Viktoriya Atanasova
June 24, 2024
VP, and specifically ICVFX, provides considerable benefits for an actor’s level of immersion in the diegetic world. Several test-shoots were conducted to determine whether audio can enhance this experience. They resulted in the formation of a comprehensive diagram for the assumed technical set-up, and a set of recommendations for future productions. 


The use of Virtual Production enabled by real-time game engine technology combined with live camera tracking allows for final image output to be displayed on LED walls. This type of VP grants crew members a shared frame of reference for the virtual environment, which can assist in proper framing and lighting techniques. This aspect is especially valuable for actors performing in VP, as it provides a visual reference to the diegetic space, heightening levels of immersion. Unfortunately, the same level of attention is not awarded to sound, resulting in an image-centric bias in the film industry. That causes a global gap in the literature about audio, despite it being a fundamental part of film. Therefore, this gem will focus on how using real-time diegetic sound effects can accompany and further enhance the level of immersion actors experience during VP. 


A series of VP shoots were conducted to determine whether there is a causational link between the use of real-time diegetic SFX and enhanced levels of immersion. Preparations for the experimental productions began with a detailed breakdown of the script, during which ideas about the final sound design were documented.  

Script adapted for real-time SFX implementation 

These notes informed the eventual soundscape of the short film. Based on the drafted previs, a spot list, which contained time-specific dialogue, music, and sound effects, was created to facilitate the editing process. Once all materials had been sourced, two key diegetic SFX were chosen for the real-time playback. These audio clips were exported with spatialised information, such as reverb, already imprinted. Diegetic environmental SFX were chosen for these productions, as they are speculated to have greater effects on presence in the VE.  


The technical set-up of the test-shoots was largely dictated by the available equipment at the XR stage in Breda University of Applied Sciences. The equipment consisted of a laptop running the Pro Tools software with the two chosen diegetic SFX placed on two separate tracks. The laptop was connected to a KORG nanoKontrol 2 midi keyboard, which was used to trigger the playback of the individual audio clips. The connection would travel back through the laptop and into a StudioLive 16.0.2 USB recording mixer, which would then transmit the signal to the speakers. For this purpose, two pairs of two Martin Audio W3P speakers were placed above the actors, on either side of the XR stage.  

Diagram of real-time audio playback system 

Each shot from the VP test-shoots that involved the playback of the diegetic SFX was carefully synchronised to the movement of assets in the VE to best deliver an immersive experience to the actors. Several takes were made to give each performer a solid basis for comparison for the anticipated data-collection. In the case of issues related to the playback, the Web MIDI test page was utilised to determine whether the device output and input were detected and functioning appropriately. 


Overall, the playback of real-time diegetic SFX during VP shoots has several benefits. It can provide important information to the actor about the state of VE, such as sound cues, and through this, it transforms the XR stage into a performance space. Additionally, the SFX can serve as a basis for the final edit. The perceptual needs of actors are highly individualised, and, therefore, different approaches might affect immersion levels differently. However, the quality, quantity, and consistency of the audio elements implemented are foundational in the set-up of the framework. Participants found that the audio playback added to the feeling of realism and their perception of the virtual world. This ultimately had a positive effect on their performance, as the closer a mediated experience is to its non-mediated counterpart, the more likely a media user is to behave as they would in the real-world. The added SFX completed the experience, which helped performers suspend disbelief that they were present in the diegetic world. Actors reported, conversely, that the limited use of audio exacerbated the moments of silence on-set and contributed to the decrease of presence in the VE. As a result, they suggested that continuous playback of the soundscape would be more likely to accomplish the desired effect. Furthermore, the element of surprise can serve as a valuable tool depending on the mood, though actors agreed that hearing the SFX beforehand better prepared them for what to expect once shooting commenced.  

On-set preparations for Virtual Production Student Network Short film 

Since performers require a consistent and conscious effort to immerse themselves in the environment, even a small complication can alert their subconscious that they are, de facto, in a mediated experience and subsequently break this illusion. Actors noted that crucial adjustments can be made to minimise these instances. Beginning with the script, participants placed a great importance on the connection between the story and character development, and the motivations behind those. This is to say that the level of immersion the actors experience in the VE is linked to the emotional connection between them and the story narrative; if they feel that the actions they are portraying are reflective of the character, they would be more likely to immerse themselves in the story. While working on set, actors reported a feeling of disconnect from the diegetic world after repeated takes. As a result, the participants began to lose the necessary cognitive engagement and felt unnatural in their portrayal of the fictional characters. These considerations extend to the implementation of real-time audio in VP, as lapses in synchronisation between the playback of the SFX and movement on the LED wall can negatively affect levels of immersion. Additionally, the quantity and placement of the speakers relative to the actor played an equally pivotal role in delivering a believable experience. However, academic sources point to conflicting data over the necessity of spatialised audio, stating only that multi-speaker systems were found to induce a greater sense of immersion over single-speaker systems. 


Based on the findings, the following advice is offered to sound engineers in the VP industry. Despite the advantages, it is acknowledged that the proposed framework for real-time SFX playback may not be a practical solution for all studios. Its endorsement is largely contingent on the type of shoot at hand, as well as the available time and budget, and necessity for this practice. Separate from the adoption of immersive audio, participants noted that repeated takes resulted in their detachment from the VE. As they sensed their conscious efforts in acting were eluding them, the actors held a lack of guidance from the director responsible. Directors are advised to, therefore, be more involved in assisting actors to achieve the desired character portrayal. Moreover, producers should manage the amount of time that actors spend on-set when not filming. Findings suggested that the lengthy preparations between shots tire the imaginations of actors and should be addressed accordingly. Furthermore, participants noted that they could not benefit from the immersive nature of the LED wall-enhanced VP since they spent most of the time facing away from the screen. To mitigate this issue, studios can place a small screen showcasing the camera feed in front of the actors, or additionally, they can make use of a lower resolution, movable LED panel to display the extension of the VE as it would appear before the actors. 


Although the answer to whether a performer’s immersion in the virtual world can be enhanced through the use of real-time SFX is complex, it presents undeniable benefits. During the test-shoots, certain limitations arose due to the quality of the available equipment. Overall, the chosen approach was preferred given the project circumstances. Additionally, relevant literature points to conflicting stances on which audio elements are most likely to elicit feelings of presence. Nevertheless, recommendations for whether to implement real-time audio in Virtual Productions are still entirely dependent on the specialised needs of the film crew, as well as the opportunity to do so. More importantly, the research found that additional accommodations on-set should be made for actors to prevent the decrease of immersion in the diegetic space during production. Moreover, this gem concludes that audio is an essential part of any media experience, and the benefits it provides for the VP pipeline should be explored further.