Making a ‘Virtual Choir’ video with free* software: Part 2 – Audio


In this three-part series of posts, I’ll take you through why and how to make one of those charming multi-screen, multi-track musical videos, based on my own experiences. I’ve used software that’s freely available online [though see update below!], and I’m very much coming at this from the perspective of an amateur video editor, in the hope that my tribulations might make life easier for anyone contemplating putting one of these together.

Click here for Part 1 and here for Part 3

[Update, March 2021: I’ve recently done a couple more of these videos, and decided to return to these posts, to see if they can be made more helpful, in the light of my more recent experiences. Most importantly, I’ve downgraded the headline from ‘free’ to ‘free*’. It’s definitely possible to do this with freely available software – but I’ve found that spending a little money on professional editing software makes the process roughly 10 times easier and more enjoyable.]

We’ve looked at why we might want to have a go at a split-screen music video. Now let’s look at one way of actually doing it.

Note: This is just one of a thousand different ways you could approach this. I’m not claiming this is the best way – just the one that worked for me, which I mostly figured out as I went along.

Further note: I’m going to address this to the moderately tech-savvy. This is purely a guide to what I did – take all or none of it. It presupposes using YouTube tutorials to get the basics of the software, so I’m not going to cover these in the guide.

What you’ll need

This is the most basic version of the equipment you’ll need to put this together.

  • A reasonably well-specced computer
    • There’s no getting away from this, I’m afraid – video editing eats processing power for breakfast. You’ll need a reasonable amount of RAM and a decent CPU. If you’re using a MacBook, you’ve probably already got this. If not, check your system specs – I reckon 4-8 GB of RAM and a reasonably modern processor should do it, together with enough space on the hard drive for quite a few videos!
  • Audio editing software
    • I used Cubase, which is available as a free trial. If you need longer, it’s not too expensive to buy, or you could try Audacity, which is rather more fiddly, but free for life
  • Video editing software
    • Adobe Premiere Pro. It has a really good introductory tutorial built in. I initially used it on a free trial, but subsequently decided it was worth the money to purchase a subscription for now (~£20 per month)
    • There’s also Lightworks, which is free and does the same sorts of things, and Shotcut, which is also well-specced. However, I have found that these free editors become unstable after a certain number of tracks are added. A little investment in the software prevents a multitude of headaches down the line
  • Handbrake
    • This helps us make sure all the video files submitted to us can be edited by the software, by converting them all into the same format
  • Time

Step 1: Create the Guide

You could simply make your performers record audio and video at the same time. However, this can be a little overwhelming – it’s a lot of pressure to think about both the visual and the audio at the same time when you are recording yourself, and it makes editing and controlling the audio trickier.

We’re going to record the audio and video components of the video separately, then put them together afterwards. This means that the performers can focus entirely on getting their performance right, then, having done so, can effectively mime the video. This allows for a more engaging presentation.

Creating the Guide

The performers need a guide recording to perform along to. It can be as simple as a metronome, but the more the performers feel like they’re performing with others, the better, and some have used preexisting recordings for this purpose, grafting their own voices or instruments on top of it.

I don’t find either of these solutions particularly gratifying. Using a metronome can lead to a rather mechanical performance, and singing along to someone else’s recording doesn’t allow the freedom of your own interpretation.

Note: in fact, some the pieces I chose needed a flexible tempo, which a metronome would make impossible, and there weren’t any extant recordings to use.

Here’s how I made my guide recording:

  • Using my phone, I took a video of myself, clapping on the fourth beat of a metronome – beep, beep, beep, clap – followed by me conducting the piece to camera.
  • I then recorded myself playing the choral parts/accompaniment on the piano into the audio software (Cubase), while watching the video I had just made (making sure to clap along at the beginning), then exported this as a .wav file
  • In the video editor (Premiere Pro), I lined up my new piano recording with the video, by lining up where the two ‘clap’ waveforms were on the audio tracks – they’re pretty easy to spot. I then exported this video
  • Watching the new video, I repeated the piano recording process, except this time recording myself singing the vocal parts, always lining them up using the clap

Make a rough mix of the voices and piano in the audio editor by adjusting the track levels on the mixer until you’re happy. Then add it as the audio to your conducting video.

For a recent video, I actually made four different versions of this ‘guide’ mix, each one emphasising a particular voice-part by putting it forward in the mix, and the others back. This was a lot more work, but the singers found it helpful to have a strong lead on their part to sing along with.

After exporting, this left me with a video of me clapping, then conducting an invisible ensemble of piano and singers. By following myself conducting, instead of using a metronome, I was able to allow for breaths and a slightly more organic performance. It also forces you to learn whether you’re easy to follow or not!

Note: I asked friends to supply the voice-parts I couldn’t sing. If there’s no one around and you don’t feel like doing it, why not engage some professional singers to lay down guide tracks for a few bob – they’ll appreciate the work.

Send it to the Performers

Send the video to the group, along with detailed instructions as to how to contribute – everything from positioning the recording device, to warming up beforehand, and clapping with the guide. I based my guidelines on the excellent list available here (geared towards the acappella tradition but mostly applicable).

Experience suggests the following problems are most common (and need highlighting in the instructions!): the orientation of the videos (I prefer landscape, but everyone has to do the same or it looks messy); forgetting to clap in the audio/video/both.

Each participant records audio (with headphones in) and then video separately (no headphones), and sends you both files. Use a service that permits the transfer of large files, such as WeTransfer, wesendit, iCloud, Google Drive, etc.

Step 2: Assembling the Audio

Lining up clap waveforms in Cubase

As you receive the audio files, import them into Cubase, and line them up with the guide recording using the clap.

NB You might need to make sure they’re in a format Cubase can read – for example, it doesn’t like Apple’s m4a format, so I used this website to convert those to wav.

Hopefully this should mean they’re vaguely together with each other – you can make micro-adjustments if not. You can trim ‘rogue’ moments out, add some reverb to distance the sound a little, and use the Mixer to get the balance right between parts. Play about until you’re happy, then export to a single file. Remember to leave in the ‘clap’ so that you can synchronise it to the video in the next stage.

If you can access plugins, I’ve found the following invaluable, used on the whole mix: DeEsser (to de-emphasise those sibilants); Limiter (to prevent the audio from getting too loud and creating distortion); EQ (taking off highs and lows creates a bit of distance); Reverb. The latter presents an interesting challenge: you want it to sound like the listener is hearing a choir at the normal distance away (10 meters or so), but must reconcile this with the fact that the singers’ faces are right up by the screen, which psychologically suggests a more intimate sound.

If you’re just making an audio virtual recording, you can stop there. If hubris hasn’t yet got the better of you, though, the final stage is video. Hold on to your hats (and spare a thought for your poor computer).

Next week:

Making a ‘Virtual Choir’ video with free software: Part 3 – Video