This is the technical process PushPull uses to record as of June 2014. Feel free to be foolish enough to duplicate.
In a previous post, Kristin did a great job of explaining how we put together PushPull. As the lead audio engineer, I wanted to follow that up with some more about the technology and techniques that I use to make our recordings so vibrant and alive.
Look, we don’t know what we are doing. PushPull’s major early design principle was “get ‘er done.” We took the idea of minimum viable product to a whole new (low) level. In past projects, I have hung up the production trying to make a perfect recording and, after many many too many hours had zero product. None of that frustration in PushPull production.
So, we would welcome input on how to improve the audio and presentation of our podcast. We are pretty excited about the promise of PushPull and could move past our MVP approach and amp up the game.
That said, my goal is not glimmering beautiful audio. Our goal is to have interesting chats. A couple listeners have commented “I wanted to get in on your conversation.” I like that our spontaneous, agenda-free hour strays into interesting topics that we did not foresee. I like that our guests sometimes have to shout me down because I get too ranty. That is our core value, not HD audio.
The One Mic Principle
We are recording with a Zoom H2n. It is a battery powered portable recorder with four directional mics. It records onto an SD card.
At first we placed the microphone on the table but found we could cut down on the sound transmitted through the table by affixing it to a boom mic that totally gets in Kristin’s way when we record.
The H2n has four settings. We use the 2 channel surround (2ch) setting and put the mic in the middle of the room. This does not give me the ability to mix each voice separately. It has not been that much of a problem (to me — if you are a listener do provide feedback). The nature of these condenser mics and the normalization process seem to balance out the voices pretty well.
I have been recording PushPull at Good Ol’ CD Quality, 44.1/24 bit.
We record in my guest bedroom. I have hung a blanket on one wall so that Kristin’s delightful laugh does not get *too* boomy. The other wall is slightly protected by this painting made of me 20 years ago. Kristin looks upon it over my should while we record. Perhaps too much Joel at one time. Perhaps.
Audio Processing, the Low Touch Way
For the first several episodes, I went through a compress, normalize, and noise reduction process within Audacity, the free digital audio software for Mac that is, er, pretty OK. It does the job. I would not want to live in Audacity but I can manage it for quick projects. I am very grateful for the developers of Audacity.
My good friend and audio pro walked me through how to best take our recording from raw recording to a deliverable MP3. I made a lot of mistakes, had to back out and start again, and generally hated the process. The wrong combination of these complex algorithms can make the conversation sound like erudite robots speaking through megaphones while simultaneously crunching broken glass between their robot teeth.
A listener who got tired of our inconsistent results pointed me to Auphonic. Auphonic is an online audio production tool. I have been using it since episode 9. I love it.
I just upload the raw file to Dropbox, apply the preset, and await the confirmation email. Auphonic processes the file and drops an MP3 and an uncompressed FLAC file back into Dropbox.
The MP3 is for Kristin and I to preview. Kristin is a champ about writing the show notes from this version.
The FLAC is my working copy. I import it to an Audacity project that already has Michael Sulis’ awesome introduction. I choose the 15-20 second intro and copy that to a third stereo track.
We rarely cut anything out of the recording. Occasionally we pause for a bottlecap removal and I cut that out.
The final produced file is currently a 128 kbps MP3. Is that too high quality?
The Bluegrass Approach to Podcasting
We could do better, in terms of audio quality. I know better, really I do. Individual mics with just the right gain shoved close to people’s face make for great signal to noise and dynamic response. I could mix for space, throw voices to left or right.
I avoid saying never, but I don’t foresee us moving to that model. If the audience starts telling us that we are unlistenable, then we will improve. However, I suspect that we are using just the right tech. That is because we record podcasts like it is bluegrass. This approach to capturing live, improvised music forces the band to huddle around the mic. They interact with each other and feed off of each other’s body language. I think we accomplish the same with PushPull. If we or our guests were put into a frozen position in front of a mic, we would have a different interaction.
I feel that we are sacrificing the opportunity for higher quality audio for the sake of higher quality, authentic content. All I hope is that the audience — perhaps including @tylerpoage — will simply not feel that the audio quality gets in the way of the content.
You will get cars passing, the sounds of a nervous guest tapping on the table, and Torito intruding for attention. We hope that along with that you get authentic conversation about making the web, because that is what matters to us. That and good dogs.
Sometimes, this happens in the midst of recording: