Production Expert

View Original

Exploring Spectral Audio Editing Tools And Techniques

Spectrograms have been around for decades and spectral editing has existed for over 20 years. In this article, Paul Maunder investigates the history of spectrograms, takes a look at some of the popular editing tools available today and explains a number of techniques that will ensure you get the best results for your audio.

Spectral audio editing was first introduced by Cedar in 2002 with the release of Retouch. This patented technology allows specific parts of the frequency spectrum to be edited, manipulated or removed using a visual graph called a spectrogram, which plots time against frequency, giving an indication of how a sound varies with time. The intensity of a sound is indicated by its colour or brightness. As the owner of the patent on spectral editing, Cedar licences the technology to many manufacturers who now offer products based on it.

Spectrographic analysis and display of audio actually dates back all the way to 1951, when The Kay Electric Co produced the first commercially available machine under the trademark ‘Sona-Graph’. Early uses included seismology, sonar, linguistics, radar and ornithology. The image below is a 1971 spectrogram of bird calls.

Spectrograms did appear in a few audio software applications prior to 2002, but the ability to edit only came about thanks to Cedar’s research and development in this area.

Today, a wide variety of the software tools incorporate spectral editing. In this article we’ll take a look at a few of the popular ones and discuss the uses and benefits of working with audio in this way.

Fast Fourier Transform

In order to generate a spectrogram of audio, something called a Fast Fourier Transform, or FFT, is used. A Fourier Transform is an integral transform that converts a function into a form that describes the frequencies present in the original function. To describe it in a more simplistic way, the Fourier Transform is analogous to decomposing a sound into the intensities of its constituent pitches. The resulting spectrogram allows us to view audio visually, and, depending on the software used, perform a range of editing tasks.

In several of the spectral editing applications available today, the FFT size can be changed. Changing the FFT size results in differences in how the spectrogram looks, bringing different parts of it into sharper or softer focus. Generally, smaller FFT sizes provide more detail in time based events, while larger FFT sizes favour frequency. For identification of low frequency sounds, it’s worth trying a higher FFT size, while for higher frequency content, lower FFT sizes usually work best.

Frequency Scales

Many of the software tools which include spectral editing provide an adjustable frequency scale. To use the examples available in iZotope RX, we can choose from Linear, Mel, Bark, Log and Extended Log. The user manual provides some detail on what each of these scales mean, but in essence, ‘Linear’ displays the greatest detail in higher frequencies, with less visible resolution given to lower frequencies. ‘Extended Log’, on the other hand, gives the greatest detail, and screen area to lower frequencies.

A simple way to think of this is that the settings closer to the top of the list give more visible resolution to higher frequencies while the lower down the list you go, the more resolution is given to lower frequencies.

Steinberg Spectralayers has similar frequency scale settings, with some differences, but the concept is the same as in iZotope RX.

Software

A variety of audio applications now include a spectrogram and allow audio to be edited and manipulated using this view and the associated tools. Some of the popular ones include:

  • Acon Digital Acoustica

  • Absentia DX

  • Adobe Audition

  • Audacity

  • Cedar Retouch

  • iZotope RX

  • Reaper

  • Samplitude Pro X

  • Steinberg Wavelab Pro

  • Steinberg Spectralayers

The tools and functions vary considerably between these applications, but let’s consider some of the common editing features often found in spectral audio editing applications.

Editing

Rather than going into the particular functions of each and every one of the applications listed above, here I’ll aim to give an overview of some of the key benefits and editing options available. I’ll use a couple of examples along the way.

With audio represented visually in the form of a spectrogram, it’s easy to identify points which require attention. In the example below, a whistle occurred in the background during an interview. This is represented in the spectrogram by the bright coloured line which curves upwards, indicating its rising pitch over its duration. In the second screenshot, the whistle has been greatly reduced in level while retaining the dialogue itself. In this example, the Retouch tool in Acon Digital Acoustica was used, but a similar thing can be achieved in most of the applications listed above.

Acon Acoustica Spectrogram showing background whistle

Acon Acoustica Spectragram after whistle has been reduced in level

As well as attenuating or replacing parts of the frequency spectrum, sections can also be copied and pasted. This allows very specific selections to be duplicated elsewhere in the recording. You could, for example, copy just the sound of a tolling church bell and paste it into another part of the recording, without copying the background noise or other audio which was present during the original bell sound. Another use for the copy and paste functionality of spectral editors is to replace noises which occur at particular frequencies. For example, a low frequency knock caused by bumping a boom mic could be dealt with by making a frequency and time selection elsewhere where the audio is clean and then pasting this over the bump, thereby replacing that small section of the recording within the required frequency range.

Noise Reduction

One of the most notable benefits of spectral audio tools is the noise reduction. As mentioned, specific selections can be removed, replaced or attenuated, but a range of tools is usually also provided which cover a variety of noise reduction tasks. Recently, we’ve seen huge advancements in dialogue noise reduction in particular, with a number of plug-ins now providing exceptional results at the turn of a dial, with no need to resort to spectral editing applications. However, every so often, a particularly tricky unwanted noise still comes along which even the latest dialogue noise reduction plug-ins can’t seem to reduce. I encountered this last week on some dialogue I was working on for a documentary. There was one particular section in which a high pitched hum with harmonics was present. I tried several of the current best plug-ins and, while they were all able to pull out the other background noise, this particular sound evaded them for some reason. I sent the audio to iZotope RX and, using the brush tool to select the sound in the spectrogram, and then selecting its harmonics, I was then able to successfully use the Spectral Repair module to almost completely eliminate the problematic noise. Again, this could be accomplished using most of the current crop of spectrogram based audio applications, but I had iZotope RX to hand and it worked well for this particular task.

iZotope RX

Typically, a suite of noise reduction tools is included with a spectral audio application, and most don’t require the user to edit the spectrogram directly. Instead, algorithms are included for functions such as reduction of mic pops, wind noise, clipping, breaths, broadband noise, rustling, reverb and mouths clicks, to name just a few. In these cases, the spectrogram is useful to aid in the identification of such noises and audio can then be selectively processed as required, with the effects of the processing then being visible in the spectrogram. It goes without saying that you should always use your ears when working on any audio, but a visual guide can be of great assistance.

Spectral Separation

Some tools allow for different parts of a sound to be separated. For example, you may wish to split the tonal and noise components of a sound in order to reduce string noise in a guitar recording, or the sound of the pedals in a recording of a piano.

For some time now, we’ve been able to split mixed music into separate ‘stems’, for rebalancing. An example of this can be found in iZotope RX, where the Music Rebalance module allows for independent adjustment of the Vocal, Bass, Percussion and Other components of a track. The results are not perfect, and if you try to completely eliminate a vocal, for example, artefacts can be heard. For moderate rebalancing though, it’s a useful tool.

Steinberg Spectralayers goes several steps beyond this, providing the capability to split music into its constituent parts of Vocals, Drums, Guitar, Piano, Bass and Other. The results are impressive and, while still not absolutely perfect, they’re extremely good, and excellent for rebalancing, extracting stems for remixing or creating instrumental versions of mixed music. The Unmix functionality within Spectralayers also includes algorithms for the separation of multiple voices, and for separating speech from noise. In the screenshots below, we can see a recording of some dialogue which has some very prominent insect noise present. The noise can be seen in the spectrogram as the horizontal lines which extend from just below 6kHz to just below 12kHz. 

Steinberg Spectralayers

After using the Noisy Speech function from the Unmix menu, the speech has automatically been separated from the noise. The components now have their own colours within the spectrogram, along with controls over on the right hand side for volume, mute, solo and polarity. Muting the noise gives a remarkably good result and it’s impressive how the technology has evolved and improved over the last few years.

Final Thoughts

Spectrograms have been around for decades and spectral editing has existed for over 20 years, but the functionality it provides continues to evolve and improve. Advancements in A.I have meant that we now have tools within the software which allow us to carry out advanced noise reduction, separation and other processing which wasn’t previously possible.

ARA2 integration within DAWs means that many spectral editors can be launched and accessed from the DAW timeline, saving time on round tripping between applications. Avid have been slow off the mark with ARA2 integration in Pro Tools but we saw the start of it in 2022 with Melodyne integration. Now, in a forthcoming update, ARA2 will be extended to also support iZotope’s RX Spectral Editor directly into Pro Tools, which will be a welcome addition for many users.

As a point of interest, there have been a number of songs released over the years which include artwork built into the audio which can be viewed on a spectrogram. An example of this, from Apex Twin’s ‘Equation’, released in 1999 is shown below. 

What are your favourite spectral audio editing applications? Let us know in the comments!

See this gallery in the original post