Production Expert

View Original

Using Higher Sample Rates For Pitch Shifting And Time Stretching Audio

One of the most hotly debated (and argued) topics relating to digital audio is sample rate. Go on any audio forum or Facebook group and express an opinion on the matter and you will invariably spark a heated discussion, with everyone weighing in with their viewpoints. Some will quote Nyquist-Shannon sampling theorem, arguing that since we can’t hear above 20kHz, a 44.1 or 48kHz sample rate is fine. Others will state that recording at higher rates such as 88.2kHz or above will help to reduce audible aliasing distortion. Someone will make the argument that high sample rates are a waste of CPU resources and disk space. You may even get comments from hifi enthusiasts who say that higher sample rates just sound better.

Fabfilter oversampling options

Digital audio is a complex matter and there’s more to sample rates than meets the eye (or should that be ear?). A couple of years ago we featured the excellent video which Dan Worrall produced for Fabfilter about this very topic in our article Higher Sample Rates For Recording And Mixing - When And Why You Should Use Them And When You Shouldn't.

Most DAWs provide the option to work at sample rates up to 192kHz and a few even offer 384kHz. Whatever your view point on sample rates, there’s no denying that working at rates above 48kHz will be more of a draw on your CPU, and very large sessions are likely to be impossible to run once you have lots of tracks and plug-ins in the session. However, there are some cases where working at higher sample rates can be worthwhile. Let’s consider two of them.

Time Stretching

If you need to time stretch any audio after recording, having more densely packed samples can greatly reduce the audible artifacts which arise as a side effect of the processing. One example is ADR processing. If you need to time compress and expand ADR recordings to fit the timing of the original performance, a 48kHz recording can start to sound a little grainy if time stretched too much. One thing which gives you the best chance of getting a good result in most cases is to work at 96kHz or maybe even 192kHz, if practical. Recording and editing ADR at these rates and then sample rate converting the processed audio to 48kHz for inclusion in the mix can give a much more natural sounding result which is free of noticeable processing artifacts.

Pro Tools Elastic Audio algorithms

For sound design, you may also want to selectively make certain recordings at high sample rates if you intend to time stretch those sounds for creative effect. There are very audible differences between the available time stretching tools and algorithms. For example, if using Elastic Audio in Pro Tools, the various algorithms give different results, depending on the type of content you’re working with. Experiment with the different Elastic Audio options, along with time stretching plug-ins, to find the one which gives the best result for you.

Pro Tools X-form pitch shift and time stretch plug-in

Pitch Shifting

Another case for sample rates above 48kHz is pitch shifting. Traditionally, slowing down recorded audio meant pitching it down as well. Now of course, the two can still be linked, or they can be separate. You can time stretch something without affecting the pitch or you can pitch shift something without affecting the duration. As we know from Nyquist, the highest frequency which can be represented by any digital system is half the sample rate. Pitch shifting high sample rate material down will of course shift previously ultrasonic frequencies down into the audible band, assuming your microphone was capable of capturing them in the first place. This can be of use creatively if you want to create an interesting effect for sound design.

The more extreme the pitch shift, the more you’ll potentially gain a benefit from having higher frequencies present in the original recording. If you combine pitch shifting and time stretching together, the cleaner results gained from working at high sample rates will be very evident. Try taking a 48kHz recording and slow it down to half speed, dropping the pitch an octave. It can sound quite grainy. Taking it down by two whole octaves and to 25% speed will sound very rough indeed. Do the same thing with a 192kHz recording and most of the coarseness is gone, or at least greatly reduced vs the 48kHz version.

Other Reasons To Use High Sample Rates

There are a few other possible reasons to use sample rates above 48kHz. With a less than perfect analogue to digital convertor, as might be found in older or inexpensive audio interfaces, the quality of the anti aliasing filter may not be ideal. This is essentially a steep, high order low pass filter which should cut everything above Nyquist. An ideal anti aliasing filter will pass everything in the audible band, start attenuating at some frequency above 20kHz and provide a complete cut by the Nyquist frequency, ie half the sample rate. The lower the sample rate, the steeper the anti aliasing filter needs to be.

In practice, it’s difficult to manufacture a perfect filter at low cost. A filter which doesn’t reach complete attenuation by Nyquist will allow some aliasing to occur. Many suboptimal filter designs will cause ripple in the passband, deviations from a flat frequency response which find their way into the audible range. There are two solutions to this: build a better, more expensive filter, or record at higher sample rates so the ripple is shifted into the ultrasonic range. Higher sample rates allow for less steep filters which can also help. In practice, it’s likely that these issues will be sufficiently subtle that most people can’t hear them but it’s quite possible that a non ideal digital to analogue converter could sound better at 96kHz than at 48kHz because of this very issue.

No discussion of high sample rates would be complete without also mentioning oversampling. If you incorporate saturation processing or fast attack compression or limiting into your mix, you’re likely to generate high frequency harmonics. These can fold back down into the audible range and sound nasty. Oversampling is a way to improve this by running the internal processing at higher sample rates than that of the host. In a 48kHz session, for example, you might choose to oversample at 2, 4, 8, 16 or even 32x the session sample rate in order to reduce aliasing. This is a preference which is selected within individual plug-ins. The Fabfilter dynamics plug-ins provide this option. Of course, the higher the oversampling rate, the more intensive it will be on your CPU.

SRC Infinite Wave comparison

Downconverting To 44.1kHz Or 48kHz

Should you choose to work at any rate above 48kHz, you’ll no doubt need to create a lower sample rate version of the work at some point, either to continue working with the processed audio in a project or for final output. It’s worth considering that the quality of sample rate conversion provided by different pieces of software can vary a lot. There’s a handy tool on the website of Canadian mastering facility Infinite Wave. They’ve taken the time to measure the effectiveness of the conversion quality in various pieces of audio software, converting from 96kHz to 44.1kHz. Take a look at the results here.

As you’ll see, many popular DAWs cause some aliasing or other non linear distortions, so choose your software carefully when sample rate converting!

Main image courtesy of Vecteezy.com

See this gallery in the original post