bjorg: Why EQ Is Done In the Time Domain

Wednesday, August 8, 2012

Why EQ Is Done In the Time Domain

In my last post, I discussed how various audio processing may be best done in the frequency or time domain. Specifically, I suggested that EQ, which is a filter that alters the frequency balance of a signal, is best done in the time domain, not the frequency domain. (See my next post if you want to learn how to implement a time-domain filter.)

If this seems counter intuitive to you, rest assured you are not alone. I've been following the "audio" and "FFT" tags (among others) on stack overflow and it's clear that many people attempt to implement EQs in the frequency domain, only to find that they run into a variety of problems.

Frequency Domain Filters

Let's say you want to eliminate or reduce high frequencies from your signal. This is called a "low-pass" filter, or, less commonly, a "high-cut" filter. In the frequency domain, high frequencies get "sorted" into designated "bins", where you can manipulate them or even set them to zero. This seems like an ideal way to do low-pass filtering, but lets explore the process to see why it might not work out so well.

Our first attempt at a low-pass filter, implemented with the FFT might look something like this:

loop on audio input
if enough audio is received, perform FFT, which gives us audio in the frequency domain

in frequency domain, perform manipulations we want. In the case of eliminating high frequencies, we set the bins representing high frequencies to 0.
perform inverse FFT, to get audio back in time domain
output that chunk of audio

But there are quite a few problems with that approach:

We must wait for a chunk of audio before we can even begin processing, which means that we will incur latency in our processing. The higher quality filter we want, the more audio we need to wait for. If the input buffer size does not match the FFT size, extra buffering needs to be done.
The FFT, though efficient compared to the DFT (which is the FFT without the "fast" part), performs worse than linear time, and we need to do both the FFT and it's inverse, which is computationally similar. EQing with the FFT is therefore generally very inefficient compared to comparable time-domain filters.
Because our output chunk has been processed in the frequency domain independent of samples in neighboring chunks, the audio in neighboring chunks may not be continuous. One solution is to process the entire file as one chunk (which only works for offline, rather than real-time processing, and is computationally expensive). The better solution is the OLA or Overlap Add method but this involves complexity that many people miss when implementing a filter this way.
Filters implemented via FFT, as well as time-domain filters implemented via IFFT, often do not perform the way people expect. For example, many people expect that if they set all values in bins above a certain frequency to 0, then all frequencies above the given frequency will be eliminated. This is not the case. Instead, frequency responses at the bin values will be 0, but the frequency response between those values is free to fluctuate -- and it does fluctuate, often greatly. This fluctuation is called "ripple." There are techniques for reducing ripple but they are complex, and they don't eliminate ripple. Note that, in general, frequencies across the entire spectrum are subject to ripple, so even just manipulating a small frequency band many create ripple across the entire frequency spectrum.
FFT filters suffer from so-called "pre-echo", where the sounds can be heard before the main sound hits. In and of itself, this isn't really a problem, but sounds are "smeared" so badly by many designs, that many in the audio world feel that these filters can effect the impact of transients and stereo imaging if not implemented and used correctly.

So it's clear that FFT filters may not be right, or if they are, they involve much more complexity than many people first realize.

As a side note, one case where it might be worth all that work is a special case of so-called FIR filters (also sometimes called "Linear phase" filters). These are used sometimes in audio production and in other cases. In audio, they are usually used only in mastering because of their high latency and computational cost, but even then, many engineers don't like them (while others swear by them). FIR filters are best implemented in the time domain, as well, until the number of "taps"in the filter becomes enormous, which it sometimes does, and it actually becomes more efficient to implement using an FFT with OLA. FIR filters suffer from many of the problems mentioned above including pre-echo, high computational cost and latency, but they do have some acoustical properties that make them desirable in some applications.

Time Domain Filters

Let's try removing high frequencies in the time domain instead. In the time domain, high frequencies are represented by the parts of the signal that change quickly, and low frequencies are represented as the parts that change slowly. One simple way to remove high frequencies, then, would be to use a moving average filter:

y(n) = { x(n) + x(n-1) + .... + x(n-M) } / (M+1)

where x(i) is your input sample at time i, and y(i) is your output sample at time i. No FFT required for that (This is not the best filter for removing high frequencies -- in fact we can do WAY better -- but it is my favorite way to illustrate the point. The moving average filter is not uncommon in economics, image processing and other fields partly for this reason.). Several advantages are immediately obvious, and some are not so obvious:

Each input sample can be processed one at a time to produce one output sample without having to chunk or wait for more audio. Therefore, there are also no continuity issues and minimal latency.
It is extremely efficient, with only a few multiplies, adds and memory stores/retrievals required per sample.
These filters can be designed to closely mimic analog filters.

A major disadvantage is that it is not immediately obvious how to design a high-quality filter in the time domain. In fact, it can take some serious math to do so. It's also worth noting that many time-domain filters, like frequency domain filters, also suffer from ripple, but for many design methods, this ripple is well defined and can be limited in various ways.

In the end, the general rule is that for a given performance, you can get much better results with the time-domain than the frequency domain.

8 comments:

fliesSeptember 28, 2012 at 9:08 AM
Great article. One additional reason not to use FFT for filtering is that, especially for short windows/low latency, the IFFT output sounds weird and jangly - even when no transformation is done to the FFT data. (Longer windows attenuate the jangles but will smear the sound.) For speech, this may not be a big deal, but for music it's usually a very ugly sound.
ReplyDelete
Replies
mathmaniacMarch 17, 2013 at 8:57 PM
i was wondering how we can remove frequencies in between, ie implement a band stop filter using your time domain filtering approach. Could you explain that point? Thanks!
ReplyDelete
Replies
Eric SDecember 29, 2013 at 11:52 PM
I'm a little confused by what you've said here. You make it sound like frequency domain filters and time domain filters are two totally different things. You present moving time averages as though it's somehow completely different than a FIR filter, but it's actually just a specific FIR filter where all the taps are equally weighted. The time and frequency domains are just two perspectives on the same thing after all.
ReplyDelete
Replies

Add comment

Note: Only a member of this blog may post a comment.

bjorg

Wednesday, August 8, 2012

Why EQ Is Done In the Time Domain

Frequency Domain Filters

Time Domain Filters

8 comments:

About Me

Blog Archive

Sites

Followers