Fast Fourier TransformPosted: January 24, 2011
I used every optimization trick I could think of, but couldn’t get the FFT routines fast enough to do real-time 3D sound. A 1.125 second clip, 49612 samples at 44.1kHz, takes 3.9 seconds with integer FFT, and 5.5 seconds with floating point.
That’s surprising. My HTC Tattoo has a 528 MHz CPU, so the integer conversion is taking about 2 billion clock cycles. The HRTF converts 128 samples at a time, and calculates the left and right channels separately, so that’s about 2.7 million clock cycles per call.
Each HRTF call does an FFT-256 and inverse FFT-256, plus some overhead. FFT runs in O(NlogN), so the inner loop must be taking about 650 clock cycles. Honestly, that’s a long time for code that consists of four array reads, four array writes, eight integer adds, four integer multiplies, and a comparison. Around 30 clock cycles per command? Crazy.
Potentially the code could be sped up by optimizing for FFT-256 (instead of generic FFT) and unrolling some loops. But that wouldn’t get the necessary four-fold speedup.
The other option is to write the FFT in native code. That may be fast enough, but it wouldn’t be portable across handsets. Seriously, why isn’t there an FFT class in the standard Java distribution?
But all is not lost. It turns out that 3D sound isn’t that effective at distinguishing front/back and up/down anyway. Maybe I’ve got strangely-shaped ears or crappy headphones, but I can only guess the direction slightly better than chance. But left/right is easy.
Your head is about 17cm across, and sound covers that distance in about half a millisecond. If you delay the sound coming into your left ear by that much, then the perceived direction is clearly from the right. So much so that you can’t actually hear the sound in your left ear, even though it’s playing at the same volume in both ears.
So left/right sound works really well, and it doesn’t require a fourier transform. You can’t distinguish between front/back, but you could overcome that with an interactive system that updates the left/right phase as you turn your head. That gets you 90 percent of the benefit, and the phone’s CPU would barely notice.