I'm glad to hear that you know what you're doing with FFTs. I look
forward to seeing you demonstrate it someday, when your code is ready.
Sorry that you're offended by my posting links to code that actually
answers the OP's question, and references to real performance numbers,
which happen to be on our site.
Best of luck!
Steven G. Johnson
It was a surprise to me that the OP even asked when you find hordes of
implementations on the web. Revisiting his post, he didn't care if
the solution was C/C++. Having said that, it appears to me that it's
incumbent upon you to toot your own horn on a continuum. You'll take
issues with the standard algorithms used to re-express the DCT as a
real FFT (Numerical Recipes, FFTPACK etc) of the same size - calling
them unstable and then promote your own. With respect to the
algorithms. Do you valid points? Indeed. If memory serves FFTPACK
has O(square root ( n )) errors for realtively large data sets. ( Again
it's been awhile but if I recall correctly that's a true statement ).
It may be that most folks use a relatively small dataset and in that
regard it's a minor issue.
Here's my issue: I'm not arguing all that. I'm interested in a C++
solution. Period!!
I might add that my knowledge of C++ is sub-par. But that's where I'm
at. I'm at the language level. void main's and mallocs don't cut it
for me. Furthermore, I dont care if malloc saves me 5 microseconds.
Bottom line. if you want to promote your 'stuff' promote it. Some of
us aren't interested in a C solution and I'm willing to bet bottom
dollar. When the dust settles ( algorithm details aside ) there's a
comparable C++ solution out there.
Am I there yet, NO.
At first I developed a template class that's fairly canonical and
general. It uses a technique, where the 'twiddle' factors are stored
in a precomputed bit-reversed table. exp[0], exp[pi*i/2], exp[pi*i/4],
exp[3*pi*i/4], exp[pi*i/8] ...
I could go on but in the interest of time. I'm in a state of flux right
now, in that there's more pressing issues. I think I got side track
because I needed to finalize a prediction architecture. In essence,
for a broadband ( stochatic ) signal with interference from a
narrowband ( periodic ) source. I needed to work through the
prediction architecture such that the adaptive filter attempts to find
the correlation between d (k) and y (k)
Of course, I'm not here to re-invent the wheel. For a general purpose
FFT, initially I ran with link what's here:
http://www.relisoft.com/Science/Physics/fft.html
Today I've got a vendor library for handling the FFTs. It's pure C and
it's got the right 'flags' (if memory serves) to take advantage of the
Altivec engine on my cards. Admittedly, I've used your implementation.
Of course I don't recall what the numbers looked like but it's evident
I've - literally - been all over the place. Come to think of it, I
might bounce it (your FFTW) against the 'gold standard' ( a scientific
library from a vendor ) in my line of business.
In any event (algorithm details aside), I haven't had a chance to
solidify my implementation, but hopefully, I'll be able to continue/get
it together real soon. Again part of that is my ignorance of C++ but
...
Having said that. I think I'm/have been - way off topic at least 2/3
posts ago.