Author Topic: Half zero FFT (Read 4068 times)

blueskull · « **on:** January 05, 2016, 04:27:06 am »

Is there any way to accelerate the DFT computation on a sequence that half of it are zeroes? Like x(n), where x(n~N/2-1) are valid, and x(N/2~N-1) are zeroes.
The input sequence is real, so I know I can split it into a N/2 complex sequence, but I am looking for an algorithm that further reduces it, since we already have half of the sequence to be zero.

coppice · « **Reply #1 on:** January 11, 2016, 09:22:32 am »

I don't know a generic way to speed up a single FFT of real data, but assuming you are performing a series of them over time, there is a simple trick. Try Googling for twofft.c. If you put only real data through an FFT you get a Hermitian matrix result. If you put only imaginary data through an FFT you get the corollary Hermitian matrix as the result. This means you can put two real data sets through one FFT, and easily split up the two result sets at the end. Every FFT operation processes two data sets, which isn't a bad gain in performance.

T3sl4co1l · « **Reply #2 on:** January 11, 2016, 11:04:31 am »

Isn't it just equivalent to a decimated (remove every intentional zero, i.e., every other entry) FT, scaled up and patched over?

Tim

Marco · « **Reply #3 on:** January 12, 2016, 01:40:30 am »

Quote from: T3sl4co1l on January 11, 2016, 11:04:31 am

Isn't it just equivalent to a decimated (remove every intentional zero, i.e., every other entry) FT, scaled up and patched over?

Zero padding in time/freq domain is equivalent to (circular) sinc interpolation in the freq/time domain AFAIK.

I'm not sure there are large savings to be had, you can obviously save on a few MADs on the first level of butterflies, but I think that's it.

chris_leyson · « **Reply #4 on:** January 12, 2016, 08:52:28 am »

As Marco just pointed out you can only save a few multiply adds on some of the butterflies in the early stages of the FFT. I had the same problem using 2D FFTs with overlap-save image convolution, There is just no easy way to untangle the signal flow graphs and the extra overhead needed just to remove a few butterfly calculations would probably slow things down.

You might get some savings using the PFA or Prime Factor Algorithm as there are no interstage twiddle factors but the indexing is modulo depending on the prime factors. If you don't have hardware address generators available then the modulo addressing would require extra overhead.

I think your best bet, because your data is real, is to use the double length algorithm. Found a bit of Matlab code I'd written a while back, its only half completed because it only covers FFT lengths that are modulo 4.

% Function CFFT_Double_Length(R,N)
% Calculates FFT of real signal R(1:N) using length N/2 FFT
% N must be modulo 4 as only the 2nd case from Smith & Smith is used.
% Return complex result z(1:N)
% Reference: Handbook of Real-Time Fast Fourier Transforms
% Winthrop W. Smith and Joanne M. Smith

function z = CFFT_Double_Length(R,N);

N2 = N/2;
N4 = N/4;

% Pack real signal R(1:N) into complex array a(1:N/2)
for i = 0:N2-1
a(i+1) = complex(R(2*i+1),R(2*i+2));
end

% Calculate length N/2 FFT
b = fft(a);

% Separate outputs into real and imag parts
for i = 1:N4-1
rp(i+1) = 0.5*(real(b(i+1)) + real(b(N2-i+1)));
rp(N2-i+1) = rp(i+1);
rm(i+1) = 0.5*(real(b(i+1)) - real(b(N2-i+1)));
rm(N2-i+1) = -rm(i+1);
ip(i+1) = 0.5*(imag(b(i+1)) + imag(b(N2-i+1)));
ip(N2-i+1) = ip(i+1);
im(i+1) = 0.5*(imag(b(i+1)) - imag(b(N2-i+1)));
im(N2-i+1) = -im(i+1);
end

rp(1) = real(b(1));
ip(1) = imag(b(1));
rm(1) = 0.0;
im(1) = 0.0;
rm(N4+1) = 0.0;
im(N4+1) = 0.0;
rp(N4+1) = real(b(N4+1));
ip(N4+1) = imag(b(N4+1));

% Compute FFT output
for i = 1:N2-1
ar(i+1) = rp(i+1) + cos(i*pi/N2)*ip(i+1) - sin(i*pi/N2)*rm(i+1);
ar(N-i+1) = ar(i+1);
ai(i+1) = im(i+1) - cos(i*pi/N2)*rm(i+1) - sin(i*pi/N2)*ip(i+1);
ai(N-i+1) = -ai(i+1);
end
ar(1) = rp(1) + ip(1);
ai(1) = im(1) - rm(1);
ar(N2+1) = rp(1) - ip(1);
ai(N2+1) = 0.0;

z = complex(ar,ai);

Marco · « **Reply #5 on:** January 13, 2016, 02:28:22 am »

How about doing a N/2 FFT of the non zero padded sequence and then doing SDFT based Sinc interpolation to get the remaining coefficients, would that be faster? (That would be 2 N/2 FFTs and 1 N/2 iFFT.)


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Half zero FFT (Read 4068 times)

blueskull

Half zero FFT

coppice

Re: Half zero FFT

T3sl4co1l

Re: Half zero FFT

Marco

Re: Half zero FFT

chris_leyson

Re: Half zero FFT

Marco

Re: Half zero FFT

Share me