MA332 lecture notes, Fall 1998 Week 11 Tuesday Topics: discrete Fourier transform (dft) Big picture ----------- 1) start with the continuous Fourier transform. 2) discretize in time 3) discretize in frequency 4) develop the Fast Fourier Transform Superposition principle: any signal can be composed by the superposition of sinusoids with various frequencies, phases and amplitudes. Spectral analysis: take a signal and break it down into its component frequency components. Applications 1) sound analysis, speech recognition 2) image recognition 3) many, many more... Fourier transform ----------------- The best example to keep in mind when you think about Fourier transforms is the spectral analysis of sound. Sound is our perception of changes in air pressure over time. Microphones convert changes in air pressure into changes in electrical charge or current. If we record this signal for a period of time, we get a waveform. Different sounds produce different sorts of waveforms (see the Physics of Music example). h(t) : air pressure as a function of time (real) H(f) : magnitude of component with frequency f (complex) Any function of time h(t) can be written as the sum of various frequency components. For a given frequency, we want to know the magnitude and phase of that component in the signal. H(f) is a complex number that indicates the magnitude and phase of that component with frequency f in the signal. (f=1 indicates one cycle per unit of time, whatever the units of time are) If H(1) = 1 and and all other H(f) = 0, then the corresponding h(t) is exp (-2Pi i t), a sinusoid with frequency 1. (that negative sign is in there for convention, and corresponds to a 180 degree phase shift) polar interpretation of complex numbers: r exp (i theta) If H(1) = i and and all other H(f) = 0, then the corresponding h(t) is a sinusoid with frequency 1, with phase shifted by 90 degrees. That suggests that in order to convert from H(f) to h(t), all we have to do is add up the contributions from all the frequency components. If there are only two components, like H(1) = 1 and H(2) = 1, then h(t) = exp (-2Pi i t) + exp (-2Pi i 2t) (check that that is really a function of t) So h at some time t is the integral over all frequencies of H times the sinusoid with the appropriate frequency (evaluated at t): inf h(t) = int H(f) exp (-2Pi i f t) df -inf (check that that is really a function of t) (look at the mutiplication of the complex number) So if you give me the spectrum of a function, H(f), I can find the function, h(t). But how do we go the other way? Interestingly, it is a virtually identical transformation: inf H(f) = int h(t) exp (2Pi i f t) dt -inf This transformation process is a Fourier transform. H(f) and h(t) should be thought of as two representations of the same function, one in the time domain, the other in the frequency domain. Getting back to the sound example, one way to analyze sound waveforms is to perform a Fourier transform. Examples: 1) tuning fork produces a nearly perfect sinusoid, so the FT is a delta 2) musical instruments produce more complex waveforms, which our ear finds more interesting; they are usually made up of a fundamental frequency which is loudest, plus harmonics that are integer multiples of the fundamental frequency, with amplitude that diminishes with higher frequency. 3) our larynx produces sound similar to a reed instrument's, but we use the resonance chambers in our mouths to amplify four frequencies called formants. Our perception of vowel sounds is based primarily on the amplitudes of the first two formants. Thus, by looking at a spectral analysis of speech, you can identify vowels easily (and consonants with some difficulty). (What do sibilants look like?) (What is white noise?) Power ----- The total power in a signal is the integral in the time domain of the amplitude: -inf power = int |h(t)|^2 dt inf The cool thing is that we get the same result in the frequency domain: -inf power = int |H(f)|^2 df inf Scaling and shifting -------------------- With sound as a running example, consider the following relationships between functions and their FT: Fourier transformation is linear, so 1) multiplying a waveform by a constant multiplies its FT by a constant (amplification) 2) adding two waveforms adds their FTs (superposition) 3) What if we scale a function in time, speeding it up by a factor of a? h(at) <==> 1/|a| H(f/a) It stretches the spectrum by a (making it sound higher), but reduces the amplitude in order to conserve power. Symmetry -------- There are a number of symmetry relations between h and H, but the one most relevant to our example is that if h(t) is real (which it is) then H(f) is (almost) symmetric H(-f) = H(f)' (complex conjugate). What that means for sound analysis is that a negative frequency is just a phase shift of a positive frequency, so the amplitude has to be the same in both. Generally when we do an FT of a waveform, we only look at half the resulting data (the positive frequencies). Convolution ----------- One of the more useful features of the Fourier transform is that it provides a nice way to compute convolutions. The convolution of two functions is -inf g * h = C(t) = int g(x) h(t - x) dx inf (Where * here means convolution) This is incredibly useful in probability and statistics, but a pain to compute. The neat thing is that if g <==> G and h <==> H, then g*h <==> GH Discrete waveforms ------------------ Let's say you want to make a CD. You have to make a digital recording, which consists of discrete samples of h(t) at (usually evenly-spaced) time intervals. t_n = n delt h_n = h (t_n), where delt is the time interval and n is integral The sampling rate is 1/delt = number of samples per second. For a CD, the rate is 44 KHz, or 44,000 samples per second. Looking at a sequence of samples, you lose the ability to discern high-frequency components. You need at least two points per cycle to resolve a frequency component, which leads to the Nyquist critical frequency, fc = 1 / 2delt. For CDs, that's 22KHz, which is just about the upper bound of human hearing. If the wavelength is less than the sample interval (delt), it comes out looking like a lower frequency signal. This masquerade is called aliasing, and it is an inescapable problem that occurs whenever you approximate a continuous signal with discrete samples. When we take the Fourier Transform of the discretized data, we get no information about frequencies outside the range -fc to fc. Discrete frequency domain ------------------------- Given N data points, we can evaluate H(f) for any values of f we want, except 1) there is no point evaluation an f outside -fc to fc 2) there is no point in evaluating more than N different frequencies, since you can't create information Sooo, the usual thing to do is evaluate N evenly spaced frequencies from -fc to fc. f_n = n / N delt as n goes from -N/2 to N/2 (Let's assume than N is even). We can replace the integral over h(t) with a summation over h_n: inf H(f_n) = int h(t) exp (2Pi i f_n t) dt -inf N-1 ~= sum h_k exp (2Pi i f_n t_k) delt k=0 N-1 = sum h_k exp (2Pi i k n/N) delt k=0 Based on this, we define the Discrete Fourier Transform as H_n = H(f_n) / delt N-1 = sum h_k exp (2Pi i k n/N) k=0 The point of defining away the delt term is that the DFT is now a mapping from a sequence of complex numbers onto a new sequence of complex numbers, with no units involved. To interpret the H_n, you have to use the units to calculate the frequencies. Instead of using values of n from -N/2 to N/2, it is conventional to go from 0 to N. Then the H_0 is the amplitude of the zero frequency component (which is the time average of the signal). From 1 to N/2, we get the positive frequencies up to fc. From N/2 to N-1, we get the negative frequencies from -fc up to 0. Example, for N=4: index -2 -1 0 index -2 -1 0 1 2 3 freq -fc -fc/2 0 fc/2 fc -fc/2 0 fc -fc Shifting the indices of interest allows us to write the inverse DFT 1 N-1 h_k = - sum H_n exp (-2Pi i k n/N) N n=0 Notice the similarity to the FT -- the difference between DFT and iDFT is a factor of 1/N in front and a negative sign in the exponent. That suggests that we could use the same program to calculate either. Next time: Fast Fourier Transform!