MA332 lecture notes, Fall 1998
Week 11 Tuesday

Topics: discrete Fourier transform  (dft)

Big picture
-----------

1) start with the continuous Fourier transform.
2) discretize in time
3) discretize in frequency
4) develop the Fast Fourier Transform

Superposition principle: any signal can be composed by
the superposition of sinusoids with various frequencies,
phases and amplitudes.

Spectral analysis: take a signal and break it down into
its component frequency components.

Applications

1) sound analysis, speech recognition
2) image recognition
3) many, many more...

Fourier transform
-----------------

The best example to keep in mind when you think about Fourier
transforms is the spectral analysis of sound.

Sound is our perception of changes in air pressure over time.
Microphones convert changes in air pressure into changes in
electrical charge or current.

If we record this signal for a period of time, we get a waveform.
Different sounds produce different sorts of waveforms (see the
Physics of Music example).

h(t) : air pressure as a function of time          (real)
H(f) : magnitude of component with frequency f       (complex)

Any function of time h(t) can be written as the sum of various
frequency components.  For a given frequency, we want to know
the magnitude and phase of that component in the signal.

    H(f) is a complex number that indicates the magnitude and phase of
    that component with frequency f in the signal.

    (f=1 indicates one cycle per unit of time, whatever the units of
    time are)

If H(1) = 1 and and all other H(f) = 0, then the corresponding
h(t) is exp (-2Pi i t), a sinusoid with frequency 1.

     (that negative sign is in there for convention, and
     corresponds to a 180 degree phase shift)

     polar interpretation of complex numbers: r exp (i theta)

If H(1) = i and and all other H(f) = 0, then the corresponding
h(t) is a sinusoid with frequency 1, with phase shifted by 90 degrees.

That suggests that in order to convert from H(f) to h(t), all
we have to do is add up the contributions from all the frequency
components.

If there are only two components, like H(1) = 1 and H(2) = 1,
then
	h(t) = exp (-2Pi i t) + exp (-2Pi i 2t)

	(check that that is really a function of t)

So h at some time t is the integral over all frequencies of
H times the sinusoid with the appropriate frequency (evaluated at t):

            inf            
  h(t) = int     H(f) exp (-2Pi i f t) df
           -inf

	(check that that is really a function of t)

	(look at the mutiplication of the complex number)

So if you give me the spectrum of a function, H(f), I can find
the function, h(t).

But how do we go the other way?  Interestingly, it is a virtually
identical transformation:

            inf            
  H(f) = int     h(t) exp (2Pi i f t) dt
           -inf

This transformation process is a Fourier transform.  H(f)
and h(t) should be thought of as two representations of the
same function, one in the time domain, the other in the
frequency domain.

Getting back to the sound example, one way to analyze sound
waveforms is to perform a Fourier transform.

Examples:

1) tuning fork produces a nearly perfect sinusoid, so the FT is a delta

2) musical instruments produce more complex waveforms, which our ear
finds more interesting; they are usually made up of a fundamental
frequency which is loudest, plus harmonics that are integer multiples
of the fundamental frequency, with amplitude that diminishes with
higher frequency.

3) our larynx produces sound similar to a reed instrument's, but we
use the resonance chambers in our mouths to amplify four frequencies
called formants.  Our perception of vowel sounds is based primarily
on the amplitudes of the first two formants.  Thus, by looking at
a spectral analysis of speech, you can identify vowels easily
(and consonants with some difficulty).

     (What do sibilants look like?)

     (What is white noise?)

Power
-----

The total power in a signal is the integral in the time domain
of the amplitude:

               -inf
    power = int     |h(t)|^2 dt
                inf

The cool thing is that we get the same result in the frequency
domain:

               -inf
    power = int     |H(f)|^2 df
                inf


Scaling and shifting
--------------------

With sound as a running example, consider the following relationships
between functions and their FT:

Fourier transformation is linear, so

1) multiplying a waveform by a constant multiplies its FT by a constant
   (amplification)

2) adding two waveforms adds their FTs (superposition)

3) What if we scale a function in time, speeding it up by a factor of a?

     h(at) <==> 1/|a| H(f/a)

It stretches the spectrum by a (making it sound higher), but
reduces the amplitude in order to conserve power.


Symmetry
--------

There are a number of symmetry relations between h and H, but
the one most relevant to our example is that if h(t) is real
(which it is) then H(f) is (almost) symmetric H(-f) = H(f)'
(complex conjugate).

What that means for sound analysis is that a negative frequency
is just a phase shift of a positive frequency, so the amplitude
has to be the same in both.

Generally when we do an FT of a waveform, we only look at half
the resulting data (the positive frequencies).

Convolution
-----------

One of the more useful features of the Fourier transform is
that it provides a nice way to compute convolutions.

The convolution of two functions is

                       -inf
    g * h  = C(t) = int     g(x) h(t - x) dx
                       inf

(Where * here means convolution)

This is incredibly useful in probability and statistics, but
a pain to compute.  The neat thing is that if

  g <==> G  and h <==> H, then g*h <==> GH


Discrete waveforms
------------------

Let's say you want to make a CD.  You have to make a digital
recording, which consists of discrete samples of h(t) at
(usually evenly-spaced) time intervals.

 t_n = n delt
 h_n = h (t_n), where delt is the time interval and n is integral

The sampling rate is 1/delt = number of samples per second.

For a CD, the rate is 44 KHz, or 44,000 samples per second.


Looking at a sequence of samples, you lose the ability to
discern high-frequency components.  You need at least two points
per cycle to resolve a frequency component, which leads to
the Nyquist critical frequency, fc = 1 / 2delt.

For CDs, that's 22KHz, which is just about the upper bound of
human hearing.

If the wavelength is less than the sample interval (delt), it
comes out looking like a lower frequency signal.  This
masquerade is called aliasing, and it is an inescapable problem
that occurs whenever you approximate a continuous signal with
discrete samples.

When we take the Fourier Transform of the discretized data,
we get no information about frequencies outside the range
-fc to fc.

Discrete frequency domain
-------------------------

Given N data points, we can evaluate H(f) for any values of
f we want, except

1) there is no point evaluation an f outside -fc to fc
2) there is no point in evaluating more than N different
   frequencies, since you can't create information

Sooo, the usual thing to do is evaluate N evenly spaced
frequencies from -fc to fc.

 f_n = n / N delt      as n goes from -N/2 to N/2

(Let's assume than N is even).

We can replace the integral over h(t) with a summation over h_n:

            inf            
 H(f_n) = int     h(t) exp (2Pi i f_n t) dt
           -inf

            N-1
       ~= sum    h_k  exp (2Pi i f_n t_k) delt
            k=0

            N-1
        = sum    h_k  exp (2Pi i k n/N) delt
            k=0

Based on this, we define the Discrete Fourier Transform as

 H_n = H(f_n) / delt

            N-1
        = sum    h_k  exp (2Pi i k n/N)
            k=0

The point of defining away the delt term is that the DFT is now
a mapping from a sequence of complex numbers onto a new sequence
of complex numbers, with no units involved.

To interpret the H_n, you have to use the units to calculate
the frequencies.

Instead of using values of n from -N/2 to N/2, it is conventional
to go from 0 to N.

Then the H_0 is the amplitude of the zero frequency component
(which is the time average of the signal).

From 1 to N/2, we get the positive frequencies up to fc.

From N/2 to N-1, we get the negative frequencies from -fc up to 0.

Example, for N=4:

index                            -2    -1     0
index    -2     -1    0     1     2     3

freq     -fc  -fc/2   0    fc/2   fc   -fc/2  0
          fc                     -fc

Shifting the indices of interest allows us to write the inverse DFT

          1    N-1
 h_k    = -  sum    H_n  exp (-2Pi i k n/N)
          N    n=0

Notice the similarity to the FT -- the difference between DFT and
iDFT is a factor of 1/N in front and a negative sign in the exponent.

That suggests that we could use the same program to calculate either.

Next time: Fast Fourier Transform!