An Intuitive Explanation of the Discrete Fourier Transform

anEvery time I thought “Now I understand Fourier Transform,” I was wrong.

Jean-Baptiste Joseph Fourier, 1768 – 1830. (Image source: Wikipedia)

Doing the Fourier Transform of a function is just seeing it from “another” point of view. The “usual” view of a function is in the standard basis \{e_1, \cdots, e_n\}. For example, f can be seen as a vector (in the basis given by the elements in the domain) whose coordinates are the evaluations of f on the elements in the domain. It can also be seen as a polynomial (in the monomial basis) whose coefficients are these evaluations. Let us call this vector u.

The same function can also be seen from the “Fourier basis”, which is just another orthogonal basis, formed by the basis vectors \{v_t\}, t. The tth coordinate in the new basis will be given by inner product between u and the tth basis vector v_t. We call these inner products the Fourier coefficients. The Discrete Fourier Transform matrix (the DFT matrix) “projects” a function from the standard basis to the Fourier basis in the usual sense of projection: taking the inner product along a given direction.

In this post, I am going to use elementary group theoretic notions, polynomials, matrices, and vectors. The ideas in this post will be similar to this Wikipedia article on Discrete Fourier Transform.

Please let me know in the comments if you find any error or confusion.

Roots of Unity. Let M_p be the the multiplicative group of pth roots of unity. That is, M_p=\{1, \omega_p, \omega_p^2, \cdots, \omega_p^{p-1} \}. \omega_p is a primitive pth root if the smallest natural exponent t that makes \omega_p^t=1 is p-1. The root \omega_p is principal if 1 + \omega_p + \omega_p^2 + \omega_p^{p-1} = 0. In what follows, we assume that \omega_p is a principal and primitive pth root of unity.

Let F be a field containing M_p. For example, F can be the set of complex numbers and \omega_p = e^{2\pi i/p} where i is the imaginary unit. It is easy to check that the roots \{\omega_p^t\}_{t=0}^{p-1} appear on the unit circle of radius 1 around the origin.

Representing Functions as Polynomials or Vectors. Let Z_p be the set of integers \{0, 1, \cdots, p-1\}. To identify a function f : Z_p  \rightarrow F, it suffices to write down its evaluations on all t\in Z_p. Thus we can treat f as a p-dimensional vector, whose tth entry is just f(t) for t \in Z_p.  This shows that f \in F^p.

We can also represent the function f (from the preceding paragraph) as a polynomial of degree at most p-1, the representation being f \mapsto \sum_{t \in Z_p}{f(t) x^t}. This shows f \in R=F[x] / (x^p - 1), where F[x] is the ring of polynomials with coefficients in F and indeterminate x, and  the ring R contains polynomials with degree at most p-1. Why? Because when we take f(x) \bmod (x^p-1), we set every x^p=1 in f(x), turning every x^{p+t} into x^t. Neat. (Why is F[x] a ring?)

The Chinese Remainder Theorem. The good news is, x^p-1 factors completely in F because by assumption, F contains all pth roots of unity. This means, x^p-1=\prod_{t\in Z_p}{(x-\omega_p^t)}. Since the factors (x-\omega_p^t) are relatively prime to each other, by the Chinese Remainder Theorem (CRT), there is a bijection T : R \rightarrow R which maps f(x) \in R to a p-tuple whose tth entry is f(x) \bmod (x - \omega_p^t).

Essentially, the CRT says that “anything” expressed “with resptct to” x^p-1 can be equivalently expressed “with respect to” all individual factors x- \omega_p^t, where

  • “anything” is a polynomial f \in R
  • expressing f(x) “with respect to” g(x) means computing f(x) \bmod g(x).

From Coefficients to Evaluations. Let us write

\hat{f}(\omega_p^t) := f(x) \bmod (x - \omega_p^t) .

By Euclid’s division algorithm, we can express

f(x) = a(x) (x - \omega_p^t) + b(x)

for some a(x) and b(x) where the degree of the remainder, b(x), is strictly less than the degree of the divisor x - \omega_p^t which means b(x) \in F is just a scalar. It is easy to check that b(x) = \hat{f}(\omega_p^t) (by our definition of \hat{f}) and f(\omega_p^t) = b(x) (by substituting f(x)= a(x) (x - \omega_p^t) + b(x) into the expression for \hat{f}), giving us

\hat{f}(\omega_p^t) = f(\omega_p^t).

To me, the above feels like magic.

Therefore, the transformed polynomial Tf can be completely described by the evaluations of f over the roots of unity \{\omega_p^t\}, the evaluations being equal to f(t) for each t\in Z_p.

Connections with Polynomial Interpolation. This way, the transformed polynomial Tf can be completely described by the evaluations of f over the roots of unity \{\omega_p^t\}. Notice how it sounds like an interpolation problem: every polynomial of degree at most p-1 can be reconstructed from p evaluations in any field. Indeed, the Lagrange interpolation theorem is a restatement of the Chinese Remainder Theorem.

Enter DFT. The transform T is the Discrete Fourier Transform. The tth coefficient of the transformed polynomial \hat{f}(x) = Tf(x) is called the tth Fourier coefficient of f. A root of unity \omega_p^t, when treated as a function from \chi_t : Z_p \rightarrow M_p defined as \chi_t(k) = \omega_p^{tk} for any k \in Z_p, is called a character of Z_p. The Fourier Transform maps a function f from the standard (monomial) basis to the (Fourier) basis of characters.

The Matrix of the Transform. So, how does the transform T look like? Let \hat{f} = Tf, where both f, \hat{f} are tread as p dimensional vectors over F, where the kth coordinate of f is f(k). Recall that in polynomial representation, f(x) = \sum_{k\in Z_p}{f(k) x^k}. We want to force every coordinate \hat{f}(t) to be equal to f(\omega_p^t) = \sum_{k \in Z_p}{f(k) (\omega_p^t)^k}. Notice that this equals \langle f , w_t \rangle where f is the vector mentioned above, w_t is the vector whose kth entry is \omega_p^{tk}, and \langle . , . \rangle is the usual inner product. Then we can readily see that $\hat{f} = Tf$ makes sense as a matrix-vector multiplication, when the tth row of  the matrix T is the transpose of the vector w_t. All told, the p \times p matrix T, containing \omega_p^{tk} at its row t and column k, is the Discrete Fourier Transform matrix. The above construction works even when p is not a prime. (Notice that we have abused the notation and used T to denote both a linear transform and its matrix.)

Actually, I think I missed a scale-factor. The coefficients of the Discrete Fourier Transform of a function f:Z_p \rightarrow F are actually scaled by a factor 1/p. I will correct my derivation later. The above discussion at least gives the structural insight behind the Fourier Transform.

Afterthoughts. The transform is discrete because of the multiplicative group of the roots of unity, M_p, is a finite group. It is isomorphic to Z_p under a map which sends the generator of Z_p (which is 1 when p is a prime) to the generator of M_p, which is \omega_p. The transform is also linear, because the inner product formulation \hat{f}(\omega_p^t) = \langle f , w_t \rangleis linear in both the coefficients of f and entries of the matrix T.

A Glimpse of the Uncertainty Principle. The weight of a polynomial is the number of its nonzero coefficients. Loosely speaking, the uncertainty principle says that the weight of f and Tf cannot both be sparse. An extreme example is that if f has only one nonzero coefficient, none of its Fourier coefficients is zero. Similarly, if f has no nonzero coefficient, then its Fourier transform has weight one. Let’s examine below why that is true.

A (Dirac) delta function \delta_k can be represented as a vector with all zero entries except at coordinate k, which contains 1. A uniform function u_m(x) = m for all x \in Z_pwill be the vector m \cdot \mathbf{1}, where \mathbf{1} is the all-ones vector.

Everyone tells us that the Fourier transform of u_m is m\delta_0, and the Fourier transform of \delta_0 is \mathbf{1}. If we see the Fourier Transform as multiplying a vector by the matrix T, it is easy to see why.

All entries in the first row and column of the matrix T are 1, becasue their values are \omega_p^{tk} where either t or k is zero. Multiplying T with \delta_k gives us the kth column of T, which is \mathbf{1} when k=0. Otherwise, all Fourier coefficients of \delta_k are nonzero. In particular, T(m\delta_0)=u_m.

In contrast, multiplying T with the vector u_m adds up all columns of T (times the scale factor m). This makes each Fourier coefficient (except the 0th coefficient) equal to the sum of all roots of unity, and this sum is zero. To see this, note that x^p-1=(x-1)\cdot q(x) where q(x)=\sum_{t=0}^{p-1}{x^t}. Since \omega_p satisfies x^p=1, this implies (\omega_p-1)\cdot q(\omega_p)=0. Since \omega_p \neq 1, it follows that q(\omega_p), which is the sum of all pth root of unity, must be zero. On the other hand, the 0th coefficient is just the sum of p ones. Hence, Tu_m = (mp, 0, 0, \cdots, 0).




One thought on “An Intuitive Explanation of the Discrete Fourier Transform

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s