Discrete Fourier Transform: the Intuition

Every time I think “Now I understand the Fourier Transform,” I am wrong.

Jean-Baptiste Joseph Fourier, 1768 – 1830. (Image source: Wikipedia)

Doing the Fourier Transform of a function is just seeing it from “another” point of view. The “usual” view of a function is in the standard basis \{e_1, \cdots, e_n\}. For example, f can be seen as a vector (in the basis given by the elements in the domain) whose coordinates are the evaluations of f on the elements in the domain. It can also be seen as a polynomial (in the monomial basis) whose coefficients are these evaluations. Let us call this vector u.

The same function can also be seen from the “Fourier basis”, which is just another orthogonal basis, formed by the basis vectors \{v_t\}, t. The tth coordinate in the new basis will be given by inner product between u and the tth basis vector v_t. We call these inner products the Fourier coefficients. The Discrete Fourier Transform matrix (the DFT matrix) “projects” a function from the standard basis to the Fourier basis in the usual sense of projection: taking the inner product along a given direction.

In this post, I am going to use elementary group theoretic notions, polynomials, matrices, and vectors. The ideas in this post will be similar to this Wikipedia article on Discrete Fourier Transform.

Please let me know in the comments if you find any error or confusion.

Roots of unity. Let M_p be the the multiplicative group of pth roots of unity. That is, M_p=\{1, \omega_p, \omega_p^2, \cdots, \omega_p^{p-1} \}. \omega_p is a primitive pth root if the smallest natural exponent t that makes \omega_p^t=1 is p-1. The root \omega_p is principal if 1 + \omega_p + \omega_p^2 + \cdots + \omega_p^{p-1} = 0. In what follows, we assume that \omega_p is a principal and primitive pth root of unity.

Let F be a field containing M_p. For example, F can be the set of complex numbers and \omega_p = e^{2\pi i/p} where i is the imaginary unit. It is easy to check that the roots \{\omega_p^t\}_{t=0}^{p-1} appear on the unit circle of radius 1 around the origin.

Representing functions as polynomials and vectors. Let Z_p be the set of integers \{0, 1, \cdots, p-1\}. To identify a function f : Z_p  \rightarrow F, it suffices to write down its evaluations on all t\in Z_p. Thus we can treat f as a p-dimensional vector, whose tth entry is just f(t) for t \in Z_p.  This shows that f \in F^p.

We can also represent the function f (from the preceding paragraph) as a polynomial of degree at most p-1, the representation being f \mapsto \sum_{t \in Z_p}{f(t) x^t}. This shows f \in R=F[x] / (x^p - 1), where F[x] is the ring of polynomials with coefficients in F and indeterminate x, and  the ring R contains polynomials with degree at most p-1. Why? Because when we take f(x) \bmod (x^p-1), we set every x^p=1 in f(x), turning every x^{p+t} into x^t. Neat. (Why is F[x] a ring?)

The Chinese Remainder Theorem (CRT). The good news is, x^p-1 factors completely in F because by assumption, F contains all pth roots of unity. This means, x^p-1=\prod_{t\in Z_p}{(x-\omega_p^t)}. Since the factors (x-\omega_p^t) are relatively prime to each other, by the Chinese Remainder Theorem (CRT), there is a bijection T : R \rightarrow R which maps f(x) \in R to a p-tuple whose tth entry is f(x) \bmod (x - \omega_p^t).

Essentially, the CRT says that “anything” expressed “with resptct to” x^p-1 can be equivalently expressed “with respect to” all individual factors x- \omega_p^t, where

  • “anything” is a polynomial f \in R
  • expressing f(x) “with respect to” g(x) means computing f(x) \bmod g(x).

From coefficients to evaluations. Given an f \in F[x], let us define the polynomials

g \in F[x]\,, g(x) \triangleq f(x) \bmod (x - \omega_p^t)\,, and

\hat{f} \in F[x] : F \rightarrow F\,, \hat{f}(t) \triangleq g(\omega_p^t) for t \in Z_p.

Note that by Euclid’s division algorithm, we can express

f(x) = a(x) (x - \omega_p^t) + b(x)

for some a(x) and b(x) where the degree of the remainder, b(x), is strictly less than the degree of the divisor x - \omega_p^t which means b(x) \in F is just a scalar. It is easy to check that \hat{f}(t) = g(\omega_p^t) = b(\omega_p^t) = f(\omega_p^t). The first equality is the definition of \hat{f}; the second equality follows from the definition of g and the Euclid expansion of f(x); the last equality follows from the Euclid expansion of f(x) if we substitute x = \omega_p^t.

To reiterate,

\hat{f}(t) = f(\omega_p^t).

This is magic.

Thus, the transformed polynomial Tf can be completely described by the evaluations of f over the roots of unity \{\omega_p^t\}, the evaluations being \left(\hat{f}(t) \right)_{ t \in Z_p}.

Connections with polynomial interpolation. Notice that the above sentence sounds like an interpolation problem: every polynomial of degree at most p-1 can be reconstructed from p evaluations in any field. Indeed, the Lagrange interpolation theorem is a restatement of the Chinese Remainder Theorem.

DFT: a change from the coefficient-basis to the evaluation-basis. The transform T is the Discrete Fourier Transform. The tth coefficient of the transformed polynomial \hat{f} = Tf,\, \hat{f}(t) = f(\omega_p^t), t \in Z_p is called the tth Fourier coefficient of f. A root of unity \omega_p^t, when treated as a function \chi_t : Z_p \rightarrow M_p defined as \chi_t(k) = \omega_p^{tk} for any k \in Z_p, is called a character of Z_p. We can also represent \chi_t in the vector form as

\chi_t \triangleq  \left( \omega_p^{tk} \right)_{k \in Z_p} .

The Fourier Transform maps a function f from the standard basis (i.e., the monomial basis or the coefficient basis) to the Fourier basis (i.e., the evaluation basis) given by the characters \{\chi_t\}_{t \in Z_p}. Recall that in polynomial representation, f(x) = \sum_{k\in Z_p}{f_k x^k}. Similarly, we can write the function \hat{f} as the p-dimensional vector (\hat{f}_t)_{t \in Z_p} where

\hat{f}_t \triangleq \hat{f}(t) = f(\omega_p^t) = \sum_{k \in Z_p} f_k \omega_p^{tk} = \langle f, \chi_t \rangle \,,

where \langle \cdot , \cdot \rangle  is the usual inner product.

The matrix of the transform. So, how does the transform T look like? Let \hat{f} = Tf, where both f, \hat{f} are tread as p dimensional vectors over F, where the kth coordinate of f is f_k. We can readily see that \hat{f} = Tf makes sense as a matrix-vector multiplication, when the tth row of  the matrix T is the transpose of the vector \chi_t. All told, the p \times p matrix T, containing \omega_p^{tk} at its row t and column k, is the Discrete Fourier Transform matrix. The above construction works even when p is not a prime. (Notice that we have abused the notation and used T to denote both a linear transform and its matrix.)

Actually, I think I missed a scale-factor. The coefficients of the Discrete Fourier Transform of a function f: Z_p \rightarrow F are actually scaled by a factor 1/p. I will correct my derivation later. The above discussion at least gives the structural insight behind the Fourier Transform.

Afterthoughts. The transform is discrete because of the multiplicative group of the roots of unity, M_p, is a finite group. It is isomorphic to Z_p under a map which sends the generator of Z_p (which is 1 when p is a prime) to the generator of M_p, which is \omega_p. The transform is also linear, because the inner product formulation \hat{f}(t) = \langle f , \chi_t \rangleis linear in both the coefficients of f and entries of the matrix T.

A glimpse of the uncertainty principle. The weight of a polynomial is the number of its nonzero coefficients. Loosely speaking, the uncertainty principle says that the weight of f and Tf cannot both be sparse. An extreme example is that if f has only one nonzero coefficient, none of its Fourier coefficients is zero. Similarly, if f has no nonzero coefficient, then its Fourier transform has weight one. Let’s examine below why that is true.

A (Dirac) delta function \delta_k can be represented as a vector with all zero entries except at coordinate k, which contains 1. A uniform function u_m(x) = m for all x \in Z_pwill be the vector m \cdot \mathbf{1}, where \mathbf{1} is the all-ones vector.

Everyone tells us that the Fourier transform of u_m is m\delta_0, and the Fourier transform of \delta_0 is \mathbf{1}. If we see the Fourier Transform as multiplying a vector by the matrix T, it is easy to see why.

All entries in the first row and column of the matrix T are 1, becasue their values are \omega_p^{tk} where either t or k is zero. Multiplying T with \delta_k gives us the kth column of T, which is \mathbf{1} when k=0. Otherwise, all Fourier coefficients of \delta_k are nonzero. In particular, T(m\delta_0)=u_m.

In contrast, multiplying T with the vector u_m adds up all columns of T (times the scale factor m). This makes each Fourier coefficient (except the 0th coefficient) equal to the sum of all roots of unity, and this sum is zero. To see this, note that x^p-1=(x-1)\cdot q(x) where q(x)=\sum_{t=0}^{p-1}{x^t}. Since \omega_p satisfies x^p=1, this implies (\omega_p-1)\cdot q(\omega_p)=0. Since \omega_p \neq 1, it follows that q(\omega_p), which is the sum of all pth root of unity, must be zero. On the other hand, the 0th coefficient is just the sum of p ones. Hence, Tu_m = (mp, 0, 0, \cdots, 0).



One thought on “Discrete Fourier Transform: the Intuition

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s