To start, let’s suppose we have a function \(f(k)\) defined on \(\mathbb{Z}\) mod \(N\). If you have no idea what this means, don’t worry, it’s not too bad, we simply have a function \(f(k)\) that takes as input an index and takes on the values \([f(0), f(1), f(2), \ldots, f(N-1)]\). The “mod \(N\)” parts means that the function is periodic with period \(N\), i.e., \(f(a + m N) = f(a)\) if \(m\) is an integer.

Often we are interested in studying the change the function with respect to \(k\). In the continuous function case this would be the derivative operator, but in the discrete case we can describe this through a shift operator

\[S f(k) = f(k+1)\]

We can apply multiple shifts sequentially,

\[S^m f(k) = f(k+m)\]

and because our function is periodic with period \(N\) we must have \(S^N\) be the identity operator.

Now since our function \(f(k)\) only has \(N\) distinct values, let’s represent by a vector \(\mathbf{f}\) of length \(N\).

\[\mathbf{f} = \begin{pmatrix} f(0) \\ f(1) \\ \vdots \\ f(N-1) \end{pmatrix}\]

Our shift operator is now an \(N \times N\) matrix, and we can explicitly construct it as,

\[S = \begin{pmatrix} 0 & 1 & 0 & 0 &\ldots & 0 \\ 0 & 0 & 1 & 0 & \ldots & 0 \\ & & &\ddots \\ 1 & 0 & 0 & 0 & \ldots & 0 \end{pmatrix}\]

From inspection, we can see that the transpose of the shift matrix actually shifts the signal the opposite direction,

\[S^T f(k) = f(k-1)\]

and thus \(S S^T = S^T S = I\). Therefore \(S\) is a unitary matrix.

One of the fundamental results of unitary matrices is that it’s eigenvectors form an orthogonal basis over the vectorspace \(\mathbb{C}^N\). Without explicitly finding these eigenvectors yet, let’s define the matrix \(V\) such that each column is an eigenvector of \(S\). In other words, we have the relation

\[SV = VD\]

for a diagonal matrix \(D\). We also must have \(V^* = V^{-1}\) where \(*\) indicates conjugate transpose, resulting in the decomposition,

\[S = V D V^*\]

or that \(V\) is the transform that diagonalizes or shift operator.

Now let’s go back to our arbirary shift equation \(S^m f(k) = f(k+m)\) and use the notation that \(\mathbf{f}_m\) indicates the vector \(\mathbf{f}\) shifted by \(m\) places. We can write this equation in matrix form,

\[S^m \mathbf{f}_0 = \mathbf{f}_m\]

substituting in our diagnolization,

\[V D^m V^* \mathbf{f}_0 = \mathbf{f}_m\]

Now our shift reads: apply \(V^*\) to \(\mathbf{f}_0\), multiply each term by the diagonals of \(D^m\), and apply the inverse transformation \(V\). In essense we have turned shifting into multiplication.

So what, why does this matter? Well consider the equation

\[y(k+1) + 0.5 y(k-1) = f(k) + 2 f(k-1)\]

And suppose we know \(f(k)\) but we want to find \(y(k)\), how do we do that? Let’s rewrite this in matrix form and use our shift diagonalization matrices

\[D V^* \mathbf{y}_0 + 0.5 D^{-1} V^* \mathbf{y}_0 = V^* \mathbf{f}_0 + 2 D^{-1}V^* \mathbf{f}_0\]

In this equation these is no “coupling” between indicies, each element of the vectors can be found independenly, and then we can apply \(V\) to transform back.

The next question is what are \(V\) and \(D\)? We could directly calculate them from the matrix form of \(S\), but there is a quick way to do it by finding eigenfunctions of the operator form. We seek functions where

\[f(k+1) = c f(k)\]

Or, functions that turn addition into mutiplication. There is only one such function, exponentiation, consider \(f(k) = z^k\) for some complex number \(z\),

\[f(k+1) = z^{k+1} = z \cdot z^k,\]

and so \(z^k\) are our eigenfunctions with eigenvalues \(z\). The only constraint we have on our system is that it must be \(N\)-periodic, this means that we have to have

\[z^k = z^{k+N}\]

or equivalently that

\[z^N = 1,\]

which means that \(z\) has to be one of the roots of unity that are given by,

\[z_l = \exp{\left(\frac{2\pi i l}{N}\right)} \text{ for } l=0, 1, \ldots, N-1\]

So we only have \(N\) possible values for \(z\)

This gives us our desired eigenvectors,

\[\mathbf{v}_l = \begin{pmatrix} z_l^0 \\ z_l^1 \\ z_l^2 \\ \vdots \\ z_l^{N-1} \end{pmatrix}\]

And our matrix \(V\) could be written in the form,

\[V = \begin{pmatrix} 1 & 1 & 1 & 1 & 1 & \ldots & 1 \\ 1 & \omega & \omega^2 & \omega^3 & \omega^4 & \ldots & \omega^{N-1} \\ 1 & \omega^2 & \omega^4 & \omega^6 & \omega^8 & \ldots & \omega^{2(N-1)} \\ &&&\ddots&&& \\ 1 & \omega^{N-1} & \omega^{2(N-1)} & \omega^{3(N-1)} & \omega^{4(N-1)} & \ldots & \omega^{(N-1)(N-1)} \end{pmatrix}\]

where \(\omega = \exp \frac{2\pi i}{N}\). This is precisely the discete fourier transform matrix.

Another way of thinking about this is that the fourier transform diagonalizes the shift operator. In fact, any operator that commutes with the shift operator is diagonalized by the fourier transform by simultaneous diagonalization. And thus the Fourier transform is the natural basis to work in when dealing with any system that has tranlational invariance.

Relation to the Fourier Transform

In the continuous case we don’t have shift matrices, but we have shift operators that are built up through derivative operators. Much how discrete complex exponentials are eigenfunctions of the shift operator, continuous complex exponentials are eigenfunctions of the differentiation operator. The Fourier transform “diagonalizes” the time evolution operator, turning differentiation into multiplication. While the mathematics in the continuous case is much more complicated, the same intuition that is built up by the discrete case applies in the continuous case.