Eigenvalues as the magnitude of the matrix

Eigenvalues are a notoriously hard topic to understand. Their definition can be unintuitive, making it difficult to understand their usefulness.

Recently, I've started explaining eigenvalues as the magnitude of a matrix, that generalizes the notion of "big" and "small" numbers.

Magnitude of real numbers

Take an arbitrary real number $x$. What is $x^n$ if $n$ is large? If $x=0$ or $1$ the answer is 0 or 1. If $x>1$ then $x^n$ is a large number. If $-1<x<1$ then $x^n$ will be close to 0.

As shwon above, we can actually say quite a lot about $x^n$, without knowing what $x$ actually is. We only had to know its sign and magnitude.

How do we generalize this to matrices? Is it even useful, talking about the high power of a matrix?

Dynamical systems

Let's model the following very simple system: a car moving at a constant speed of $v$, starting at position $x_0$. After one second, the position is: $$x_1=x_0+v$$ After two seconds: $$x_2=x_1+v$$

This can be written in the following matrix form:

$$ \begin{bmatrix} x_n \\ v \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \cdot \begin{bmatrix} x_{n-1} \\ v \end{bmatrix} $$

If we call $s_n=\left[\begin{smallmatrix}x_n \\ v\end{smallmatrix}\right]$, the expression can be further simplified as: $$s_n=A\cdot s_{n-1}=A\cdot A\cdot s_{n-2}=\ldots=A^{n}s_0,$$ where $$A=\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}.$$

As we can see, the behaviour of the car can be described by exponentiating the matrix $A$. Of course, this is a rather simple example. However, many physical simulations can be written in this form, called a linear dynamical system. The linear part refers to the fact the update from $s_{n-1}$ to $s_n$ is a linear function (a matrix multiplication). Dynamical system means that $s_n$ only depends on the previous state $s_{n-1}$, and not on the step count $n$.

It turns out, that many physical models can be represented as a dynamical system (e.g. wheather predictions, planetary mechanics). Even if the update step is not linear, it can often be linearized or approximated. Therefore, the ability to efficiently analyze power of matrices is crucial. It turns out, eigenvalues will be a great help for us.

Eigenvalues for powers

Assume that the matrix $A$ has an eigendecomposition, that is it can be converted to the form $$A=QDQ^{-1},$$ where $D$ is a diagonal matrix holding the eigenvalues, and $Q$ is a matrix whose columns hold the eigenvectors. Now we can write $$A^n=(QDQ^{-1})(QDQ^{-1})\ldots (QDQ^{-1})=QD^nQ^{-1}$$ The power $D^n$ can be calculated quickly, by raising each eigenvalue to the power of $n$: $$D^n = \begin{bmatrix} \lambda_1 & & \\ & \lambda_2 & \\ & & \ddots \end{bmatrix}^n = \begin{bmatrix} \lambda_1^n & & \\ & \lambda_2^n & \\ & & \ddots \end{bmatrix} $$

What does this form tell us? If all eigenvalues are between -1 and 1, then $D^n$ will approach zero as $n$ gets larger. Consequently, $A^n$ will also approach zero. If $A$ has an eigenvalue larger than 1, one column of $D^n$ will go to infinity, thus $A^n$ will have a direction where it blows up. Finally, if there is an eigenvalue less than -1, the matrix $A^n$ will diverge. These statements are analogous to the statements we had on scalar numbers.

Sometimes you have your eigenvalues from different magnitudes. Depending on your problem, it could be fine that the matrix blows up in one direction (the corresponding eigenvalue is larger than 1) while converges in other directions (the corresponding eigenvalues are between -1 and 1).

Note, that in the above derivation we assumed the matrix can be decomposed nicely. However, it turns out, even if the matrix does not have an eigendecomposition, we can still make these statements about its eigenvalues. If all eigenvalues of the matrix have an absolute value less than 1, $A^n$ converges to 0.

In summary, we can draw conclusions about the convergence of matrix powers, allowing us to better analyze and solve problems. These conclusions are analogous to those we can make about scalar numbers.