What’s there?
There are basically 5 (ಐದು / पाँच) types of matrix decompositions that are like super important. I strive to go in depth into all of them. These are
- Eigendecomposition (This post!)
- QR decomposition
- Cholesky decomposition
- LU decomposition
- (perhaps the most important) Singular Value Decomposition
Edit (July 20, 2025): Added proof of spectral theorem.
Why do I write this?
I wanted to create an authoritative document I could refer to for later. Thought that I might as well make it public (build in public, serve society and that sort of stuff).
Eigen decomposition
Wait…
Basics!
Linear Transformation
A linear transformation is a function
between two vector spaces
and
that preserves the operations of vector addition and scalar multiplication. For finite-dimensional vector spaces, any linear transformation
can be represented by an
matrix
. The action of the transformation on a vector
is given by matrix-vector multiplication:
.
Ok, got it? Basic linear algebra.
Eigenvectors and eigenvalues
Let
be a square matrix. A nonzero vector
is an eigenvector of
if there exists a scalar
such that
. The scalar
is called the eigenvalue corresponding to the eigenvector
.
I will now add a bit of geometry. The equation states that the vector (the result of the transformation) is collinear (fancy term for along the same line) with the original vector
. The eigenvalue
is the scaling factor.
- If
, the eigenvector is stretched.
- If
, the eigenvector is compressed.
- If
, the eigenvector is flipped and scaled.
- If
, the eigenvector is unchanged.
- If
, the eigenvector is mapped to the zero vector, meaning it lies in the null space of
.
Want a reminder on how to find these? The expression is a polynomial in
of degree
, called the characteristic polynomial. The roots of this polynomial are the eigenvalues of
. By the fundamental theorem of algebra, an
-degree polynomial has
roots (counting multiplicity) in the complex plane
.
Important matrix types
Ok, this is purely LLM generated.
- Diagonal Matrices (
): Square matrices where all non-diagonal entries are zero (
for
). They represent pure scaling along the coordinate axes. Their powers (
) and inverses (
) are trivial to compute.
- Triangular Matrices (
): Square matrices that are either lower triangular (
, where
for
) or upper triangular (
, where
for
). Systems of equations involving triangular matrices can be solved efficiently via forward or backward substitution.
- Orthogonal Matrices (
): Square matrices whose columns (and rows) form an orthonormal basis. They satisfy the property
, which implies
. Geometrically, orthogonal matrices represent pure rotations or reflections, which preserve lengths and angles of vectors:
.
I wrote this (!!) –
- Also, Diagonalizability. A square matrix
is said to be diagonalizable if it has
linearly independent eigenvectors. Such matrices are also known as non-defective matrices.
Eigendecomposition
Basics
Ok, we can start now. For eigen decompositions, we are super concerned with endomorphisms. This is when the domain and the codomains of the linear transformation are in the same space. This is basically – , represented by a square matrix
.
Why do you need to know this? You don’t. I knew this and barfed the information on you.
When can you eigendecompose?
If a matrix is diagonalizable, it can be factored as
, where:
is a matrix whose columns are the
linearly independent eigenvectors of
, i.e.,
.
is a diagonal matrix whose diagonal entries are the corresponding eigenvalues,
.
is the inverse of the matrix
. The existence of
is guaranteed by the linear independence of the eigenvectors.
Ok, why can you eigendecompose?
Let have
linearly independent eigenvectors
with corresponding eigenvalues
. We can write the
eigenvalue equations as:
We can concatenate these equations into a single matrix equation. Let . Then:
The right-hand side can be expressed as the product of and a diagonal matrix
:
Thus, we have the relationship . Since the eigenvectors are linearly independent, the matrix
is invertible. We can right-multiply by
to obtain the eigendecomposition:
Side Note: Notice how is not orthonormal. It can be, sure. Your call. Why? It’s inverse removes the norm of the individual columns. Trivial enough (math textbook, much?).
Eigendecomposition is powerful!
Eigendecomposition reveals fundamental properties of a matrix:
- Determinant: The determinant of
is the product of its eigenvalues.
- Trace: The trace of
is the sum of its eigenvalues.
- Matrix Powers: Eigendecomposition simplifies the computation of matrix powers.
The Spectral Theorem for Symmetric Matrices
Ok, here’s a special case – what if the matrix is symmetric? (Why do we care about symmetric matrices? Remind yourself about covariance matrices.)
Theorem (Spectral Theorem): For any real symmetric matrix :
- All eigenvalues of
are real numbers.
- There exists an orthonormal basis of
consisting of the eigenvectors of
. That is, the eigenvectors can be chosen to be mutually orthogonal and have unit norm.
Proof 1: Eigenvalues are Real
Let be an eigenvalue of
with a corresponding eigenvector
. These may be complex, so we start with the standard eigenvalue equation:
Now, take the conjugate transpose (denoted by a star, ) of both sides:
Since the matrix is real and symmetric, its conjugate transpose is itself (
). This simplifies the equation to:
Next, we right-multiply this equation by the original eigenvector :
Now, let’s return to the original equation, , and left-multiply it by
:
By comparing Eq. 1 and Eq. 2, we can see that:
The term is the squared magnitude of the vector
. Since an eigenvector cannot be a zero vector,
must be a positive real number. Therefore, we can safely divide by it, leaving:
A number is equal to its own complex conjugate if and only if it is a real number. This completes the proof that all eigenvalues of a real symmetric matrix are real.
Proof 2: Eigenvectors of Distinct Eigenvalues are Orthogonal
Let and
be two distinct eigenvalues (
) of
, with corresponding eigenvectors
and
. We have:
Let’s start with the first equation and left-multiply by :
Now, let’s take the transpose of the entire expression. Since the result is a scalar, it is equal to its own transpose:
Because is symmetric (
), this becomes:
Next, we return to the second original equation () and left-multiply by
:
By comparing Eq. 3 and Eq. 4, we get:
We started with the assumption that the eigenvalues are distinct, so . For the product to be zero, the other term must be zero:
This shows that the eigenvectors and
are orthogonal.
Yeah, that’s it.
Cool, now think about eigendecomposition. is suddenly an orthogonal matrix. Thus, for any real symmetric matrix
, its eigendecomposition is:
This form has a powerful geometric interpretation: any transformation by a symmetric matrix can be understood as a sequence of three operations:
- A rotation back to the original coordinate system (
).
- A rotation or reflection (
) to align the coordinate system with the eigenvectors.
- A scaling along these new axes by the eigenvalues (multiplication by
).
“Huh?”, you ask. Yeah, the above order is wrong.
When you apply the transformation to a vector
, you calculate
. The correct sequence of geometric operations is:
- Rotation/Reflection (
): Aligns the coordinate system with the eigenvectors. This is essentially a change of basis.
- Scaling (
): Scales the vector along these new axes.
- Rotation Back (
): Returns to the original coordinate system.
Cool, ok, but where do I use it?
I point you back to the “Eigendecomposition is powerful!” subsection.
Sources
- Mathematics for Machine Learning (link)
Leave a comment