Eigenvalues and Eigenvectors

For a square matrix AA, an eigenvector is a non-zero vector that the matrix only stretches or shrinks — its direction is preserved (up to sign) under multiplication by AA. The factor by which it gets scaled is the corresponding eigenvalue.

These pairs reveal how the linear map xAx\mathbf{x} \mapsto A\mathbf{x} behaves along its “natural axes”: every other vector twists toward or away from these eigendirections, but the eigenvectors themselves only get rescaled.

Definition

For a square matrix ARn×nA \in \mathbb{R}^{n \times n}, a non-zero vector vRn\mathbf{v} \in \mathbb{R}^n is an eigenvector of AA with eigenvalue λR\lambda \in \mathbb{R} if:

Av=λvA \mathbf{v} = \lambda \mathbf{v}

The pair (λ,v)(\lambda, \mathbf{v}) is called an eigenpair of AA. The requirement v0\mathbf{v} \neq \mathbf{0} is essential — the equation is trivially satisfied by the zero vector for any λ\lambda, so without it the definition would be vacuous.

The scalar λ\lambda may be complex even when AA has real entries, but for the symmetric matrices most often encountered in this material — Hessian matrices, covariance matrices, and so on — every eigenvalue is guaranteed to be real.

Finding Eigenvalues

Start from the eigenvalue equation Av=λvA\mathbf{v} = \lambda \mathbf{v} and move everything to one side:

Avλv=0(AλI)v=0A\mathbf{v} - \lambda \mathbf{v} = \mathbf{0} \quad \Longrightarrow \quad (A - \lambda I) \mathbf{v} = \mathbf{0}

(The identity matrix sneaks in because λv=λIv\lambda \mathbf{v} = \lambda I \mathbf{v} — it’s the only way to subtract a scalar from a matrix and keep the equation matrix-shaped.)

The key observation: we’re looking for a non-zero v\mathbf{v} that the matrix AλIA - \lambda I sends to 0\mathbf{0}. A healthy, invertible matrix never does that — multiplying any non-zero vector by an invertible matrix gives back another non-zero vector. So AλIA - \lambda I has to be non-invertible — it must collapse some direction down to 0\mathbf{0}. From the determinant properties, this is exactly the condition

det(AλI)=0\det(A - \lambda I) = 0

which gives a single equation in the unknown λ\lambda.

The characteristic polynomial of a square matrix ARn×nA \in \mathbb{R}^{n \times n} is

pA(λ)=det(AλI)p_A(\lambda) = \det(A - \lambda I)

Its roots are exactly the eigenvalues of AA. For an n×nn \times n matrix, pAp_A is a polynomial of degree nn in λ\lambda, so AA has at most nn eigenvalues (counted with multiplicity).

Picture λ\lambda as a tuning knob you can dial. For most settings, the matrix AλIA - \lambda I is perfectly healthy — its determinant is some non-zero number, and the matrix has an inverse. But at a few special, isolated values of λ\lambda — exactly the eigenvalues — the matrix collapses, its determinant snaps to zero, and a direction in space gets crushed to 0\mathbf{0}. The characteristic polynomial pA(λ)p_A(\lambda) is just the determinant of AλIA - \lambda I written as a function of λ\lambda, and “finding eigenvalues” is the same as hunting for the values where this polynomial crosses zero.

In practice the recipe is three steps:

  1. Form AλIA - \lambda I by subtracting λ\lambda from every diagonal entry of AA (off-diagonal entries stay put).
  2. Compute its determinant. Expanding gives a polynomial in λ\lambda — the characteristic polynomial pA(λ)p_A(\lambda).
  3. Solve pA(λ)=0p_A(\lambda) = 0 for λ\lambda. Each root is an eigenvalue.

Finding Eigenvectors

Once you have an eigenvalue λ\lambda, the matching eigenvectors are the directions that AλIA - \lambda I flattens to zero — and you already know at least one such direction must exist, because that collapsing-a-direction property is exactly what made λ\lambda an eigenvalue in the first place. To pin it down, plug the value of λ\lambda back into AλIA - \lambda I and solve

(AλI)v=0(A - \lambda I) \mathbf{v} = \mathbf{0}

for v\mathbf{v}. This is just a linear system: for a 2×22 \times 2 matrix it boils down to a single linear relation between v1v_1 and v2v_2, leaving one free parameter to slide along.

One thing to expect: if v\mathbf{v} is an eigenvector, so is 2v2\mathbf{v}, v-\mathbf{v}, or any other non-zero scalar multiple — multiplying both sides of Av=λvA\mathbf{v} = \lambda \mathbf{v} by a constant doesn’t change anything. That’s not a flaw in the recipe; it just means the eigenvector’s direction is what’s pinned down, not its length. By convention you pick a clean representative with simple integer entries (or unit length).

Worked Example

Find the eigenvalues and eigenvectors of A=(4123)A = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}.

Step 1. Form AλIA - \lambda I by subtracting λ\lambda from each diagonal entry:

AλI=(4λ123λ)A - \lambda I = \begin{pmatrix} 4 - \lambda & 1 \\ 2 & 3 - \lambda \end{pmatrix}

Step 2. Compute its determinant — this is the characteristic polynomial:

pA(λ)=det(AλI)=(4λ)(3λ)12=λ27λ+10p_A(\lambda) = \det(A - \lambda I) = (4 - \lambda)(3 - \lambda) - 1 \cdot 2 = \lambda^2 - 7\lambda + 10

Step 3. Set pA(λ)=0p_A(\lambda) = 0 and solve. Factoring is easiest here:

λ27λ+10=(λ5)(λ2)=0λ1=5, λ2=2\lambda^2 - 7\lambda + 10 = (\lambda - 5)(\lambda - 2) = 0 \quad \Longrightarrow \quad \lambda_1 = 5,\ \lambda_2 = 2

So AA has two eigenvalues, 55 and 22.

Step 4. For each eigenvalue, plug it back in and solve (AλI)v=0(A - \lambda I)\mathbf{v} = \mathbf{0} to find the matching eigenvector.

For λ1=5\lambda_1 = 5:

(1122)(v1v2)=0v1=v2v1=(11)\begin{pmatrix} -1 & 1 \\ 2 & -2 \end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \mathbf{0} \quad \Longrightarrow \quad v_1 = v_2 \quad \Longrightarrow \quad \mathbf{v}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}

For λ2=2\lambda_2 = 2:

(2121)(v1v2)=02v1+v2=0v2=(12)\begin{pmatrix} 2 & 1 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \mathbf{0} \quad \Longrightarrow \quad 2v_1 + v_2 = 0 \quad \Longrightarrow \quad \mathbf{v}_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}

Sanity check — both pairs satisfy Av=λvA\mathbf{v} = \lambda \mathbf{v}:

Av1=(41+1121+31)=(55)=5v1A \mathbf{v}_1 = \begin{pmatrix} 4 \cdot 1 + 1 \cdot 1 \\ 2 \cdot 1 + 3 \cdot 1 \end{pmatrix} = \begin{pmatrix} 5 \\ 5 \end{pmatrix} = 5 \mathbf{v}_1Av2=(41+1(2)21+3(2))=(24)=2v2A \mathbf{v}_2 = \begin{pmatrix} 4 \cdot 1 + 1 \cdot (-2) \\ 2 \cdot 1 + 3 \cdot (-2) \end{pmatrix} = \begin{pmatrix} 2 \\ -4 \end{pmatrix} = 2 \mathbf{v}_2

The same recipe extends to larger matrices — for 3×33 \times 3 matrices the characteristic polynomial is cubic in λ\lambda and yields up to three eigenvalues, each with its own eigenvector(s) found by the same plug-and-solve step.

Sum and Product of Eigenvalues

For any square matrix, the eigenvalues come paired with two simple bookkeeping identities — they sum to the trace, and they multiply to the determinant.

For a square matrix AA with eigenvalues λ1,,λn\lambda_1, \ldots, \lambda_n (counted with multiplicity), the trace equals the sum of the eigenvalues:

tr(A)=λ1+λ2++λn.\operatorname{tr}(A) = \lambda_1 + \lambda_2 + \cdots + \lambda_n.

For a square matrix AA with eigenvalues λ1,,λn\lambda_1, \ldots, \lambda_n (counted with multiplicity), the determinant equals the product of the eigenvalues:

det(A)=λ1λ2λn.\det(A) = \lambda_1 \cdot \lambda_2 \cdots \lambda_n.

Both follow from a single observation: the characteristic polynomial can be expanded directly or read off its roots, and matching coefficients recovers the identities at once. The 2×22 \times 2 case spells it out.

Expanding det(AλI)\det(A - \lambda I) for A=(abcd)A = \begin{pmatrix} a & b \\ c & d \end{pmatrix} gives

det(AλI)=(aλ)(dλ)bc=λ2(a+d)λ+(adbc).\det(A - \lambda I) = (a - \lambda)(d - \lambda) - bc = \lambda^2 - (a + d)\lambda + (ad - bc).

The same polynomial, with λ1,λ2\lambda_1, \lambda_2 as its roots, also factors as

(λλ1)(λλ2)=λ2(λ1+λ2)λ+λ1λ2.(\lambda - \lambda_1)(\lambda - \lambda_2) = \lambda^2 - (\lambda_1 + \lambda_2)\lambda + \lambda_1 \lambda_2.

Matching coefficients (Vieta’s formulas) reads off both identities at once:

λ1+λ2=a+d=tr(A),λ1λ2=adbc=det(A).\lambda_1 + \lambda_2 = a + d = \operatorname{tr}(A), \qquad \lambda_1 \lambda_2 = ad - bc = \det(A).

For an n×nn \times n matrix the picture is the same — match coefficients of the degree-nn characteristic polynomial with its fully-factored form.

Diagonal Matrices: A Free Lunch

For a diagonal matrix the entire computation collapses to inspection — no characteristic polynomial needed.

If DRn×nD \in \mathbb{R}^{n \times n} is diagonal with entries d1,d2,,dnd_1, d_2, \ldots, d_n along its diagonal, then:

  • The eigenvalues of DD are exactly the diagonal entries: λi=di\lambda_i = d_i for i{1,,n}i \in \{1, \ldots, n\}.
  • The corresponding eigenvectors are the standard basis vectors: Dei=dieiD \mathbf{e}_i = d_i \mathbf{e}_i.

The reason is mechanical: multiplying DD by ei\mathbf{e}_i picks out the ii-th column of DD, and since DD is diagonal, that column is dieid_i \mathbf{e}_i. So ei\mathbf{e}_i maps to itself, scaled by the matching diagonal entry — the very definition of an eigenpair.

The characteristic polynomial confirms it directly: DλID - \lambda I is also diagonal, with entries diλd_i - \lambda along its diagonal, and the determinant of a diagonal matrix is the product of its diagonal entries, so

pD(λ)=det(DλI)=i=1n(diλ)p_D(\lambda) = \det(D - \lambda I) = \prod_{i=1}^{n} (d_i - \lambda)

which is already in factored form — its roots are visibly d1,,dnd_1, \ldots, d_n.

D=(300010007)D = \begin{pmatrix} 3 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 7 \end{pmatrix}

has eigenvalues λ1=3, λ2=1, λ3=7\lambda_1 = 3,\ \lambda_2 = -1,\ \lambda_3 = 7 with corresponding eigenvectors e1,e2,e3\mathbf{e}_1, \mathbf{e}_2, \mathbf{e}_3. The characteristic polynomial is

pD(λ)=(3λ)(1λ)(7λ)p_D(\lambda) = (3 - \lambda)(-1 - \lambda)(7 - \lambda)

with roots reading straight off the diagonal.