Matrices

A handful of small but recurring pieces of linear algebra used throughout the notes: the transpose of a matrix, the trace of a square matrix, and two named symmetry classes — symmetric and skew-symmetric matrices.

Transpose

The transpose of an m×nm \times n matrix AA, denoted AA^\top, is the n×mn \times m matrix obtained by swapping its rows and columns. If AijA_{ij} denotes the entry in row ii, column jj, then:

(A)ij=Ajifor all i,j(A^\top)_{ij} = A_{ji} \quad \text{for all } i, j

Concretely, the ii-th row of AA becomes the ii-th column of AA^\top:

A=(a11a12a13a21a22a23)A=(a11a21a12a22a13a23)A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{pmatrix} \quad \Longrightarrow \quad A^\top = \begin{pmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \end{pmatrix}

Geometrically, transposing a square matrix is the same as mirror-reflecting it across its main diagonal — the entries with i=ji = j. Each off-diagonal entry AijA_{ij} gets sent to position (j,i)(j, i); the diagonal entries themselves stay put.

Symmetric Matrix

Some square matrices are unchanged by that diagonal reflection — the transpose does nothing to them. They earn a special name.

A square matrix ARn×nA \in \mathbb{R}^{n \times n} is symmetric if it equals its own transpose:

A symmetric    A=A    Aij=Aji for all i,j{1,,n}A \text{ symmetric} \iff A = A^\top \iff A_{ij} = A_{ji} \text{ for all } i, j \in \{1, \ldots, n\}

Equivalently, every entry above the main diagonal mirrors the one below it. Only square matrices can be symmetric — if AA is m×nm \times n with mnm \neq n, then AA^\top has different dimensions (n×mn \times m) and the equality A=AA = A^\top cannot even be stated.

The matrix on the left is symmetric — mirror-reflecting it across the diagonal leaves it unchanged. The one on the right is not: the (1,2)(1, 2) entry is 22 while the (2,1)(2, 1) entry is 44.

(123257379)vs.(1245)\begin{pmatrix} 1 & 2 & 3 \\ 2 & 5 & 7 \\ 3 & 7 & 9 \end{pmatrix} \quad \text{vs.} \quad \begin{pmatrix} 1 & 2 \\ 4 & 5 \end{pmatrix}

One property worth noting up front: every eigenvalue of a real symmetric matrix is itself a real number. A general real matrix can have complex eigenvalues — the characteristic polynomial need not factor over the reals — but symmetry forces every eigenvalue onto the real line. This is what makes sign-based reasoning about symmetric matrices (positive/negative definiteness, ordering eigenvalues by size, classifying Hessians, and so on) meaningful in the first place.

Skew-Symmetric Matrix

The mirror image of the symmetric case: instead of being unchanged by the diagonal reflection, the matrix flips sign.

A square matrix ARn×nA \in \mathbb{R}^{n \times n} is skew-symmetric (also called antisymmetric) if its transpose equals its negative:

A skew-symmetric    A=A    Aij=Aji for all i,j{1,,n}A \text{ skew-symmetric} \iff A^\top = -A \iff A_{ij} = -A_{ji} \text{ for all } i, j \in \{1, \ldots, n\}

Setting i=ji = j in the entry-wise condition forces Aii=AiiA_{ii} = -A_{ii}, so every diagonal entry of a skew-symmetric matrix must be zero. Off-diagonal entries come in opposite-sign pairs across the diagonal — whatever sits at (i,j)(i, j) has its negative sitting at (j,i)(j, i).

(023205350)\begin{pmatrix} 0 & 2 & -3 \\ -2 & 0 & 5 \\ 3 & -5 & 0 \end{pmatrix}

The diagonal is all zeros, and each off-diagonal entry is the negative of its mirror across the diagonal — for instance, the (1,2)(1, 2) entry is 22 while the (2,1)(2, 1) entry is 2-2.

Diagonal Matrix

Yet another structural class of square matrix — this time defined by where the non-zero entries are allowed to sit.

A square matrix DRn×nD \in \mathbb{R}^{n \times n} is diagonal if every off-diagonal entry is zero:

D diagonal    Dij=0 for all ijD \text{ diagonal} \iff D_{ij} = 0 \text{ for all } i \neq j

Equivalently, DD has the form

D=(d1000d2000dn)D = \begin{pmatrix} d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n \end{pmatrix}

with all entries off the main diagonal forced to zero, and the diagonal entries d1,,dnd_1, \ldots, d_n free to be any scalars.

A few quick observations follow from the definition:

  • Every diagonal matrix is automatically symmetric — both DijD_{ij} and DjiD_{ji} are zero off the diagonal, so Dij=DjiD_{ij} = D_{ji} holds for free.
  • The trace is just d1+d2++dnd_1 + d_2 + \cdots + d_n — the sum of the only non-zero entries.
  • Multiplying two diagonal matrices is entry-wise on the diagonal: if D,ED, E are diagonal with entries did_i and eie_i, then DEDE is also diagonal, with entries dieid_i e_i.
D=(300010007)D = \begin{pmatrix} 3 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 7 \end{pmatrix}

is diagonal with d1=3, d2=1, d3=7d_1 = 3,\ d_2 = -1,\ d_3 = 7. Its trace is 3+(1)+7=93 + (-1) + 7 = 9.

The most important diagonal matrix is the one whose every diagonal entry is 11 — it shows up so often that it gets its own name and symbol.

The identity matrix InRn×nI_n \in \mathbb{R}^{n \times n} is the diagonal matrix with every diagonal entry equal to 11:

In=(100010001)(In)ij=δijI_n = \begin{pmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{pmatrix} \quad\Longleftrightarrow\quad (I_n)_{ij} = \delta_{ij}

where δij\delta_{ij} is the Kronecker delta. It acts as the multiplicative identity for square matrices: InA=AIn=AI_n A = A I_n = A for every ARn×nA \in \mathbb{R}^{n \times n}.

When the size is clear from context, the subscript is dropped — II alone refers to the identity matrix of whatever size makes the surrounding expression dimensionally consistent.

Trace

A scalar quantity attached to a square matrix — collapsing the whole n×nn \times n block down to a single number by summing along the diagonal.

The trace of a square matrix ARn×nA \in \mathbb{R}^{n \times n}, denoted tr(A)\operatorname{tr}(A), is the sum of its diagonal entries:

tr(A)=i=1nAii=A11+A22++Ann\operatorname{tr}(A) = \sum_{i=1}^{n} A_{ii} = A_{11} + A_{22} + \cdots + A_{nn}

Only diagonal entries contribute; off-diagonal entries are ignored entirely. Like the transpose, the trace cares about both rows and columns being indexed by the same range — so it is undefined for non-square matrices.

tr(123456789)=1+5+9=15\operatorname{tr}\begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix} = 1 + 5 + 9 = 15

A few useful properties for A,BRn×nA, B \in \mathbb{R}^{n \times n} and λR\lambda \in \mathbb{R}:

  • Linearity: tr(A+B)=tr(A)+tr(B)\operatorname{tr}(A + B) = \operatorname{tr}(A) + \operatorname{tr}(B) and tr(λA)=λtr(A)\operatorname{tr}(\lambda A) = \lambda \operatorname{tr}(A)
  • Transpose invariance: tr(A)=tr(A)\operatorname{tr}(A^\top) = \operatorname{tr}(A) — transposing preserves the diagonal, so it preserves the sum
  • Cyclic property: tr(AB)=tr(BA)\operatorname{tr}(AB) = \operatorname{tr}(BA) whenever both products are defined, even though ABBAAB \neq BA in general