Matrices

A handful of small but recurring pieces of linear algebra used throughout the notes: the transpose of a matrix, the trace of a square matrix, and two named symmetry classes — symmetric and skew-symmetric matrices.

Transpose

The transpose of an $m \times n$ matrix $A$ , denoted $A^\top$ , is the $n \times m$ matrix obtained by swapping its rows and columns. If $A_{ij}$ denotes the entry in row $i$ , column $j$ , then:

(A^\top)_{ij} = A_{ji} \quad \text{for all } i, j

Concretely, the $i$ -th row of $A$ becomes the $i$ -th column of $A^\top$ :

A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \end{pmatrix} \quad \Longrightarrow \quad A^\top = \begin{pmatrix} a_{11} & a_{21} \\ a_{12} & a_{22} \\ a_{13} & a_{23} \end{pmatrix}

Geometrically, transposing a square matrix is the same as mirror-reflecting it across its main diagonal — the entries with $i = j$ . Each off-diagonal entry $A_{ij}$ gets sent to position $(j, i)$ ; the diagonal entries themselves stay put.

Symmetric Matrix

Some square matrices are unchanged by that diagonal reflection — the transpose does nothing to them. They earn a special name.

A square matrix $A \in \mathbb{R}^{n \times n}$ is symmetric if it equals its own transpose:

A \text{ symmetric} \iff A = A^\top \iff A_{ij} = A_{ji} \text{ for all } i, j \in \{1, \ldots, n\}

Equivalently, every entry above the main diagonal mirrors the one below it. Only square matrices can be symmetric — if $A$ is $m \times n$ with $m \neq n$ , then $A^\top$ has different dimensions ( $n \times m$ ) and the equality $A = A^\top$ cannot even be stated.

The matrix on the left is symmetric — mirror-reflecting it across the diagonal leaves it unchanged. The one on the right is not: the $(1, 2)$ entry is $2$ while the $(2, 1)$ entry is $4$ .

\begin{pmatrix} 1 & 2 & 3 \\ 2 & 5 & 7 \\ 3 & 7 & 9 \end{pmatrix} \quad \text{vs.} \quad \begin{pmatrix} 1 & 2 \\ 4 & 5 \end{pmatrix}

One property worth noting up front: every eigenvalue of a real symmetric matrix is itself a real number. A general real matrix can have complex eigenvalues — the characteristic polynomial need not factor over the reals — but symmetry forces every eigenvalue onto the real line. This is what makes sign-based reasoning about symmetric matrices (positive/negative definiteness, ordering eigenvalues by size, classifying Hessians, and so on) meaningful in the first place.

Skew-Symmetric Matrix

The mirror image of the symmetric case: instead of being unchanged by the diagonal reflection, the matrix flips sign.

A square matrix $A \in \mathbb{R}^{n \times n}$ is skew-symmetric (also called antisymmetric) if its transpose equals its negative:

A \text{ skew-symmetric} \iff A^\top = -A \iff A_{ij} = -A_{ji} \text{ for all } i, j \in \{1, \ldots, n\}

Setting $i = j$ in the entry-wise condition forces $A_{ii} = -A_{ii}$ , so every diagonal entry of a skew-symmetric matrix must be zero. Off-diagonal entries come in opposite-sign pairs across the diagonal — whatever sits at $(i, j)$ has its negative sitting at $(j, i)$ .

\begin{pmatrix} 0 & 2 & -3 \\ -2 & 0 & 5 \\ 3 & -5 & 0 \end{pmatrix}

The diagonal is all zeros, and each off-diagonal entry is the negative of its mirror across the diagonal — for instance, the $(1, 2)$ entry is $2$ while the $(2, 1)$ entry is $-2$ .

Diagonal Matrix

Yet another structural class of square matrix — this time defined by where the non-zero entries are allowed to sit.

A square matrix $D \in \mathbb{R}^{n \times n}$ is diagonal if every off-diagonal entry is zero:

D \text{ diagonal} \iff D_{ij} = 0 \text{ for all } i \neq j

Equivalently, $D$ has the form

D = \begin{pmatrix} d_1 & 0 & \cdots & 0 \\ 0 & d_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & d_n \end{pmatrix}

with all entries off the main diagonal forced to zero, and the diagonal entries $d_1, \ldots, d_n$ free to be any scalars.

A few quick observations follow from the definition:

Every diagonal matrix is automatically symmetric — both $D_{ij}$ and $D_{ji}$ are zero off the diagonal, so $D_{ij} = D_{ji}$ holds for free.
The trace is just $d_1 + d_2 + \cdots + d_n$ — the sum of the only non-zero entries.
Multiplying two diagonal matrices is entry-wise on the diagonal: if $D, E$ are diagonal with entries $d_i$ and $e_i$ , then $DE$ is also diagonal, with entries $d_i e_i$ .

D = \begin{pmatrix} 3 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 7 \end{pmatrix}

is diagonal with $d_1 = 3,\ d_2 = -1,\ d_3 = 7$ . Its trace is $3 + (-1) + 7 = 9$ .

The most important diagonal matrix is the one whose every diagonal entry is $1$ — it shows up so often that it gets its own name and symbol.

The identity matrix $I_n \in \mathbb{R}^{n \times n}$ is the diagonal matrix with every diagonal entry equal to $1$ :

I_n = \begin{pmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{pmatrix} \quad\Longleftrightarrow\quad (I_n)_{ij} = \delta_{ij}

where $\delta_{ij}$ is the Kronecker delta. It acts as the multiplicative identity for square matrices: $I_n A = A I_n = A$ for every $A \in \mathbb{R}^{n \times n}$ .

When the size is clear from context, the subscript is dropped — $I$ alone refers to the identity matrix of whatever size makes the surrounding expression dimensionally consistent.

Trace

A scalar quantity attached to a square matrix — collapsing the whole $n \times n$ block down to a single number by summing along the diagonal.

The trace of a square matrix $A \in \mathbb{R}^{n \times n}$ , denoted $\operatorname{tr}(A)$ , is the sum of its diagonal entries:

\operatorname{tr}(A) = \sum_{i=1}^{n} A_{ii} = A_{11} + A_{22} + \cdots + A_{nn}

Only diagonal entries contribute; off-diagonal entries are ignored entirely. Like the transpose, the trace cares about both rows and columns being indexed by the same range — so it is undefined for non-square matrices.

\operatorname{tr}\begin{pmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{pmatrix} = 1 + 5 + 9 = 15

A few useful properties for $A, B \in \mathbb{R}^{n \times n}$ and $\lambda \in \mathbb{R}$ :

Linearity: $\operatorname{tr}(A + B) = \operatorname{tr}(A) + \operatorname{tr}(B)$ and $\operatorname{tr}(\lambda A) = \lambda \operatorname{tr}(A)$
Transpose invariance: $\operatorname{tr}(A^\top) = \operatorname{tr}(A)$ — transposing preserves the diagonal, so it preserves the sum
Cyclic property: $\operatorname{tr}(AB) = \operatorname{tr}(BA)$ whenever both products are defined, even though $AB \neq BA$ in general