Matrices

I’ve greatly improved my knowledge of Mathematics in recent years. Poor maths skills were a huge gap in my knowledge. It became an unavoidable task to fill that gap when I started my quantitatively-focused Ph.D. program. Matrices in particular have always been a struggle for me. I tried to understand them time and time again, but I couldn’t seem to wrap my head around them. At first, I convinced myself I could avoid the material which used them. When that didn’t work, I convinced myself that I wasn’t smart enough to understand them. Something clicked when I began visualizing them as the math version of a spreadsheet. Here’s how I structured my thinking of them…

In mathematics, a matrix (plural: matrices) is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object. The dimensions of a matrix are denoted as $r \times c$, with the number of rows always being expressed first. Several operations can be performed on matrices; see this post for an introduction to Matrix Operations.

Notational Conventions

Some authors try to use notation that helps readers distinguish between matrices, vectors and scalars (numbers). Greek letters ($\alpha, \beta$…) might be used for numbers, lower-case letters ($a, x, f$…) might be used for vectors, and capital letters ($A, F$…) for matrices. Matrices are also often denoted in bold font ($\bf{G}$ rather than $G$), and vectors are often written with an arrow above the letter.

Useful Identities

Transpose of a product $(AB)^T = B^T A^T$. When you transpose a matrix, you basically “flip” each element diagonally. Check out my post on Matrix Operations for more info on how to do that. Here we’re saying that the transpose of two matrices A and B multiplied by each other is the same as the transpose of B multiplied by the transpose of A.

Transpose of a sum $(A+B)^T = A^T + B^T$. Similar to the identity above, but for the sum of A and B.

Inverse of a product: $(AB)^{-1} = B^{-1} A^{-1}$ provided A and B are square and invertible.

Products of powers: $A^kA^l = A^{k+l}$ for (k, l >= 1 in general, and for all k, l if A is invertible).

Types of Matrices

There are many types of matrix, such as…

Symmetric Matrix

A symmetric matrix is defined to be one for which $A' = A$, specifically that $a_{ij} = a_{ji} for i \neq j$. This property can only hold for square matrices $(m = n)$, since otherwise A’ and A won’t be of the same order (meaning not of the same size or shape). In plain English a symmetric matrix is one that is shaped like a box, for example, it has 3 rows and 3 columns, and is the same whether it’s transposed or not.

Identity Matrix

Also known as the unit matrix, this is a matrix of the order $(n \times n)$ which has zeros in all places except for the diagonals which are 1s.

\[\bf{I}_{3 \times 3} = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}\]

In the case of the identity matrix, $IA = AI = A$.

In the case of the identity matrix, $IA = AI = A$. The identity matrix is important, so you will encounter it frequently. It’s essentially matrix equivalent of “1”, whereby if you multiply $5 \times 1$ that’s equal to $1 \times 5$”. Two use cases come up frequently involving the identity matrix…

1. Determining whether one matrix is an inversion of the other Let’s say you have two matrices, $A$ and $B$, and you want to figure out whether $B$ is actually the inverted version of $A$, which would be $A^{-1}$. If you multiply $A \times B$, you’ll get the Identity Matrix as a result if $B$ is in-fact the inverse of A. If you don’t get an identity matrix as a result, then $B$ really is its own thing, and is not the inverse of $A$.

2. Simplifying Matrix Multiplication In paraphrasing this post let’s say I have the term $x + ax + bx = x(1 + a + b)$. I can calculate this in one of two days. I can the arithmetic on the left side of the equation, $x + ax + bx$ which involves two multiplications ($a \times x, b \times x$) and one addition ($+x$). Or I can do the arithmetic on the right side of the equation, $x(1 + a + b)$ which has two additions $(1 + a, a + b)$ and then one multiplication ($x(ab)$). The right side of the equation features one fewer instance of having to multiply something, and as matrix multiplication is computationally expensive, this is a useful simplification. The identity matrix comes into play here because instead of “1” in my example above, you’d use the Identify Matrix which is denoted as “$I$”.

Diagonal Matrix

A diagonal matrix is like the identity matrix, in that only the diagonal of the matrix is populated with values - everything else is 0. The identity matrix is a diagonal matrix, but a diagonal matrix might not necessarily be the identity matrix.

Idempotent Matrix

An idempotent matrix is a matrix which, when multiplied by itself, yields itself. If matrix A is a square, symmetric matrix such that $A’ = A$ then it will be idempotent if $A = A^2 = A^3 = …$

Null Matrix

The null matrix is simply a matrix of the order $(m \times n)$ which is populated entirely be zeros.

Jacobian Matrix

In vector calculus, the Jacobian Matrix of a vector function in several variables is the matrix of all its first order partial derivatives. When this matrix is square, its determinant is the Jacobian determinant. In essence this is a matrix of first order derivatives of a function.

Hessian Matrix

A Hessian Matrix is a matrix of second order derivatives of s scalar valued function. The hessian matrix of a multivariate function $f(x,y,z)$ organizes all second order partial derivatives into a matrix…

\[\mathbf H_f=\begin{bmatrix} \dfrac{\partial^2 f}{\partial x_1^2} & \dfrac{\partial^2 f}{\partial x_1\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_1\,\partial x_n} \\[2.2ex] \dfrac{\partial^2 f}{\partial x_2\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_2^2} & \cdots & \dfrac{\partial^2 f}{\partial x_2\,\partial x_n} \\[2.2ex] \vdots & \vdots & \ddots & \vdots \\[2.2ex] \dfrac{\partial^2 f}{\partial x_n\,\partial x_1} & \dfrac{\partial^2 f}{\partial x_n\,\partial x_2} & \cdots & \dfrac{\partial^2 f}{\partial x_n^2} \end{bmatrix}\]

Although a bit more advanced, the jist is that a Hessian matrix is a matrix of second-order partial derivatives of a function. They’re used extensively in Machine Learning, so if you’re headed down that path, I recommend (much) further reading on them.

Invertible Matrix

In linear algebra, an n-by-n square matrix A is called invertible (also nonsingular or nondegenerate), if there exists an n-by-n square matrix B such that $AB = BA = I_n$, where $I_n$ denotes the n-by-n identity matrix and the multiplication used is ordinary matrix multiplication. If this is the case, then the matrix B is uniquely determined by A, and is called the (multiplicative) inverse of A, denoted by A−1.

Matrix inversion is the process of finding the matrix B that satisfies the prior equation for a given invertible matrix A.

If the determinant of the matrix is nonzero then the matrix is Invertible.

Singular Matrix

A Singular Matrix is a matrix which does not have an inverse version of itself. In other words, a Singular Matrix is one which cannot be inverted.

Positive Semi-Definite Matrix

A Positive Semi-Definite Matrix, otherwise known as a Hermitian matrix, is a type of arithmetic Matrices, and whose Eigenvalues are nonnegative. A symmetric matrix M is positive semi-definite if the real number zMzT is positive for every nonzero real number vector z.