2.4 Determinants

You can check directly that a 2✕2 matrix \(A=\begin{pmatrix} a&b\\c&d \end{pmatrix}\) is invertible if and only if \(ad-bc\neq 0\). The quantity \(ad-bc\) is called the determinant of A, and we want to generalize this to larger matrices.

Definition 2.12 Let \(A=(a_{ij})\) be n✕n. The determinant of A, written \(\det (A)\), is the number

\[ \sum_{\sigma \in S_n} \sgn(\sigma) a_{1,\sigma(1)}a_{2,\sigma(2)}\cdots a_{n,\sigma(n)}. \]

Example 2.6

\(S_2=\{\id, (1,2)\}\) so the determinant of \(A=(a_{ij})\) is \[ a_{11}a_{22} - a_{12}a_{21}\] where the first term is the one arising from \(\sigma=\id\) and the second is from \(\sigma=(1,2)\). This is equivalent to the expression \(ad-bd\) at the start of this subsection.
\(S_3=\{\mathrm{id}, (1,2,3), (1,3,2), (1,2), (1,3), (2,3)\}\) and the determinant of \(A=(a_{ij})\) is \[ a_{11}a_{22}a_{33}+a_{12}a_{23}a_{31}+a_{13}a_{21}a_{32} -a_{12}a_{21}a_{33}-a_{13}a_{22}a_{31} - a_{11}a_{23}a_{32}. \]

Our aims in this section are to show that A is invertible if and only if \(\det A\neq 0\) for square matrices A of any size, and to investigate other ways of computing \(\det A\): summing over all elements \(\sigma \in S_n\) quickly becomes unmanageable as \(|S_n|=n!\) can be very large.

Proposition 2.7 Let \(A=(a_{ij})\) be a square matrix and \(\lambda\) a number.

If A has a row or a column of zeroes, then \(\det A=0\).
\(\det A = \det A^T\).
If A has two equal columns or two equal rows, then \(\det A=0\).
\(\det I_n = 1\).

Proof.

Suppose the ith row of A is zero, so \(a_{ij}=0\) for all j. Every term in the sum defining \(\det A\) involves some \(a_{ij}\) as a factor, so is zero. Thus \(\det A=0\). The argument for a column of zeroes is similar.
The i, j entry of \(A^T\) is \(a_{ji}\), so \[ \det A^T = \sum_{\sigma \in S_n} \sgn(\sigma) a_{\sigma(1),1}\cdots a_{\sigma(n),n}.\] Notice that \[ a_{\sigma(1),1}\cdots a_{\sigma(n),n} = a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)}\] as the terms on each side are the same, just written in a different order. Therefore \[\begin{align*} \det A^T& = \sum_{\sigma\in S_n} \sgn(\sigma) a_{\sigma(1),1}\cdots a_{\sigma(n),n} \\ &= \sum_{\sigma \in S_n} \sgn(\sigma) a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)} & \text{as above} \\ &= \sum_{\sigma \in S_n} \sgn(\sigma^{-1})a_{1,\sigma^{-1}(1)}\cdots a_{n,\sigma^{-1}(n)} & \text{as } \sgn(\sigma)=\sgn(\sigma^{-1}) \\ &= \sum_{\sigma \in S_n} \sgn(\sigma)a_{1,\sigma(1)}\cdots a_{n,\sigma(n)} \\ &= \det A \end{align*}\] where the second-to-last equality is because the permutations \(\sigma^{-1}\) for \(\sigma \in S_n\) are just the permutations in \(S_n\), written in a different order.
Suppose \(p<q\) and \(a_{pj}=a_{qj}\) for all j. Let \(\sigma \in S_n\) and consider the terms in the sum defining \(\det A\) corresponding to \(\sigma\) and \(\sigma (p,q)\). Since \(a_{p,\sigma(p)}=a_{q, \sigma(p)}= a_{q, \sigma (p,q) (q)}\) and \(a_{q,\sigma(q)}=a_{p,\sigma(p,q)(p)}\) the product of entries of A is exactly the same for \(\sigma (p,q)\) as it is for \(\sigma\), but \(\sgn \sigma = - \sgn (\sigma (p,q))\). It follows that the two terms add to zero. Pairing up all terms in the sum this way, we see that it is zero. If A has two equal columns, then \(\det A=\det A^T\) by the last part and \(A^T\) has two equal rows so has determinant zero by the argument above.
Let \(I_n = (\delta_{ij})\), so \(\delta_{ij}=1\) if \(i=j\) and 0 otherwise. For \[ \delta_{1,\sigma(1)}\cdots \delta_{n,\sigma(n)} \neq 0\] we must have \(\sigma(1)=1\), \(\sigma(2)=2\), and so on, so the only nonzero term in the sum defining \(\det I_n\) is the \(\sigma=\id\) term, which has the value \[\sgn(\id)\delta_{11}\cdots \delta_{nn}=1. \]

Proposition 2.8 Let \(A=(a_{ij})\) be a square matrix and r be a row op. Then

If r is \(r_i \mapsto \lambda r_i\) then \(\det r(A) = \lambda \det A\),
if r is \(r_i \mapsto r_i + \lambda r_j\) then \(\det r(A)=\det A\), and
if r swaps rows i and j then \(\det r(A)=-\det A\).

The crucial point here is that doing row operations can change the determinant (in a predictable way), but it can never change the determinant from being zero to being nonzero, or from being nonzero to being zero.

Proof.

Each term in the sum defining \(\det r(A)\) is the same as the corresponding term in \(\det A\), but with exactly one factor of \(\lambda\). Taking these factors outside the sum we get \(\det r(A) = \lambda \det A\).
Let \(r(A)=(b_{ij})\), so \(b_{pq}=a_{pq}\) if \(p \neq j\) and \(b_{jq}=a_{jq}+\lambda a_{iq}\). The term \[ b_{1,\sigma(1)} \cdots b_{n,\sigma(n)}\] from the sum defining the determinant of \(r_{ij}(\lambda)(A)\) then equals \[\begin{multline*} a_{1,\sigma(1)}\cdots a_{j-1, \sigma(j-1)} (a_{j, \sigma(j)}+\lambda a_{i, \sigma(j)}) a_{j+1,\sigma(j+1)}\cdots a_{n,\sigma(n)} =\\ a_{1,\sigma(1)}\cdots a_{n,\sigma(n)} + \lambda a_{1,\sigma(1)}\cdots a_{j-1,\sigma(j-1)} a_{i,\sigma(j)} a_{j+1,\sigma(j+1)}\cdots a_{n,\sigma(n)}.\end{multline*}\] The first term on the right of this equation is the term in the sum defining \(\det A\) coming from the permutation \(\sigma\). The second term on the right of this equation is \(\lambda\) times the term coming from the permutation \(\sigma\) in the sum defining the determinant of a matrix which is the same as A, except with the ith row of A repeated in place of the jth row. Such a matrix has two repeated rows so its determinant is zero. Summing over all \(\sigma\in S_n\) gives \[\det r(A)= \det A + \det B \] where B is a matrix with two repeated rows, so \(\det B=0\).
\(r(A)\) is the matrix whose p, q entry is \(a_{pq}\) if \(p\neq i,j\), \(a_{jq}\) if \(p=i\), and \(a_{iq}\) if \(p=j\). The term in the sum defining \(\det r(A)\) corresponding to \(\sigma\in S_n\) is \[ \sgn(\sigma) a_{1, \sigma(1)} \cdots a_{i-1, \sigma(i-1)} a_{j,\sigma(i)} a_{i+1,\sigma(i+1)}\cdots a_{j-1, \sigma(j-1)} a_{j, \sigma(i)} a_{j+1,\sigma(j+1)} \cdots a_{n,\sigma(n)}.\] The product of \(a_{rs}\)s appearing here is the same as the one in the term of the sum defining \(\det A\) corresponding to \(\sigma (i,j)\). But \(\sgn (\sigma) = - \sgn (\sigma (i,j))\), so each term in the sum for \(\det r(A)\) appears with the opposite sign in \(\det A\).

Theorem 2.3 Let A be a square matrix. Then A is invertible if and only if \(\det A \neq 0\).

Proof. Proposition 2.8 shows that doing a row operation to A doesn’t change whether or not its determinant is zero, so used repeatedly shows that \(\det A \neq 0\) if and only if the determinant of the RRE form of A is not 0.

If A is invertible its RRE form is \(I_n\) which has nonzero determinant, so A has nonzero determinant. If A is not invertible its RRE form has a row of zeroes, so its determinant is zero by proposition 2.7, so \(\det A=0\).

The next lemma is a small step on the way to proving that \(\det(AB) = \det(A)\det(B)\) for square matrices A and B.

Lemma 2.5 Let A be an n ✕ n matrix and r a row op. Then \(\det (r(I_n) A)=\det r(I_n) \det A\).

Proof. The special case \(A=I_n\) of Proposition 2.8 says

\(\det r(I_n)=\lambda\det I_n = \lambda\) if \(r=r_i(\lambda)\)
\(\det r(I_n)=\det I_n=1\) if \(r=r_{ij}(\lambda)\), and
\(\det r(I_n)=-\det I_n=-1\) if \(r=r_{ij}\).

This means we could reformulate proposition 2.8 as saying \(\det r(A) = \det(r(I_n))\det(A)\).

Now proposition 2.3 says \(r(I_n)A = r(A)\), so \(\det(r(I_n)A) = \det(r(A)) = \det(r(I_n))\det(A)\).

Theorem 2.4 Let A and B be n✕n matrices. Then \(\det (AB)=\det A \det B\).

Proof.

Case 1: A and B are both invertible. As in the proof of Theorem 2.2, we can write A and B as products of elementary matrices: \[\begin{align*} A &= r_1(I)\cdots r_l(I) \\ B &= s_1(I)\cdots s_m(I). \end{align*}\] Now by Lemma 2.5, for any two row ops r and s we have \(\det (r(I)s(I))=\det r(I) \det s(I)\). Using this repeatedly: \[\begin{align*} \det A &= \det r_1(I)\cdots \det r_l(I) \\ \det B &= \det s_1(I)\cdots \det s_m(I) \\ \det (AB)&= \det (r_1(I)\cdots r_l(I)s_1(I)\cdots s_m(I))\\ &= \det r_1(I) \cdots \det r_l(I) \det s_1(I) \cdots \det s_m(I) \\ &= \det A \det B. \end{align*}\]
Case 2: B isn’t invertible. By Corollary 2.2, there is a nonzero vector \(\mathbf{v}\) with \(B \mathbf{v}=\mathbf{0}\). Then \(AB \mathbf{v}=A\mathbf{0}=\mathbf{0}\), so \(AB\) isn’t invertible. Thus \(\det AB=0\) and \(\det B=0\) so certainly \(\det AB=\det A \det B\).
Case 3: B is invertible but A isn’t. By Corollary 2.2 again, there is a nonzero vector \(\mathbf{v}\) with \(A\mathbf{v}=\mathbf{0}\). Then \(B^{-1}\mathbf{v}\) is nonzero too, and \(AB (B^{-1}\mathbf{v})=\mathbf{0}\) so \(AB\) isn’t invertible and \(\det AB=\det A \det B\) as in case 2.

Corollary 2.3

If A is invertible then \(\det(A^{-1})=(\det A)^{-1}\).
If a square matrix A has a left or right inverse then it is invertible.

Proof.

Apply the previous result to \(AA^{-1}=I_n\).
If \(AB=I_n\) then \(\det(AB)=\det(A)\det(B)=1\), so \(\det(A)\neq 0\), so A is invertible. The case of a left-inverse is similar.

Finally we look at computing determinants by row and column expansions.

Definition 2.13 The i, j minor of a square matrix A, written \(A_{ij}\), is the determinant of the matrix obtained from A by deleting row i and column j from A.

Definition 2.14 For \(1 \leq i \leq n\) the ith row expansion of an n✕n matrix \(A= (a_{ij})\) is \[\begin{equation*} r_i(A) =\sum_{r=1}^n (-1)^{i+r}a_{ir}A_{ir}. \end{equation*}\]

Definition 2.15 For \(1 \leq j \leq n\) the jth column expansion of an n✕n matrix \(A= (a_{ij})\) is \[\begin{equation*} c_j(A)= \sum_{r=1}^n (-1)^{r+j} a_{rj} A_{rj}. \end{equation*}\]

It turns out that each of the row and column expansions actually equal \(\det A\).

Example 2.7 Let’s look at the 2✕2 case to try and convince ourselves that these definitions really do all produce the determinant. Let \[A = \begin{pmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{pmatrix}. \] The first row expansion is \[\begin{align*} \sum_{r=1}^2 (-1)^{1+r}a_{1r}A_{1r} &= (-1)^{1+1}a_{11} A_{11} + (-1)^{1+2}a_{12}A_{12}\\ &= a_{11}a_{22}-a_{12}a_{21} \end{align*}\] which agrees with our old definition of the determinant. Now the second column expansion: \[\begin{align*} \sum_{r=1}^2 (-1)^{r+2} a_{r2} A_{r2} &= (-1)^{1+2}a_{12} A_{12} + (-1)^{2+2}a_{22}A_{22} \\ &= -a_{12}a_{21} + a_{22}a_{11} \end{align*}\] which again agrees with our previous definition.

The general proof that row and column expansions really do equal the determinant is slightly messy. Here’s a special case:

Proposition 2.9 Let A be a square matrix. Then \(c_n(A)=\det A\).

Proof. \[\begin{align*} c_n(A)&= \sum_{r=1}^n (-1)^{n+r} a_{rn} A_{rn} \\ &= \sum_{r=1}^n (-1)^{n+r}a_{rn}\sum_{\sigma\in S_{n-1}} \sgn(\sigma) a_{1,\sigma(1)} \cdots a_{r-1,\sigma(r-1)}a_{r+1,\sigma(r)}\cdots a_{n,\sigma(n-1)} \\ &= \sum_{r=1}^n \sum_{\sigma\in S_{n-1}} (-1)^{n+r}\sgn(\sigma)a_{1,\sigma(1)}\cdots a_{r-1,\sigma(r-1)}a_{r,n}a_{r+1,\sigma(r)}\cdots a_{n,\sigma(n-1)} \end{align*}\] The permutation \[ \begin{pmatrix} 1 & 2 & \cdots & r-1 & r & r+1 & \cdots & n \\ \sigma(1) & \sigma(2) & \cdots & \sigma(r-1) & n & \sigma(r) & \cdots & \sigma(n-1) \end{pmatrix}\] obtained by inserting an n into the bottom row of \(\sigma\) at position r, equals \(\sigma (n,n-1,\ldots,r+1,r)\). The cycle has length \(n-r+1\) so the sign of \(\sigma (n,n-1,\ldots, r+1,r)\) is \(\sgn(\sigma) (-1)^{n-r}= \sgn(\sigma)(-1)^{n+r}\). Every permutation in \(S_n\) can be obtained in exactly one way by inserting n into the bottom row of a permutation from \(S_{n-1}\). So our sum is \[ \sum _{\tau \in S_n} \sgn(\tau) a_{1,\tau(1)}\cdots a_{n,\tau(n)} \] which is \(\det A\).