$\newcommand{\la}{\langle}$ $\newcommand{\ra}{\rangle}$ $\newcommand{\vu}{\mathbf{u}}$ $\newcommand{\vv}{\mathbf{v}}$ $\newcommand{\vw}{\mathbf{w}}$ $\newcommand{\vzz}{\mathbf{z}}$ $\newcommand{\nc}{\newcommand}$ $\nc{\Cc}{\mathbb{C}}$ $\nc{\Rr}{\mathbb{R}}$ $\nc{\Qq}{\mathbb{Q}}$ $\nc{\Nn}{\mathbb{N}}$ $\nc{\cB}{\mathcal{B}}$ $\nc{\cE}{\mathcal{E}}$ $\nc{\cC}{\mathcal{C}}$ $\nc{\cD}{\mathcal{D}}$ $\nc{\mi}{\mathbf{i}}$ $\nc{\span}[1]{\langle #1 \rangle}$ $\nc{\ol}[1]{\overline{#1} }$ $\nc{\norm}[1]{\left\| #1 \right\|}$ $\nc{\abs}[1]{\left| #1 \right|}$ $\nc{\vz}{\mathbf{0}}$ $\nc{\vo}{\mathbf{1}}$ $\nc{\DMO}{\DeclareMathOperator}$ $\DMO{\tr}{tr}$ $\DMO{\nullsp}{nullsp}$ $\nc{\va}{\mathbf{a}}$ $\nc{\vb}{\mathbf{b}}$ $\nc{\vx}{\mathbf{x}}$ $\nc{\ve}{\mathbf{e}}$ $\nc{\vd}{\mathbf{d}}$ $\nc{\vh}{\mathbf{h}}$ $\nc{\ds}{\displaystyle}$ $\nc{\bm}[1]{\begin{bmatrix} #1 \end{bmatrix}}$ $\nc{\gm}[2]{\bm{\mid & \cdots & \mid \\ #1 & \cdots & #2 \\ \mid & \cdots & \mid}}$ $\nc{\MN}{M_{m \times n}(K)}$ $\nc{\NM}{M_{n \times m}(K)}$ $\nc{\NP}{M_{n \times p}(K)}$ $\nc{\MP}{M_{m \times p}(K)}$ $\nc{\PN}{M_{p \times n}(K)}$ $\nc{\NN}{M_n(K)}$ $\nc{\im}{\mathrm{Im\ }}$ $\nc{\ev}{\mathrm{ev}}$ $\nc{\Hom}{\mathrm{Hom}}$ $\nc{\com}[1]{[\phantom{a}]^{#1}}$ $\nc{\rBD}[1]{ [#1]_{\cB}^{\cD}}$ $\DMO{\id}{id}$ $\DMO{\rk}{rk}$ $\DMO{\nullity}{nullity}$ $\DMO{\End}{End}$ $\DMO{\proj}{proj}$ $\nc{\GL}{\mathrm{GL}}$

Systems of Linear Equations

Linear Algebra originated from the study of systems of linear equations.

Here is an example of a systems of 3 linear equations in 4 unknowns (or variables):

$$ \begin{align*} -x_1 -2x_2 +x_3 + x_4 &= -2 \\ x_1 +2x_2 -2x_3 &= 3 \\ 2x_1 + 4x_2 -x_3 -3x_4 &= 3 \end{align*} \tag{1} $$

In general, a system of $m$ linear equations in $n$ unknown can be written as

$$ \begin{equation} \begin{aligned} a_{11}x_1 + a_{12}x_2 &+ \cdots + a_{1n}x_n = b_1 \\ a_{21}x_1 + a_{22}x_2 &+ \cdots + a_{2n}x_n = b_2 \\ & \vdots & \\ a_{m1}x_1 + a_{m2}x_2 &+ \cdots + a_{mn}x_n = b_m \end{aligned} \end{equation} $$

Using matrix multiplication, it can be written as $A\vx = \vb$ where $A = [a_{ij}]$ and $\vb = \bm{b_1 \\ \vdots \\ b_m}$ are unknown, respectively, as the coefficient matrix and the constant vector of the system. The whole system can be captured succinctly by it's augmented matrix $M = [A|\vb]$. The system is over $K$ if its augmented matrix is over $K$. For example, our first system is over $\Qq$.

Checkpoint. Write down the augmented matrix for System (1).

Let $A\vx = \vb$ be a system of $m$ linear equations in $n$ unknonws over a field $K$. Let $L$ be a field extending $K$. (e.g. think of $K$ as $\Qq$ and $L$ as $\Rr$).

A solution to the system in $L$ is a column vector $\vv \in L^n$ so that $A\vv = \vb$. A system is consistent if it has a solution in some field extending $K$.

Checkpoint. Check that $\bm{1 \\ 0 \\-1 \\0}$, $\bm{3\\0\\0\\1}$ and $\bm{-1\\1\\-1\\0}$ are solutions to the first system.

The most obvious inconsistent system is (in $n$ unknowns):

$$ 0x_1 + 0x_2 + \cdots + 0x_n = 1. $$

However, it is not always easy to spot inconsistency.

Question. Is the following system consistent ?

$$ \begin{align*} x_1 -2x_3 + x_4 &= 1 \\ 3x_1 +x_2 -4x_3+ 4x_4 &= 2 \\ x_2 + 2x_3 + x_4 &=0 \end{align*} $$

We will discuss how decide consistency in the next lecture.

A system (of $m$ equations and $n$ unknown) is homogeneous if it constant vector is a zero vector, i.e. a system of the form $A\vx = \vz\ (\in K^m)$.

A homogeneous system must be consistent since $\vz \in K^n$ is clearly a solution and we call it the trivial solution.

It is a property of linear equations that a consistent system over $K$ must already have a solution over $K$.

(Compare this with the quadratic case: $x^2 + 1 = 0$ is a quadratic equation over $\Rr$ which does not have any real solutions.)

Structure of solution sets

As the previous checkpoint shows, a system of linear equations may have many solutions.

By solving a system we mean a description of its solution set.

This task is easy if the system's augmented matrix is in a special form, know as the row reduced echelon form (or rref, in short).

Before, we give the formal definition, let's see an example.

Example. Consider the following system of linear equations:

$$ \begin{align*} x_1 + 2x_3 - x_4 &= 3 \\ x_2 -2x_3 -x_4 &= -2 \end{align*} \tag{2} $$

Checkpoint. Write out the augmented matrix of this system.

A moment of inspection tells us that $x_1$ and $x_2$ are readily expressible in terms of $x_3$ and $x_4$. $$ \begin{align*} x_1 & = 3 -2x_3 +x_4 \\ x_2 &= -2 + 2x_3 +x_4 \end{align*} $$ and $x_3,x_4 \in K$ can be choosen freely.

Let's rename the free variables $x_3$ to $s_1$ and $x_4$ to $s_2$ (we will explain why we do this later), and express the solutions as

$$ \begin{align*} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \\x_4 \end{bmatrix} = \begin{bmatrix} 3 - 2s_1 + s_2 \\ -2 +2s_1 +s_2 \\ s_1 \\ s_2 \end{bmatrix} \end{align*} = \bm{3\\-2\\0\\0} + s_1\bm{-2\\2\\1\\0} + s_2\bm{1\\1\\0\\1} $$

From this we get a parametrization of the solution set as:

$$ S = \begin{bmatrix} 3 \\ -2 \\ 0 \\ 0 \end{bmatrix}+ \left\{ s_1 \begin{bmatrix} -2 \\ 2 \\ 1 \\ 0 \end{bmatrix} +s_2 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 1 \end{bmatrix} \colon s_1,s_2 \in K \right\} $$

It is of the form $\vv_0 + H$ where $\vv_0$ is a particular solution of the system and $H$ is copy of, in this case, $K^2$ inside $K^4$ parameterized by

$$ \bm{s_1 \\s_2} \in K^2 \mapsto s_1 \begin{bmatrix} -2 \\ 2 \\ 1 \\ 0 \end{bmatrix} +s_2 \begin{bmatrix} 1 \\ 1 \\ 0 \\ 1 \end{bmatrix} \in K^4 $$

We rename the free variables $x_3,x_4$ because we parametrize the vectors of $H$ as by vectors $(s_1,s_2)$ of a copy of $K^2$ and not by the $x_3x_4$-plane inside the $K^4$.

It is not a coincidence that the solution set of system (2) take the form described above. In fact,

Theorem. The solution set of a consistent system $A\vx = \vb$ has the form $\vv_0 + H$ where $\vv_0$ is a (any) solution of the system and $H$ is the solution set of its associated homogenous system $A\vx = \vz$.

Proof. Suppose $\vv \in \vv_0 + H$. So, $\vv = \vv_0 + \vh$ for some $\vh \in H$. Thus, $$ A\vv = A(\vv_0 + \vh) = A\vv_0 + A\vh = \vb + \vz =\vb.$$ That means $\vv$ is a solution of $A\vx = \vb$.

Conversely, suppose $\vv$ is a solution of the system. Then $A(\vv - \vv_0) = A\vv -A\vv_0 = \vb - \vb = \vz$. Therefore, $\vv-\vv_0 \in H$ and so $\vv \in \vv_0 + H$.

The set $H = \{\vv \in K^n \colon A\vv = \vz\}$ is called the nullspace of $A$ (aka the kernel of $A$).

The name nullspace suggests that $H$ has some extra structure (other than just being a set).

Checkpoint We have mentioned that $H$ must be nonempty since $\vz \in H$. Check also that

  1. $\vh + \vh' \in H$ for any $\vh, \vh' \in H$. (So $H$ is closed under matrix addition); and
  2. $c\vh \in H$ for any $\vh \in H$ and any scalar $c$. (i.e. $H$ is closed under scalar multiplication).