Linear Algebra#

What are Matrices and Vectors?#

Linear algebra provides a way of compactly representing and operating on sets of linear equations. For example, consider the following system of equations:

\[\begin{split} \begin{aligned} 4x_{1} - 5x_{2} &= -13 \\ -2x_{1} + 3x_{2} &= 9 \end{aligned} \end{split}\]

In matrix and vector notation, we can write the system more compactly as:

\[A \mathbf{x} = \mathbf{b}\]

where:

\[\begin{split} A = \begin{bmatrix} 4 & -5 \\ -2 & 3 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} -13 \\ 9 \end{bmatrix} \end{split}\]

Here, \(A\) is a matrix and \(\mathbf{b}\) is a vector.

Definition 1: Matrices#

A set of numbers arranged in rows and columns is called a matrix. By \(A \in \mathbb{R}^{m \times n}\), we denote a matrix with \(m\) rows and \(n\) columns. In the above example, \(A \in \mathbb{R}^{2 \times 2}\).

Definition 2: Vectors#

A set of numbers arranged in one dimension is called a vector. By \(\mathbf{x} \in \mathbb{R}^{n}\), we denote a vector with \(n\) entries. In the above example, \(\mathbf{b} \in \mathbb{R}^{2}\). By convention, an \(n\)-dimensional vector is often thought of as a matrix with \(n\) rows and 1 column (a column vector). If we want to explicitly represent a row vector—a matrix with 1 row and \(n\) columns—we typically write \(\mathbf{x}^{T}\) (pronounced “x transpose”).

Notation#

  • The \(i\)-th element of a vector \(\mathbf{x}\) is denoted \(x_{i}\):

\[\begin{split} \mathbf{x} = \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{n} \end{bmatrix} \end{split}\]
  • The entry of matrix \(A\) in the \(i\)-th row and \(j\)-th column is denoted \(a_{ij}\):

\[\begin{split} A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix} \end{split}\]

Matrix and Vector Operations#

Addition, Subtraction, and Scalar Multiplication#

For Matrices:

  • Addition/Subtraction: Two matrices \(A\) and \(B\) can be added or subtracted if they are of the same size (\(A, B \in \mathbb{R}^{m \times n}\)). The operations are done element-wise:

\[\begin{split} A \pm B = \begin{bmatrix} a_{11} \pm b_{11} & a_{12} \pm b_{12} & \cdots & a_{1n} \pm b_{1n} \\ a_{21} \pm b_{21} & a_{22} \pm b_{22} & \cdots & a_{2n} \pm b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} \pm b_{m1} & a_{m2} \pm b_{m2} & \cdots & a_{mn} \pm b_{mn} \end{bmatrix} \end{split}\]
  • Scalar Multiplication: Each element of matrix \(A\) is multiplied by scalar \(k\):

\[\begin{split} kA = \begin{bmatrix} ka_{11} & ka_{12} & \cdots & ka_{1n} \\ ka_{21} & ka_{22} & \cdots & ka_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ ka_{m1} & ka_{m2} & \cdots & ka_{mn} \end{bmatrix} \end{split}\]

For Vectors:

  • Addition/Subtraction: Two vectors \(\mathbf{a}\) and \(\mathbf{b}\) can be added or subtracted if they are of the same size (\(\mathbf{a}, \mathbf{b} \in \mathbb{R}^{n}\)):

\[\begin{split} \mathbf{a} \pm \mathbf{b} = \begin{bmatrix} a_{1} \pm b_{1} \\ a_{2} \pm b_{2} \\ \vdots \\ a_{n} \pm b_{n} \end{bmatrix} \end{split}\]
  • Scalar Multiplication: Each element of vector \(\mathbf{a}\) is multiplied by scalar \(k\):

\[\begin{split} k\mathbf{a} = \begin{bmatrix} ka_{1} \\ ka_{2} \\ \vdots \\ ka_{n} \end{bmatrix} \end{split}\]

Matrix and Vector Multiplication#

Matrix-Vector Multiplication#

Given \(A \in \mathbb{R}^{m \times n}\) and \(\mathbf{x} \in \mathbb{R}^{n}\), their product \(\mathbf{y} = A\mathbf{x} \in \mathbb{R}^{m}\) can be expressed as:

\[\begin{split} \mathbf{y} = \begin{bmatrix} a_{1}^{T}\mathbf{x} \\ a_{2}^{T}\mathbf{x} \\ \vdots \\ a_{m}^{T}\mathbf{x} \end{bmatrix} \end{split}\]

Each element \(y_i\) of \(\mathbf{y}\) is the inner product of the \(i\)-th row of \(A\) and vector \(\mathbf{x}\).

Matrix-Matrix Multiplication#

Given \(A \in \mathbb{R}^{m \times n}\) and \(B \in \mathbb{R}^{n \times p}\), their product \(C = AB \in \mathbb{R}^{m \times p}\) is defined by:

\[ C_{ij} = \sum_{k=1}^{n} A_{ik} B_{kj} \]

For example, if:

\[ \begin{align}\begin{aligned}\begin{split} A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} \\\end{split}\\\begin{split}AB = \begin{bmatrix} 1 * 5 + 2 * 7 & 1 * 6 + 2 * 8 \\ 3 * 5 + 4 * 7 & 3 * 6 + 4 * 8 \end{bmatrix} = \begin{bmatrix} 19 & 22 \\ 43 & 50 \end{bmatrix} \end{split}\end{aligned}\end{align} \]

Vector-Vector Multiplication#

  • Inner Product (Dot Product):

For \(\mathbf{x}, \mathbf{y} \in \mathbb{R}^{n}\):

\[ \mathbf{x}^{T}\mathbf{y} = \sum_{i=1}^{n} x_{i} y_{i} \]
  • Outer Product:

For \(\mathbf{x} \in \mathbb{R}^{m}\) and \(\mathbf{y} \in \mathbb{R}^{n}\):

\[\begin{split} \mathbf{x}\mathbf{y}^{T} = \begin{bmatrix} x_{1}y_{1} & x_{1}y_{2} & \cdots & x_{1}y_{n} \\ x_{2}y_{1} & x_{2}y_{2} & \cdots & x_{2}y_{n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m}y_{1} & x_{m}y_{2} & \cdots & x_{m}y_{n} \end{bmatrix} \end{split}\]

Vector Geometry#

Representing Points as Vectors#

In geometry, points in space can be represented as vectors. A point \((x, y, z)\) in 3D space corresponds to the vector:

\[\begin{split} \mathbf{v} = \begin{bmatrix} x \\ y \\ z \end{bmatrix} \end{split}\]

This representation allows for efficient manipulation of points using vector operations.

Magnitude of Vectors and Vector Norms#

The magnitude (or length) of a vector \(\mathbf{v} = [v_1, v_2, \dots, v_n]^T\) is given by:

\[ |\mathbf{v}| = \sqrt{v_1^2 + v_2^2 + \dots + v_n^2} \]

More generally, a vector norm assigns a length to a vector. Common norms include:

  • Euclidean norm (L² norm): \( \|\mathbf{v}\|_2 = \sqrt{\sum_{i=1}^n v_i^2} \)

  • Manhattan norm (L¹ norm): \( \|\mathbf{v}\|_1 = \sum_{i=1}^n |v_i| \)

  • Maximum norm (L∞ norm): \( \|\mathbf{v}\|_\infty = \max_{i} |v_i| \)

Dot Product and Its Geometric Meaning#

The dot product of two vectors \(\mathbf{a}, \mathbf{b} \in \mathbb{R}^n\) is:

\[ \mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^n a_i b_i \]

Geometrically, it relates to the angle \(\theta\) between the vectors:

\[ \mathbf{a} \cdot \mathbf{b} = |\mathbf{a}| |\mathbf{b}| \cos(\theta) \]
  • If \(\mathbf{a} \cdot \mathbf{b} = 0\), the vectors are orthogonal.

  • The dot product measures how much one vector extends in the direction of another.

Orthogonality#

Two vectors are said to be orthogonal if their dot product equals zero. In geometric terms, orthogonal vectors are perpendicular to each other. Orthogonality is a critical concept in vector spaces, as it simplifies many operations, such as projections and decompositions.

Orthogonal vectors have important properties:

  • If \(\mathbf{a} \cdot \mathbf{b} = 0\), then \(\mathbf{a}\) and \(\mathbf{b}\) are perpendicular.

  • In an orthonormal basis, all vectors are orthogonal and of unit length.

Vector Projections#

The projection of vector \(\mathbf{a}\) onto vector \(\mathbf{b}\) is:

\[ \text{proj}_{\mathbf{b}}(\mathbf{a}) = \frac{\mathbf{a} \cdot \mathbf{b}}{|\mathbf{b}|^2} \mathbf{b} \]

This represents the component of \(\mathbf{a}\) in the direction of \(\mathbf{b}\).

Cross Product#

For vectors \(\mathbf{a}, \mathbf{b} \in \mathbb{R}^3\), the cross product is a vector perpendicular to both:

\[\begin{split} \mathbf{a} \times \mathbf{b} = \begin{bmatrix} a_2 b_3 - a_3 b_2 \\ a_3 b_1 - a_1 b_3 \\ a_1 b_2 - a_2 b_1 \end{bmatrix} \end{split}\]

The magnitude of the cross product equals the area of the parallelogram formed by \(\mathbf{a}\) and \(\mathbf{b}\):

\[ |\mathbf{a} \times \mathbf{b}| = |\mathbf{a}| |\mathbf{b}| \sin(\theta) \]

Defining Lines with Vectors#

A line in space can be defined using a point \(\mathbf{p}\) and a direction vector \(\mathbf{d}\):

\[ \mathbf{r}(t) = \mathbf{p} + t\mathbf{d} \]

Alternatively, in the form \(\mathbf{w}^T \mathbf{x} + b = 0\), a line (or hyperplane) in \(n\)-dimensional space is defined by:

  • \(\mathbf{w}\): A normal vector(weight) perpendicular to the line (or hyperplane).

  • \(b\): A scalar(bias) that shifts the line from the origin. This is not always the same as the y-intercept.

A point \(\mathbf{x}\) lies on the line if \(\mathbf{w}^T \mathbf{x} + b = 0\).

Distance from a Point to a Line#

Given a point \(\mathbf{q}\) and a line defined by \(\mathbf{w}^T \mathbf{x} + b = 0\), the distance from \(\mathbf{q}\) to the line is:

\[ \text{Distance} = \frac{|\mathbf{w}^T \mathbf{q} + b|}{|\mathbf{w}|} \]

This measures the perpendicular distance from the point to the line.

Some Basic Matrix Properties#

Transpose#

The transpose of a matrix \(A \in \mathbb{R}^{m \times n}\) is denoted \(A^{T} \in \mathbb{R}^{n \times m}\), where each element is flipped over the diagonal:

\[ (A^{T})_{ij} = A_{ji} \]

Properties of Transposes:

  • \((A^{T})^{T} = A\)

  • \((AB)^{T} = B^{T}A^{T}\)

  • \((A + B)^{T} = A^{T} + B^{T}\)

Trace#

The trace of a square matrix \(A \in \mathbb{R}^{n \times n}\) is the sum of its diagonal elements:

\[ \text{tr}(A) = \sum_{i=1}^{n} A_{ii} \]

Properties of Trace:

  • \(\text{tr}(A) = \text{tr}(A^{T})\)

  • \(\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)\)

  • \(\text{tr}(tA) = t\text{tr}(A)\) for scalar \(t\)

  • \(\text{tr}(AB) = \text{tr}(BA)\)

Determinant#

The determinant of a square matrix \(A \in \mathbb{R}^{n \times n}\) is denoted \(|A|\) or \(\det(A)\).

Properties of Determinants:

  • \(|I| = 1\) for the identity matrix \(I\)

  • \(|A| = |A^{T}|\)

  • \(|AB| = |A||B|\)

  • \(|A| = 0\) if and only if \(A\) is singular (non-invertible)

  • \(|A^{-1}| = \frac{1}{|A|}\) for invertible \(A\)

Inverse#

The inverse of a square matrix \(A \in \mathbb{R}^{n \times n}\), denoted \(A^{-1}\), satisfies:

\[ A^{-1}A = I = AA^{-1} \]

Properties of Inverses:

  • \((A^{-1})^{-1} = A\)

  • \((AB)^{-1} = B^{-1}A^{-1}\) Note that both \(A\) and \(B\) must be invertible.

  • \((A^{-1})^{T} = (A^{T})^{-1}\)

Triangular Matrices#

A triangular matrix is a square matrix where all the entries above or below the main diagonal are zero.

  • Upper Triangular Matrix: All elements below the diagonal are zero.

  • Lower Triangular Matrix: All elements above the diagonal are zero.

Properties of Triangular Matrices:

  • The determinant of a triangular matrix is the product of its diagonal elements.

  • The eigenvalues of a triangular matrix are the entries on its diagonal.

Orthogonal Matrices#

A matrix \(Q \in \mathbb{R}^{n \times n}\) is orthogonal if:

\[ Q^{T}Q = QQ^{T} = I \]

Properties of Orthogonal Matrices:

  • The inverse of an orthogonal matrix is its transpose: \(Q^{-1} = Q^{T}\).

  • Orthogonal matrices preserve vector norms and angles.

  • The columns (and rows) of an orthogonal matrix form an orthonormal basis.

Rank of a Matrix#

The rank of a matrix \(A \in \mathbb{R}^{m \times n}\) is the dimension of its row space or column space, indicating the maximum number of linearly independent rows or columns.

Properties of Rank:

  • \(\text{rank}(A) \leq \min(m, n)\).

  • A matrix is full rank if \(\text{rank}(A) = \min(m, n)\).

  • The rank of a matrix equals the number of non-zero singular values in its SVD.

Eigenvectors and Eigenvalues#

Eigenvectors \(\mathbf{v}\) and eigenvalues \(\lambda\) satisfy:

\[ A\mathbf{v} = \lambda \mathbf{v} \]

Solving for Eigenvalues#

Given matrix \(A\), solve:

\[ |A - \lambda I| = 0 \]

Properties of Eigenvalues and Eigenvectors#

  1. If \(A\) is triangular, its eigenvalues are its diagonal elements.

  2. If \(\lambda\) is an eigenvalue of \(A\), \(\frac{1}{\lambda}\) is an eigenvalue of \(A^{-1}\).

  3. The sum of the eigenvalues of \(A\) equals \(\text{tr}(A)\).

  4. The product of the eigenvalues of \(A\) equals \(\det(A)\).

Eigendecomposition#

Eigendecomposition is the process of decomposing a square matrix \(A\) into its eigenvalues and eigenvectors. If \(A\) is diagonalizable, it can be written as:

\[ A = V \Lambda V^{-1} \]

where:

  • \(V\) is the matrix whose columns are the eigenvectors of \(A\).

  • \(\Lambda\) is a diagonal matrix whose diagonal entries are the corresponding eigenvalues of \(A\).

  • \(V^{-1}\) is the inverse of \(V\).

This decomposition is useful in simplifying matrix operations, solving differential equations, and understanding the behavior of linear transformations.

Singular Values and Singular Value Decomposition (SVD)#

Singular Value Decomposition (SVD) is a factorization of a real or complex matrix. Any \(m \times n\) matrix \(A\) can be decomposed into:

\[ A = U \Sigma V^T \]

where:

  • \(U \in \mathbb{R}^{m \times m}\) is an orthogonal matrix whose columns are the left singular vectors of \(A\).

  • \(\Sigma \in \mathbb{R}^{m \times n}\) is a diagonal matrix with non-negative real numbers on the diagonal, known as the singular values of \(A\).

  • \(V \in \mathbb{R}^{n \times n}\) is an orthogonal matrix whose columns are the right singular vectors of \(A\).

Properties of SVD:

  1. The singular values in \(\Sigma\) are the square roots of the eigenvalues of \(A^T A\).

  2. SVD provides the best low-rank approximation of a matrix, making it useful in data compression and noise reduction.

  3. The rank of \(A\) equals the number of non-zero singular values.