Linear Algebra#
What are Matrices and Vectors?#
Linear algebra provides a way of compactly representing and operating on sets of linear equations. For example, consider the following system of equations:
In matrix and vector notation, we can write the system more compactly as:
where:
Here, \(A\) is a matrix and \(\mathbf{b}\) is a vector.
Definition 1: Matrices#
A set of numbers arranged in rows and columns is called a matrix. By \(A \in \mathbb{R}^{m \times n}\), we denote a matrix with \(m\) rows and \(n\) columns. In the above example, \(A \in \mathbb{R}^{2 \times 2}\).
Definition 2: Vectors#
A set of numbers arranged in one dimension is called a vector. By \(\mathbf{x} \in \mathbb{R}^{n}\), we denote a vector with \(n\) entries. In the above example, \(\mathbf{b} \in \mathbb{R}^{2}\). By convention, an \(n\)-dimensional vector is often thought of as a matrix with \(n\) rows and 1 column (a column vector). If we want to explicitly represent a row vector—a matrix with 1 row and \(n\) columns—we typically write \(\mathbf{x}^{T}\) (pronounced “x transpose”).
Notation#
The \(i\)-th element of a vector \(\mathbf{x}\) is denoted \(x_{i}\):
The entry of matrix \(A\) in the \(i\)-th row and \(j\)-th column is denoted \(a_{ij}\):
Matrix and Vector Operations#
Addition, Subtraction, and Scalar Multiplication#
For Matrices:
Addition/Subtraction: Two matrices \(A\) and \(B\) can be added or subtracted if they are of the same size (\(A, B \in \mathbb{R}^{m \times n}\)). The operations are done element-wise:
Scalar Multiplication: Each element of matrix \(A\) is multiplied by scalar \(k\):
For Vectors:
Addition/Subtraction: Two vectors \(\mathbf{a}\) and \(\mathbf{b}\) can be added or subtracted if they are of the same size (\(\mathbf{a}, \mathbf{b} \in \mathbb{R}^{n}\)):
Scalar Multiplication: Each element of vector \(\mathbf{a}\) is multiplied by scalar \(k\):
Matrix and Vector Multiplication#
Matrix-Vector Multiplication#
Given \(A \in \mathbb{R}^{m \times n}\) and \(\mathbf{x} \in \mathbb{R}^{n}\), their product \(\mathbf{y} = A\mathbf{x} \in \mathbb{R}^{m}\) can be expressed as:
Each element \(y_i\) of \(\mathbf{y}\) is the inner product of the \(i\)-th row of \(A\) and vector \(\mathbf{x}\).
Matrix-Matrix Multiplication#
Given \(A \in \mathbb{R}^{m \times n}\) and \(B \in \mathbb{R}^{n \times p}\), their product \(C = AB \in \mathbb{R}^{m \times p}\) is defined by:
For example, if:
Vector-Vector Multiplication#
Inner Product (Dot Product):
For \(\mathbf{x}, \mathbf{y} \in \mathbb{R}^{n}\):
Outer Product:
For \(\mathbf{x} \in \mathbb{R}^{m}\) and \(\mathbf{y} \in \mathbb{R}^{n}\):
Vector Geometry#
Representing Points as Vectors#
In geometry, points in space can be represented as vectors. A point \((x, y, z)\) in 3D space corresponds to the vector:
This representation allows for efficient manipulation of points using vector operations.
Magnitude of Vectors and Vector Norms#
The magnitude (or length) of a vector \(\mathbf{v} = [v_1, v_2, \dots, v_n]^T\) is given by:
More generally, a vector norm assigns a length to a vector. Common norms include:
Euclidean norm (L² norm): \( \|\mathbf{v}\|_2 = \sqrt{\sum_{i=1}^n v_i^2} \)
Manhattan norm (L¹ norm): \( \|\mathbf{v}\|_1 = \sum_{i=1}^n |v_i| \)
Maximum norm (L∞ norm): \( \|\mathbf{v}\|_\infty = \max_{i} |v_i| \)
Dot Product and Its Geometric Meaning#
The dot product of two vectors \(\mathbf{a}, \mathbf{b} \in \mathbb{R}^n\) is:
Geometrically, it relates to the angle \(\theta\) between the vectors:
If \(\mathbf{a} \cdot \mathbf{b} = 0\), the vectors are orthogonal.
The dot product measures how much one vector extends in the direction of another.
Orthogonality#
Two vectors are said to be orthogonal if their dot product equals zero. In geometric terms, orthogonal vectors are perpendicular to each other. Orthogonality is a critical concept in vector spaces, as it simplifies many operations, such as projections and decompositions.
Orthogonal vectors have important properties:
If \(\mathbf{a} \cdot \mathbf{b} = 0\), then \(\mathbf{a}\) and \(\mathbf{b}\) are perpendicular.
In an orthonormal basis, all vectors are orthogonal and of unit length.
Vector Projections#
The projection of vector \(\mathbf{a}\) onto vector \(\mathbf{b}\) is:
This represents the component of \(\mathbf{a}\) in the direction of \(\mathbf{b}\).
Cross Product#
For vectors \(\mathbf{a}, \mathbf{b} \in \mathbb{R}^3\), the cross product is a vector perpendicular to both:
The magnitude of the cross product equals the area of the parallelogram formed by \(\mathbf{a}\) and \(\mathbf{b}\):
Defining Lines with Vectors#
A line in space can be defined using a point \(\mathbf{p}\) and a direction vector \(\mathbf{d}\):
Alternatively, in the form \(\mathbf{w}^T \mathbf{x} + b = 0\), a line (or hyperplane) in \(n\)-dimensional space is defined by:
\(\mathbf{w}\): A normal vector(weight) perpendicular to the line (or hyperplane).
\(b\): A scalar(bias) that shifts the line from the origin. This is not always the same as the y-intercept.
A point \(\mathbf{x}\) lies on the line if \(\mathbf{w}^T \mathbf{x} + b = 0\).
Distance from a Point to a Line#
Given a point \(\mathbf{q}\) and a line defined by \(\mathbf{w}^T \mathbf{x} + b = 0\), the distance from \(\mathbf{q}\) to the line is:
This measures the perpendicular distance from the point to the line.
Some Basic Matrix Properties#
Transpose#
The transpose of a matrix \(A \in \mathbb{R}^{m \times n}\) is denoted \(A^{T} \in \mathbb{R}^{n \times m}\), where each element is flipped over the diagonal:
Properties of Transposes:
\((A^{T})^{T} = A\)
\((AB)^{T} = B^{T}A^{T}\)
\((A + B)^{T} = A^{T} + B^{T}\)
Trace#
The trace of a square matrix \(A \in \mathbb{R}^{n \times n}\) is the sum of its diagonal elements:
Properties of Trace:
\(\text{tr}(A) = \text{tr}(A^{T})\)
\(\text{tr}(A + B) = \text{tr}(A) + \text{tr}(B)\)
\(\text{tr}(tA) = t\text{tr}(A)\) for scalar \(t\)
\(\text{tr}(AB) = \text{tr}(BA)\)
Determinant#
The determinant of a square matrix \(A \in \mathbb{R}^{n \times n}\) is denoted \(|A|\) or \(\det(A)\).
Properties of Determinants:
\(|I| = 1\) for the identity matrix \(I\)
\(|A| = |A^{T}|\)
\(|AB| = |A||B|\)
\(|A| = 0\) if and only if \(A\) is singular (non-invertible)
\(|A^{-1}| = \frac{1}{|A|}\) for invertible \(A\)
Inverse#
The inverse of a square matrix \(A \in \mathbb{R}^{n \times n}\), denoted \(A^{-1}\), satisfies:
Properties of Inverses:
\((A^{-1})^{-1} = A\)
\((AB)^{-1} = B^{-1}A^{-1}\) Note that both \(A\) and \(B\) must be invertible.
\((A^{-1})^{T} = (A^{T})^{-1}\)
Triangular Matrices#
A triangular matrix is a square matrix where all the entries above or below the main diagonal are zero.
Upper Triangular Matrix: All elements below the diagonal are zero.
Lower Triangular Matrix: All elements above the diagonal are zero.
Properties of Triangular Matrices:
The determinant of a triangular matrix is the product of its diagonal elements.
The eigenvalues of a triangular matrix are the entries on its diagonal.
Orthogonal Matrices#
A matrix \(Q \in \mathbb{R}^{n \times n}\) is orthogonal if:
Properties of Orthogonal Matrices:
The inverse of an orthogonal matrix is its transpose: \(Q^{-1} = Q^{T}\).
Orthogonal matrices preserve vector norms and angles.
The columns (and rows) of an orthogonal matrix form an orthonormal basis.
Rank of a Matrix#
The rank of a matrix \(A \in \mathbb{R}^{m \times n}\) is the dimension of its row space or column space, indicating the maximum number of linearly independent rows or columns.
Properties of Rank:
\(\text{rank}(A) \leq \min(m, n)\).
A matrix is full rank if \(\text{rank}(A) = \min(m, n)\).
The rank of a matrix equals the number of non-zero singular values in its SVD.
Eigenvectors and Eigenvalues#
Eigenvectors \(\mathbf{v}\) and eigenvalues \(\lambda\) satisfy:
Solving for Eigenvalues#
Given matrix \(A\), solve:
Properties of Eigenvalues and Eigenvectors#
If \(A\) is triangular, its eigenvalues are its diagonal elements.
If \(\lambda\) is an eigenvalue of \(A\), \(\frac{1}{\lambda}\) is an eigenvalue of \(A^{-1}\).
The sum of the eigenvalues of \(A\) equals \(\text{tr}(A)\).
The product of the eigenvalues of \(A\) equals \(\det(A)\).
Eigendecomposition#
Eigendecomposition is the process of decomposing a square matrix \(A\) into its eigenvalues and eigenvectors. If \(A\) is diagonalizable, it can be written as:
where:
\(V\) is the matrix whose columns are the eigenvectors of \(A\).
\(\Lambda\) is a diagonal matrix whose diagonal entries are the corresponding eigenvalues of \(A\).
\(V^{-1}\) is the inverse of \(V\).
This decomposition is useful in simplifying matrix operations, solving differential equations, and understanding the behavior of linear transformations.
Singular Values and Singular Value Decomposition (SVD)#
Singular Value Decomposition (SVD) is a factorization of a real or complex matrix. Any \(m \times n\) matrix \(A\) can be decomposed into:
where:
\(U \in \mathbb{R}^{m \times m}\) is an orthogonal matrix whose columns are the left singular vectors of \(A\).
\(\Sigma \in \mathbb{R}^{m \times n}\) is a diagonal matrix with non-negative real numbers on the diagonal, known as the singular values of \(A\).
\(V \in \mathbb{R}^{n \times n}\) is an orthogonal matrix whose columns are the right singular vectors of \(A\).
Properties of SVD:
The singular values in \(\Sigma\) are the square roots of the eigenvalues of \(A^T A\).
SVD provides the best low-rank approximation of a matrix, making it useful in data compression and noise reduction.
The rank of \(A\) equals the number of non-zero singular values.