Run this notebook: Open in Colab Open in Kaggle

Lab 02 – Analytic Geometry¶

Based on Chapter 3 of Mathematics for Machine Learning (Deisenroth, Faisal, Ong)

This lab explores the geometric concepts that arise when we equip a vector space with an inner product: norms, distances, angles, orthogonality, projections, and rotations. These ideas are foundational for machine learning algorithms such as PCA (Ch. 10), linear regression (Ch. 9), and SVMs (Ch. 12).

Topics covered (MML Sections 3.1–3.9):

Norms
Inner Products
Lengths and Distances
Angles and Orthogonality
Orthonormal Basis (Gram–Schmidt)
Orthogonal Complement
Inner Product of Functions
Orthogonal Projections
Rotations

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import FancyArrowPatch
from mpl_toolkits.mplot3d import Axes3D

np.set_printoptions(precision=4, suppress=True)
%matplotlib inline
plt.rcParams['figure.figsize'] = (7, 5)
plt.rcParams['font.size'] = 12
print('Setup complete.')

1. Norms (MML 3.1)¶

A norm on a vector space $V$ is a function $\|\cdot\| : V \to \mathbb{R}$ that assigns each vector $\mathbf{x}$ a length $\|\mathbf{x}\|$ satisfying:

Absolutely homogeneous: $\|\lambda\mathbf{x}\| = |\lambda|\,\|\mathbf{x}\|$
Triangle inequality: $\|\mathbf{x} + \mathbf{y}\| \leq \|\mathbf{x}\| + \|\mathbf{y}\|$
Positive definite: $\|\mathbf{x}\| \geq 0$ and $\|\mathbf{x}\| = 0 \iff \mathbf{x} = \mathbf{0}$

Common norms for $\mathbf{x} \in \mathbb{R}^n$:

Norm	Formula
$\ell_1$ (Manhattan)	$\|\mathbf{x}\|1 = \sum{i=1}^n
$\ell_2$ (Euclidean)	$\|\mathbf{x}\|_2 = \sqrt{\sum_{i=1}^n x_i^2}$
$\ell_\infty$ (Max)	$\|\mathbf{x}\|_\infty = \max_i

# Compute L1, L2, Linf norms of a vector
x = np.array([3.0, -4.0])

l1_norm = np.linalg.norm(x, ord=1)
l2_norm = np.linalg.norm(x, ord=2)
linf_norm = np.linalg.norm(x, ord=np.inf)

print(f'x = {x}')
print(f'||x||_1   = {l1_norm:.4f}  (sum of absolute values: |3| + |-4| = 7)')
print(f'||x||_2   = {l2_norm:.4f}  (sqrt(9 + 16) = 5)')
print(f'||x||_inf = {linf_norm:.4f}  (max(|3|, |-4|) = 4)')

# Verify triangle inequality for L2 norm
y = np.array([1.0, 2.0])
lhs = np.linalg.norm(x + y)
rhs = np.linalg.norm(x) + np.linalg.norm(y)
print(f'\nTriangle inequality check: ||x+y|| = {lhs:.4f} <= ||x|| + ||y|| = {rhs:.4f}  =>  {lhs <= rhs + 1e-12}')

# Visualize unit balls in 2D for L1, L2, Linf norms (MML Figure 3.3)
theta = np.linspace(0, 2 * np.pi, 1000)

fig, axes = plt.subplots(1, 3, figsize=(14, 4.5))

# L1 unit ball: |x1| + |x2| = 1  (diamond)
t = np.linspace(0, 1, 250)
l1_x = np.concatenate([t, -t, -t, t])
l1_y = np.concatenate([1 - t, t - 1, 1 - t, t - 1])
# Simpler: parametric by angle
l1_x = np.cos(theta) / (np.abs(np.cos(theta)) + np.abs(np.sin(theta)))
l1_y = np.sin(theta) / (np.abs(np.cos(theta)) + np.abs(np.sin(theta)))
axes[0].plot(l1_x, l1_y, 'r-', linewidth=2)
axes[0].set_title(r'$\ell_1$ unit ball: $\|x\|_1 = 1$')
axes[0].set_aspect('equal')
axes[0].grid(True, alpha=0.3)
axes[0].axhline(0, color='k', lw=0.5)
axes[0].axvline(0, color='k', lw=0.5)
axes[0].set_xlim(-1.3, 1.3)
axes[0].set_ylim(-1.3, 1.3)

# L2 unit ball: circle
axes[1].plot(np.cos(theta), np.sin(theta), 'r-', linewidth=2)
axes[1].set_title(r'$\ell_2$ unit ball: $\|x\|_2 = 1$')
axes[1].set_aspect('equal')
axes[1].grid(True, alpha=0.3)
axes[1].axhline(0, color='k', lw=0.5)
axes[1].axvline(0, color='k', lw=0.5)
axes[1].set_xlim(-1.3, 1.3)
axes[1].set_ylim(-1.3, 1.3)

# Linf unit ball: square
p_vals = [0.5, 1, 2, 5, 50]  # approximate Linf with large p
# Direct: max(|x1|, |x2|) = 1
linf_x = np.array([1, 1, -1, -1, 1])
linf_y = np.array([1, -1, -1, 1, 1])
axes[2].plot(linf_x, linf_y, 'r-', linewidth=2)
axes[2].set_title(r'$\ell_\infty$ unit ball: $\|x\|_\infty = 1$')
axes[2].set_aspect('equal')
axes[2].grid(True, alpha=0.3)
axes[2].axhline(0, color='k', lw=0.5)
axes[2].axvline(0, color='k', lw=0.5)
axes[2].set_xlim(-1.3, 1.3)
axes[2].set_ylim(-1.3, 1.3)

plt.tight_layout()
plt.suptitle('Unit Balls for Different Norms in $\\mathbb{R}^2$', y=1.03, fontsize=14)
plt.show()

2. Inner Products (MML 3.2)¶

An inner product on a vector space $V$ is a positive definite, symmetric bilinear mapping $\langle \cdot, \cdot \rangle : V \times V \to \mathbb{R}$.

The standard dot product in $\mathbb{R}^n$ is: $$\langle \mathbf{x}, \mathbf{y} \rangle = \mathbf{x}^\top \mathbf{y} = \sum_{i=1}^n x_i y_i$$

More generally, for a symmetric positive definite matrix $\mathbf{A}$: $$\langle \mathbf{x}, \mathbf{y} \rangle = \mathbf{x}^\top \mathbf{A} \mathbf{y}$$

Cauchy–Schwarz Inequality (Eq. 3.17): $$|\langle \mathbf{x}, \mathbf{y} \rangle| \leq \|\mathbf{x}\| \, \|\mathbf{y}\|$$

# Standard dot product
x = np.array([1.0, 2.0, 3.0])
y = np.array([4.0, -1.0, 2.0])

dot_manual = np.sum(x * y)
dot_numpy = np.dot(x, y)
print(f'x = {x}')
print(f'y = {y}')
print(f'Dot product (manual):  {dot_manual}')
print(f'Dot product (np.dot): {dot_numpy}')

# General inner product with SPD matrix A (MML Example 3.3)
# <x, y> = x^T A y  where A is symmetric positive definite
A = np.array([[2, 1],
              [1, 3]])
# Check A is SPD: eigenvalues must be positive
eigvals = np.linalg.eigvalsh(A)
print(f'\nSPD matrix A:\n{A}')
print(f'Eigenvalues of A: {eigvals}  (all positive => SPD)')

x2 = np.array([1.0, 1.0])
y2 = np.array([-1.0, 1.0])
ip_standard = x2 @ y2
ip_general = x2 @ A @ y2
print(f'\nx = {x2}, y = {y2}')
print(f'Standard dot product:  <x,y> = {ip_standard}')
print(f'General inner product: <x,y>_A = x^T A y = {ip_general}')

# Cauchy-Schwarz inequality verification
np.random.seed(42)
num_tests = 5
print('Cauchy-Schwarz Inequality: |<x,y>| <= ||x|| * ||y||')
print('-' * 55)

for i in range(num_tests):
    x = np.random.randn(4)
    y = np.random.randn(4)
    lhs = np.abs(np.dot(x, y))
    rhs = np.linalg.norm(x) * np.linalg.norm(y)
    print(f'Test {i+1}: |<x,y>| = {lhs:.4f}  <=  ||x||*||y|| = {rhs:.4f}  =>  {lhs <= rhs + 1e-12}')

# Equality holds when x = alpha * y (collinear)
x_col = np.array([1.0, 2.0, 3.0])
y_col = 3.0 * x_col
lhs_eq = np.abs(np.dot(x_col, y_col))
rhs_eq = np.linalg.norm(x_col) * np.linalg.norm(y_col)
print(f'\nCollinear case: |<x, 3x>| = {lhs_eq:.4f},  ||x||*||3x|| = {rhs_eq:.4f}  =>  Equal: {np.isclose(lhs_eq, rhs_eq)}')

3. Lengths and Distances (MML 3.3)¶

An inner product induces a norm: $\|\mathbf{x}\| := \sqrt{\langle \mathbf{x}, \mathbf{x} \rangle}$ (Eq. 3.16)

The distance (metric) between $\mathbf{x}$ and $\mathbf{y}$ is: $$d(\mathbf{x}, \mathbf{y}) := \|\mathbf{x} - \mathbf{y}\| = \sqrt{\langle \mathbf{x} - \mathbf{y},\; \mathbf{x} - \mathbf{y} \rangle}$$

With the dot product, this gives the Euclidean distance. The $\ell_1$ norm gives the Manhattan distance.

A metric satisfies: (1) positive definiteness, (2) symmetry, (3) triangle inequality.

# Euclidean and Manhattan distances between vectors
a = np.array([1.0, 2.0, 3.0])
b = np.array([4.0, 0.0, -1.0])

d_euclidean = np.linalg.norm(a - b)          # L2 distance
d_manhattan = np.linalg.norm(a - b, ord=1)   # L1 distance

print(f'a = {a}')
print(f'b = {b}')
print(f'Euclidean distance: d_2(a, b) = {d_euclidean:.4f}')
print(f'Manhattan distance: d_1(a, b) = {d_manhattan:.4f}')

# Length using a different inner product (MML Example 3.5)
# <x, y> = x^T [[1, -0.5], [-0.5, 1]] y
x = np.array([1.0, 1.0])
A = np.array([[1, -0.5],
              [-0.5, 1]])
norm_dot = np.sqrt(x @ x)
norm_A = np.sqrt(x @ A @ x)
print(f'\nx = {x}')
print(f'||x|| with dot product  = {norm_dot:.4f}')
print(f'||x|| with inner prod A = {norm_A:.4f}  (shorter because x1, x2 same sign)')

# Distance matrix: pairwise Euclidean distances between a set of points
points = np.array([[0, 0],
                   [1, 0],
                   [0, 1],
                   [1, 1],
                   [0.5, 0.5]])
n = len(points)
labels = ['O(0,0)', 'A(1,0)', 'B(0,1)', 'C(1,1)', 'M(0.5,0.5)']

dist_matrix = np.zeros((n, n))
for i in range(n):
    for j in range(n):
        dist_matrix[i, j] = np.linalg.norm(points[i] - points[j])

print('Pairwise Euclidean Distance Matrix:')
print(f'{"":>12s}', end='')
for lbl in labels:
    print(f'{lbl:>12s}', end='')
print()
for i, lbl in enumerate(labels):
    print(f'{lbl:>12s}', end='')
    for j in range(n):
        print(f'{dist_matrix[i, j]:12.4f}', end='')
    print()

# Verify metric properties
print(f'\nSymmetric: d(A,B) = {dist_matrix[1,2]:.4f}, d(B,A) = {dist_matrix[2,1]:.4f}')
print(f'Triangle inequality: d(O,C) = {dist_matrix[0,3]:.4f} <= d(O,A) + d(A,C) = {dist_matrix[0,1] + dist_matrix[1,3]:.4f}')

4. Angles and Orthogonality (MML 3.4)¶

The angle $\omega$ between two non-zero vectors is defined via: $$\cos \omega = \frac{\langle \mathbf{x}, \mathbf{y} \rangle}{\|\mathbf{x}\|\,\|\mathbf{y}\|}$$

Two vectors are orthogonal ($\mathbf{x} \perp \mathbf{y}$) if $\langle \mathbf{x}, \mathbf{y} \rangle = 0$.

If additionally $\|\mathbf{x}\| = \|\mathbf{y}\| = 1$, they are orthonormal.

Cosine similarity is widely used in ML (e.g., word embeddings, recommendation systems): $$\text{sim}(\mathbf{x}, \mathbf{y}) = \frac{\mathbf{x}^\top \mathbf{y}}{\|\mathbf{x}\|\,\|\mathbf{y}\|} \in [-1, 1]$$

# MML Example 3.6: angle between x=[1,1] and y=[1,2]
x = np.array([1.0, 1.0])
y = np.array([1.0, 2.0])

cos_omega = np.dot(x, y) / (np.linalg.norm(x) * np.linalg.norm(y))
omega_rad = np.arccos(np.clip(cos_omega, -1, 1))
omega_deg = np.degrees(omega_rad)

print(f'x = {x}, y = {y}')
print(f'cos(omega) = {cos_omega:.4f}')
print(f'omega = {omega_rad:.4f} rad = {omega_deg:.2f} degrees')
print(f'(Book: arccos(3/sqrt(10)) ~ 0.32 rad ~ 18 degrees)')

# Orthogonality check
x_orth = np.array([1.0, 1.0])
y_orth = np.array([-1.0, 1.0])
dot_orth = np.dot(x_orth, y_orth)
print(f'\nOrthogonality: x = {x_orth}, y = {y_orth}')
print(f'<x, y> = {dot_orth}  =>  Orthogonal: {np.isclose(dot_orth, 0)}')

# Cosine similarity examples
print('\n--- Cosine Similarity ---')
vecs = {
    'parallel':     (np.array([1, 2, 3]), np.array([2, 4, 6])),
    'anti-parallel': (np.array([1, 2, 3]), np.array([-1, -2, -3])),
    'orthogonal':   (np.array([1, 0, 0]), np.array([0, 1, 0])),
    'arbitrary':    (np.array([1, 2, 3]), np.array([4, -1, 2])),
}
for name, (a, b) in vecs.items():
    sim = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
    print(f'  {name:15s}: cos_sim = {sim:+.4f}')

# Visualize angle between vectors in 2D
fig, ax = plt.subplots(1, 1, figsize=(6, 6))

x = np.array([1.0, 1.0])
y = np.array([1.0, 2.0])

# Draw vectors as arrows from origin
ax.annotate('', xy=x, xytext=(0, 0),
            arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('', xy=y, xytext=(0, 0),
            arrowprops=dict(arrowstyle='->', color='red', lw=2))

ax.text(x[0] + 0.05, x[1] - 0.15, r'$\mathbf{x} = [1, 1]^\top$', fontsize=12, color='blue')
ax.text(y[0] + 0.05, y[1] + 0.05, r'$\mathbf{y} = [1, 2]^\top$', fontsize=12, color='red')

# Draw angle arc
angle_x = np.arctan2(x[1], x[0])
angle_y = np.arctan2(y[1], y[0])
arc_theta = np.linspace(angle_x, angle_y, 50)
r_arc = 0.4
ax.plot(r_arc * np.cos(arc_theta), r_arc * np.sin(arc_theta), 'g-', lw=1.5)
mid_angle = (angle_x + angle_y) / 2
ax.text(0.5 * np.cos(mid_angle), 0.5 * np.sin(mid_angle),
        r'$\omega \approx 18.4^\circ$', fontsize=11, color='green')

ax.set_xlim(-0.3, 2.3)
ax.set_ylim(-0.3, 2.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.axhline(0, color='k', lw=0.5)
ax.axvline(0, color='k', lw=0.5)
ax.set_title('Angle Between Vectors (MML Example 3.6)')
ax.set_xlabel(r'$x_1$')
ax.set_ylabel(r'$x_2$')
plt.tight_layout()
plt.show()

5. Orthonormal Basis – Gram-Schmidt Process (MML 3.5, 3.8.3)¶

An orthonormal basis (ONB) $\{\mathbf{b}_1, \ldots, \mathbf{b}_n\}$ satisfies: $$\langle \mathbf{b}_i, \mathbf{b}_j \rangle = 0 \text{ for } i \neq j, \quad \langle \mathbf{b}_i, \mathbf{b}_i \rangle = 1$$

The Gram-Schmidt process (Eqs. 3.67–3.68) iteratively constructs an orthogonal basis from any basis:

\[\mathbf{u}_1 := \mathbf{b}_1\]

\[\mathbf{u}_k := \mathbf{b}_k - \sum_{j=1}^{k-1} \frac{\langle \mathbf{u}_j, \mathbf{b}_k \rangle}{\langle \mathbf{u}_j, \mathbf{u}_j \rangle} \mathbf{u}_j, \quad k = 2, \ldots, n\]

Then normalize: $\hat{\mathbf{u}}_k = \mathbf{u}_k / \|\mathbf{u}_k\|$.

def gram_schmidt(V):
    """Gram-Schmidt orthonormalization.
    
    Parameters
    ----------
    V : ndarray, shape (n, k)
        Columns are the input basis vectors.
    
    Returns
    -------
    Q : ndarray, shape (n, k)
        Columns are the orthonormal basis vectors.
    """
    n, k = V.shape
    Q = np.zeros_like(V, dtype=float)
    
    for i in range(k):
        # Start with the i-th input vector
        u = V[:, i].astype(float).copy()
        # Subtract projections onto all previously computed basis vectors
        for j in range(i):
            u -= np.dot(Q[:, j], V[:, i]) * Q[:, j]
        # Normalize
        Q[:, i] = u / np.linalg.norm(u)
    
    return Q


# MML Example 3.12: b1 = [2, 0], b2 = [1, 1]
B = np.array([[2, 1],
              [0, 1]], dtype=float)

Q = gram_schmidt(B)
print('Input basis vectors (columns):')
print(B)
print('\nOrthonormal basis (Gram-Schmidt):')
print(Q)

# Verify orthonormality: Q^T Q should be identity
print('\nQ^T Q (should be identity):')
print(Q.T @ Q)

# 3D example: MML Exercise 3.8
print('\n--- 3D Gram-Schmidt (Exercise 3.8) ---')
B3 = np.array([[1, -1],
               [1,  2],
               [1,  0]], dtype=float)
Q3 = gram_schmidt(B3)
print('Input basis (columns):')
print(B3)
print('\nONB (columns):')
print(Q3)
print('\nQ^T Q (should be 2x2 identity):')
print(Q3.T @ Q3)

# Visualize Gram-Schmidt in 2D
b1 = np.array([2.0, 0.0])
b2 = np.array([1.0, 1.0])

u1 = b1.copy()
proj = (np.dot(u1, b2) / np.dot(u1, u1)) * u1  # projection of b2 onto u1
u2 = b2 - proj

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# (a) Original basis
ax = axes[0]
ax.annotate('', xy=b1, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('', xy=b2, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='red', lw=2))
ax.text(b1[0] + 0.05, b1[1] + 0.1, r'$\mathbf{b}_1$', fontsize=14, color='blue')
ax.text(b2[0] + 0.05, b2[1] + 0.05, r'$\mathbf{b}_2$', fontsize=14, color='red')
ax.set_title('(a) Original basis')
ax.set_xlim(-0.5, 2.5)
ax.set_ylim(-0.5, 1.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)

# (b) u1 and projection of b2 onto span[u1]
ax = axes[1]
ax.annotate('', xy=u1, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('', xy=b2, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='red', lw=2))
ax.annotate('', xy=proj, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='orange', lw=2, ls='--'))
ax.plot([b2[0], proj[0]], [b2[1], proj[1]], 'g--', lw=1)
ax.text(u1[0] + 0.05, u1[1] + 0.1, r'$\mathbf{u}_1$', fontsize=14, color='blue')
ax.text(proj[0], proj[1] - 0.15, r'$\pi(\mathbf{b}_2)$', fontsize=11, color='orange')
ax.set_title(r'(b) $\mathbf{u}_1$ and projection of $\mathbf{b}_2$')
ax.set_xlim(-0.5, 2.5)
ax.set_ylim(-0.5, 1.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)

# (c) Orthogonal basis u1, u2
ax = axes[2]
ax.annotate('', xy=u1, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('', xy=u2, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='green', lw=2))
ax.text(u1[0] + 0.05, u1[1] + 0.1, r'$\mathbf{u}_1$', fontsize=14, color='blue')
ax.text(u2[0] + 0.05, u2[1] + 0.05, r'$\mathbf{u}_2$', fontsize=14, color='green')
ax.set_title(r'(c) Orthogonal basis $(\mathbf{u}_1, \mathbf{u}_2)$')
ax.set_xlim(-0.5, 2.5)
ax.set_ylim(-0.5, 1.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)

plt.suptitle('Gram-Schmidt Orthogonalization (MML Example 3.12 / Figure 3.12)', y=1.02, fontsize=13)
plt.tight_layout()
plt.show()

print(f'u1 . u2 = {np.dot(u1, u2):.6f}  (orthogonal: {np.isclose(np.dot(u1, u2), 0)})')

6. Orthogonal Complement (MML 3.6)¶

For a $D$-dimensional vector space $V$ and an $M$-dimensional subspace $U \subseteq V$, the orthogonal complement $U^\perp$ is a $(D - M)$-dimensional subspace containing all vectors orthogonal to every vector in $U$:

\[U^\perp = \{\mathbf{v} \in V : \langle \mathbf{v}, \mathbf{u} \rangle = 0 \text{ for all } \mathbf{u} \in U\}\]

Key property: $V = U \oplus U^\perp$ (every vector can be uniquely decomposed).

For a plane $U$ in $\mathbb{R}^3$, $U^\perp$ is the line spanned by the normal vector $\mathbf{w}$ of the plane.

# Orthogonal complement in R^3: subspace U spanned by two vectors
# U^perp is the null space of U^T (the set of vectors orthogonal to all columns of U)

u1 = np.array([1, 0, 1], dtype=float)
u2 = np.array([0, 1, 1], dtype=float)

# The orthogonal complement is the null space of [u1, u2]^T
U_mat = np.column_stack([u1, u2])  # 3x2
print(f'Subspace U basis vectors (columns of U):')
print(U_mat)

# Find normal vector: solve u1^T w = 0 and u2^T w = 0
# U^T w = 0 => [[1,0,1],[0,1,1]] w = 0
# w1 + w3 = 0, w2 + w3 = 0 => w = t * [-1, -1, 1]
w = np.array([-1, -1, 1], dtype=float)
w_normalized = w / np.linalg.norm(w)

print(f'\nNormal vector (orthogonal complement basis): w = {w}')
print(f'Normalized: w_hat = {w_normalized}')
print(f'\nVerification:')
print(f'  u1 . w = {np.dot(u1, w):.4f}  (should be 0)')
print(f'  u2 . w = {np.dot(u2, w):.4f}  (should be 0)')

# Decompose an arbitrary vector into U and U^perp components
x = np.array([3.0, 2.0, 1.0])
# Project x onto U using projection formula
B = U_mat  # 3x2
lam = np.linalg.solve(B.T @ B, B.T @ x)
x_U = B @ lam                 # component in U
x_perp = x - x_U              # component in U^perp

print(f'\nDecomposition of x = {x}:')
print(f'  x_U     = {x_U}  (projection onto U)')
print(f'  x_perp  = {x_perp}  (component in U^perp)')
print(f'  x_U + x_perp = {x_U + x_perp}  (should equal x)')
print(f'  u1 . x_perp = {np.dot(u1, x_perp):.6f}  (should be 0)')
print(f'  u2 . x_perp = {np.dot(u2, x_perp):.6f}  (should be 0)')

# Visualize orthogonal complement in R^3
fig = plt.figure(figsize=(8, 7))
ax = fig.add_subplot(111, projection='3d')

# Draw the plane U (spanned by u1 and u2) as a mesh
s = np.linspace(-1.5, 1.5, 10)
t = np.linspace(-1.5, 1.5, 10)
S, T = np.meshgrid(s, t)
X_plane = S * u1[0] + T * u2[0]
Y_plane = S * u1[1] + T * u2[1]
Z_plane = S * u1[2] + T * u2[2]
ax.plot_surface(X_plane, Y_plane, Z_plane, alpha=0.2, color='cyan')

# Draw basis vectors of U
origin = np.zeros(3)
ax.quiver(*origin, *u1, color='blue', arrow_length_ratio=0.1, linewidth=2, label=r'$\mathbf{u}_1$')
ax.quiver(*origin, *u2, color='green', arrow_length_ratio=0.1, linewidth=2, label=r'$\mathbf{u}_2$')

# Draw normal vector (orthogonal complement)
ax.quiver(*origin, *w_normalized * 1.5, color='red', arrow_length_ratio=0.1, linewidth=2.5, label=r'$\mathbf{w}$ (normal)')

ax.set_xlabel('$e_1$')
ax.set_ylabel('$e_2$')
ax.set_zlabel('$e_3$')
ax.set_title('Orthogonal Complement: plane $U$ and its normal $\mathbf{w}$')
ax.legend(loc='upper left')
plt.tight_layout()
plt.show()

7. Inner Product of Functions (MML 3.7)¶

The inner product generalizes from finite-dimensional vectors to functions. For $u, v : \mathbb{R} \to \mathbb{R}$:

\[\langle u, v \rangle := \int_a^b u(x) v(x) \, dx\]

Two functions are orthogonal if $\langle u, v \rangle = 0$.

Example (MML 3.9): $\sin$ and $\cos$ are orthogonal on $[-\pi, \pi]$: $$\int_{-\pi}^{\pi} \sin(x) \cos(x) \, dx = 0$$

The collection $\{1, \cos(x), \cos(2x), \cos(3x), \ldots\}$ is orthogonal on $[-\pi, \pi]$, which is the basis of Fourier series.

# Inner product of functions via numerical integration (trapezoidal rule)
def func_inner_product(f, g, a, b, n_points=10000):
    """Compute <f, g> = integral_a^b f(x)*g(x) dx via trapezoidal rule."""
    x = np.linspace(a, b, n_points)
    return np.trapz(f(x) * g(x), x)

# Show sin and cos are orthogonal on [-pi, pi]
ip_sin_cos = func_inner_product(np.sin, np.cos, -np.pi, np.pi)
print(f'<sin, cos> on [-pi, pi] = {ip_sin_cos:.6f}  (should be ~0)')

# Show cos(x) and cos(2x) are orthogonal on [-pi, pi]
cos2x = lambda x: np.cos(2 * x)
ip_cos_cos2 = func_inner_product(np.cos, cos2x, -np.pi, np.pi)
print(f'<cos(x), cos(2x)> on [-pi, pi] = {ip_cos_cos2:.6f}  (should be ~0)')

# Show cos(x) is NOT orthogonal to itself
ip_cos_cos = func_inner_product(np.cos, np.cos, -np.pi, np.pi)
print(f'<cos(x), cos(x)> on [-pi, pi] = {ip_cos_cos:.4f}  (should be pi = {np.pi:.4f})')

# Polynomial inner product: <p, q> = integral_0^1 p(x)*q(x) dx
# p(x) = 1, q(x) = x  =>  <p,q> = integral_0^1 x dx = 1/2
p = lambda x: np.ones_like(x)
q = lambda x: x
r = lambda x: x**2 - 1/3  # chosen so that <p, r> = 0 on [0,1]
print(f'\nPolynomial inner products on [0, 1]:')
print(f'  <1, x> = {func_inner_product(p, q, 0, 1):.4f}  (expected: 0.5)')
print(f'  <1, x^2 - 1/3> = {func_inner_product(p, r, 0, 1):.6f}  (should be ~0: orthogonal)')

# Visualize sin(x)*cos(x) and its integral (MML Example 3.9 / Figure 3.8)
x = np.linspace(-np.pi, np.pi, 500)

fig, axes = plt.subplots(1, 2, figsize=(13, 5))

# Left: sin, cos, and their product
ax = axes[0]
ax.plot(x, np.sin(x), 'b-', label=r'$\sin(x)$', lw=1.5)
ax.plot(x, np.cos(x), 'r-', label=r'$\cos(x)$', lw=1.5)
ax.plot(x, np.sin(x) * np.cos(x), 'g-', label=r'$\sin(x)\cos(x)$', lw=2)
ax.fill_between(x, np.sin(x) * np.cos(x), alpha=0.15, color='green')
ax.axhline(0, color='k', lw=0.5)
ax.set_xlabel('$x$')
ax.set_title(r'$f(x) = \sin(x)\cos(x)$: positive and negative areas cancel')
ax.legend()
ax.grid(True, alpha=0.3)

# Right: orthogonality of {1, cos(x), cos(2x), cos(3x)}
ax = axes[1]
funcs = [
    (lambda x: np.ones_like(x), '1'),
    (np.cos, r'$\cos(x)$'),
    (lambda x: np.cos(2*x), r'$\cos(2x)$'),
    (lambda x: np.cos(3*x), r'$\cos(3x)$'),
]
names = ['1', 'cos(x)', 'cos(2x)', 'cos(3x)']
n_funcs = len(funcs)
gram = np.zeros((n_funcs, n_funcs))
for i in range(n_funcs):
    for j in range(n_funcs):
        gram[i, j] = func_inner_product(funcs[i][0], funcs[j][0], -np.pi, np.pi)

im = ax.imshow(gram, cmap='RdBu_r', vmin=-np.pi, vmax=2*np.pi)
ax.set_xticks(range(n_funcs))
ax.set_yticks(range(n_funcs))
ax.set_xticklabels(names)
ax.set_yticklabels(names)
for i in range(n_funcs):
    for j in range(n_funcs):
        ax.text(j, i, f'{gram[i,j]:.2f}', ha='center', va='center', fontsize=10)
ax.set_title(r'Gram matrix $\langle f_i, f_j \rangle$ on $[-\pi, \pi]$')
plt.colorbar(im, ax=ax, shrink=0.8)

plt.tight_layout()
plt.show()

8. Orthogonal Projections (MML 3.8)¶

8.1 Projection onto a Line (1D subspace)¶

For a line through the origin with basis vector $\mathbf{b}$, the projection of $\mathbf{x}$ is:

\[\pi_U(\mathbf{x}) = \frac{\mathbf{b}^\top \mathbf{x}}{\mathbf{b}^\top \mathbf{b}} \, \mathbf{b}\]

The projection matrix is: $\mathbf{P}_\pi = \frac{\mathbf{b}\mathbf{b}^\top}{\mathbf{b}^\top \mathbf{b}}$

8.2 Projection onto General Subspaces¶

For a subspace $U$ with basis $\mathbf{B} = [\mathbf{b}_1, \ldots, \mathbf{b}_m]$:

\[\pi_U(\mathbf{x}) = \mathbf{B}(\mathbf{B}^\top \mathbf{B})^{-1} \mathbf{B}^\top \mathbf{x}\]

\[\mathbf{P}_\pi = \mathbf{B}(\mathbf{B}^\top \mathbf{B})^{-1} \mathbf{B}^\top\]

This derives from the normal equation $\mathbf{B}^\top \mathbf{B} \boldsymbol{\lambda} = \mathbf{B}^\top \mathbf{x}$ and is directly connected to least squares.

# 8.1 Projection onto a line (MML Example 3.10)
b = np.array([1.0, 2.0, 2.0])
x = np.array([1.0, 1.0, 1.0])

# Projection formula: pi(x) = (b^T x / b^T b) * b
lam = np.dot(b, x) / np.dot(b, b)
proj_x = lam * b

# Projection matrix: P = bb^T / b^T b
P_line = np.outer(b, b) / np.dot(b, b)
proj_x_mat = P_line @ x

print('=== Projection onto a Line (MML Example 3.10) ===')
print(f'b = {b}')
print(f'x = {x}')
print(f'lambda = b^T x / b^T b = {lam:.4f}')
print(f'pi(x) = lambda * b = {proj_x}')
print(f'pi(x) via P*x       = {proj_x_mat}')
print(f'\nProjection matrix P = bb^T / b^T b:')
print(np.round(P_line, 4))

# Verify P^2 = P
print(f'\nP^2 == P: {np.allclose(P_line @ P_line, P_line)}')

# Verify orthogonality of residual
residual = x - proj_x
print(f'Residual (x - pi(x)) . b = {np.dot(residual, b):.6f}  (should be 0)')

# 8.2 Projection onto a 2D subspace (MML Example 3.11)
B = np.array([[1, 0],
              [1, 1],
              [1, 2]], dtype=float)
x = np.array([6.0, 0.0, 0.0])

# Normal equation: B^T B lambda = B^T x
BtB = B.T @ B
Btx = B.T @ x
lam = np.linalg.solve(BtB, Btx)

# Projection
proj_x = B @ lam

# Projection matrix
P_sub = B @ np.linalg.inv(BtB) @ B.T

print('=== Projection onto a 2D Subspace (MML Example 3.11) ===')
print(f'B =\n{B}')
print(f'x = {x}')
print(f'\nB^T B =\n{BtB}')
print(f'B^T x = {Btx}')
print(f'lambda = {lam}')
print(f'pi_U(x) = B*lambda = {proj_x}')
print(f'\nProjection matrix P:')
print(np.round(P_sub, 4))

# Verify
residual = x - proj_x
print(f'\nResidual: x - pi_U(x) = {residual}')
print(f'||residual|| = {np.linalg.norm(residual):.4f}  (book: sqrt(6) = {np.sqrt(6):.4f})')
print(f'Residual . b1 = {np.dot(residual, B[:, 0]):.6f}  (should be 0)')
print(f'Residual . b2 = {np.dot(residual, B[:, 1]):.6f}  (should be 0)')
print(f'P^2 == P: {np.allclose(P_sub @ P_sub, P_sub)}')

# Visualize projection onto a line in 2D
b = np.array([2.0, 1.0])
x = np.array([1.0, 2.0])
proj = (np.dot(b, x) / np.dot(b, b)) * b

fig, axes = plt.subplots(1, 2, figsize=(13, 5.5))

# Left: projection onto a line
ax = axes[0]
# Draw the line (extended)
t_line = np.linspace(-0.5, 1.5, 100)
ax.plot(t_line * b[0], t_line * b[1], 'k--', alpha=0.3, label='Subspace $U$')
# Vectors
ax.annotate('', xy=b, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='blue', lw=2))
ax.annotate('', xy=x, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='red', lw=2))
ax.annotate('', xy=proj, xytext=(0, 0), arrowprops=dict(arrowstyle='->', color='orange', lw=2))
# Residual
ax.annotate('', xy=x, xytext=proj, arrowprops=dict(arrowstyle='->', color='green', lw=1.5, ls='--'))
# Right angle marker
sq_size = 0.12
b_unit = b / np.linalg.norm(b)
perp = np.array([-b_unit[1], b_unit[0]])
sq = proj + sq_size * perp
ax.plot([proj[0], sq[0]], [proj[1], sq[1]], 'k-', lw=0.8)
ax.plot([sq[0], sq[0] + sq_size * b_unit[0]], [sq[1], sq[1] + sq_size * b_unit[1]], 'k-', lw=0.8)

ax.text(b[0] + 0.05, b[1] + 0.1, r'$\mathbf{b}$', fontsize=13, color='blue')
ax.text(x[0] + 0.05, x[1] + 0.05, r'$\mathbf{x}$', fontsize=13, color='red')
ax.text(proj[0] - 0.15, proj[1] - 0.25, r'$\pi_U(\mathbf{x})$', fontsize=12, color='orange')
ax.set_xlim(-0.5, 2.5)
ax.set_ylim(-0.5, 2.5)
ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.set_title('Projection onto a Line')

# Right: least-squares connection
# Fit y = mx to data using projection
np.random.seed(7)
n_pts = 20
x_data = np.random.uniform(-3, 3, n_pts)
y_data = 0.7 * x_data + np.random.randn(n_pts) * 0.5

# Least squares: project y onto column space of X
X = x_data.reshape(-1, 1)  # n x 1
# P = X (X^T X)^{-1} X^T
m = np.linalg.solve(X.T @ X, X.T @ y_data)  # scalar slope
y_hat = X @ m

ax = axes[1]
ax.scatter(x_data, y_data, color='steelblue', s=40, zorder=3, label='Data')
sort_idx = np.argsort(x_data)
ax.plot(x_data[sort_idx], y_hat[sort_idx], 'r-', lw=2, label=f'Least squares: y = {m[0]:.3f}x')
# Draw residuals
for xi, yi, yhi in zip(x_data, y_data, y_hat.ravel()):
    ax.plot([xi, xi], [yi, yhi], 'g--', alpha=0.4, lw=0.8)
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Projections and Least Squares')
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

9. Rotations (MML 3.9)¶

2D Rotations¶

A rotation by angle $\theta$ (counterclockwise) in $\mathbb{R}^2$ is given by the matrix (Eq. 3.76):

\[\begin{split}\mathbf{R}(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}\end{split}\]

3D Rotations¶

In $\mathbb{R}^3$, rotations about the three coordinate axes are (Eqs. 3.77–3.79):

\[\begin{split}\mathbf{R}_1(\theta) = \begin{bmatrix} 1 & 0 & 0 \\ 0 & \cos\theta & -\sin\theta \\ 0 & \sin\theta & \cos\theta \end{bmatrix}, \quad \mathbf{R}_2(\theta) = \begin{bmatrix} \cos\theta & 0 & \sin\theta \\ 0 & 1 & 0 \\ -\sin\theta & 0 & \cos\theta \end{bmatrix}, \quad \mathbf{R}_3(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix}\end{split}\]

Properties: Rotation matrices are orthogonal ($\mathbf{R}^\top = \mathbf{R}^{-1}$, $\det(\mathbf{R}) = 1$). They preserve lengths and angles.

# 2D rotation matrix
def rotation_2d(theta):
    """2D rotation matrix for angle theta (radians)."""
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[c, -s],
                     [s,  c]])

# 3D rotation matrices about each axis
def rotation_3d_x(theta):
    """Rotation about e1 (x-axis)."""
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[1, 0, 0],
                     [0, c, -s],
                     [0, s, c]])

def rotation_3d_y(theta):
    """Rotation about e2 (y-axis)."""
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[c, 0, s],
                     [0, 1, 0],
                     [-s, 0, c]])

def rotation_3d_z(theta):
    """Rotation about e3 (z-axis)."""
    c, s = np.cos(theta), np.sin(theta)
    return np.array([[c, -s, 0],
                     [s, c, 0],
                     [0, 0, 1]])

# MML Exercise 3.10: Rotate x1=[2,3], x2=[0,-1] by 30 degrees
theta = np.radians(30)
R = rotation_2d(theta)
x1 = np.array([2.0, 3.0])
x2 = np.array([0.0, -1.0])

x1_rot = R @ x1
x2_rot = R @ x2

print('2D Rotation by 30 degrees:')
print(f'R(30) =\n{np.round(R, 4)}')
print(f'\nx1 = {x1}  =>  R*x1 = {np.round(x1_rot, 4)}')
print(f'x2 = {x2}  =>  R*x2 = {np.round(x2_rot, 4)}')

# Verify properties
print(f'\nR^T R = I: {np.allclose(R.T @ R, np.eye(2))}')
print(f'det(R) = {np.linalg.det(R):.4f}  (should be 1)')
print(f'||x1|| preserved: {np.linalg.norm(x1):.4f} -> {np.linalg.norm(x1_rot):.4f}')

# 3D rotation example
theta_3d = np.radians(45)
R3 = rotation_3d_z(theta_3d)
v = np.array([1.0, 0.0, 0.0])
v_rot = R3 @ v
print(f'\n3D Rotation about z-axis by 45 degrees:')
print(f'R3(45) =\n{np.round(R3, 4)}')
print(f'[1,0,0] -> {np.round(v_rot, 4)}')
print(f'det(R3) = {np.linalg.det(R3):.4f}')

# Visualize 2D rotation: rotating a triangle
fig, axes = plt.subplots(1, 2, figsize=(13, 6))

# Triangle vertices
triangle = np.array([[0, 0],
                     [2, 0],
                     [1, 1.5],
                     [0, 0]]).T  # 2 x 4

# Left: multiple rotation angles
ax = axes[0]
angles_deg = [0, 30, 60, 90, 120, 150]
colors = plt.cm.viridis(np.linspace(0, 0.9, len(angles_deg)))

for angle, color in zip(angles_deg, colors):
    R = rotation_2d(np.radians(angle))
    rotated = R @ triangle
    ax.plot(rotated[0], rotated[1], '-', color=color, lw=1.5, label=f'{angle}$^\\circ$')
    ax.fill(rotated[0], rotated[1], color=color, alpha=0.1)

ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.axhline(0, color='k', lw=0.5)
ax.axvline(0, color='k', lw=0.5)
ax.legend(title='Angle', loc='lower left', fontsize=9)
ax.set_title('2D Rotation: Triangle at Various Angles')
ax.set_xlim(-3, 3)
ax.set_ylim(-2, 3)

# Right: rotating a unit circle point continuously
ax = axes[1]
thetas = np.linspace(0, 2*np.pi, 100)
point = np.array([1.0, 0.0])
trajectory = np.array([rotation_2d(t) @ point for t in thetas])

ax.plot(trajectory[:, 0], trajectory[:, 1], 'b-', lw=1.5, alpha=0.5, label='Trajectory')

# Mark a few positions
for deg in [0, 45, 90, 135, 180, 270]:
    rad = np.radians(deg)
    p = rotation_2d(rad) @ point
    ax.plot(p[0], p[1], 'ro', markersize=6)
    ax.annotate(f'{deg}$^\\circ$', xy=p, xytext=(p[0]+0.1, p[1]+0.1), fontsize=9)

ax.set_aspect('equal')
ax.grid(True, alpha=0.3)
ax.axhline(0, color='k', lw=0.5)
ax.axvline(0, color='k', lw=0.5)
ax.set_title('Rotation of [1, 0] Through 360$^\\circ$')
ax.set_xlim(-1.5, 1.5)
ax.set_ylim(-1.5, 1.5)

plt.tight_layout()
plt.show()

# 3D rotation visualization: rotate a set of points about the z-axis
fig = plt.figure(figsize=(9, 7))
ax = fig.add_subplot(111, projection='3d')

# Create a small house-like shape in the x-z plane
house_pts = np.array([
    [0, 0, 0],
    [1, 0, 0],
    [1, 0, 1],
    [0.5, 0, 1.5],
    [0, 0, 1],
    [0, 0, 0],
]).T  # 3 x 6

# Plot original and rotated versions
angles_3d = [0, 45, 90, 135]
colors_3d = ['blue', 'red', 'green', 'purple']

for angle, color in zip(angles_3d, colors_3d):
    R = rotation_3d_z(np.radians(angle))
    rotated = R @ house_pts
    ax.plot(rotated[0], rotated[1], rotated[2], '-o', color=color,
            markersize=3, lw=1.5, label=f'{angle}$^\\circ$ about $e_3$')

# Draw coordinate axes
ax_len = 1.8
ax.quiver(0, 0, 0, ax_len, 0, 0, color='k', arrow_length_ratio=0.05, lw=1)
ax.quiver(0, 0, 0, 0, ax_len, 0, color='k', arrow_length_ratio=0.05, lw=1)
ax.quiver(0, 0, 0, 0, 0, ax_len, color='k', arrow_length_ratio=0.05, lw=1)
ax.text(ax_len + 0.1, 0, 0, '$e_1$', fontsize=11)
ax.text(0, ax_len + 0.1, 0, '$e_2$', fontsize=11)
ax.text(0, 0, ax_len + 0.1, '$e_3$', fontsize=11)

ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_zlabel('$z$')
ax.set_title('3D Rotation About the $e_3$-Axis')
ax.legend(loc='upper left', fontsize=9)
plt.tight_layout()
plt.show()

# Demonstrate non-commutativity of 3D rotations
theta_a, theta_b = np.radians(90), np.radians(90)
Rx_then_Rz = rotation_3d_z(theta_b) @ rotation_3d_x(theta_a)
Rz_then_Rx = rotation_3d_x(theta_a) @ rotation_3d_z(theta_b)

print('Non-commutativity of 3D rotations (90 deg each):')
print(f'R_z * R_x =\n{np.round(Rx_then_Rz, 4)}')
print(f'\nR_x * R_z =\n{np.round(Rz_then_Rx, 4)}')
print(f'\nR_z*R_x == R_x*R_z?  {np.allclose(Rx_then_Rz, Rz_then_Rx)}')

Summary¶

This lab covered the key concepts of Analytic Geometry from MML Chapter 3:

Concept	Key Formula	Section
Norms	$\|\mathbf{x}\|_p = (\sum	x_i
Inner Products	$\langle \mathbf{x}, \mathbf{y} \rangle = \mathbf{x}^\top \mathbf{A} \mathbf{y}$	3.2
Cauchy-Schwarz	$	\langle \mathbf{x}, \mathbf{y} \rangle
Angles	$\cos\omega = \langle \mathbf{x}, \mathbf{y} \rangle / (\|\mathbf{x}\| \cdot \|\mathbf{y}\|)$	3.4
Orthogonality	$\mathbf{x} \perp \mathbf{y} \iff \langle \mathbf{x}, \mathbf{y} \rangle = 0$	3.4
Gram-Schmidt	$\mathbf{u}_k = \mathbf{b}_k - \sum_j \text{proj}_{\mathbf{u}_j}(\mathbf{b}_k)$	3.8.3
Orthogonal Complement	$U^\perp = \{\mathbf{v} : \langle \mathbf{v}, \mathbf{u}\rangle = 0, \forall \mathbf{u} \in U\}$	3.6
Function Inner Product	$\langle u, v \rangle = \int_a^b u(x)v(x)\,dx$	3.7
Projection (line)	$\pi_U(\mathbf{x}) = \frac{\mathbf{b}^\top \mathbf{x}}{\|\mathbf{b}\|^2} \mathbf{b}$	3.8.1
Projection (subspace)	$\mathbf{P} = \mathbf{B}(\mathbf{B}^\top\mathbf{B})^{-1}\mathbf{B}^\top$	3.8.2
2D Rotation	$\mathbf{R}(\theta) = [\cos\theta, -\sin\theta; \sin\theta, \cos\theta]$	3.9.1
3D Rotations	$\mathbf{R}_1, \mathbf{R}_2, \mathbf{R}_3$ about coordinate axes	3.9.2

ML connections:

Projections underpin least-squares regression (Ch. 9) and PCA (Ch. 10)
Inner products define kernels for SVMs (Ch. 12)
Orthogonality is key to Gram-Schmidt, QR decomposition, and numerical stability
Cosine similarity is used throughout NLP and recommendation systems
Rotations are central to computer graphics, robotics, and data augmentation

Norm	Formula
\(\ell_1\) (Manhattan)	$\|\mathbf{x}\|1 = \sum{i=1}^n
\(\ell_2\) (Euclidean)	\(\|\mathbf{x}\|_2 = \sqrt{\sum_{i=1}^n x_i^2}\)
\(\ell_\infty\) (Max)	$\|\mathbf{x}\|_\infty = \max_i