Least Squares - A Whirlwind Introduction

Thomas J. Kennedy

Contents:

Go to Google (or another search engine) and look for Least Squares Approximation. You will find a number of pages on the topic (as expected). One of these is a Wikipedia Article, https://en.wikipedia.org/wiki/Least_squares. Unless you are accustomed to a bunch of unfamiliar math terms presented all at-once, this page (and others like it) may seem overwhelming.

However, if we scroll through the page, we eventually find a link to Linear Least Squares. If we scroll through this next page, we will eventually find a somewhat approachable example. However, the notation is a bit unfamiliar.

Let us start our discussion by taking a few steps back…

1 A Line Connecting Two Points

Let us go back a few years… to the equation of a line:

\[ y = mx + b \]

From algebra, we know that:

By definition, we can write m as

\[ m = \frac{y_1 - y_0}{x_1 - x_0} \]

These definitions allow us to write:

\[ y = \frac{y_1 - y_0}{x_1 - x_0} x + b \]

After a quick look at the provided diagram, we can write:

\[ \begin{align} b & = y_0 - m x_0 \\ & = y_0 - \frac{y_1 - y_0}{x_1 - x_0} x_0 \\ \end{align} \]

That leads us to

\[ \begin{align} y & = \frac{y_1 - y_0}{x_1 - x_0}x + (y_0 - \frac{y_1 - y_0}{x_1 - x_0} x_0) \\ & = \frac{y_1 - y_0}{x_1 - x_0}x - \frac{y_1 - y_0}{x_1 - x_0} x_0 + y_0) \\ & = \frac{y_1 - y_0}{x_1 - x_0}(x-x_0) + y_0 \\ \end{align} \]

1.1 Trend Analysis

We now have a reusable equation that allows us to draw a line between any arbitrary pair of unique points (excluding points that form vertical lines). However, data is not always so clean. When working with more than two points, it is very unlikely that a single line will pass through all input points.

This moves us towards approximation (and away from linear interpolation).

1.2 Change of Notation

Before moving on, let us rewrite

\[ y = mx + b \]

as

\[ y = b + mx \]

and change to a more general notation

\[ \varphi = c_0 + c_1 x \]

This allows us to generalize our approximation function to

\[ \begin{align} \varphi & = c_0 + c_1 x + c_2 x^2 + c_3 x^3 + … + c_{n-2} x^{n-2} + c_{n-1} x^{n-1} + c_n x^n \\ & = \sum_{i=0}^{n} c_i x^i \\ \end{align} \]

This allows us to work with polynomial approximation functions of any degree (i.e., this removes the line-only limit).

2 The XTX|XTY Method

Later in this course we will discuss a more robust form of Least Squares Approximation. However, we will start with the XTX|XTY (Wikipedia) method based on:

  1. accessibility of web-based resources on the method.
  2. availability of immediate application of the method.