Least Squares - More Formal Notation
Thomas J. Kennedy
1 TL;DR
Given a collection of discrete points we need to find a line (polynomial of degree one) of best fit. Since more than two points will be included, it is extremely unlikely to have a single line pass perfectly through every point (i.e., for all points to be collinear). Instead a line of best fit is computed. While most cases use a line, it is possible
-
for a polynomial of any degree to be used as an approximation function (provided a sufficient number of points).
-
to extend the problem to 3 or more spatial dimensions.
2 Formalizing Notation
So far we have discussed three functions:
- $f(x)$ or $f$ - the unknown function to be approximated
- $\varphi$ - a possible approximation function
- $\hat{\varphi}$ - the best possible approximation function
We need to find the best possible approximation function
$$ \hat{\varphi} $$
out of all approximation functions
$$ \Phi $$
that minimizes error. In the next module we will define the weighted L2-Norm
\[ ||f - \hat{\varphi}|| \]
and use it to formally derive a more rigorous form of Least Squares Approximation.
3 Basis Functions
So far, we have defined $\varphi$ as
\[ \begin{align} \varphi & = c_0 + c_1 x + c_2 x^2 + c_3 x^3 + … + c_{n-2} x^{n-2} + c_{n-1} x^{n-1} + c_n x^n \\ & = \sum_{i=0}^{n} c_i x^i \\ \end{align} \]
This can be generalized by factoring out the basis functions.
\[ \begin{align} \varphi & = c_0 \pi_0 + c_1 \pi_1 + c_2 \pi_2 + c_3 \pi_3 + … + c_{n-2} \pi_{n-2} + c_{n-1} \pi_{n-1} + c_n \pi_n \\ & = \sum_{i=0}^{n} c_i \pi_i \\ \end{align} \]
For the problems solved thus far the polynomial basis functions would be defined as:
\[ \begin{align} \pi_0 & = x^0 \\ \pi_1 & = x^1 \\ \pi_2 & = x^2 \\ \pi_3 & = x^3 \\ \pi_0 & = x^4 \\ \vdots & = \vdots \\ \pi_{n-1} & = x^{n-1} \\ \pi_{n} & = x^n \\ \end{align} \]
Note the qualifier polynomial before the most recent use of the term basis functions. Thus far, examples have focused on linear combinations of monomials (e.g., $c_i x^i$). However, different basis functions (e.g., $sin(x)$ or $e^x$) can be used instead (if appropriate).