Curve fitting using unconstrained and constrained linear least squares methods

This online calculator builds a regression model to fit a curve using the linear least squares method. If additional constraints on the approximating function are entered, the calculator uses Lagrange multipliers to find the solutions.

This page exists due to the efforts of the following people:

Timur

Timur

Karen Luckhurst

Created: 2020-06-04 10:02:07, Last updated: 2021-01-18 15:45:38

The calculator below uses the linear least squares method for curve fitting, in other words, to approximate one variable function using regression analysis, just like the calculator Function approximation with regression analysis. But, unlike the previous calculator, this one can find an approximating function if it is additionally constrained by particular points, which means that the computed curve-fit should pass through these particular points.

Lagrange multipliers are used to find a curve-fit in case of constraints. This poses some limitations to the used regression model, namely, only linear regression models can be used. That's why, unlike the above-mentioned calculator, this one does not include power and exponential regressions. However, it includes 4th and 5th order polynomial regressions. Formulas and a brief theory recap can be found below the calculator, as usual.

Note that if the x-values field is left empty, the calculator assumes that x changes starting from zero with a +1 increment.

PLANETCALC, Curve Fitting using Unconstrained and Constrained Linear Least Squares Methods

Curve Fitting using Unconstrained and Constrained Linear Least Squares Methods

Function must pass through particular points

Items per page:

Digits after the decimal point: 4
Quadratic regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
Cubic regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
4th order polynomial regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
5th order polynomial regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
Linear regression
 
Linear correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
Logarithmic regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
Hyperbolic regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
6th order polynomial regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
7th order polynomial regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
8th order polynomial regression
 
Correlation coefficient
 
Coefficient of determination
 
Average relative error, %
 
Results
The file is very large. Browser slowdown may occur during loading and creation.

Linear least squares (LLS)

Linear least squares (LLS) is the least squares approximation of linear functions to data. And the method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the residuals made in the results of every single equation.

You can find more information, including formulas, about the least squares approximation at Function approximation with regression analysis.

Here we will demonstrate with linear regression models, then the approximating function is the linear combination of parameters that needs to be determined. Determined values, of course, should minimize the sum of the squares of the residuals.

Suppose we have a set of data points $(x_1,y_1), ..., (x_m,y_m)$.

Our approximating function is the linear combination of parameters to be determined, for example
y(x;a_1,a_2,a_3,a_4,a_5,a_6)=a_1+a_2x+a_3 \cdot ln(x) + ... + a_6x^{10}

We can use a matrix notation to express the values of this function

\begin{bmatrix} \hat{y}_1 \\ ... \\ \hat{y}_m \end{bmatrix} = \begin{bmatrix} 1 & x_1 & ln(x_1) & ... & x_{1}^{10} \\ ... & ... & ... & ... & ... \\ 1 & x_m & ln(x_m) & ... & x_{m}^{10} \end{bmatrix} \begin{bmatrix} a_1 \\ ... \\ a_6 \end{bmatrix}

Or, in short notation:

\mathbf{\hat{y}=Xa}

Since we are using the least squares approximation, we should minimize the following function

f(\mathbf{a})=\sum_{i=1}^m[\hat{y}(x_i;\mathbf{a})-y_i]^2,

or, in matrix form

f(\mathbf{a})=|\mathbf{Xa-y}|^2

This value is the distance between vector y and vector Xa. To minimize this distance, Xa should be the projection to the X columns space, and vector Xa-y should be orthogonal to that space.

This is possible then
(X\mathbf{v})^T(X{\mathbf{a}}-\mathbf{y})=\mathbf{v}^T(X^TX{\mathbf{a}}-X^T\mathbf{y})=0,

there v is a random vector in the columns space. Since it is random, the only way to satisfy the condition above is to have

X^TX{\mathbf{a}}-X^T\mathbf{y}=0,

or

X^TX{\mathbf{a}}=X^T\mathbf{y},

hence

\mathbf{a}=(X^TX)^{-1}X^T\mathbf{y}

The calculator uses the formula above in the case of the unconstrained linear least squares method.

Lagrange multipliers

Now let's talk about constraints. These could be:
– curve-fit must pass through particular points (this is supported by the calculator)
– slope of the curve at particular points must be equal to particular values.

So, we need to find the approximating function, which, from one side, should minimize the sum of the squares,

f(\mathbf{a})=|\mathbf{Xa-y}|^2

and from the other side, should satisfy the conditions

\begin{bmatrix} y_{c_1} \\ ... \\ y_{c_k} \end{bmatrix} = \begin{bmatrix} 1 & x_{c_1} & ln(x_{c_1}) & ... & x_{c_1}^{10} \\ ... & ... & ... & ... & ... \\ 1 & x_{c_k} & ln(x_{c_k}) & ... & x_{c_k}^{10} \end{bmatrix} \begin{bmatrix} a_1 \\ ... \\ a_6 \end{bmatrix}

or, in matrix form,

\mathbf{b = Ca}

This is called the conditional extremum, and it is solved by constructing the Lagrangian F(a,\lambda) using Lagrange multipliers.

F(a, \lambda)=f(a)+\lambda\varphi(a)

In our case the Lagrangian is

F(a, \lambda)=|\mathbf{Xa-y}|^2+\lambda  (\mathbf{Ca - b})

and the task is to find its extremum. After some derivations, which I have not listed here, the formula to find the parameters is

\begin{bmatrix} a \\ \lambda \end{bmatrix} = \begin{bmatrix} 2X^TX & C^T \\ C & 0 \end{bmatrix}^{-1} \begin{bmatrix} 2X^Ty \\ b \end{bmatrix}

The calculator uses the formula above in the case of the constrained linear least squares method

URL copied to clipboard
PLANETCALC, Curve fitting using unconstrained and constrained linear least squares methods

Comments