Least-Squares Polynomial Approximation
Theory If it is known that the measured quantity y (depended variable) is a linear function of x (independent variable), i.e.
the most probable values of a0 (intercept) and a1 (slope) can be estimated from a set of n pairs of experimental data (x1, y1), (x2, y2)…, (xn, yn), where y-values are contaminated with a normally distributed zero mean random error (e.g. noise, experimental uncertainty). This estimation is known as least-squares linear regression. Least-squares linear regression is only a partial case of least-squares polynomial regression analysis. By implementing this analysis, it is easy to fit any polynomial of m degree
to experimental data (x1, y1), (x2, y2)…, (xn, yn), (provided that n≥m+1) so that the sum of squared residuals S is minimized:
By obtaining the partial derivatives of S with respect to a0, a1, .., am and equating these derivatives to zero, the following system of m-equations and m-unknowns (a0, a1, .., am) is defined:
where:
(obviously it is always: s0 = n) This system is known a system of normal equations. The set of coefficients: a0, a1, …, am is the unique solution of this system. For m=1, the familiar expressions used in linear least-square fit are obtained:
Similar (but by far more complicated) expressions are obtained for coefficients of polynomials of higher
degrees. Direct use of these expressions for m>1 are almost never used. Instead, the system of normal equations is set and the solution vector of a0, a1, …, an coefficients is calculated usually with the aid of a computer.
The quality of fit is judged by the coefficient of determination (r2):
is the theoretical y-value corresponding to xi (calculated through the polynomial) and is the mean value 2 2 of all experimental y-values. It is always 0≥r ≥1. r =1 indicates a perfect fit (the curve is passing exactly over all data points), whereas r2 becomes progressively less than 1 as the scattering of points about the best fitted curve increases. Another measure of the quality of fit is the sum of squared residuals itself and it is obvious that when S=0 we have again a perfect fit.