Curve Fitting
by Richard Reid
(reference "Schaums Outline Series: Theory and problems of Probability and Statistics"
by Murray R. Spiegal, Ph.D.)
Very often in practice a relationship is found to exist between two (or more) variables, and one wishes to express this relationship in mathematical form by determinining an equation connecting the variables.
A first step is the collection of data showing corresponding values of the variables. For example, suppose x and y denote respectively the point count and the advantage in a game of blackjack. Then a sample of n hands of blackjack would reveal the point count x1, x2, x3, . . . , xn and the corresponding advantage y1, y2, y3 . . ., yn.
A next step is to plot the points (x1,y1), (x2,y2), . . ., (xn,yn) on a rectangular coordinate system. The resulting set of points is sometimes called a scatter diagram.
From the scatter diagram it is often possible to visualize a smooth curve approximating the data. Such a curve is called an "approximating curve." In Figure 1, for example, the data apear to be approximated well by a straight line and we say that a linear relationship exists between the variables.
In Figure 2, however, although a relationship exists between the two variables it is not a linear relationship and so we call it a nonlinear relationship.
In Figure 3, there appears to be no relationship between the variables.
The general problem of finding equations of approximating curves which fit given sets of data is called curve fitting. In practice the type of equation is often suggested from the scatter diagram. Thus, for Figure 1, we could use the equation for a straight line:
y = a + b*x
while for Figure 2, we could try an equation for a parabola or quadratic curve:
y = a + b*x + c*x2
Return to: Least Squares Estimates