Mathematics Projects
Regression Analysis (relates to Statistics; preferable for actuary students)
a) please install Maple and be ready
to use it (stat library)
b) see this link for how to use the leastsquare
function from the stat
to fit curves to data
start with some perfectly linear data of your own creation usi
y = mx + b (m
and b your choice) and pluggi
so should do it). See if you can get
Maple to fit the same linear equation
your data. If so you are off to a good
try and get the same results as in the handout to the parabolic data on page
page 614 (handout) problem 10.
Additionally, in each case, use Maple to plot
both the points given and the least squares
polynomial you have generated.
c) use calculus to derive the
equations for the least squares linear
fit (see me to
d) can your TI calculator perform a least
squares linear fit? How??
Project #2 Regression Analysis
Overview: in
the first project, you got familiar with some of the technical aspects of
regression analysis – basic concepts, usi
Basic concept:
the goal of regression analysis is as follows: given a set of n data points
{ (x1,
y1), (x2, y2),. . . , (xn, yn)}
we wish to find a curve, y=f(x),
which is the best possible fit to that data.
That curve, for the time bei
this is mathematics, part of
which seeks to be precise and quantitative.
The word "best" thus needs clarification. What does it mean
for one curve to be better than another?? Once that is determined, then one can
optimize and seek the "best" of all options. So we need a way to
measure how "good" a curve is.
No curve is perfect unless we
only have 2 or 3 data points. To be perfect would imply that the curve went
exactly through each point, which is
usually not possible. So there is an error associated with any curve. We have to
define that and then try to minimize it.
For n data points { (x1, y1), (x2,
y2),. . . , (xn, yn)} and a curve, y = f(x), its error, E, is defined as
Comments: Take
a piece of paper and and sketch a line as well as some data points not on it. The quantity
yi – f(xi) is the vertical
distance that the curve misses the
data point at each x value. Since we
don't care if it is positive or negative (above or below), and we don't want
any cancellation of positive and negative terms, we square all terms. It doesn't
So now we have a way to measure the error associated with a given
curve. Regression analysis seeks to find
the f(x) which produces the smallest
such error, or the "least squares error".
Part One: Linear Regression
Let's suppose we plot the data and find it to be fairly straight and
thus want a straight line. Then y=f(x) = ax+b. The problem thus reduces to:
find slope and intercept, a and b, so that E is a minimum.
(note: keep in mind in what follows that a and b are the variables, not
the x's and y's, which are known data values)
So, find a and b so that
is a minimum. This is now the problem. The
good news is that elementary calculus can be used to solve this! Recall from calculus I that a
function may be at a minimum if its derivative is 0. The only new twist is that there are two
variables, a and b, so two derivatives are needed.
These are then called partial derivatives, but their concept is the same.
Your job:
compute the derivatives of E with respect to a
and b.
use rules of differentiation from Calc I.
Set your
derivatives to 0
You now
have two equations and two unknowns – put them in standard form.
Note the
equations are linear! This is why this material appears in this
Feel free
to show me your equations when you are done. Do the algebra carefully!!
Next: let's try it out!
1. pick a half dozen data points which are
neither perfectly linear nor terribly non linear. Get Maple to plot them.
2. have Maple produce the least squares linear
fit and plot it (see Project #1)
for the data points you picked, solve your
equations (feel free to use Maple) and
determine a and b
4. plot your
function alo
You should come up with the same thi
Part Two: Quadratic Regression
Same idea as Part One only we wish to fit a
parabola. All data is not linear!!! This
time assume that
y = f(x) = ax2 + bx + c
and seek to once again minimize the
error. Repeat all 5 steps only this time
you will have 3 partial derivatives and 3 equations. Carefully set them up in standard form (variables
on left side, constants on right side).
Next, try it out again!
pick a half dozen data points which are fairly quadratic. Get Maple to plot them.
2. have Maple produce the least squares
quadratic fit and plot it (see Project #1)
for the data points you picked, solve your
equations (feel free to use Maple) and
determine a, b
and c
4. plot your
function alo
Comments: at this point it ought to clear
that to fit an nth degree polynomial to data usi
What to hand in:
a paper with
a cover
the work
outlined above
Conclusion summarizi
– Project 3
Regression Analysis
Your job here is to
put your knowledge of Maple and Regression Analysis together and produce an
interactive Maple worksheet which illustrates the fundamental concepts of Regression
Analysis. You will want to review
your first 2 projects
how one puts text into
The final product
should have two major parts:
Part One –
Linear Regression
Part Two –
Quadratic Regression
Each part should have the followi
a) sample data
(at least a half dozen points – your choice)
b) a plot of the data
c) place
for the user (not you but a hypothetical person) to enter slope and intercept
for the linear case or 3 coefficients
quadratic. You might want to review for them the purpose of the coefficients in
the shape of the graph
d) a graph
e) an
explanation of Least Squares best fit
f) a plot
of the sample data, their function and the Least Squares best fit function
In general, it should
be informative, easy to use and educational.