MA332 lecture notes, Fall 1998 Week 10 Tuesday Topics: Curve-fitting Discussion of midterm evaluation Exam review Problem, how much can you program ahead of time? How to invert a matrix. What is the angle between two vectors. (From Strang page 106) One handy bit of linear algebra. What is ata? It's the sum of the squares of the elements of a, which is the square of the 2-norm. Form a triangle of vectors a, b, and the vector (a-b) that connects them. By the law of cosines (generalization of Pythogoras) ||b-a||^2 = ||b||^2 + ||a||^2 - 2 ||b|| ||a|| cos theta ||b-a||^a = (b-a)t (b-a) = btb - 2atb + ata = btb + ata - 2 ||b|| ||a|| cos theta cos theta = atb / ||a|| ||b|| We can think of that as normalizing a and b and then measuring the projection of one onto the other. Curve-fitting ------------- So far, all the matrices have been square, which means that we have the same number of equations and unknowns, which often means that there is a unique solution. Lots of times, though, we have many equations and only a few unknowns. For example, consider the two parameter model of network performance. transit time = latency + packet size / bandwidth Now imagine that we measure a bunch of transit times, for various packet sizes, and we want to estimate the latency and bandwidth. First problem: is this a linear system. Well, no, but by solving for 1/bandwidth it is. Second problem: what if there are more than two data points? yi = b0 + b1 * xi for many values of i Solution: choose the values of the parameters that yield the best fit to the data, where best fit means smallest error. But the error is a vector yi - b0 + b1 * xi for all values of i So what do we mean smallest? 1) one-norm: absolute deviation 2) two-norm: least-squares 3) infinity-norm: minimax By far the most popular is least-squares fitting, because 1) it gives reasonable weight to various errors (small errors are ok, larger errors gradually cost more) 2) it can be solved analytically (So far this is based on Chapter 8.2, but now we switch to the Strang handout) In matrix notation, we are trying to solve Ax = b, where x = (b0, b1)t, b = (y1, y2, ... yn)t, and A is the matrix made up of a column of 1s and a column of xs. E = ||Ax - b|| E^2 = (Ax - b)t (Ax - b) = AtAx^2 - 2 Atbx + btb The minimum of E^2 occurs when dE2/dx = 0 = 2 AtAx - 2 Atb AtA x = At b In other words, if Ax = b is over constrained, multiply both sides by At! Example: size transit time 1 19 ms 2 32 ms 3 38 ms 1 1 19 A = 1 2 b = 32 x = (latency, 1/bandwidth)t 1 3 38 See Maple worksheet for solution.