Horner’s Method: Polynomial at a Point
Multiplications are more expensive to compute than additions. Additionally, rounding error can accumulate when additions and multiplications are mixed incorrectly. To evaluate a polynomial anxn+an−1xn−1+⋯+a2x2+a1x+a0 at some specific x, the most efficient and accurate solution is to use Horner’s Method: re-write the polynomial as ((⋯((an)x+an−1)x+⋯+a2)x+a1)x+a0 and evaluate left-to-right as n multiplications and n additions. On hardware with a fused-multiply-add instruction, this reduces further to just n FMA operations.
Finite Difference
The finite difference of some function f(x) at x is defined to be b−af(x+b)−f(x+a). For our purposes, we will always use b=1 and a=0 to get the simpler f(x+b)−f(x), which is called the forward difference for f at x and write it as Δ[f](x).
The forward difference of the polynomial f(x)=7x3−2x2−8x+3 can be evaluated by expanding f(x+1)−f(x) and combining like terms to get 21x2+17x−3. Taking the forward difference of that, we get 42x+38, and the forward difference of that is simply 42. The forward difference of a constant like 42 is always 0.
Just as in this example, the finite difference of any polynomial is a polynomial of one lesser degree.
Difference addition
If we wish to evaluate a polynomial at a sequence of adjacent integer arguments, we can use forward differences to compute these points very efficiently. In particular, if we know f(x) and Δ[f](x), then we can compute f(x+1)=f(x)+Δ[f](x). If Δ[f] is a constant we can keep adding that same value to get f(x+2), $f(x+3), and so on. If it is not a constant than it is another polynomial we can use this same method on to find the sequence of Δ[f](x)s we need.
Consider the 1st-order polynomial f(x)=42x+38. Use Horner’s rule, we find f(−5)=−172 and f(−4)=−130.
Subtracting these, we find Δ[f](−5)=42. Because f is linear, Δ[f] is constant.
We now find other values of f:
- f(−3)=f(−4)+Δ[f]=−130+42=−88
- f(−2)=f(−3)+Δ[f]=−88+42=−46
- f(−1)=f(−2)+Δ[f]=−46+42=−4
- f(0)=f(−1)+Δ[f]=−4+42=38
We can also move backwards:
- f(−6)=f(−5)−Δ[f]=−172−42=−214
- f(−7)=f(−6)−Δ[f]=−214−42=−266
Consider the 2nd-order polynomial f(x)=21x2+17x−3. Use Horner’s rule, we find
- f(−5)=437
- f(−4)=265
- f(−3)=135
Subtracting these, we find
- Δ[f](−5)=−172
- Δ[f](−4)=−130
and subtracting those we get
- Δ[Δ[f]](−5)=42.
Because f is quadratic, Δ[f] is linear and Δ[Δ[f]] is constant.
Now, we know that f(−2)=f(−3)+Δ[f](−3), but we don’t yet know Δ[f](−3). But we know Δ[f](−3)=Δ[f](−4)+Δ[Δ[f]]=−130+42=−88, which means f(−2)=f(−3)−88=138−88=50. And we got that with just two additions, no multiplications. Similarly
−5 |
42 |
−172 |
437 |
−4 |
42 |
−130 |
265 |
−3 |
42 |
−88 |
135 |
−2 |
42 |
−46 |
47 |
−1 |
42 |
−4 |
1 |
0 |
|
38 |
−3 |
1 |
|
|
35 |
As polynomials get bigger, the Δ[⋯] notation becomes awkward. Sometimes primes like f′ and f′′, or dots like f˙ and f¨, or subscripts like fx and fxx, are used instead.
Let f(x)=7x3−2x2−8x+3. Evaluating using Horner’s rule, we get
- f(0)=3
- f(1)=0 meaning f′(0)=−3
- f(2)=35 meaning f′(1)=35 and f′′(0)=38
- f(3)=150 meaning f′(1)=115 and f′′(0)=80 and f′′′(0)=42
We can now find f larger x by adding forward differences:
0 |
42 |
38 |
−3 |
3 |
1 |
42 |
80 |
35 |
0 |
2 |
42 |
122 |
115 |
35 |
3 |
42 |
164 |
237 |
150 |
4 |
42 |
|
401 |
387 |
5 |
42 |
|
|
788 |
And at smaller x by subtracting forward differences
0 |
42 |
38 |
−3 |
3 |
−1 |
42 |
−4 |
1 |
2 |
−2 |
42 |
−46 |
47 |
−45 |
−3 |
42 |
−88 |
135 |
−180 |
Summary
The method of forward differences lets us evaluate a single-variable polynomial efficiently at integer arguments. All we need to do is
- Evaluate the nth-order polynomial at n+1 adjacent arguments. In the last example above, these were (3,0,35,150). We’ll discard these numbers shortly.
- Subtract, then subtract again, and so on to get a list of forward differences. In the last example above, these were (3,−3,38,42). These numbers we’ll keep, and the first one is the function at the first evaluated x.
- To increase x, we add values in the difference list left-to-right: (3−3,−3+38,38+42,42)=(0,35,80,42).
- To decrease x, we subtract values in the difference list right-to-left: (?,?,38−42,42)→(?,−3−(−4),−4,42)→(3−1,1,−4,42)=(2,1,−4,42).
Multivariate forward differences
This also generalizes naturally to multivariate polynomials. If we have f(x,y) we can find both fx(x,y)=f(x+1,y)−f(x,y) and fy(x,y)=f(x,y+1)−f(y). Conveniently, the order of differencing does not matter:
fyx(x,y)======fx(x,y+1)−fx(x,y)(f(x+1,y+1)−f(x,y+1))−(f(x+1,y)−f(x,y))f(x+1,y+1)−f(x,y+1)−f(x+1,y)+f(x,y)(f(x+1,y+1)−f(x+1,y))−(f(x,y+1)−f(x,y))fy(x+1,y)−fy(x,y)fxy(x,y)
Consider the circle equation f(x,y)=(x−cx)2+(y−cy)2−r2, so called because f(x,y)=0 is a circle of radius r centered at point (cx,cy). Computing the finite differences, we have:
- fx(x,y)=1+2(x−cx)
- fxx(x,y)=2
- fy(x,y)=1+2(y−cy)
- fyy(x,y)=2
- fxy(x,y)=0
Consider drawing a circle of radius 6 centered at (2,3).
We know that (−4,3) is on the circle, and that it is symmetric; if we can find the points between 0° and 45°, we can mirror them to get the rest of the points.
We find our initial value and differences:
- f(−4,3)=0
- fx(−4,3)=−11
- fy(−4,3)=1
- fxx=fyy=2.
Let’s abbreviate this as (f,fx,fy) so our starting state at (−4,3) is (0,−11,1).
Now we’ll repeatedly move up in y and decide if it’s better to move right in x or not.
up to (−4,4) gives us (1,−11,3)
right to (−3,4) would give us (−10,−9,3) but that’s further from 0 so let’s not
plot (−4,4) and its symmetric neighbors.
up to (−4,5) gives us (4,−11,5)
right to (−3,5) would give us (−7,−9,5) but that’s further from 0 so let’s not
plot (−4,5) and its symmetric neighbors.
up to (−4,6) gives us (9,−11,7)
right to (−3,6) gives us (−2,−9,7) which is closer to 0 so let’s use that
plot (−3,6) and its symmetric neighbors.
up to (−3,7) gives us (5,−9,9)
right to (−2,7) gives us (−4,−7,9) which is closer to 0 so let’s use that
plot (−2,7) and its symmetric neighbors.
fy now has larger magnitude than fx, meaning we have reached the 45° point and are done.
The exact details of how we decide to pick between moving in x and not moving depends on if we want to plot points inside, near, or outside the circle. For inside points, always keep f(x,y)≤0; this is what we’d do to fill it in. For outside points, always keep f(x,y)>0; this is what we’d do to mask the circle, coloring things outside it. For nearest points, keep the f(x,y) with the smaller magnitude; this is what we’d do to draw the circle as a single-pixel-width ring.