1 Horner’s Method: Polynomial at a Point

Multiplications are more expensive to compute than additions. Additionally, rounding error can accumulate when additions and multiplications are mixed incorrectly. To evaluate a polynomial anxn+an1xn1++a2x2+a1x+a0a_n x^n + a_{n-1} x^{n-1} + \cdots + a_2 x^2 + a_1 x + a_0 at some specific xx, the most efficient and accurate solution is to use Horner’s Method: re-write the polynomial as ((((an)x+an1)x++a2)x+a1)x+a0((\cdots((a_n) x + a_{n-1}) x + \cdots + a_2) x + a_1) x + a_0 and evaluate left-to-right as nn multiplications and nn additions. On hardware with a fused-multiply-add instruction, this reduces further to just nn FMA operations.

2 Finite Difference

The finite difference of some function f(x)f(x) at xx is defined to be f(x+b)f(x+a)ba\dfrac{f(x+b)-f(x+a)}{b-a}. For our purposes, we will always use b=1b=1 and a=0a=0 to get the simpler f(x+b)f(x)f(x+b)-f(x), which is called the forward difference for ff at xx and write it as Δ[f](x)\Delta[f](x).

The forward difference of the polynomial f(x)=7x32x28x+3f(x) = 7 x^3 - 2 x^2 - 8x + 3 can be evaluated by expanding f(x+1)f(x)f(x+1)-f(x) and combining like terms to get 21x2+17x321x^2 + 17x - 3. Taking the forward difference of that, we get 42x+3842x+38, and the forward difference of that is simply 4242. The forward difference of a constant like 4242 is always 00.

Just as in this example, the finite difference of any polynomial is a polynomial of one lesser degree.

3 Difference addition

If we wish to evaluate a polynomial at a sequence of adjacent integer arguments, we can use forward differences to compute these points very efficiently. In particular, if we know f(x)f(x) and Δ[f](x)\Delta[f](x), then we can compute f(x+1)=f(x)+Δ[f](x)f(x+1) = f(x) + \Delta[f](x). If Δ[f]\Delta[f] is a constant we can keep adding that same value to get f(x+2)f(x+2), $f(x+3), and so on. If it is not a constant than it is another polynomial we can use this same method on to find the sequence of Δ[f](x)\Delta[f](x)s we need.

Consider the 1st-order polynomial f(x)=42x+38f(x) = 42x + 38. Use Horner’s rule, we find f(5)=172f(-5) = -172 and f(4)=130f(-4) = -130.

Subtracting these, we find Δ[f](5)=42\Delta[f](-5) = 42. Because ff is linear, Δ[f]\Delta[f] is constant.

We now find other values of ff:

  • f(3)=f(4)+Δ[f]=130+42=88f(-3) = f(-4) + \Delta[f] = -130+42 = -88
  • f(2)=f(3)+Δ[f]=88+42=46f(-2) = f(-3) + \Delta[f] = -88+42 = -46
  • f(1)=f(2)+Δ[f]=46+42=4f(-1) = f(-2) + \Delta[f] = -46+42 = -4
  • f(0)=f(1)+Δ[f]=4+42=38f(0) = f(-1) + \Delta[f] = -4+42 = 38

We can also move backwards:

  • f(6)=f(5)Δ[f]=17242=214f(-6) = f(-5) - \Delta[f] = -172-42 = -214
  • f(7)=f(6)Δ[f]=21442=266f(-7) = f(-6) - \Delta[f] = -214-42 = -266

Consider the 2nd-order polynomial f(x)=21x2+17x3f(x) = 21x^2 + 17x - 3. Use Horner’s rule, we find

  • f(5)=437f(-5) = 437
  • f(4)=265f(-4) = 265
  • f(3)=135f(-3) = 135

Subtracting these, we find

  • Δ[f](5)=172\Delta[f](-5) = -172
  • Δ[f](4)=130\Delta[f](-4) = -130

and subtracting those we get

  • Δ[Δ[f]](5)=42\Delta\big[\Delta[f]\big](-5) = 42.

Because ff is quadratic, Δ[f]\Delta[f] is linear and Δ[Δ[f]]\Delta\big[\Delta[f]\big] is constant.

Now, we know that f(2)=f(3)+Δ[f](3)f(-2) = f(-3) + \Delta[f](-3), but we don’t yet know Δ[f](3)\Delta[f](-3). But we know Δ[f](3)=Δ[f](4)+Δ[Δ[f]]=130+42=88\Delta[f](-3) = \Delta[f](-4) + \Delta\big[\Delta[f]\big] = -130 + 42 = -88, which means f(2)=f(3)88=13888=50f(-2) = f(-3) - 88 = 138 - 88 = 50. And we got that with just two additions, no multiplications. Similarly

xx Δ[Δ[f]](x)\Delta\big[\Delta[f]\big](x) Δ[f](x)\Delta[f](x) f(x)f(x)
5-5 4242 172-172 437437
4-4 4242 130-130 265265
3-3 4242 88-88 135135
2-2 4242 46-46 4747
1-1 4242 4-4 11
00 3838 3-3
11 3535

As polynomials get bigger, the Δ[]\Delta[\cdots] notation becomes awkward. Sometimes primes like ff\prime and ff\prime\prime, or dots like f˙\dot{f} and f¨\ddot{f}, or subscripts like fxf_x and fxxf_{xx}, are used instead.

Let f(x)=7x32x28x+3f(x) = 7 x^3 - 2 x^2 - 8x + 3. Evaluating using Horner’s rule, we get

  • f(0)=3f(0) = 3
  • f(1)=0f(1) = 0 meaning f(0)=3f'(0) = -3
  • f(2)=35f(2) = 35 meaning f(1)=35f'(1) = 35 and f(0)=38f''(0) = 38
  • f(3)=150f(3) = 150 meaning f(1)=115f'(1) = 115 and f(0)=80f''(0) = 80 and f(0)=42f'''(0) = 42

We can now find ff larger xx by adding forward differences:

xx f(x)f'''(x) f(x)f''(x) f(x)f'(x) f(x)f(x)
00 4242 3838 3-3 33
11 4242 8080 3535 00
22 4242 122122 115115 3535
33 4242 164164 237237 150150
44 4242 401401 387387
55 4242 788788

And at smaller xx by subtracting forward differences

xx f(x)f'''(x) f(x)f''(x) f(x)f'(x) f(x)f(x)
00 4242 3838 3-3 33
1-1 4242 4-4 11 22
2-2 4242 46-46 4747 45-45
3-3 4242 88-88 135135 180-180

4 Summary

The method of forward differences lets us evaluate a single-variable polynomial efficiently at integer arguments. All we need to do is

  • Evaluate the nnth-order polynomial at n+1n+1 adjacent arguments. In the last example above, these were (3,0,35,150)(3,0,35,150). We’ll discard these numbers shortly.
  • Subtract, then subtract again, and so on to get a list of forward differences. In the last example above, these were (3,3,38,42)(3,-3,38,42). These numbers we’ll keep, and the first one is the function at the first evaluated xx.
  • To increase xx, we add values in the difference list left-to-right: (33,3+38,38+42,42)=(0,35,80,42)(3-3,-3+38,38+42,42) = (0,35,80,42).
  • To decrease xx, we subtract values in the difference list right-to-left: (?,?,3842,42)(?,3(4),4,42)(31,1,4,42)=(2,1,4,42)(?,?,38-42,42) \rightarrow (?,-3-(-4),-4,42) \rightarrow (3-1,1,-4,42) = (2,1,-4,42).

5 Multivariate forward differences

This also generalizes naturally to multivariate polynomials. If we have f(x,y)f(x,y) we can find both fx(x,y)=f(x+1,y)f(x,y)f_x(x,y) = f(x+1,y) - f(x,y) and fy(x,y)=f(x,y+1)f(y)f_y(x,y) = f(x,y+1) - f(y). Conveniently, the order of differencing does not matter:

fyx(x,y)=fx(x,y+1)fx(x,y)=(f(x+1,y+1)f(x,y+1))(f(x+1,y)f(x,y))=f(x+1,y+1)f(x,y+1)f(x+1,y)+f(x,y)=(f(x+1,y+1)f(x+1,y))(f(x,y+1)f(x,y))=fy(x+1,y)fy(x,y)=fxy(x,y)\begin{array}{rcl} f_{yx}(x,y) &=& f_x(x,y+1) - f_x(x,y) \\ &=& \big(f(x+1,y+1) - f(x,y+1)\big) - \big(f(x+1,y)-f(x,y)\big) \\ &=& f(x+1,y+1) - f(x,y+1) - f(x+1,y) + f(x,y) \\ &=& \big(f(x+1,y+1) - f(x+1,y)\big) - \big(f(x,y+1) - f(x,y)\big) \\ &=& f_y(x+1,y) - f_y(x,y) \\ &=& f_{xy}(x,y) \end{array}

Consider the circle equation f(x,y)=(xcx)2+(ycy)2r2f(x,y) = (x-c_x)^2 + (y-c_y)^2 - r^2, so called because f(x,y)=0f(x,y) = 0 is a circle of radius rr centered at point (cx,cy)(c_x,c_y). Computing the finite differences, we have:

  • fx(x,y)=1+2(xcx)f_x(x,y) = 1+2(x-c_x)
  • fxx(x,y)=2f_{xx}(x,y) = 2
  • fy(x,y)=1+2(ycy)f_y(x,y) = 1+2(y-c_y)
  • fyy(x,y)=2f_{yy}(x,y) = 2
  • fxy(x,y)=0f_{xy}(x,y) = 0

Consider drawing a circle of radius 6 centered at (2,3).

We know that (4,3)(-4,3) is on the circle, and that it is symmetric; if we can find the points between 0° and 45°, we can mirror them to get the rest of the points.

We find our initial value and differences:

  • f(4,3)=0f(-4,3) = 0
  • fx(4,3)=11f_{x}(-4,3) = -11
  • fy(4,3)=1f_{y}(-4,3) = 1
  • fxx=fyy=2f_{xx} = f_{yy} = 2.

Let’s abbreviate this as (f,fx,fy)(f,f_x,f_y) so our starting state at (4,3)(-4,3) is (0,11,1)(0,-11,1).

Now we’ll repeatedly move up in yy and decide if it’s better to move right in xx or not.

  1. up to (4,4)(-4,4) gives us (1,11,3)(1,-11,3)
    right to (3,4)(-3,4) would give us (10,9,3)(-10,-9,3) but that’s further from 0 so let’s not
    plot (4,4)(-4,4) and its symmetric neighbors.

  2. up to (4,5)(-4,5) gives us (4,11,5)(4,-11,5)
    right to (3,5)(-3,5) would give us (7,9,5)(-7,-9,5) but that’s further from 0 so let’s not
    plot (4,5)(-4,5) and its symmetric neighbors.

  3. up to (4,6)(-4,6) gives us (9,11,7)(9,-11,7)
    right to (3,6)(-3,6) gives us (2,9,7)(-2,-9,7) which is closer to 0 so let’s use that
    plot (3,6)(-3,6) and its symmetric neighbors.

  4. up to (3,7)(-3,7) gives us (5,9,9)(5,-9,9)
    right to (2,7)(-2,7) gives us (4,7,9)(-4,-7,9) which is closer to 0 so let’s use that
    plot (2,7)(-2,7) and its symmetric neighbors.

    fyf_y now has larger magnitude than fxf_x, meaning we have reached the 45° point and are done.

The exact details of how we decide to pick between moving in xx and not moving depends on if we want to plot points inside, near, or outside the circle. For inside points, always keep f(x,y)0f(x,y) \le 0; this is what we’d do to fill it in. For outside points, always keep f(x,y)>0f(x,y) > 0; this is what we’d do to mask the circle, coloring things outside it. For nearest points, keep the f(x,y)f(x,y) with the smaller magnitude; this is what we’d do to draw the circle as a single-pixel-width ring.