3  Setting Up the Tools

Every method in this book — Euler, RK2, RK3, and RK4 — is derived from one idea: expand a function of two variables as a power series in the step size \(h\), then choose coefficients so that the numerical approximation matches the exact Taylor expansion through as many powers of \(h\) as possible. This chapter establishes the three tools that make that program precise: the two-variable Taylor expansion in the form we will actually use, Big-O notation for measuring what we throw away, and a formal definition of “order \(p\).” At the end of the chapter we introduce the Python pipeline that will carry all the symbolic algebra from Chapter 4 onward.

3.1 The One Formula That Drives Everything: Expanding \(f(x + ah, y + bk)\)

Every Runge–Kutta scheme evaluates the right-hand side \(f\) at a point that is displaced from \((x, y)\) by some multiple of the step size \(h\) in the \(x\)-direction and some multiple in the \(y\)-direction. We need to know how \(f\) changes at such a displaced point. That question is answered by the two-variable Taylor series.

You already know the single-variable Taylor series: if \(g\) is smooth, then \(g(x + \varepsilon) = g(x) + \varepsilon\, g'(x) + \tfrac{\varepsilon^2}{2!} g''(x) + \cdots\) The two-variable version works the same way, but we expand in two directions at once. Given a smooth function \(f(x, y)\), a displacement \(a h\) in \(x\), and a displacement \(b k\) in \(y\):

\[f(x + a h,\ y + b k) = f(x, y) + a h\, f_x + b k\, f_y + \tfrac{1}{2}\!\left(a^2 h^2 f_{xx} + 2 a b\, h k\, f_{xy} + b^2 k^2 f_{yy}\right) + \cdots\]

Here the subscripts denote partial derivatives: \(f_x = \partial f/\partial x\), \(f_{xx} = \partial^2 f/\partial x^2\), \(f_{xy} = \partial^2 f/(\partial x\, \partial y)\), and so on, all evaluated at \((x, y)\). This is exactly the formula you saw when you studied multivariable Taylor series; no new mathematics is required.

In our IVP setting the independent variable is \(t\) (or \(x\)), the dependent variable is \(y\), and \(f(x, y)\) is the right-hand side of \(y' = f(x, y)\). The displacement in \(x\) will always be a multiple of \(h\) (the step size), and the displacement in \(y\) will always be a multiple of \(k\) where \(k = h\, f(x, y)\) or some related slope estimate. So \(a\) and \(b\) are just rational numbers whose values we will solve for in each chapter. Collecting by power of \(h\) is how we read off the order conditions.

Let’s verify this expansion in Python so there is nothing to take on faith. We use SymPy to manipulate expressions symbolically. In the RK context \(k\) is itself proportional to \(h\) (we will set \(k = h\, f(x, y)\)), so the displacement in \(y\) is also a multiple of \(h\). We make that explicit here by writing the displacement as \(b \cdot h \cdot g\), where \(g\) is an abstract slope estimate independent of \(h\). Then we ask SymPy’s series() to expand \(f(x + a h,\ y + b h g)\) around \(h = 0\) up to order \(h^2\):

import sympy as sp
x, y, h, a, b, g = sp.symbols('x y h a b g')
f = sp.Function('f')
expand_F = sp.series(f(x + a*h, y + b*h*g), h, 0, 3).removeO()

# Manual collect-by-h. SymPy's sp.collect can fail on expressions with
# multivariate Derivatives, so we group by h^k by extracting coefficients
# directly.
def collect_h(expr, sym, n_max=8):
    expr = sp.expand(expr)
    return sum(expr.coeff(sym, k) * sym**k for k in range(n_max + 1))

collect_h(expand_F, h)

\(\displaystyle h^{2} \left(\frac{a^{2} \left. \frac{\partial^{2}}{\partial \xi_{1}^{2}} f{\left(\xi_{1},y \right)} \right|_{\substack{ \xi_{1}=x }}}{2} + \frac{a b g \left. \frac{\partial^{2}}{\partial _b*g*xi + y\partial \xi_{1}} f{\left(\xi_{1},_b*g*xi + y \right)} \right|_{\substack{ \xi_{1}=x }}}{2} + \frac{a b g \left. \frac{\partial^{2}}{\partial _a*xi + x\partial \xi_{2}} f{\left(_a*xi + x,\xi_{2} \right)} \right|_{\substack{ \xi_{2}=y }}}{2} + \frac{b^{2} g^{2} \left. \frac{\partial^{2}}{\partial \xi_{2}^{2}} f{\left(x,\xi_{2} \right)} \right|_{\substack{ \xi_{2}=y }}}{2}\right) + h \left(a \left. \frac{\partial}{\partial \xi_{1}} f{\left(\xi_{1},y \right)} \right|_{\substack{ \xi_{1}=x }} + b g \left. \frac{\partial}{\partial \xi_{2}} f{\left(x,\xi_{2} \right)} \right|_{\substack{ \xi_{2}=y }}\right) + f{\left(x,y \right)}\)

The output shows the constant term \(f(x, y)\), the first-order term in \(h\) with coefficient \(a\, f_x + b g\, f_y\) (using SymPy’s Derivative notation for the partial derivatives), and the second-order term in \(h^2\) involving \(f_{xx}\), \(f_{xy}\), \(f_{yy}\). The slope estimate \(g\) stands in for whatever specific slope we will use in each chapter (typically \(g = f(x, y)\), sometimes a previously-computed stage value).

When we substitute \(g = f(x, y)\) in later chapters, the entire first-order piece becomes proportional to \(h\), the second-order piece becomes proportional to \(h^2\), and so on. Every term groups cleanly by power of \(h\), which is what makes the order-condition approach work.

3.2 Big-O Notation and Truncation Error

When we truncate the Taylor series after the \(h^p\) term, all remaining terms have \(h^{p+1}\) or higher as a factor. Big-O notation is a compact way to say exactly that.

\[g(h) = O(h^p) \quad\text{means}\quad \left|\frac{g(h)}{h^p}\right| \le C \quad\text{for}\quad h \to 0\]

In plain language: \(g(h)\) is \(O(h^p)\) if \(g\) shrinks at least as fast as \(h^p\) as \(h \to 0\), for some constant \(C\) that does not depend on \(h\). The constant \(C\) may depend on derivatives of \(f\), on the interval length, and on other fixed quantities — but not on \(h\) itself.

When a numerical method approximates one step of the ODE, the difference between the numerical answer and the exact answer is called the local truncation error (LTE). The LTE is the error committed in a single step, assuming we start the step with the exact value of \(y\). If the LTE is \(O(h^{p+1})\), we say the method has order \(p\).

Why \(p+1\) in the LTE but \(p\) for the order? Because the LTE is the error per step, but we take roughly \(1/h\) steps to cover a fixed interval. Multiplying the per-step error \(O(h^{p+1})\) by the \(1/h\) steps gives \(O(h^p)\) for the total accumulated error over the interval. That accumulated (global) error is the number that actually matters for practical accuracy, and it is \(O(h^p)\). So a method with LTE \(O(h^{p+1})\) gives global error \(O(h^p)\), and we call it an order-\(p\) method.

Example: if a method has order 4, and we halve the step size, the global error is multiplied by \((1/2)^4 = 1/16\). That is a 16-fold accuracy gain from a 2-fold increase in computation. This is the essential payoff of high-order methods.

Let’s make this concrete. SymPy’s series() returns an object that includes a symbolic remainder term \(O(h^{p+1})\). We can use .removeO() to strip it off and inspect just the polynomial part:

y_func = sp.Function('y')
euler_expansion = sp.series(y_func(x + h), h, 0, 4)
euler_expansion

\(\displaystyle y{\left(x \right)} + h \left. \frac{d}{d \xi_{1}} y{\left(\xi_{1} \right)} \right|_{\substack{ \xi_{1}=x }} + \frac{h^{2} \left. \frac{d^{2}}{d \xi_{1}^{2}} y{\left(\xi_{1} \right)} \right|_{\substack{ \xi_{1}=x }}}{2} + \frac{h^{3} \left. \frac{d^{3}}{d \xi_{1}^{3}} y{\left(\xi_{1} \right)} \right|_{\substack{ \xi_{1}=x }}}{6} + O\left(h^{4}\right)\)

The output displays \(y(x) + h\, y'(x) + \tfrac{h^2}{2}\, y''(x) + \tfrac{h^3}{6}\, y'''(x) + O(h^4)\). Euler’s method keeps only \(y(x) + h\, y'(x)\). The error per step is therefore \(\tfrac{h^2}{2}\, y''(x) + \cdots = O(h^2)\). Global error is \(O(h)\). This is order 1.

3.3 What “Order” Means: A Method Is Order \(p\) If the Global Error Is \(O(h^p)\)

We now state the definition precisely, because every derivation in this book is ultimately an argument that a certain method satisfies it.

Definition. A one-step numerical method is said to have order \(p\) (or to be an order-\(p\) method) if its global error satisfies:

\[\left|y_N - y(t_N)\right| \le C\, h^p\]

for all step sizes \(h\) smaller than some threshold, where \(C\) is a constant independent of \(h\), \(y_N\) is the numerical solution after \(N\) steps, and \(y(t_N)\) is the exact solution at the same final time. (\(N = (t_N - t_0)/h\) so \(N\) grows as \(h\) shrinks.)

Equivalently, and more practically: a method is order \(p\) if and only if its local truncation error is \(O(h^{p+1})\). The local condition is what we can actually verify algebraically, step by step, which is why all our derivations will focus on the LTE.

The strategy for deriving any RK method is now clear:

  1. Write down the general form of the method (a weighted sum of slopes \(k_1, k_2, \ldots\)).
  2. Expand each slope \(k_i\) as a two-variable Taylor series in \(h\).
  3. Collect by power of \(h\).
  4. Demand that every coefficient through \(h^p\) matches the corresponding coefficient in the exact Taylor expansion of \(y(x + h)\).
  5. Solve the resulting system of equations for the free parameters.

Each step in that list is mechanical once the formulas are set up. In Chapter 4 we do it entirely by hand for Euler’s method. By Chapter 7 the algebra is too large for a human to track reliably, and SymPy takes over.

A critical observation about condition counting. At order \(p\) the exact Taylor expansion has a certain number of independent derivative terms, each of which must be matched. These are the order conditions. The number of conditions grows rapidly with \(p\):

import pandas as pd
counts = [1, 1, 2, 4, 8, 17, 37]
df = pd.DataFrame({"Order p": list(range(1, 8)), "# conditions": counts})
df
Order p # conditions
0 1 1
1 2 1
2 3 2
3 4 4
4 5 8
5 6 17
6 7 37

The explosion from 8 conditions at order 4 to 17 at order 5 is not a coincidence — it reflects a deep combinatorial structure (rooted trees) that Chapter 9 explains. For now, note that order 4 is the last level where the system is small enough to solve by hand. That is exactly why RK4 is the method everyone uses.

3.4 Introducing the SymPy Symbolic Pipeline: series(), collect(), and coeff()

Starting in Chapter 4 we will use SymPy not just to compute numerical answers, but to carry the symbolic algebra of the derivations. Three functions form the backbone of that pipeline. This section introduces them on a simple example so that when they appear mid-derivation you will already know what each one does.

sp.series(expr, var, point, n) expands expr as a power series in var around the given point, up to (but not including) order n. The result includes both the polynomial part and the remainder term \(O((\text{var} - \text{point})^n)\).

sp.series(sp.sin(h), h, 0, 6)

\(\displaystyle h - \frac{h^{3}}{6} + \frac{h^{5}}{120} + O\left(h^{6}\right)\)

The output is \(h - h^3/6 + h^5/120 + O(h^6)\). We recognize the first three terms of the Taylor series for sine. series() is exact for functions SymPy knows, and it also works on user-defined symbolic functions like f(x, y) where the partial derivatives appear as Derivative(...) symbols.

expr.removeO() strips the \(O(\cdots)\) remainder and returns the polynomial part as an ordinary expression. This is useful when we want to do further algebra on the truncated expansion without carrying the remainder.

sp.series(sp.sin(h), h, 0, 6).removeO()

\(\displaystyle \frac{h^{5}}{120} - \frac{h^{3}}{6} + h\)

sp.collect(expr, var) rewrites an expanded polynomial expression by grouping terms that share the same power of var. This is the tool we use after a large expansion to read off the coefficient of each \(h^k\) separately.

expr = 3*h + 5*h**2 + 2*h + h**2 + 7*h**3
sp.collect(expr, h)

\(\displaystyle 7 h^{3} + 6 h^{2} + 5 h\)

The result is \(5 h + 6 h^2 + 7 h^3\). collect() has combined like powers. In the RK derivations the expressions will involve symbolic partial derivatives of \(f\) rather than numbers, so collect() will group those derivative terms under each \(h^k\) heading.

expr.coeff(var, n) extracts the coefficient of \(\text{var}^n\) in expr. This is how we will isolate individual order conditions: the coefficient of \(h^2\) must match between the numerical expansion and the exact expansion.

(5*h + 6*h**2 + 7*h**3).coeff(h, 2)

\(\displaystyle 6\)

The result is 6. In the RK context the extracted coefficient will be a symbolic expression involving the free parameters (\(b_1, b_2, a_{21}, \ldots\)) and we will set it equal to the corresponding coefficient from the exact expansion, producing one algebraic equation per order condition.

Now let’s run the three-function pipeline together on a preview of what we will do in Chapter 4. The exact Taylor expansion of \(y(x + h)\) begins:

import sympy as sp
x, y_sym, h = sp.symbols('x y h')
f = sp.Function('f')
exact_expansion = (y_sym
                   + h * f(x, y_sym)
                   + (h**2/2) * (f(x, y_sym).diff(x)
                                 + f(x, y_sym).diff(y_sym) * f(x, y_sym)))

def collect_h(expr, sym, n_max=8):
    expr = sp.expand(expr)
    return sum(expr.coeff(sym, k) * sym**k for k in range(n_max + 1))

collect_h(exact_expansion, h)

\(\displaystyle h^{2} \left(\frac{f{\left(x,y \right)} \frac{\partial}{\partial y} f{\left(x,y \right)}}{2} + \frac{\frac{\partial}{\partial x} f{\left(x,y \right)}}{2}\right) + h f{\left(x,y \right)} + y\)

The output shows the exact Taylor expansion through \(h^2\). The \(h^1\) term is \(h\, f(x, y)\) — that is the slope times the step. The \(h^2\) term involves the partial derivatives of \(f\). In Chapter 4 we will show that Euler’s method matches the \(h^1\) term and nothing more, giving order 1. In Chapter 5 we will match both terms, giving order 2.

Let’s also verify the coeff() extraction on this expression:

coeff1 = sp.expand(exact_expansion).coeff(h, 1)
coeff1

\(\displaystyle f{\left(x,y \right)}\)

The result is f(x, y) — the slope. .coeff(h, 2) would give the second-order term. These are exactly the quantities we will match against the numerical method’s own expansion.

Summary of the pipeline. For every RK derivation in Chapters 4 through 7 we will follow exactly these steps in SymPy:

  1. Write the numerical method symbolically: \(y_{n+1} = y_n + h (b_1 k_1 + b_2 k_2 + \ldots)\) where \(k_i\) are expressed in terms of \(f\) evaluated at displaced points.
  2. Expand each \(k_i\) using sp.series() around \(h = 0\).
  3. Collect the full expression by powers of \(h\) using removeO() and sp.collect().
  4. Extract the coefficient of each \(h^k\) using .coeff().
  5. Form the order conditions: set each extracted coefficient equal to the corresponding coefficient from the exact expansion.
  6. Solve the system with sp.solve().

Every derivation in this book is an instance of this six-step program. The next chapter runs it for Euler’s method, where the algebra is minimal and every step is transparent.

One final note on Python conventions. Throughout the book we write f(x, y) (using parentheses, as Python requires) when we mean the mathematical function \(f(x, y)\). With f = sp.Function('f') declared, f(x, y) is a symbolic application. Partial derivatives are expressed with .diff(): f(x, y).diff(x) gives \(f_x\), and f(x, y).diff(x).diff(y) (or equivalently f(x, y).diff(x, y)) gives \(f_{xy}\). When a formula appears in display math we use standard mathematical notation (parentheses, subscripts); when it appears in a code cell we use Python syntax. The correspondence is exact.

import sympy as sp
x, y_sym = sp.symbols('x y')
f = sp.Function('f')
f(x, y_sym).diff(x)

\(\displaystyle \frac{\partial}{\partial x} f{\left(x,y \right)}\)

f(x, y_sym).diff(y_sym)

\(\displaystyle \frac{\partial}{\partial y} f{\left(x,y \right)}\)

f(x, y_sym).diff(x, 2)

\(\displaystyle \frac{\partial^{2}}{\partial x^{2}} f{\left(x,y \right)}\)

f(x, y_sym).diff(x, y_sym)

\(\displaystyle \frac{\partial^{2}}{\partial y\partial x} f{\left(x,y \right)}\)

We now have everything we need to begin the derivations. The next chapter applies this machinery to the simplest possible case: Euler’s method, which is Runge–Kutta of order 1.