Integration by substitution: the ultimate fudge?

I have often worked with students on the topic of integration by substitution. This isn’t much of a surprise – it’s a fiddly topic with plenty of room for error and, conceptually, it contains challenging ideas. The thing that interests me, though, is that the challenges faced by students on this topic are never the same. In part, this is because there are many different ways of teaching the topic in the first place – and most of them involve some sort of fudge.

If you’re not familiar with the term, a fudge is when something is presented in a vague way to avoid an underlying issue, perhaps because the author feels the truth is too complicated for the reader – but sometimes because the truth is too complicated for the author and they lack the skills to explain it! In maths, a fudge might occur when a student knows where to start and where they should finish, but has to gloss over a step in the middle. Or it may occur when an answer is obtained which is obviously wrong – for example an inequality is pointing in the wrong direction – but finding the source of the error proves too tricky. In these cases, a student might fudge the matter by correcting the answer and hoping that nobody notices the gap in the logic en route.

Another common fudge in mathematics is the abuse of notation, where valid notation is manipulated in a non-standard way in order to obtain a conclusion. Some abuses of notation are straight-up errors, but others are justifiable either as an appropriate shorthand or because they rely on an idea which is true but which is yet to be developed in a particular context. Consider the following example:

Evaluate  \int_0^3 x(2x+1)^3~\mathrm{d}x.

There are many different ways which you may have been taught to solve this – some not involving substitution at all – but the most obvious approach is to use a substitution where  u=2x+1. You might therefore write:

 u=2x+1  so   \dfrac{\mathrm{d}u}{\mathrm{d}t} = 2 \dfrac{\mathrm{d}x}{\mathrm{d}t}  and therefore   \frac12\mathrm{d}u=\mathrm{d}x.

 u=2x+1  so   x=\dfrac{u-1}2.
When   x=0,u=1. When  x=3, u=7.

Hence  \displaystyle{\int_0^3\! x(2x+1)^3~\mathrm{d}x = \int_1^7\frac{u-1}2\times u^3~\frac12\mathrm{d}u = \frac14\int_1^7 \!u^4-u^3~\mathrm{d}u = \frac14\left[\frac15 u^5 -\frac14 u^4\right]_1^7,}

so  \displaystyle\int_0^3 x(2x+1)^3~\mathrm{d}x = 690.3

And actually, if a student wrote this, I’d be pretty happy. But the first line is problematic. Where did the  t come from? And then where did the  \mathrm{d}t vanish to? It makes it look like  \frac{\mathrm{d}x}{\mathrm{d}t} is a fraction – but most students are taught (correctly) that derivatives aren’t fractions and that you can’t just split them into numerators and denominators… so why make an exception now?

A formal derivation

For many students, an informal understanding of how this works is enough. But for those looking to pursue mathematics at a higher level, a more thorough understanding can be beneficial.

So: consider functions  \mathrm{F}, \mathrm{f} and  \mathrm{u} such that  \mathrm{F}^\prime(x)=\mathrm{f}(x), and values  x_1, x_2, u_1 and  u_2 such that  \mathrm{u}(x_i) = u_i.

Since  \mathrm{f} is the derivative of  \mathrm{F}, it follows from the fundamental theorem of calculus that

 \displaystyle{\int_a^b\! \mathrm{f}(t) ~\mathrm{d} t = \left[\mathrm{F}(t)\right]_a^b = \mathrm{F}(b)-\mathrm{F}(a)}

for any variable t.

Specifically, we note that  \displaystyle\int_{u_1}^{u_2}\! \mathrm{f}(u) ~\mathrm{d}u = \mathrm{F}(u_2)-\mathrm{F}(u_1).

However, since  \mathrm{u} is a function of  x, we can apply the chain rule to  \mathrm{F}(\mathrm{u}(x)). This shows us that

 \frac{\mathrm{d}}{\mathrm{d}x} \mathrm{F} (\mathrm{u} (x) ) = \mathrm{f}(\mathrm{u}(x)) \frac{\mathrm{d}u}{\mathrm{d}x}.

Applying the fundamental theorem of calculus again, it follows that

 \displaystyle{\int_{x_1}^{x_2} \mathrm{f}(\mathrm{u}(x)) \frac{\mathrm{d}u}{\mathrm{d}x} ~\mathrm{d}x = \left[\mathrm{F} (\mathrm{u} (x) ) \right]_{x_1}^{x_2} = \mathrm{F}(\mathrm{u}(x_2))-\mathrm{F}(\mathrm{u}(x_1))}.

Finally, recall that  \mathrm{u}(x_1) = u_1 and  \mathrm{u}(x_2) = u_2. We can substitute these into the previous result to show that

 \displaystyle{\int_{x_1}^{x_2} \mathrm{f}(\mathrm{u}(x)) \frac{\mathrm{d}u}{\mathrm{d}x} ~\mathrm{d}x = \mathrm{F}(u_2)-\mathrm{F}(u_1)}.

We have therefore formed two integrals, each of which is equivalent to  \mathrm{F}(u_2)-\mathrm{F}(u_1). Since they are both equal to the same quantity, the integrals must be equal to one another. Therefore,

 \displaystyle{\int_{x_1}^{x_2} \mathrm{f}(\mathrm{u}(x)) \frac{\mathrm{d}u}{\mathrm{d}x} ~\mathrm{d}x = \int_{u_1}^{u_2} \mathrm{f}(u) ~\mathrm{d}u}.

This result is the core of integration by substitution, and students can use it directly should they wish. The formula shows us that a complicated integral (the left-hand side of the equation) can be replaced with a simpler integral (the right hand side). In principle, the simpler integral should be easier to work with. Let’s return to our original example:

Evaluate  \displaystyle{\int_0^3 x(2x+1)^3~\mathrm{d}x}.

We again will use the substitution  u=2x+1, so  \frac{\mathrm{d}u}{\mathrm{d}x} = 2 . Halving both sides gives us that  \frac{1}{2}\frac{\mathrm{d}u}{\mathrm{d}x} = 1 .

Forming an expression with value 1 is important for the next stage, because 1 is the multiplicative identity.


 \displaystyle{\int_0^3 \!x(2x+1)^3~\mathrm{d}x = \int_0^3 \!x(2x+1)^3 \frac{1}{2}\frac{\mathrm{d}u}{\mathrm{d}x}~\mathrm{d}x}

and by the result proved earlier, we can write

 \displaystyle{\int_0^3 \! x(2x+1)^3 \frac{1}{2}\frac{\mathrm{d}u}{\mathrm{d}x}~\mathrm{d}x = \int_1^7 \!\frac{u-1}2 \times u^3 \times \frac{1}{2}~\mathrm{d}u},

which we can integrate as before to obtain 690.3.

So why do it this way? Is it essential to learn the technique of integration by substitution rigorously, or is it enough to have an informal understanding of why it works? Really, that depends on the student. Whilst some will be perfectly happy to apply methods without fully deriving them, others prefer to know exactly where the rules have come from.

The product rule

In fact, there are plenty of other techniques and methods that teachers often tend to state when they could instead be derived – many of them  to do with calculus. For instance, the product rule for differentiation, which states that

 \frac {\mathrm{d}} {\mathrm{d}x} (uv) = u v^\prime + v u^\prime,

can be easily derived if students have already covered implicit differentiation and the differentiation of the natural logarithm – both quite approachable topics. Start by letting  y = uv, where  y,  u and