The Chain Rule

We have had one chain rule already, when differentiating a function of a function. For example, the derivative of cos(x2) is -2xsin(x2). In that case we first differentiate the cosine, which is the outer function, then multiply by the derivative of the x2, which is the inner function.

We can also use the chain rule when we have functions of more than one variable, as we'll see in this section.

Suppose we have some function u(x,y). But now suppose that those variables x and y are themselves functions of other variables, s and t. Then we can differentiate u with respect to x or y, but we can also differentiate u with respect to s or t, by using the chain rule.

Let's look at a specific example to make it clearer. Suppose these are the functions in question, u(x,y) and x(s,t) and y(s,t):

Now we can differentiate u with respect to x, what does that give us?

And we can differentiate u with respect to y, what does that give us?

What about the derivatives of x and y with respect to s and t (that's 4 derivatives in all), what are they?

Now here's the point. If we change s, then x and y both change, since they both depend on s. However, if x and y change, then u must change too, since u depends on x and y.

SO, by changing the value of s, we change the value of u (in an indirect sort of way, through x and y).

But that means that u depends on s, so we should be able to work out how u changes as s changes, i.e. the derivative of u with respect to s.

That's where the chain rule comes in. Here's how we can work out the derivative of u with respect to s:

You can see the "indirect" way that u depends on s, purely through x and y.

Now we can simply substitute in the results we got above for the 4 derivatives on the right-hand side, to find the result. Try this then check your answer.

For practice, now try to do the same thing to find the derivative of u with respect to t. Have a go then check your answer.