Automatic Differentiation

I discovered automatic differentiation a few weeks ago, and I can’t believe I had never heard of it before. Although I believe everybody who has more than a passing knowledge of algorithms (especially numerical algorithms!) should know about it, apparently very few do.

I will just give a very brief introduction here before pointing out a few good sources of information.

First of all, automatic differentiation — “autodiff” — is neither numerical differentiation nor symbolic differentiation, although it does calculate exact derivatives!

In the formulation that I find most astonishing, autodiff uses object-orientied programming techniques and operator overloading to simultaneously and transparently turn all (differentiable) function calculations into evaluations of the function’s first derivative.

More concretely, instead of calculating with a floating point variable x, we instead calculate with an object x that has two data components (x.value and x.deriv, say), the value and the derivative, and has methods that overload all of the mathematical functions and operators in the language. So that, for example, when one calculates y = cos x one is automatically calculating both $y = \cos x$ and $y’ = – \sin x$ and with the results being stored in y.value and y.deriv! Operator overloading covers cases like x ^ 3, 3 ^ x, or even x ^ x using the standard rules of differentiation. And once one realizes how that works, then expressions like x + y, x * y, and x / y become easy as well. In this way, all calculations can be handled.

It should be noted that this works for all numerical computations, not just calculations involving mathematical formulas, and that it can be easily generalized to calculating arbitrary higher-order derivatives.

By the way, the technique outlined above is “forward mode” autodiff; it is less obvious that there is also “reverse mode” autodiff. Forward mode is more efficient for functions of single variables; reverse mode is more efficient for functions with a single output value (i.e., real-valued as opposed to vector-valued functions). It turns out that reverse mode autodiff is a generalization of neural net back-propagation and was actually discovered before backprop!

Justin Domke made a great blog post on automatic differentiation in 2009; and Introduction to Automatic Differentiation and MATLAB Object-Oriented Programming is a very accessible paper on actually implementing autodiff in Matlab.

Finally, seems to be the home of all things autodiff on the web.


  1. Young’s avatar

    Hi, I also discovered automatic differentiation today, and it was really interesting. I studied about it for a while. But I couldn’t find out how I can hadle the function,’y=x^x’. Could you tell me about it or inform me any sites that handles the function?

    1. hundalhh’s avatar

      The derivative of x^x with respect to x is x^x *(log(x) + 1).
      The derivative of x^k with respect to x is k*x^(k-1).
      The derivative of k^x with respect to x is k^x *log(k).

      If f(x) and g(x) are a functions of x, with derivatives f'(x) and g'(x)
      The derivative of f(x)^g(x) with respect to x is

      f(x)^g(x)( g(x)*f'(x)/f(x) + log(f(x))*g'(x))

      The derivative of f(x)^k is k*f(x)^(k-1)*f'(x).
      The derivative of k^(f(x)) is k^f(x) *log(k)*f'(x).

      You might want to check out

Comments are now closed.