Three Ways to Take a Gradient: Tape, Trace, and Source Transform
Training is gradient descent, and gradient descent needs gradients, so every ML framework has an automatic differentiation engine at its core. What is less obvious is that the representation a fram...