The purpose is pure educational. Students that jump straight to mid- or high-level libraries like tensorflow, keras, theano, etc don't have to compute the gradients themselves. On the one hand, it saves a lot of time, but on the other hand, it is makes it very easy to abstract away the learning process.
Here's how Andrej Karpathy puts it (the lecturer of the previous cs231n classes at Stanford):
When we offered CS231n (Deep Learning class) at Stanford, we intentionally designed the programming assignments to include explicit calculations involved in backpropagation on the lowest level. The students had to implement the forward and the backward pass of each layer in raw numpy. Inevitably, some students complained on the class message boards:
“Why do we have to write the backward pass when frameworks in the real
world, such as TensorFlow, compute them for you automatically?”
...
The problem with Backpropagation is that it is a leaky abstraction.
In other words, it is easy to fall into the trap of abstracting away the learning process — believing that you can simply stack arbitrary layers together and backprop will “magically make them work” on your data.
I recommend to read the whole post, it's very interesting.
So, you try to compute gradients manually. And when you do that you find that it's pretty hard to assess if the code is right: it's just a raw formula that takes a bunch of floating numbers and returns another bunch of floating numbers. And here you find the alternative numerical method that you can compare to very useful.
Of course, analytical formulas are faster and more precise and they are used in practice whenever possible. But while studying neural networks and back-propagation, it's very useful to get through manual computation at least once. Besides, it sometimes helps to find the bugs.