In this post we’ll improve our training algorithm from the previous post. When we’re done we’ll be able to achieve 98% precision on the MNIST data set, after just 9 epochs of training—which only takes about 30 seconds to run on my laptop. For comparison, last time we only achieved 92% precision after 2,000 epochs of training, which took over an hour! The main driver in this improvement is just switching from batch gradient descent to mini-batch gradient descent.... Read more
In this post we’re going to build a neural network from scratch. We’ll train it to recognize hand-written digits, using the famous MNIST data set. We’ll use just basic Python with NumPy to build our network (no high-level stuff like Keras or TensorFlow). We will dip into scikit-learn, but only to get the MNIST data and to assess our model once its built. We’ll start with the simplest possible “network”: a single node that recognizes just the digit 0.... Read more