Image Gallery

Click on an image to read more about it.

How it Works

The model that predicts digits is a convolutional neural network that is trained on the MNIST dataset, which is dataset of images of handwritten digits. The neural network was trained from scratch using the PyTorch Python library.

I converted the neural network to ONNX after training, which was small enough to deploy via AWS Lambda with reasonably fast inference time. The frontend is a Vite app, which compiles to a static website. I deployed this to Google Cloud Storage. As a result, the entire end-to-end application is hosted essentially for free.

This project was an valuable learning experience for me. I learned how to troubleshoot and problem solve in a data-heavy system. I could no longer rely on debuggers and code tracing because even if the code is correct the model was still underperforming. I had to add many skills to my toolbox, like visualizing data, setting up experiments to find root causes, and creatively searching for new solutions.

One of the main problems I had during development was the model being unable to handle transformations of the image. For example, it could guess the digit if it were directly in the middle of the image, but any slight translation and the model didn't know what to do. After much visualization and testing, I decided the best way forward was to aggressively train on transformed images from the start. So, in the training loop, random transformations would be applied without any gradual increase. This forced the model to learn actual features regardless of their scale or position, rather than memorizing the data distribution.

×