Is it or isn't it? |
I decided to build and train a pothole image detector as part of following along with Fast Ai's Practical Deep Learning course. The reason I chose the pothole detector was that at a previous company a similar thing had been built from scratch, completely training it ourselves. Potholes are also something I come across very regularly on the British roads I cycle around. Having come off my bicycle once in the last year I am not in a hurry to repeat the experience.
The thing that immediately struck me was how quickly the model was trained and how easy it was to do. This is mainly because what we are doing is fine tuning an existing image model rather than trying to train one up from nothing, the base model already knows a lot about images before we start. The project where we trained the pothole detector from scratch took at least five weeks to get to something reliable, whereas I have trained something that seems to work in about five minutes. It's not quite a fair comparison as there was always also a need to slice videos into images and extract metadata from them, so there was more scope in the original project.
Fast AI is designed to make the development and training of models quick and easy. It wraps a lot of the complexities of the underlying libraries, and provides utilities that allow you to download images from an internet search. There are data types that are designed to provide categories and labels, test and training data for the model you are training or tuning. I am sure that there is much more in the Fast AI library that I haven't got to yet.
The other big difference between the earlier project, which was several years ago, is that we are able to use free compute and deployment resources to evaluate and tune the model. The earlier project used the boss’s video gaming machine because that had some GPUs in it and he went through Call of Duty withdrawal whilst training was done.
The specific tools we used were Jupyter Notebooks hosted by Kaggle, a frontend generated by Gradio and hosted at Hugging Face and the code basically came from chapter 1 and 2 of the Fast AI course. The end result of that is a public demonstration that you can see here:
So how did it do? I have only tried it out on one or two images downloaded from the internet and it performs well on those, but they might well be in the training set. I have tried it on one photograph of a pothole that I took and it got that wrong, possibly because it was more of a pot crack than a pothole. I'm planning to take more photographs and carry out more of a formal evaluation of how the model is doing, and maybe tune it again on images that are more relevant to the roads that I cycle along. Fast AI seems to provide some tools and utilities to do this, so look out for another exciting instalment.