This project provides a solution to detection and recognition of license plates in Kenya which stems from a traffic monitoring system that is inadequate in automatically detecting and recognizing license plates.

Presently, whenever a vehicle is flagged on the road for, among other reasons, violating traffic rules, hit and run accidents or even for emergency responses, a police officer has to manually write it down, then type it back into the system for owner and registration information, or to communicate to colleagues…etc.

My solution seeks to automate the detection of the license plate of images of vehicles which are captured by the system, segment and recognize the characters of the plates, and give a textual output that can be digitally processed further.

An automatic license plate detection and recognition solution reduces human error and creates efficiency in aiding in the retrieval of information.

Automatic License Plate Detection and Recognition widely uses computer vision, which is a field of artificial intelligence that is trained to mimic human vision. E.g. teaching a machine to determine the age of a person or different fruits in an image.

Photo by Arunava on Unsplash

Computer vision tasks can be solved through deep neural networks because typical machine learning models can’t provide accurate computer vision algorithms without the downside of long and expensive computational time and a lot of resources. Therefore, I shall explore a deep learning algorithm approach; train and test a convolutional neural network to recognize the characters in a license plate.

A Brief Introduction to Deep Learning:

Deep learning and deep neural networks create powerful computer vision algorithms because of the design of their neural network architectures. Deep learning algorithms automatically extract complex linear and non-linear feature relationships to discover good representation at multiple levels of abstraction.

The key aspect of deep learning is that the layers of features are not human-designed but learned from data using a general-purpose learning procedure by the neural network.

On to the project:

This project article documents the implementation of deep neural algorithms and metrics used, collection and verification of results and solutions on extracting license plate numbers, segmenting characters and training a convolutional neural network; that builds an automatic license plate recognition system.

I will use the pretrained model, Wpod-Net, and Computer Vision with OpenCV.

There are 3 parts to this project:

Part 1: Implementation of the pre-trained model, Wpod-Net, to detect and extract license plates of Kenyan vehicle images.

Part 2: Character segmentation with OpenCV

Part 3: Train a Neural Network to predict the segmented characters obtained from the extracted license plate

Here are the major tools and libraries I used in Jupyter Notebook:

· Python

· NumPy

· Matplotlib

· TensorFlow

· Keras

· OpenCV

Also, here is a brief description and functionality of the above listed libraries:

· os.path / glob: An operating system interface package for Python, used to work with directories and the file system.

· local_utils: A Python script that contains several functions which will be used to process data from Wpod-Net.

· NumPy: A library that supports multi-dimensional arrays and matrices operation for processing in neural networks, and OpenCV.

· Matplotlib: A library that enables plotting and visualization of data and images.

· Keras: An open source neural network library working with TensorFlow.

I will use the package: model_from_json from Keras library to load the model to the architecture in JSON format.

· cv2: OpenCV is a Computer Vision library that will be used to perform image processing techniques.

Part 1: Preprocessing and Extraction of Images

In this first part, I’ll preprocess vehicle images and obtain the license plates from the images using the pretrained Wpod-Net model, that was created by Sergio Montazzolli Silva and Claudio Rosito Jung.

Wpod-Net is a powerful computer vision model based on Keras, that is able to adequately detect and extract license plates. Wpod-Net is able to majorly detect standard license plates, and if the image is blurry or is hidden by obstacles, the model might fail to detect and recognize with accuracy.

To load this model, I’ll create a helper function called load_model to open and read the pretrained Wpod-Net model which is JSON form. I will also load the weights in this helper function.

Loading the Wpod-Net model.

Once the model has been loaded successfully, I will preprocess the images using a helper function called preprocess_image that takes an image path and the Boolean option to resize as arguments. This function uses OpenCV to read the images, converts the color to RGB and normalizes the pixels to the range of 0 and 1. If the argument to resize is True, the function resizes the images to the dimensions of 224X224 pixels. The preprocessed images are returned for visualization.

I will visualize 9 images of my vehicle dataset of Kenyan license plate images scrapped from the internet.

Preprocessing of vehicle images.
Kenyan license plates

A function called get_plate is used to forward and image through the model and return the plate’s image and coordinates. If there’s an error “No license plate has been found!” will be output.

Here is the output of the get_plate function on an image:

Detected Kenyan license plate image_1

Summary of part1: I have extracted the license plate and its coordinates from a vehicle image using pre-trained model, Wpod-Net.

Part 2: Character Segmentation

Here I will segment the characters from the license plate extracted in Part 1.

Implementation of some image processing on the extracted license plate:

1. Converting it to a standard image of an 8-bit scale to reduce computational power required for processing.

2. Converting it to a grayscale image since color is insignificant in character segmentation.

3. Blur the image to remove noise and unnecessary information.

4. Perform image thresholding; so that any smaller pixel value than the threshold will be converted to 255, and any larger pixel value than the threshold will be converted to 0. This is called inverse binary thresholding.

5. Implementing dilation to increase the white contours of each character of the image that were formed through inverse binary thresholding.

License plate preprocessing.
Images of the preprocessed license plate

To determine the contours of the license plate characters, I will use OpenCV’s function called findContours. A contour is a simple curve that joins all continuous points along a boundary that share the same color and intensity. At this point, I appreciate why I converted the image to grayscale, and performed image thresholding and dilation.

The sort_contours helper function, as the name suggests, will sort contours found from left to right, for legibility. Height and width filters are then created to select only contours with the given set of height and width limits.

Select and sort contours of each character from left to right.

Finally, I’ll draw boundaries using OpenCV on all contours that pass these filters, apply binary thresholding on each determined contour and append them to a list. This function will output the length of the list as the number of characters segmented. I will also output the image of the segmented characters in the list.

Detected 7 characters.
Segmented characters in binary form.

Summary of Part 2: I have segmented the characters of the license plate and drawn bounding boxes on them, and displayed the segmented characters in a grid.

Part 3: Character Recognition by training a Convolutional Neural Network

Here is a visualization of the characters dataset that contains 37623 images with 36 classes. This dataset is really huge because each class has different font-type variations of an individual character. The classes represent the 26 alphabetical characters and 10 numerical characters. These classes will be used to train the convolutional neural network to predict characters in the segmented license plate characters.

A sample of dataset characters for training and validation.

I will perform several data preprocessing techniques;

1. Reduce the image size to 80x80 to allow for a higher accuracy.

2. Perform label encoding to the labels to enable equal representation among them.

3. Split the dataset into training and validation datasets to avoid overfitting.

4. Perform image augmentation using small basic transformation techniques e.g. rotating, shifting height and weight, zooming and stretching etc.

Pay careful attention during image augmentation to avoid vertical or horizontal rotation of numbers such as “6” to “9”. The model would be trained wrongly!

Then I will construct the MobileNets architecture with pre-trained weights on the ImageNet dataset. The model is imported directly from the Keras library.

· I have changed the last output layer to have the size of the encoded labels columns, which is 36. This layer will output the predicted probability of each of the 36 classes through the softmax activation function.

· Then initializing the learning rate and decay value; and compiling the model with loss function and metrics: categorical cross entropy and accuracy, respectively.

Model to predict the segmented characters.

I will train and evaluate the model with these hyperparameters:

· I selected a large batch size (300) to reduce my computational power expense, but at the risk of reducing the performance of my model. This is a trade-off one must contend with during training.

· I have used 20 epochs to increase training, and counter the large batch size effect; hence improve my model’s performance.

· I will implement some callback functions to monitor the metrics:

o EarlyStopping will stop the training process if validation loss does not improve after 5 epochs, while ModelCheckpoint will save weights of the model every time the validation loss is improved.

o The model acquired 97% accuracy on the validation set, at the end of the training process.

Training the model.
97% accuracy achieved on the last epoch.

Here is a visualization of the training and validation: loss and accuracy, that also shows no overfitting:

High accuracy and minimum loss for the validation curves.

I will save the model architecture and the weights achieved during training, to avoid having to build it up from scratch again.

Saving the model in JSON format.

To obtain a prediction, I will first load the saved model architecture, the saved weights after training and the original label classes (not the encoded labels).

Trained model, character recognition weights and classes loaded.

I will also implement the loaded label classes to inverse one-hot encoding label obtained from model to digital characters and loop over each segmented character image in the list, crop_characters, and display each segmented character in a grid along with its corresponding predicted label as its title.

100% accurate prediction.

Here are other successful predictions:

Kenyan license plate image_2
image_2 100% recognition
Kenyan license plate image_3
image_3 100% recognition


This model clearly has room for improvement.

Kenyan license plate image_4
An inaccurate recognition.

It can be seen that the model has either under-trained on “4”s or over-trained on “A”s.

One way to effect an improvement would be to perform class balancing on the characters dataset before training, so that the model doesn’t train more on other characters and less on others.

Any other improvement suggestions are welcome in the comments section. Thank you!

Generally, in conclusion, this project has sufficiently achieved its objective by automatically detecting and recognizing Kenyan license plates.

What’s next?

Further applications to this project can be made to automatically trigger events e.g. message or email notifications, alarms, increment statistical records, update payment expenses…etc. when particular vehicle license plates are recognized from a database.

More applications? Great! Let’s hear them out in the comments.

Resources and References

  1. Wpod-Net Paper
  2. Deep Learning
  3. Convolution Neural Network



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vanessa Alumasa

I am a constant learner in Data Science, Machine Learning and Artificial Intelligence. I enjoy logical and strategically-charged conversations.