Arabic Handwritten Digits Dataset
Abstract
In recent years, handwritten digits recognition has been an important area
due to its applications in several fields. This work is focusing on the recognition
part of handwritten Arabic digits recognition that face several challenges, including
the unlimited variation in human handwriting and the large public databases. The
paper provided a deep learning technique that can be effectively apply to recognizing Arabic handwritten digits. LeNet-5, a Convolutional Neural Network (CNN)
trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. A comparison is held amongst the
results, and it is shown by the end that the use of CNN was leaded to significant
improvements across different machine-learning classification algorithms.
The Convolutional Neural Network was trained and tested MADBase database (Arabic handwritten digits images) that contain 60000 training and 10000 testing images. Moreover, the CNN is giving an average recognition accuracy of 99.15%.
Context
The motivation of this study is to use cross knowledge learned from multiple works to enhancement the performance of Arabic handwritten digits recognition. In recent years, Arabic handwritten digits recognition with different handwriting styles as well, making it important to find and work on a new and advanced solution for handwriting recognition. A deep learning systems needs a huge number of data (images) to be able to make a good decisions.
Content
The MADBase is modified Arabic handwritten digits database contains 60,000 training images, and 10,000 test images. MADBase were written by 700 writers. Each writer wrote each digit (from 0 -9) ten times. To ensure including different writing styles, the database was gathered from different institutions: Colleges of Engineering and Law, School of Medicine, the Open University (whose students span a wide range of ages), a high school, and a governmental institution.
MADBase is available for free and can be downloaded from (http://datacenter.aucegypt.edu/shazeem/) .
Acknowledgements
CNN for Handwritten Arabic Digits Recognition Based on LeNet-5
http://link.springer.com/chapter/10.1007/978-3-319-48308-5_54
Ahmed El-Sawy, Hazem El-Bakry, Mohamed Loey
Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016
Volume 533 of the series Advances in Intelligent Systems and Computing pp 566-575
Inspiration
Creating the proposed database presents more challenges because it deals with many issues such as style of writing, thickness, dots number and position. Some characters have different shapes while written in the same position. For example the teh character has different shapes in isolated position.
Arabic Handwritten Characters Dataset
https://www.kaggle.com/mloey1/ahcd1
Benha University