OCR with Neural network: data extraction
- by Sebastian Hoitz
I'm using the AForge library framework and its neural network.
At the moment when I train my network I create lots of images (one image per letter per font) at a big size (30 pt), cut out the actual letter, scale this down to a smaller size (10x10 px) and then save it to my harddisk. I can then go and read all those images, creating my double[] arrays with data. At the moment I do this on a pixel basis.
So once I have successfully trained my network I test the network and let it run on a sample image with the alphabet at different sizes (uppercase and lowercase).
But the result is not really promising. I trained the network so that RunEpoch had an error of about 1.5 (so almost no error), but there are still some letters left that do not get identified correctly in my test image.
Now my question is: Is this caused because I have a faulty learning method (pixelbased vs. the suggested use of receptors in this article: http://www.codeproject.com/KB/cs/neural_network_ocr.aspx - are there other methods I can use to extract the data for the network?) or can this happen because my segmentation-algorithm to extract the letters from the image to look at is bad?
Does anyone have ideas on how to improve it?