A reader recently e-mailed to ask whether or not OCR software is capable of interpreting handwriting. OCR software is really meant to work with printed text, not handwritten text, so I didn’t think this would work very well. It sounded like a fun experiment though, so I gave it a shot.
Quick OCR Intro
OCR stands for Optical Character Recognition. OCR software is able to recognize printed text in a document and convert it into a format that a computer can understand. Using OCR software, you can scan a printed document and then edit it in a word processor like Microsoft Word without having to re-type the entire contents of the original text.
I conducted the following tests using this web interface to Tesseract OCR. Tesseract OCR has been around for a while, and was long regarded as one of the most accurate OCR programs around.
The Control
In order to make sure that the web version of Tesseract OCR was working properly, I took a screenshot of some text from Microsoft Word and converted it. Here is the input:
Output: “The quick brown fox jumped over the lazy dog.”
Looks like everything is working fine.
Printing Test
First, I tested out printed handwritten text. Here is the input:
Output: “{M12 at/mgk brown wO0> Not so good. Since the software was unable to read my printing, I did not have much hope for my cursive. I tested two different images: a sentence formed as perfectly as I was able to manage (which took 3 attempts!), and a sentence written in my normal handwriting. Here is (my best attempt at) proper cursive handwriting: And my normal handwriting: Output: absolutely nothing. Tesseract OCR wasn’t able to get anything from either sample of cursive handwriting.Cursive Test




