[Mageia-discuss] Reading payment forms with a scanner

Juergen Harms juergen.harms at unige.ch
Thu Feb 14 09:21:36 CET 2013


I just had a nasty experience with an ebanking bill that got rejected 
(without sending me a corresponding note) due to a typo.

Does anybody have experience/advice on using a normal scanner for 
reading the essential fields of payment forms to make ebanking more 
efficient and less error-prone?

I just did some googling and quick checks along the following lines:
- tile the payment forms on the scanner (I have an Epson 1260), so that 
only the reading zone at the bottom is visible of each,
- scan with xsane (selecting adequate settings - different from those I 
ordinarily use)
- if necessary us gimp to cut away zones with garbage that upset the OCR 
conversion (i.e. tesseract, can be avoided by properly setting the 
reading area in xsane)
- use tesseract to do OCR
- filter the output to throw away garbage lines, and to correct 
characters that frequently get mis-interpreted (e.g. B->8, Z->2, O->0, 
D->0 etc.)
- output the data thus produced, formatted for copy-paste into the 
ebanking form

That works surprisingly well, but is excessivly complicated to handle 
(easy to make handling mistakes, not fit to give it to my wife). Are 
there tools that help automating these steps and integrating them into a 
single tool? - if not, it should not be too difficult to do some 
scripting (but I dont want to re-invent things). (And yes, I had tried 
some years ago these small reading sticks that you slide over the form - 
I ditched it: only works on windows, and produces an excessive amount of 
errors).

Juergen


More information about the Mageia-discuss mailing list