Difference between revisions of "OCR"
(New page: While researching your family tree, you will find textual publications or administrative documents. You may avoid long and annoying work into GRAMPS by using optical character recognition ...) |
|||
Line 6: | Line 6: | ||
==How this work ?== | ==How this work ?== | ||
+ | |||
+ | * picture need to be contrasted | ||
+ | * OCR programs read the picture and with forms librairies, detect the characters in order to make some correspond the form to the awaited character | ||
+ | * dictionnaries will be used for minimized errors. They make comparison between existing words and your result. | ||
+ | * Some programs allow bold, italic or custom fonts size. | ||
+ | |||
+ | ==Using into GRAMPS== | ||
+ | |||
+ | There is not a lot of OCR open sources programs. | ||
+ | Intelligent Word Recognition (IWR), Intelligent Character Recognition (ICR) for written certificates are hight level. They are used on financial, historical sectors. Some programs may be used as third party of GRAMPS. | ||
+ | * [http://code.google.com/p/tesseract-ocr/ Tesseract] may be a good solution for english reader but it currently only recognizes US-ASCII characters ... | ||
+ | * [http://www.geocities.com/claraocr/ claraocr] seems to be able to learn but I do not find any documentation. | ||
+ | * [http://jocr.sourceforge.net/ GOCR/JOCR] may generate a custom database characters with: |
Revision as of 09:18, 5 April 2007
While researching your family tree, you will find textual publications or administrative documents. You may avoid long and annoying work into GRAMPS by using optical character recognition (OCR).
Here we show how you can work on your picture to make it to text !!!
How this work ?
- picture need to be contrasted
- OCR programs read the picture and with forms librairies, detect the characters in order to make some correspond the form to the awaited character
- dictionnaries will be used for minimized errors. They make comparison between existing words and your result.
- Some programs allow bold, italic or custom fonts size.
Using into GRAMPS
There is not a lot of OCR open sources programs. Intelligent Word Recognition (IWR), Intelligent Character Recognition (ICR) for written certificates are hight level. They are used on financial, historical sectors. Some programs may be used as third party of GRAMPS.