Menasseh: klein portret   

OCR and Hebrew on the Web

  Contemporary text as a preliminary to 17th century printing

A page with a clear text in contemporary Hebrew, scanned by HP DeskScan II

The text consisted of 1596 characters. After training 42 patterns (including 2 coupled patterns) in accuracy 2 proLector was able to recognize the text including spacing. This took appr. 15 minutes.

To get the same result Omnipage had to be trained 86 patterns, including 19 coupled patterns. This took appr. 45 minutes.

The result of OCR can be compared to the original.

In order to be able to read the Hebrew text it is necessary to install a Hebrew font
All necessary information can be found at including a link to the ftp-site.
Connecting to the site may take some time.

The results of OCR on books from the Menasseh collection using proLector can also be viewed:

