With each new development cycle, ABBYY enhances the recognition accuracy, document analysis, export, and synthesis capabilities of the Engine by adding cutting-edge innovations and features designed to meet the needs of our various customers. FineReader Engine 7.1 inherits all improvements that are implemented in 7.0 development cycle. Customers of FineReader Engine 7.1 will benefit from the following enhancements:
Improvements of Core Technology
Recognition
Overall recognition accuracy improved by 25%
FineReader Engine 7.1 is based on an entirely new recognition platform that includes enhancements to ABBYY's proprietary IPA Technology and other tools for fine-tuning the recognition process. A major contributor to the overall letter and word accuracy are the new “structural character models.” These models, created by ABBYY researchers, are stored in the program's memory and are used for recognizing individual text characters. The structural character models have proved to be extremely useful in one of the most complex ICR tasks — hand-print character recognition. FineReader Engine 7.1 is the first ABBYY program to use models of machine-printed characters. Now FineReader technology delivers higher recognition accuracy, better tolerates changes in letter shapes and can even restore missing or blotted out parts of letters.
New structural character models are used to analyze letters and words.
Recognition accuracy on "difficult-to-read" documents improved by 33%
Today's OCR technologies already can accurately recognize "simple texts". The real challenge that remains is to deliver high accuracy for what is considered "difficult-to-read" documents and documents with complex formats.
Difficult-to-read documents contain not just simple text but also include features hindering the recognition process. Examples include: text printed over an image, low-contrast documents with color text printed on a color background, and poorly scanned pages.
In FineReader 7.1, ABBYY has further enhanced its two image preprocessing technologies that help recognize such documents. First, ABBYY has enhanced its Adaptive Binarization technology. First introduced in FineReader Engine 6.0, Adaptive Binarization uses a "dynamic" or "intelligent threshold" technique to adjust the contrast of a document when converting grayscale and color document images to black and white, a necessary step prior to OCR recognition. The dynamic thresholding tunes the image contrast on a line by line basis, optimizing the character quality in order to achieve the most accurate recognition results.
Additionally, ABBYY has further enhanced its "texture filtering" technologies. These processes remove texture and background "noise" that could interfere with text recognition. In FineReader Engine 7.1, ABBYY has introduced a new filtering technique that filters out excess dots and background images at multiple levels.
Recognition accuracy on specialized legal and medical texts improved by 30-40%
ABBYY FineReader Engine 7.1 includes specialized legal and medical dictionaries for the English and German languages, which greatly improve recognition accuracy of legal and medical texts.
New recognition module for barcode recognition
The new module supports the most popular 1D barcodes: Code 39, Checked Code 39, Interleaved 25, Checked Interleaved 25, EAN 8, EAN13, Code 128, CODABAR (without checksum), UCC Code 128, Code 2 of 5 (Industrial, IATA, Matrix), Code 93, UPC-A, UPC-E and Postnet. This fast-working module can automatically find and recognize barcodes on a document.
Back to Top
Document and Image Analysis
Thanks to MDA and IPA Technology, FineReader Engine 7.1 retains the exact look of the printed document, be it wrap-around text, columns, tables, non-rectangular pictures, varying fonts, or varying spacing between characters.
Improved accuracy of layout analysis and format retention
ABBYY FineReader Engine 7.1 better retains document layouts, such as placement of images and columns, formatting of tables, font sizes, and more. Key improvements include:
- complex tables (such as tables without printed grid lines and tables with colour cells)
- multi-column documents with images (such as magazine articles)
- HTML formatting
- bullet points (by export to Microsoft Word)
FineReader Engine 7.1 preserves the shapes of certain standard Microsoft Word bullets when creating lists and outlines.
Improvements in recognition accuracy and format retention are largely due to a new algorithm used for Multilevel Document Analysis (MDA). Analysis of the document is held at multiple level "top-down": the system divides the page image first into blocks, blocks into paragraphs, lines and so on right up to separate symbols.
The recreation of the electronic copy after finalizing recognition is held in the reverse sequence "bottom-up". And at the same time the feedback tool controling all the levels of analysis essentially reduces the probability of recognition errors at high levels.
Back to Top
Export and Synthesis
Creates PDF files optimized for Web publishing
All PDF documents created by the FineReader Engine are now created as "linearized PDFs," which means that users will be able to see the first pages before they have downloaded the entire document.
Improved WYSIWYG HTML output
The retention of complex formatting elements has been improved in HTML (e.g. text flowing around non-rectangular images). HTML files are now of smaller size, which is important for documents published on the Internet.
Output to Microsoft PowerPoint
ABBYY FineReader Engine 7.1 now supports output to Microsoft PowerPoint. You can quickly convert presentation handouts or any other documents to create new presentations or edit existing ones. (PowerPoint XP and 2003 are supported.)
Smaller file sizes when exporting to Microsoft Word
When exporting results to Microsoft Word, the DOC file will become smaller than in the previous versions. The retention of formatting of documents with various separators has been improved and new image saving options have been added, allowing you to specify the sizes of saved images.
Image Formats
JPEG 2000 support
Image files based on the latest JPEG file format can now be opened and saved.
Learn more about benefits of using ABBYY FineReader Engine 7.1...
Back to Top