Nicomsoft OCR SDK – List of Features
- Supports 26 languages: Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish and Turkish
- Asian OCR module which supports 5 Asian languages: Chinese simplified, Chinese traditional, Arabic, Japanese, Korean.
- Supports many image formats, including such popular ones as BMP, JPEG, PNG, TIFF, and GIF.
- Supports multipage image formats (TIFF, GIF).
- Supports PDF files (by default, GhostScript is used)
- OCR'ed documents can be saved in the PDF, PDF/A (PDF/A-1a or PDF/A-1b), RTF, Text or XML formats.
- Can load an image from a file, memory, or raw pixel data.
- Can scan documents: both TWAIN and WIA interfaces are supported.
- Supports bar codes: EAN-13/UPC-A, UPC-E, EAN-8, Code 128, Code 39, Interleaved 2 of 5, and QR codes.
- Supports MRZ (machine reading zone) for passports, vises, identity cards, etc.
- Uses an advanced deskewing algorithm.
- Can detect misoriented pages (90/180/270 degrees rotation) and fix the orientation automatically.
- Can invert/rotate/mirror the entire image or some text block(s) before processing.
- Can automatically scale an image for better recognition.
- Thanks to robust adaptive image binarization, supports images with poor brightness or low contrast.
- Can configure image binarization parameters for specific images.
- Uses an advanced page layout analysis algorithm (zoning).
- Can perform OCR step by step and produce intermediate results.
- Uses an advanced lines detection and removal algorithm.
- Can perform zonal OCR, that is, allows you to select one or multiple areas for OCR.
- Allows you to specify different OCR options for different areas.
- Allows you to specify multiple languages for OCR.
- Allows you to specify different languages for different areas in same image.
- Can detect and process inverted text.
- Can use several CPU cores for even faster OCR, even if OCR is performed for one page only.
- Is thread-safe, that is, can process multiple images at once by using multiple threads.
- Is based on a unique character analysis technology that can accurately recognize any fonts.
- Uses advanced algorithms for analyzing poor-quality images, which may contain distorted, connected, and broken characters.
- Uses dictionaries for the best recognition.
- Allows you to use user-defined dictionaries.
- Can format text automatically: remove unnecessary line breaks, combine divided words, detect bulleted lists, etc.
- Is highly optimized for the SSE instructions to speed up the OCR process.
- Allows you to change the default options and improve OCR results if some image parameters are known.
- Allows you to exclude certain characters from the character set if necessary.
- Can get additional information about text lines, words, and characters: position, size, quality, etc.
- Allows you to specify regular expressions to improve recognition of formatted data.
- Fully supports Unicode.
- Can load images from memory and save recognized documents to memory, instead of using files.
- Is highly mobile: The OCR binaries and data files are less than 10 MB per language.
- Has a simple API interface and includes sample projects for various programming languages: C#, C/C++, VB.NET, Delphi, C++ Builder, Visual Basic, VBScript, JScript.
- Supports a wide range of frameworks and technologies: .NET, WPF, WCF, ASP.NET, Silverlight, etc.
- Both x86 and x64 native binaries are available for Windows.
Download Nicomsoft OCR SDK.