Smart and powerful OCR tools

Nicomsoft OCR SDK – List of Features

  • Supports 26 languages: Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Hungarian, Indonesian, Italian, Latvian, Lithuanian, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish and Turkish
  • Asian OCR module which supports 5 Asian languages: Chinese simplified, Chinese traditional, Arabic, Japanese, Korean.
  • Supports many image formats, including such popular ones as BMP, JPEG, PNG, TIFF, and GIF.
  • Supports multipage image formats (TIFF, GIF).
  • Supports PDF files (by default, GhostScript is used)
  • OCR'ed documents can be saved in the PDF, PDF/A (PDF/A-1a or PDF/A-1b), RTF, Text or XML formats.
  • Can load an image from a file, memory, or raw pixel data.
  • Can scan documents: both TWAIN and WIA interfaces are supported.
  • Supports bar codes: EAN-13/UPC-A, UPC-E, EAN-8, Code 128, Code 39, Interleaved 2 of 5, and QR codes.
  • Supports MRZ (machine reading zone) for passports, vises, identity cards, etc.
  • Uses an advanced deskewing algorithm.
  • Can detect misoriented pages (90/180/270 degrees rotation) and fix the orientation automatically.
  • Can invert/rotate/mirror the entire image or some text block(s) before processing.
  • Can automatically scale an image for better recognition.
  • Thanks to robust adaptive image binarization, supports images with poor brightness or low contrast.
  • Can configure image binarization parameters for specific images.
  • Uses an advanced page layout analysis algorithm (zoning).
  • Can perform OCR step by step and produce intermediate results.
  • Uses an advanced lines detection and removal algorithm.
  • Can perform zonal OCR, that is, allows you to select one or multiple areas for OCR.
  • Allows you to specify different OCR options for different areas.
  • Allows you to specify multiple languages for OCR.
  • Allows you to specify different languages for different areas in same image.
  • Can detect and process inverted text.
  • Can use several CPU cores for even faster OCR, even if OCR is performed for one page only.
  • Is thread-safe, that is, can process multiple images at once by using multiple threads.
  • Is based on a unique character analysis technology that can accurately recognize any fonts.
  • Uses advanced algorithms for analyzing poor-quality images, which may contain distorted, connected, and broken characters.
  • Uses dictionaries for the best recognition.
  • Allows you to use user-defined dictionaries.
  • Can format text automatically: remove unnecessary line breaks, combine divided words, detect bulleted lists, etc.
  • Is highly optimized for the SSE instructions to speed up the OCR process.
  • Allows you to change the default options and improve OCR results if some image parameters are known.
  • Allows you to exclude certain characters from the character set if necessary.
  • Can get additional information about text lines, words, and characters: position, size, quality, etc.
  • Allows you to specify regular expressions to improve recognition of formatted data.
  • Fully supports Unicode.
  • Can load images from memory and save recognized documents to memory, instead of using files.
  • Is highly mobile: The OCR binaries and data files are less than 10 MB per language.
  • Has a simple API interface and includes sample projects for various programming languages: C#, C/C++, VB.NET, Delphi, C++ Builder, Visual Basic, VBScript, JScript.
  • Supports a wide range of frameworks and technologies: .NET, WPF, WCF, ASP.NET, Silverlight, etc.
  • Both x86 and x64 native binaries are available for Windows.

Download a trial version of the Nicomsoft OCR SDK.