Smart and powerful OCR tools

General description of Nicomsoft OCR architecture

The Nicomsoft OCR engine (NSOCR) consists of the following modules:

  1. Main engine module: Engine_XXXXX functions. It is used for initialization/uninitialization of the library and general library configuration.
  2. Configuration module: Cfg_XXXXX functions. It is used to create, destroy, and manage CFG objects. A CFG object stores all settings that are used in the OCR process.
  3. OCR module: Ocr_XXXXX functions. It is used to create, destroy, and manage OCR objects. An OCR object performs optical character recognition and handles all resources that are related to the OCR process, such as character template bases, dictionaries, recognition threads, etc. When a new OCR object is created, a related CFG object must be specified, which will be used to retrieve settings.
  4. Image module: Img_XXXXX functions. It is used to create, destroy, and manage IMG objects. An IMG object handles an image for OCR. When a new IMG object is created, a related OCR object must be specified, which will be used for the OCR process.
  5. Block module: Blk_XXXXX functions. It is used to manage BLK objects. A BLK object handles an area in the related IMG object. The area can be a text block, a picture, a clear area, etc. Each BLK object has a related IMG block, is characterized by position and type, and also has other settings that define how OCR is performed for the block.
  6. Saver module: Svr_XXXXX functions. It is used to save OCR'ed documents as PDF, RTF, or TXT files.
  7. Scan module: Scan_XXXXX functions. It is used to work with scanners and scanned documents via the TWAIN and WIA interfaces.

In general, to OCR an image, you need to do the following:

  1. Initialize the library: Engine_Initialize.
  2. Create a CFG object and load the configuration file: Cfg_Create, Cfg_LoadOptions.
  3. Create an OCR object: Ocr_Create.
  4. Create an IMG object and load an image for OCR: Img_Create, Img_LoadFile.
  5. Create one or more BLK objects to define areas for OCR: Img_AddBlock. (Skip this step if you want to perform auto-zoning for the entire image and create BLK objects automatically.)
  6. Call the Img_OCR function to perform OCR, and then get results by using the Img_GetImgText function (or a different function).
  7. Release all created objects and uninitialize the library: Img_Destroy, Ocr_Destroy, Cfg_Destroy, Engine_Uninitialize.

If you need to recognize two or more images one by one:

  1. Initialize the library: Engine_Initialize.
  2. Create a CFG object and load the configuration file: Cfg_Create, Cfg_LoadOptions.
  3. Create an OCR object: Ocr_Create.
  4. Create an IMG object: Img_Create.
  5. Load some image for OCR: Img_LoadFile (or use adifferent function).
  6. Create one or more BLK objects to define areas for OCR: Img_AddBlock. (Skip this step if you want to perform auto-zoning for the entire image and create BLK objects automatically.)
  7. Call the Img_OCR function to perform OCR, and then get results by using the Img_GetImgText function (or a different function).
  8. Go to step 5 to load the next image.
  9. Release all created objects and uninitialize the library: Img_Destroy, Ocr_Destroy, Cfg_Destroy, Engine_Uninitialize.

As you can see, in this scenario the CFG, OCR, and IMG objects are created only once, and library initialization is called only once, too. In both scenarios, the objects hierarchy will be the same:

Hierarchy

By default, an OCR object can use several CPU cores (the option "Main/ NumKernels"), so every image can be processed faster on a multi-core CPU. However, as some OCR algorithms cannot be parallelized, the CPU will not be utilized completely. If you have many images and need to process them as fast as possible, the best way is to process several images at once, using one thread per image. In this case, you should do the following:

  1. Initialize the library: Engine_Initialize.
  2. Create a CFG object and load the configuration file: Cfg_Create, Cfg_LoadOptions.
  3. Call the Cfg_SetOption function to setthe option "Main/NumKernels" to "1", in order to force each OCR object to use only one thread during the OCR process. In case of a .NET language, also make sure that your .NET application has "[MTAThread]" option to allow correctly using the NSOCR COM object from multiple threads.
  4. Create several threads in your application, ideally as many as the number of CPU cores.
  5. In every thread, create an OCR object: Ocr_Create.
  6. In every thread, create an IMG object: Img_Create.
  7. In every thread, load an image for OCR: Img_LoadFile (or different function).
  8. In every thread, create one or more BLK objects to define areas for OCR: Img_AddBlock. (Skip this step if you want to perform auto-zoning for entire image and create BLK objects automatically.)
  9. In every thread, call the Img_OCR function to perform OCR, and then get results by using the Img_GetImgText function (or a different function).
  10. In every thread, go to step 5 to load the next image.
  11. Release all created objects and uninitialize the library: Img_Destroy, Ocr_Destroy, Cfg_Destroy, Engine_Uninitialize.

In this scenario, the objects hierarchy looks like this:

Hierarchy

Note: This approach requires more memory as several images are processed at the same time. You can get the ERROR_NOMEMORY error if NSOCR cannot allocate enough memory when simultaneously processing many big images (especially in x86 applications).

Tips for developers:

  • When using the NSOCR library as a COM object, do not create several instances of NSOCRLib. Only one instance of the NSOCR library can be created.
  • Do not initialize NSOCR multiple times. Initialization of the library must be performed only once during your application life, do not initialize it before every image. In other words, make sure you call the "Engine_Initialize" function only once, even if you process a lot of images, use multiple threads, etc.
  • Do not create several CFG objects if you do not need them. Even if you create multiple OCR objects, you can specify the same CFG object for them. Several CFG objects are necessary only if you need to have different settings for different OCR objects.
  • Do not create several OCR objects if your application has only one thread. Multiple OCR objects are necessary only if you need to process multiple images at once. If you process images one by one, you do not need to create multiple OCR objects; create one OCR object and use it for all images.
  • Do not create multiple IMG objects if you have only one OCR object and do not need to have several images loaded.
  • Do not forget to release objects. Every XXX_Create function returns an object handle, which is an integer. When you create the first object, the handle will be 1; then the XXX_Create function will return 2, etc. There is a simple way to check if you have forgotten to release the objects: Just check the object handles while your application is working. Too high values (say, more than 100) mean that you need to check the code. Otherwise the number of objects will continue to grow, and after a while you will get the ERROR_TOOMANYOBJECTS error.