Nicomsoft OCR: Developer's Guide


Performance Tips


Disable some algorithms

     OCR is a complex task that uses a lot of algorithms. By default, NSOCR uses all available algorithms and settings to achieve good recognition for all possible image types. In many cases, you can improve the overall performance and reduce the recognition time severalfold (see the Configuration section for the description of options):

     Note that some algorithms are interrelated. To reduce the recognition time, you need to disable all interrelated algorithms. First, disable the algorithms one by one, checking the recognition quality for your images. Second, disable all algorithms that are not important for your images, and then measure the recognition time and compare it with the original recognition time.

Note:
           The first image is recognized more slowly than any subsequent images due to OCR initialization.


Select zones for OCR

     If you need to recognize only a part of the image, you can select one or more zones for recognition. It will disable the auto-zoning algorithm and reduce the overall recognition time.


Use a multi-core CPU efficiently

     By default, NSOCR will use several CPU cores to process even a single image. However, not all algorithms can be parallelized, so the fastest way to recognize text in a lot of images is to process them in several threads, one thread per image. For more information, see the article "General description of Nicomsoft OCR SDK architecture", and also check the "C# Multithreading Sample" project in the OCR SDK.


Use NSOCR objects correctly

     If you need to process multiple images, do not create/destroy any objects and do not initalize/uninitialize NSOCR each time you process an image. Keep in mind that initialization takes some time. Due to additional OCR initialization, the first image is recognized more slowly than any subsequent images. The code that processes several images should look like this:
//FileNames, FilesCnt definitions
//...
int CfgObj, OcrObj, ImgObj, i;
Engine_InitializeAdvanced(&CfgObj, &OcrObj, &ImgObj); //called only once
for (i = 0; i < FilesCnt; i++)
{
  Img_LoadFile(Img, FileNames[i]); //load image
  Img_OCR(ImgObj, OCRSTEP_FIRST, OCRSTEP_LAST, OCRFLAG_NONE); //perform OCR
  //call Img_GetImgText or other function(s) to get results
}
Engine_Uninitialize(); //called only once