Smart and powerful OCR tools

OCR steps in details

Any OCR process consists of several steps like image scaling, deskewing, noise removing, binarization, lines detection, and so on. Sometimes it is very useful to have some kind of control at these steps, for example, to pause the OCR process after some step, analyze the intermediate results and change some settings of the next step before proceeding to it, or execute some step again with different settings. NSOCR provides you with this kind of control and defines the following OCR steps:

  • OCRSTEP_FIRST
  • OCRSTEP_PREFILTERS
  • OCRSTEP_BINARIZE
  • OCRSTEP_POSTFILTERS
  • OCRSTEP_REMOVELINES
  • OCRSTEP_ZONING
  • OCRSTEP_PREOCR
  • OCRSTEP_OCR
  • OCRSTEP_LAST

These steps are used in the "Img_OCR" function. It has the parameters "FirstStep" and "LastStep", so you can specify a range of OCR steps to execute. The simplest case is to specify the full range OCRSTEP_FIRST ... OCRSTEP_LAST, so all steps are executed and the entire OCR process is done. But you can also call the "Img_OCR" function several times and execute steps one by one, or execute a few steps. Note that you cannot skip any steps, though you can execute some steps twice or more if necessary. So, in fact, both code samples below do exactly the same thing:

NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_FIRST, TNSOCR.OCRSTEP_LAST, TNSOCR.OCRFLAG_NONE);

and

NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_FIRST, TNSOCR.OCRSTEP_BINARIZE, TNSOCR.OCRFLAG_NONE);
//... you can get a binarized image here, check if it is good and re-binarize it with different settings if necessary
NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_BINARIZE + 1, TNSOCR.OCRSTEP_ZONING, TNSOCR.OCRFLAG_NONE);
//... you can analyze the detected zones here, remove some of them, or create new ones
NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_ZONING + 1, TNSOCR.OCRSTEP_LAST, TNSOCR.OCRFLAG_NONE);

Another example: There is an image that contains both barcodes and text, and you need to recognize both barcodes and text. Let's do it, assuming that the image is already loaded to the ImgObj object:

int w, h, BlkObj;
string txt1, txt2;
NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_FIRST, TNSOCR.OCRSTEP_ZONING - 1, TNSOCR.OCRFLAG_NONE); //execute steps before zoning
NsOCR.Img_GetSize(ImgObj, out w, out h);
NsOCR.Img_AddBlock(ImgObj, 0, 0, w, h, out BlkObj); //create a zone that covers the entire image
NsOCR.Blk_SetType(BlkObj, TNSOCR.BT_BARCODE); //mark it as a barcode zone
NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_ZONING, TNSOCR.OCRSTEP_LAST, TNSOCR.OCR_FLAG_NONE); //recognize all barcodes on image
NsOCR.Img_GetImgText(ImgObj, out txt1, TNSOCR.FMT_EDITCOPY); //get barcodes text
NsOCR.Img_DeleteAllBlocks(ImgObj); //remove our barcode zone
NsOCR.Img_OCR(ImgObj, TNSOCR.OCRSTEP_ZONING, TNSOCR.OCRSTEP_LAST, TNSOCR.OCRFLAG_NONE); //execute the same steps again; there are no defined zones now, so OCR will find text/picture zones automatically and recognize them.
NsOCR.Img_GetImgText(ImgObj, out txt2, TNSOCR.FMT_EDITCOPY); //get OCR'ed text

 

Detailed description of OCR steps:

  • OCRSTEP_FIRST - The first step, which must be specified as the "FirstStep" parameter value for the first "Img_OCR" function call every time when a new image is loaded or a new image page is selected with the "Img_SetPage" function. This step does nothing, it only marks the OCR process as started.
  • OCRSTEP_PREFILTERS - An important step when a lot of actions are performed on the loaded (original) image. The most important thing is that at the end of this step, the (final) image can be rescaled for best recognition results. All functions that get or set position/size of blocks/words/characters will use the coordinates of the rescaled image. You can use the "Img_CalcPointPosition" function to convert coordinates between the original and final images.

    The following actions are done at this step: image rotation, mirroring, inversion, deskewing, rescaling, and noise removal. This step can be executed only once. Any subsequent calls will be ignored, because at the first call, the original image is removed to reduce memory usage.

  • OCRSTEP_BINARIZE - At this step, the binarized image is generated.
  • OCRSTEP_POSTFILTERS - At this step, some additional algorithms for noise removing are applied.
  • OCRSTEP_REMOVELINES - At this step, NSOCR detects and removes lines and frames from the image. After this step, you can get the parameters of detected lines by using the "Img_GetPixLineCnt" and "Img_GetPixLine" functions.
  • OCRSTEP_ZONING - If no blocks (zones) were created before this step, NSOCR performs autozoning, that is, detects text and picture zones in the image. This step is ignored if any blocks were created before this step.
  • OCRSTEP_PREOCR - Additional preparations are applied at this step. For example, some blocks can be binarized again.
  • OCRSTEP_OCR - The main step. The OCR process itself, when NSOCR recognizes text in defined zones.
  • OCRSTEP_LAST - The last step, which must be specified as the "LastStep" parameter value at the last "Img_OCR" call. This step does nothing, it only marks OCR process as finished.

 

Tips for developers:

  • Some functions cannot be called if appropriate OCR steps were not executed. If this is the case, the ERROR_MISSEDSTEP error code is returned. For example, you cannot call the "Img_GetImgText" function if the OCRSTEP_OCR step was not executed. For more details, please refer to the API documentation.
  • Skipping OCR steps is not allowed. All steps must be executed. If some step is not necessary, you can disable it in the configuration.
  • If you load a new image or select a different image page, the OCR steps must be executed from the beginning, that is, starting from the OCRSTEP_FIRST step.
  • The OCRSTEP_PREFILTERS step can be executed only once, any subsequent calls will be ignored. The reason is that at the first call, the original image is removed to reduce memory usage.