Nicomsoft OCR: Developer's Guide


Nicomsoft OCR Constants


Block (zone) type constants:
BT_DEFAULT0x00Used only for the Cfg_GetOption and Cfg_SetOption functions to access the "Default" configuration section. See the NSOCR Configuration section for details.
BT_OCRTEXT0x01The block contains machine-printed text.
BT_ICRDIGIT0x02The block contains handwritten digits.
BT_CLEAR0x03The block is for clearing its image area (removes block area from recognition).
BT_PICTURE0x04The block contains a picture.
BT_ZONING0x05The block is for detecting zones (text and picture blocks).
BT_OCRDIGIT0x06The block contains machine-printed digits.
BT_BARCODE0x07The block contains a barcode.
BT_TABLE0x08The block contains a table.
BT_MRZ0x09The block contains MRZ, the machine-readable zone (ISO/IEC 7501-1).


Constants for the Img_LoadBmpData function:
BMP_24BIT0x00The image is 24-bit (color).
BMP_8BIT0x01The image is 8-bit (grayscale).
BMP_1BIT0x02The image is 1-bit (black-white).
BMP_32BIT0x03The image is 32-bit (color).
BMP_BOTTOMTOP0x100The image is bottom up and starts at the bottom-left corner.


Constants for the Img_GetImgText, Blk_GetText and Svr_AddPage functions:
FMT_EDITCOPY0x00The text will be formatted for editing: unnecessary line breaks will be removed, divided words will be combined, etc.
FMT_EXACTCOPY0x01The text will be returned exactly as it appears in the image.


Constants for the Img_OCR function:
OCRSTEP_FIRST0x00The first step in the OCR process. It only marks the OCR process as started.
OCRSTEP_PREFILTERS0x10Applies image filters: image scaling, inversion, rotation, mirroring, and deskewing algorithms. See the "ImgAlizer" section in the configuration file for possible options. This step is performed only once, you cannot call it twice.
OCRSTEP_BINARIZE0x20Calculates the binarized image. See the "Binarizer" section in the configuration file for possible options. If necessary, this step can be called several times with different parameters.
OCRSTEP_POSTFILTERS0x50Applies filters to the binarized image. See such options as "BigGarbageMinWidth" or "SmallGarbageMaxPixCnt". If necessary, this step can be called several times with different parameters.
OCRSTEP_REMOVELINES0x60Finds lines and removes them from the image. If necessary, this step can be called several times with different parameters.
OCRSTEP_ZONING0x70If the image doesn’t have any defined blocks or has the BT_ZONING blocks, the page will be analyzed, and text and picture blocks will be created automatically. If necessary, this step can be called several times.
OCRSTEP_OCR0x80Performs OCR of the image. If necessary, this step can be called several times with different parameters.
OCRSTEP_LAST0xFFThe last step in the OCR process. It only marks the OCR process as finished.


Constants for the Img_OCR and Ocr_ProcessPages functions, the "Flags" parameter:
OCRFLAG_NONE0x00Does OCR in blocking mode. The Img_OCR will return when the OCR process is complete.
OCRFLAG_THREAD0x01Does OCR in nonblocking mode. The Img_OCR will return immediately. You need to call Img_OCR again with the OCRFLAG_GETRESULT flag repeatedly until the function returns a value different from ERROR_PENDING.
OCRFLAG_GETRESULT0x02Gets the status of the OCR process. Returns ERROR_PENDING if OCR is not complete.
OCRFLAG_GETPROGRESS0x03Gets the OCR progress as a percentage. Returns 0 ... 100 or an error code.
OCRFLAG_CANCEL0x04Cancels the OCR process. Returns immediately. You need to call Img_OCR again with the OCRFLAG_GETRESULT flag repeatedly until the function returns a value different from ERROR_PENDING.


Constants for the Img_DrawToDC and Img_GetBmpData functions:
DRAW_NORMAL0x00Draws the original image. After the OCRSTEP_PREFILTERS step, it draws an intermediate image with possible scaling, inversion, rotation, etc.
DRAW_BINARY0x01Draws a binarized image. Can be used only after the OCRSTEP_BINARIZE step.
DRAW_GETBPP0x100Retrieves the bits-per-pixel value for the selected mode (use this flag with DRAW_NORMAL or DRAW_BINARY to specify the mode). The possible return values are 8 and 24. In the DRAW_BINARY mode, it will always return 8. In the DRAW_NORMAL mode, it will return 24 for color images, and 8 for black-white or grayscale images.


Constants for the Blk_Inversion function:
BLK_INVERSE_GET-1Gets the current block’s inversion state.
BLK_INVERSE_SET00x00Disallows inversion of the block (black text and white background).
BLK_INVERSE_SET10x01Inverts the block (white text and black background).
BLK_INVERSE_DETECT0x100Detects the inversion state automatically. Note that the OCRSTEP_BINARIZE step must be done before using this value.


Constants for the Blk_Rotation function:
BLK_ROTATE_GET-1Gets the current block’s rotation state.
BLK_ROTATE_NONE0x00Disallows rotation of the block.
BLK_ROTATE_900x01Rotates the block 90° clockwise.
BLK_ROTATE_1800x02Rotates the block 180° clockwise.
BLK_ROTATE_2700x03Rotates the block 270° clockwise.
BLK_ROTATE_ANGLE0x100000Rotates the block clockwise through the specified angle. The angle is specified in degrees, multipled by 1000. For example, to rotate the block 10 degrees clockwise, use Blk_Rotation(BlkObj, BLK_ROTATE_ANGLE | (10 * 1000)). Negative values are not allowed; to rotate the block 20 degrees counterclockwise, use Blk_Rotation(BlkObj, BLK_ROTATE_ANGLE | ((360-20) * 1000)).
BLK_ROTATE_DETECT0x100Detects the rotation (0°/90°/180°/270°) automatically. Note that the OCRSTEP_BINARIZE step must be done before using this value.


Constants for the Blk_Mirror function:
BLK_MIRROR_GET-1Gets the current block’s mirror state.
BLK_MIRROR_NONE0x00Disallows mirroring of the block.
BLK_MIRROR_H0x01Mirrors the block horizontally.
BLK_MIRROR_V0x02Mirrors the block vertically.


Constants for the Blk_GetWordFontStyle function:
FONT_STYLE_UNDERLINED0x01Underlined text
FONT_STYLE_STRIKED0x02Stricken text
FONT_STYLE_BOLD0x04Bold text (currently not supported in the public release)
FONT_STYLE_ITALIC0x08Italic text (currently not supported in the public release)


Constants for the Svr_Create function:
SVR_FORMAT_PDF0x01Adobe PDF format (PDF)
SVR_FORMAT_RTF0x02Microsoft Rich Text format (RTF)
SVR_FORMAT_TXT_ASCII0x03ASCII Text format (TXT)
SVR_FORMAT_TXT_UNICODE0x04Unicode Text format (TXT)
SVR_FORMAT_XML0x05XML format
SVR_FORMAT_PDFA0x06Adobe PDF/A-1a or PDF/A-1b format (PDF/A)


Constants for the Scan_Enumerate function:
SCAN_GETDEFAULTDEVICE0x01The function will return the default TWAIN scanner index.
SCAN_SETDEFAULTDEVICE0x100The function will set the default TWAIN scanner (Flags = SCAN_SETDEFAULTDEVICE | ScannerIndex).


Constants for the Scan_ScanToImg and Scan_ScanToFile functions:
SCAN_NOUI0x01Scans without displaying the scanner preview dialog. Always enabled for WIA devices.
SCAN_SOURCEADF0x02Uses an ADF (Automatic Document Feeder) as a document source.
SCAN_SOURCEAUTO0x04Detects the document source automatically.
SCAN_DONTCLOSEDS0x08Disallows the closing of the TWAIN Document Source (DS) after scanning. In most cases, you do not need to use this option.
SCAN_FILE_SEPARATE0x10Used for the Scan_ScanToFile function only: When an ADF is used and several pages are scanned, it saves every page to a separate file.


Constants for the Img_GetProperty function, the "PropertyID" parameter:
IMG_PROP_DPIX0x01Resolution (DPI) X
IMG_PROP_DPIY0x02Resolution (DPI) Y
IMG_PROP_BPP0x03Color depth (bits per pixel)
IMG_PROP_WIDTH0x04Original image width
IMG_PROP_HEIGHT0x05Original image height
IMG_PROP_INVERTED0x06The image inversion flag after the OCRSTEP_PREFILTERS step
IMG_PROP_SKEW0x07The image skew angle, multipled by 1000, after the OCRSTEP_PREFILTERS step
IMG_PROP_SCALE0x08The image scale factor, multipled by 1000, after the OCRSTEP_PREFILTERS step
IMG_PROP_PAGEINDEX0x09The image page index for a multi-page document


Constants for the Blk_SetWordRegEx function:
REGEX_SET0x00Sets the regular expression.
REGEX_CLEAR0x01Clears the regular expression.
REGEX_CLEAR_ALL0x02Clears all regular exressions for the block.
REGEX_DISABLE_DICT0x04Disallows the use of the dictionary; only the regular expression will be checked.
REGEX_CHECK0x08Checks if the specified string matches the current regular expression.


Constants for the Svr_SetDocumentInfo function:
INFO_PDF_AUTHOR0x01Sets the "author" info field for a PDF file.
INFO_PDF_CREATOR0x02Sets the "creator" info field for a PDF file.
INFO_PDF_PRODUCER0x03Sets the "producer" info field for a PDF file.
INFO_PDF_TITLE0x04Sets the "title" info field for a PDF file.
INFO_PDF_SUBJECT0x04Sets the "subject" info field for a PDF file.
INFO_PDF_KEYWORDS0x04Sets the "keywords" info field for a PDF file.


Constants for the Blk_GetBarcodeType function:
BARCODE_TYPE_EAN80x01EAN8 barcode
BARCODE_TYPE_UPCE0x02UPCE barcode
BARCODE_TYPE_ISBN100x03ISBN10 barcode
BARCODE_TYPE_UPCA0x04UPCA barcode
BARCODE_TYPE_EAN130x05EAN13 barcode
BARCODE_TYPE_ISBN130x06ISBN13 barcode
BARCODE_TYPE_ZBAR_I250x07ZBAR_I25 barcode
BARCODE_TYPE_CODE390x08CODE39 barcode
BARCODE_TYPE_QRCODE0x09QRCODE barcode
BARCODE_TYPE_CODE1280x0ACODE128 barcode


Constants for the "BarCode/TypesMask" configuration option:
BARCODE_TYPE_MASK_EAN80x01EAN8 barcode
BARCODE_TYPE_MASK_UPCE0x02UPCE barcode
BARCODE_TYPE_MASK_ISBN100x04ISBN10 barcode
BARCODE_TYPE_MASK_UPCA0x08UPCA barcode
BARCODE_TYPE_MASK_EAN130x10EAN13 barcode
BARCODE_TYPE_MASK_ISBN130x20ISBN13 barcode
BARCODE_TYPE_MASK_ZBAR_I250x40ZBAR_I25 barcode
BARCODE_TYPE_MASK_CODE390x80CODE39 barcode
BARCODE_TYPE_MASK_QRCODE0x100QRCODE barcode
BARCODE_TYPE_MASK_CODE1280x200CODE128 barcode


Constants for the Img_SaveToFile function:
IMG_FORMAT_BMP00BMP format
IMG_FORMAT_JPEG02JPEG format
IMG_FORMAT_PNG13PNG format
IMG_FORMAT_TIFF18TIFF format
IMG_FORMAT_FLAG_BINARIZED0x100Merge this flag to save binarized image