Asian languages OCR module of the Nicomsoft OCR SDK
We get a lot of requests about support for Chinese, Arabic and other languages that are not related to the main alphabet that is supported by Nicomsoft OCR.
Since v7.0, NSOCR SDK contains additional OCR module for Asian languages. Currently it supports the following languages:
Arabic.
Chinese simplified.
Chinese traditional.
Japanese.
Korean.
Asian OCR module is not a part of main OCR module of NSOCR, it is based on tesseract with some modifications and important bug fixes, below are some notes about this module:
Currently this module is available for Windows platform only.
This module can be excluded from NSOCR binaries if necessary: just remove "asian" folder from "Bin" and "Bin_64" folders.
Using both main and Asian OCR modules for the same zone is not supported, for example, do not select "German" and "Japanese" languages for the same zone. However, you can use "German" language in one zone and "Japanese" language in another one.
Using several Asian languages for the same zone is not supported currently, for example, do not select "Korean" and "Japanese" for the same zone. However, you can use "Korean" language in one zone and "Japanese" language in another one.
Some options don't work for Asian OCR module: entire "Linezer", "Spacer" and "WordAlizer" sections.
Time required for initialization and recognition is larger than for main OCR module.
Optimal characters height is about 40-50 pixels.
It is recommended to select language for recognition before OCRSTEP_PREFILTERS step, so NSOCR can select best scale for selected language.