This pack contains the infrastructure that enables document processing flows, with the goal of achieving full digitization, classification and data extraction capabilities.
It incorporates the ABBYY FlexiCapture technology, integrating the FCE SDK into our product. To this extent, the IntelligentOCR Scope container activity initializes the FlexiCapture engine, enabling you to use all the other FlexiCapture activities within it. We have engineered the
FCDocument proprietary variable, which is compatible with the FCE SDK and enables the analysis of texts once they're converted to this format. Any document can be converted to this variable type via the Process Document activity. The Classify Document activity can compare any document against templates and decide where it belongs, while Validate Document can be used to validate it. The Get Field and Get Table activities retrieve data accurately, while the Export Document activity can wrap up any changes you've made to the
FCDocument variable and export it to a format of your choice. By harnessing the full power of this engine, you can perform complex tasks that require Intelligent OCR capabilities.
The IntelligentOCR package also contains the infrastructure for enabling document processing through other means:
- The Digitize Document activity helps in retrieving the text from any PDF or image, using, if necessary, the OCR engine of your choice.
- The Classify Document Scope activity allows the usage of any classification algorithm for identifying the type of document a file is. The Keyword Based Classifier activity is the first such classifier, targetting classification for titled documents.
- The Train Classifiers and Extractors scope activity allow for completion of the feedback loop for any classifiers and data extraction algorithms capable of learning (the Keyword Based Classifier for example).