Author: Vikrant Korde, Technical Architect, Aurionpro's Oracle Implementation Services team
My wedding photos are stored in several empty shoeboxes. Yes...I got married before digital photography was mainstream...which means I'm
old. But my parents are really
old. They have shoeboxes filled with vacation photos on slides (I doubt many of you have even seen a home slide projector...and I hope you never do!). Neither me nor my parents should have shoeboxes filled with any form of photographs whatsoever. They should obviously live in the digital world...with no physical versions in sight (other than a few framed on our walls).
Businesses grapple with similar challenges. But instead of shoeboxes, they have file cabinets and warehouses jam packed with paper invoices, legal documents, human resource files, material safety data sheets, incident reports, and the list goes on and on. In fact, regulatory and compliance rules govern many industries, requiring that this paperwork is available for any number of years. It's a real challenge...especially trying to find archived documents quickly and many times with no backup. Which brings us to a set of technologies called Image Process Management (or simply Imaging or Image Processing) that are transforming these antiquated, paper-based processes.
Oracle's WebCenter Content Imaging solution is a combination of their WebCenter suite, which offers a robust set of content and document management features, and their Business Process Management (BPM) suite, which helps to automate business processes through the definition of workflows and business rules. Overall, the solution provides an enterprise-class platform for end-to-end management of document images within transactional business processes. It's a solution that provides all of the capabilities needed - from document capture and recognition, to imaging and workflow - to effectively transform your ‘shoeboxes’ of files into digitally managed assets that comply with strict industry regulations. The terminology can be quite overwhelming if you're new to the space, so we've provided a summary of the primary components of the solution below, along with a short description of the two paths that can be executed to load images of scanned documents into Oracle's WebCenter suite.
WebCenter Imaging (WCI): the electronic document repository that provides security, annotations, and search capabilities, and is the primary user interface for managing work items in the imaging solution
SOA & BPM Suites (workflow): provide business process management capabilities, including human tasks, workflow management, service integration, and all other standard SOA features. It's interesting to note that there a number of 'jumpstart' processes available to help accelerate the integration of business applications, such as the accounts payable invoice processing solution for E-Business Suite that facilitates the processing of large volumes of invoices
WebCenter Enterprise Capture (WEC): expedites the capture process of paper documents to digital images, offering high volume scanning and importing from email, and allows for flexible indexing options
WebCenter Forms Recognition (WFR): automatically recognizes, categorizes, and extracts information from paper documents with greatly reduced human intervention
WebCenter Content: the backend content server that provides versioning, security, and content storage
There are two paths that can be executed to send data from WebCenter Capture to WebCenter Imaging, both of which are described below:
1. Direct Flow - This is the simplest and quickest way to push an image scanned from WebCenter Enterprise Capture (WEC) to WebCenter Imaging (WCI), using the bare minimum metadata. The WEC activities are defined below:
The paper document is scanned (or imported from email).
The scanned image is indexed using a predefined indexing profile.
The image is committed directly into the process flow
2. WFR (WebCenter Forms Recognition) Flow - This is the more complex process, during which data is extracted from the image using a series of operations including Optical Character Recognition (OCR), Classification, Extraction, and Export. This process creates three files (Tiff, XML, and TXT), which are fed to the WCI Input Agent (the high speed import/filing module). The WCI Input Agent directory is a standard ingestion method for adding content to WebCenter Imaging, the process for doing so is described below:
WEC commits the batch using the respective commit profile. A TIFF file is created, passing data through the file name by including values separated by "_" (underscores).
WFR completes OCR, classification, extraction, export, and pulls the data from the image. In addition to the TIFF file, which contains the document image, an XML file containing the extracted data, and a TXT file containing the metadata that will be filled in WCI, are also created. All three files are exported to WCI's Input agent directory.
Based on previously defined "input masks", the WCI Input Agent will pick up the seeding file (often the TXT file).
Finally, the TIFF file is pushed in UCM and a unique web-viewable URL is created. Based on the mapping data read from the TXT file, a new record is created in the WCI application.
Although these processes may seem complex, each Oracle component works seamlessly together to achieve a high performing and scalable platform. The solution has been field tested at some of the largest enterprises in the world and has transformed millions and millions of paper-based documents to more easily manageable digital assets. For more information on how an Imaging solution can help your business, please contact
[email protected] (for U.S. West inquiries) or
[email protected] (for U.S. East inquiries).
About the Author:
Vikrant is a Technical Architect in Aurionpro's Oracle Implementation Services team, where he delivers WebCenter-based Content and Imaging solutions to Fortune 1000 clients. With more than twelve years of experience designing, developing, and implementing Java-based software solutions, Vikrant was one of the founding members of Aurionpro's WebCenter-based offshore delivery team. He can be reached at
[email protected].