Author: Vikrant Korde, Technical Architect, Aurionpro's Oracle Implementation Services team
My wedding photos are stored in several empty shoeboxes. Yes...I got married before digital photography was mainstream...which means I'm old. But my parents are really old. They have shoeboxes filled with vacation photos on slides (I doubt many of you have even seen a home slide projector...and I hope you never do!). Neither me nor my parents should have shoeboxes filled with any form of photographs whatsoever. They should obviously live in
the digital world...with no physical versions in sight (other than a few framed on our walls).
Businesses grapple with similar challenges. But instead of shoeboxes, they have file cabinets and warehouses jam packed with paper invoices, legal documents,
human resource files, material safety data sheets, incident reports, and
the list goes on and on. In fact, regulatory and compliance rules govern many industries, requiring that this paperwork is available for any number of years. It's a real challenge...especially trying to find archived documents quickly and many times with no backup. Which brings us to a set of technologies called Image Process Management (or simply Imaging or Image Processing) that are transforming these antiquated, paper-based processes.
Oracle's WebCenter Content Imaging solution is a combination of their WebCenter suite, which offers a robust set of content and document management features, and their Business Process Management (BPM) suite, which helps to automate business processes through
the definition of workflows and business rules. Overall,
the solution provides an enterprise-class platform for end-to-end management of document images within transactional business processes. It's a solution that provides all of
the capabilities needed - from document capture and recognition, to imaging and workflow - to effectively transform your ‘shoeboxes’ of files into digitally managed assets that comply with strict industry regulations.
The terminology can be quite overwhelming if you're new to
the space, so we've provided a summary of
the primary components of
the solution below, along with a short description of
the two paths that can be executed to load images of scanned documents into Oracle's WebCenter suite.
WebCenter Imaging (WCI):
the electronic document repository that provides security, annotations, and search capabilities, and is
the primary user interface for managing work items in
the imaging solution
SOA & BPM Suites (workflow): provide business process management capabilities, including
human tasks, workflow management, service integration, and all other standard SOA features. It's interesting to note that there a number of 'jumpstart' processes available to help accelerate
the integration of business applications, such as
the accounts payable invoice processing solution for E-Business Suite that facilitates
the processing of large volumes of invoices
WebCenter Enterprise Capture (WEC): expedites
the capture process of paper documents to digital images, offering high volume scanning and importing from email, and allows for flexible indexing options
WebCenter Forms Recognition (WFR): automatically recognizes, categorizes, and extracts information from paper documents with greatly reduced
human intervention
WebCenter Content:
the backend content server that provides versioning, security, and content storage
There are two paths that can be executed to send data from WebCenter Capture to WebCenter Imaging, both of which are described below:
1. Direct Flow - This is
the simplest and quickest way to push an image scanned from WebCenter Enterprise Capture (WEC) to WebCenter Imaging (WCI), using
the bare minimum metadata.
The WEC activities are defined below:
The paper document is scanned (or imported from email).
The scanned image is indexed using a predefined indexing profile.
The image is committed directly into
the process flow
2. WFR (WebCenter Forms Recognition) Flow - This is
the more complex process, during which data is extracted from
the image using a series of operations including Optical Character Recognition (OCR), Classification, Extraction, and Export. This process creates three files (Tiff, XML, and TXT), which are fed to
the WCI Input Agent (the high speed import/filing module).
The WCI Input Agent directory is a standard ingestion method for adding content to WebCenter Imaging,
the process for doing so is described below:
WEC commits
the batch using
the respective commit profile. A TIFF file is created, passing data through
the file name by including values separated by "_" (underscores).
WFR completes OCR, classification, extraction, export, and pulls
the data from
the image. In addition to
the TIFF file, which contains
the document image, an XML file containing
the extracted data, and a TXT file containing
the metadata that will be filled in WCI, are also created. All three files are exported to WCI's Input agent directory.
Based on previously defined "input masks",
the WCI Input Agent will pick up
the seeding file (often
the TXT file).
Finally,
the TIFF file is pushed in UCM and a unique web-viewable URL is created. Based on
the mapping data read from
the TXT file, a new record is created in
the WCI application.
Although these processes may seem complex, each Oracle component works seamlessly together to achieve a high performing and scalable platform.
The solution has been field tested at some of
the largest enterprises in
the world and has transformed millions and millions of paper-based documents to more easily manageable digital assets. For more information on how an Imaging solution can help your business, please contact
[email protected] (for U.S. West inquiries) or
[email protected] (for U.S. East inquiries).
About
the Author:
Vikrant is a Technical Architect in Aurionpro's Oracle Implementation Services team, where he delivers WebCenter-based Content and Imaging solutions to Fortune 1000 clients. With more than twelve years of experience designing, developing, and implementing Java-based software solutions, Vikrant was one of
the founding members of Aurionpro's WebCenter-based offshore delivery team. He can be reached at
[email protected].