Founded 1977 | HQ Redwood City, CA | 143,000 employees | $43B revenues
Oracle has created a competent set of document AI services that can be used by anyone for general-purpose IDP applications. Yet we suspect Oracle is aiming much higher. The true value of OCI Vision comes from its integration within the Oracle cloud infrastructure.
Oracle is one of the best-known software companies in the world. Founded in 1977, the company has grown to 143,000 employees with revenues near $43 billion. Best known for its eponymous relational database and big data solutions, Oracle also sells enterprise business applications such as ERP, supply chain management, and human capital management.
The focus of this report is Oracle Cloud Infrastructure (OCI) Vision, announced in February 2022. It is part of the company’s OCI AI Services, a collection of services with prebuilt machine learning (ML) models for developers to apply AI to their applications and business operations. The OCI Vision product team is led by Elad Ziklik, formerly of Microsoft where he led the Azure Cognitive Services product team.
OCI Vision is an AI service for performing deep-learning-based image and document analysis. As a leader in enterprise databases, Oracle has access to a staggering amount of unstructured and structured data, not to mention a large cohort of data scientists, and has leveraged this data advantage to train AI models. Prebuilt, pretrained models help developers add image recognition and text recognition into their applications without the need for ML or data science expertise.
Developers can also create custom models and train on their own data. Examples of how these models can be used in the real world include detecting visual anomalies during a manufacturing process, extracting text from documents to automate business workflows, and tagging items in images to keep count of products or shipments.
Structurally, OCI Vision fits within the broader Oracle cloud infrastructure inside the ML/AI Services stack (see Figure 1). The OCI Vision application programming interface (API) integrates with the Oracle Cloud and all its associated applications, databases, and AI services.
OCI Vision provides two main functions: image analysis (object detection, classification) and document analysis. For the purposes of this report, we will focus on the document analysis features.
From the analyzeDocument API, the document models leverage a core set of document AI services:
- Text recognition and OCR that locates and digitizes text information from images at the word or line level
- Key-value extraction that can extract a predefined list of key-value pairs
- Table extraction to identify and extract content in tabular format, while maintaining the row/column relationships of cells
- Document classification models to categorize documents into different types based on a combination of visual appearance, high-level features, and extracted keywords
- Asynchronous or batch support on all document analysis APIs
For document classification, Oracle provides out-of-the-box, pretrained document models for categories including invoice, receipt, tax form, resume, passport, driver’s license, bank statement, pay slip, check, and more. Developers can train the models with their own data to achieve acceptable recognition rates. Forms processing in general is hard to get right. The Oracle classification functionality is a general, all-purpose forms processor based on deep learning. In our experience, only IDP vendors who specialize in and configure their services around industry- or use-case-specific form types can achieve the highest levels of accuracy.
The bottom line is that the OCI Vision classification in theory should work well. We also need to stress that at the time of our briefing, the product was still in beta testing with select customers. As with any AI/ML service, the accuracy should improve over time. Given enough time and training, we expect the Oracle document AI engine should reach similar levels of proficiency to competitive products.
The OCR text extraction is unremarkable but does the job. It consists of basic OCR functionality: recognizes print or handwritten text, works with poor images such as faxes, corrects the orientation of images, and reads tilted images. The table extraction function ticks the usual boxes.
For custom model creation, OCI Vision includes a basic user interface for model testing and training prior to deployment (see Figure 2).
As mentioned above and worth noting, OCI Vision also supports batch processing, which is the ability to process documents as a group rather than one at a time. This first release of OCI Vision supports up to 2,000 pages per document, or 2,000 documents per job. Oracle was smart to include batch processing and position OCI Vision as an enterprise-grade engine, particularly because Oracle’s customers include the largest data processors.
Oracle has created a competent set of document AI services that can be used by anyone for general-purpose IDP applications. Yet we suspect Oracle is aiming much higher.
First, we think the true value of OCI Vision comes from its integration within the Oracle cloud infrastructure. The briefing team spoke of OCI Vision’s role in creating the “Intelligent Lakehouse” by extracting information from unstructured data stored in a data lake. Unlike a data warehouse, which stores structured data, a data lake is a repository of all types of unstructured data from all sources, stored without organization or hierarchy. A data lakehouse implements the data structures and management features of a data warehouse for a data lake. Data lakehouses are essential tools for data scientists, enabling ML and business intelligence.
Second, this is a classic Oracle “build versus buy” play. Oracle customers are running document-intensive applications such as financial services, supply chain management, ERP, patient healthcare (through the recent acquisition of Cerner), and human resources, and will benefit from the ease of implementing internally developed and delivered IDP tools. This could disrupt the IDP vendors who have specialized in marketing within the Oracle ecosystem.
Advice to Buyers
If you are an Oracle house with Oracle applications and databases, there is no downside or risk to asking your dev team to try OCI Vision. Buyers with existing complex IDP applications may want to wait for more robust versions. Keep in mind this is version 0.9 and expect Oracle to add more functionality as it receives feedback from the first wave of users.
The standalone pricing for OCI Vision is competitive with other cloud document AI services from Microsoft, Amazon, and Google. However, one must consider the total cost of ownership (TCO) for the Oracle stack, including the cost of cloud services, storage, database, and CPU/GPU.
- Integration within Oracle Cloud infrastructure
- Data scientists and their access to training data
- Enable the Intelligent Lakehouse
- Become the go-to document AI for Oracle customers
- Develop more industry- and form-specific modules
- Expand rapidly across the Oracle ecosystem
- Some early enterprise adopters
- Document AI is part of a much broader
- AI services suite
Attribution-NonCommercial-NoDerivatives 4.0 International
CC BY-NC-ND 4.0 license