Artificial intelligence has the potential to unlock real value from documentation. In this second part of his series on applied AI, TEKenable’s Mohammad Zeeshan Khan explains how Azure AI Document Intelligence can augment search and automate document processing.
Azure Cognitive Services Form Recognizer, now known as Azure AI Document Intelligence¹, is a cloud-based Azure AI service that uses machine-learning models to automate your data processing in applications and workflows⁴. It applies advanced machine learning to extract text, key-value pairs, tables, and structures from documents automatically and accurately². This service is essential for enhancing data-driven strategies and enriching document search capabilities⁴.
What are the Key features of Azure AI Document Intelligence?
Azure AI Document Intelligence offers three types of models¹:
1. Document Analysis Models:
These models enable text extraction from forms and documents and return structured business-ready content ready for your organization’s action, use, or progress¹. They can extract printed and handwritten text, text and document structure, and text, structure, and key-value pairs¹.
2. Prebuilt Models:
These models enable you to add intelligent document processing to your apps and flows without having to train and build your own models¹. They can extract customer and vendor details from invoices, sales transaction details from receipts, identification and verification details from identity cards, health insurance details from health insurance cards, business contact details from business cards, agreement and party details from contracts, taxable compensation details from W2 forms, student loan interest details from US Tax 1098-E forms, mortgage interest details from US Tax 1098 forms, and qualified tuition details from US Tax 1098-T forms¹.
3. Custom Models
These models are trained using your labelled datasets to extract distinct data from forms and documents, specific to your use cases¹. Standalone custom models can be combined to create composed models¹. Custom extraction models are trained to extract labelled fields from documents¹.
An example of solution architecture
Azure AI Document Intelligence can be used to build an automated document processing pipeline⁶. Here’s an example of how it can be integrated into a typical business process:
1. Data Ingestion and Extraction:
Documents are ingested through a browser at the front end of a web application⁶. The back-end application posts a request to a Form Recognizer REST API endpoint that uses one of the models mentioned above⁶. The response from Form Recognizer contains raw OCR data and structured extractions⁶. The App Service back-end application uses the confidence values to check the extraction quality⁶. When the extraction quality meets requirements, the data enters Azure Cosmos DB for downstream application consumption⁶.
2. Data Enrichment:
The pipeline used for data enrichment depends on the use case⁶. Data enrichment can include named entity recognition (NER), the extraction of personal information, key phrases, health information, and other domain-dependent entities⁶.
How to use Azure AI to extract text from images in SharePoint?
Let’s look at how Azure AI Document Intelligence fits into a larger solution architecture to solve a real-world business use case.
Have you ever wanted to search for text that’s embedded in images, such as diagrams, charts, or shapes? If you have a lot of documents that contain such images, you might find it hard to manually scan them for relevant information. Fortunately, there is a solution that can help you automate this process and make your documents more searchable and accessible.
We can use Azure AI to extract text from images in stored SharePoint. By using AI Builder and Azure Form Recognizer, you can configure a Power Automate workflow to use a trained model to extract text from an image. Once you’ve configured a workflow, you can quickly search documents for meaningful text that’s embedded in shapes and objects.
The architecture
The following diagram shows the architecture of the solution:
The solution consists of the following components:
- AI Builder: A Power Platform capability that lets you train models to recognise objects in images. You can also use prebuilt models for object detection.
- Form Recognizer: An Azure Cognitive Service that uses machine-learning models to extract and analyse form fields, text, and tables from your documents.
- Power Automate: An online workflow service that automates actions across apps and services.
- Azure Functions: An event-driven serverless compute platform that runs on demand and at scale in the cloud.
- PnP Modern Search: A set of SharePoint Online modern web parts that let you create highly flexible and personalised search-based experiences.
The solution works as follows:
- An object detection model is trained in AI Builder to recognise objects that you specify, such as pumps, valves, switches, etc.
- A new document enters a SharePoint document library, OneDrive, or Teams.
- Power Automate runs the AI Builder model on the document and returns a JSON file that contains the pixel coordinates of any detected objects.
- Power Automate sends the document to Form Recognizer for a full optical character recognition (OCR) scan and returns a JSON file that contains scanned-in text and pixel coordinates of the text.
- Power Automate runs a function in Azure Functions that analyses the pixel coordinates in the AI Builder and Form Recognizer output files. If detected objects intersect with scanned-in text, the function returns the matched data in a JSON file.
- Power Automate enters the metadata, or the text from detected objects, into a document library.
- Users search for the metadata by using PnP Modern Search web parts.
The benefits Azure AI Document Intelligence?
By using this solution, you can:
- Save time and effort by automating the extraction of text from images in your documents.
- Improve the searchability and accessibility of your documents by adding metadata that reflects the content of the images.
- Enhance your document management and analysis by using AI to identify and extract relevant information from complex diagrams.
The use cases for this approach
This solution can be applied to various types of documents that contain images with embedded text, such as:
- Complicated engineering schematic diagrams that show various types of components. By using this solution, you can quickly search for specific components on a diagram. This can help you with investigations, exposing shortages, or looking for recall and failure notices.
- Industrial diagrams that show the components in a manufacturing assembly. This solution can help you identify pumps, valves, automated switches, and other components. This can help you with preventative maintenance, isolating hazardous components, and increasing the visibility of risk management in your organization.
The steps to Implement Azure AI
To implement this solution, you need to follow these steps:
- Train an object detection model in AI Builder by using your own images or prebuilt models.
- Create a Power Automate workflow that triggers when a new document is added to a document library, OneDrive, or Teams.
- Add an action to run the AI Builder model on the document and store the output JSON file in a variable.
- Add an action to send the document to Form Recognizer for OCR scan and store the output JSON file in another variable.
- Add an action to call an Azure Function that takes the two JSON files as input and returns the matched data as output.
- Add an action to update the document properties with the metadata from the Azure Function output.
- Configure PnP Modern Search web parts to display the metadata in SharePoint.
Conclusion: AI-driven efficiency in document processing
Azure AI Document Intelligence is a game-changer for businesses looking to automate their document processing workflows. It not only reduces manual labour but also increases efficiency by providing accurate and structured data extraction. By integrating this service into their business processes, organisations can focus more on acting on information rather than compiling it².
I showed you how to use Azure AI to extract text from images in SharePoint. This solution can help you make your documents more searchable and accessible by using AI Builder and Azure Form Recognizer to identify and extract relevant information from complex diagrams.
I hope you found this useful and interesting. If you have any questions or feedback, please leave a comment below. Thanks for reading!
Source:
- What is Azure AI Document Intelligence (formerly Form Recognizer …. https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/overview?view=doc-intel-3.1.0.
- Azure AI Document Intelligence documentation. https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/?view=doc-intel-3.1.0.
- Form Recognizer – Automated Data Processing Systems | Microsoft Azure. https://azure.microsoft.com/en-in/products/form-recognizer/.
- Automate document processing with Azure Form Recognizer – Azure …. https://learn.microsoft.com/en-us/azure/architecture/ai-ml/architecture/automate-document-processing-azure-form-recognizer.
- azure – ai-form-recognizer vs. cognitiveservices-computervision – Stack …. https://stackoverflow.com/questions/71071309/ai-form-recognizer-vs-cognitiveservices-computervision.
Frequently Asked Questions
1. What is Azure AI Document Intelligence?
Azure AI Document Intelligence (formerly known as Azure Form Recognizer) is a cloud-based service that leverages machine learning models to extract data from documents. It can automatically identify text, tables, and key-value pairs from a variety of document types, helping businesses automate document processing and improve search capabilities.
2. How does Azure AI Document Intelligence work?
Azure AI Document Intelligence uses three main types of models: Document Analysis Models, Prebuilt Models, and Custom Models. These models extract and analyze text, structure, and data from documents. Once extracted, the data can be integrated into business processes, improving automation and decision-making.
3. What are the key features of Azure AI Document Intelligence?
Key features include:
- Document Analysis Models: Extract printed and handwritten text, key-value pairs, and document structures.
- Prebuilt Models: Pretrained to recognize common document types like invoices, receipts, and contracts.
- Custom Models: Tailored to your specific business needs by training models with your data.
4. How does Azure AI Document Intelligence improve document search?
By extracting and structuring text from documents, Azure AI Document Intelligence allows businesses to enrich metadata and make documents more searchable. This is especially beneficial for documents with embedded text in images, such as diagrams or charts.
5. What are some use cases for Azure AI Document Intelligence?
This service is ideal for processing documents with complex layouts or embedded text in images. Use cases include:
- Analyzing engineering diagrams or industrial schematics.
- Automating invoice and receipt processing.
- Improving document searchability for regulatory and compliance purposes.
6. How can I use Azure AI Document Intelligence to extract text from images in SharePoint?
You can integrate Azure AI Document Intelligence with SharePoint by using Power Automate to create workflows that automatically detect and extract text from images. This solution enhances document searchability by transforming complex images into accessible text.
7. What is the architecture of the Azure AI solution for document processing?
The solution involves several components:
- AI Builder: Trains models for object recognition.
- Form Recognizer: Performs optical character recognition (OCR) on documents.
- Power Automate: Automates workflows based on document events.
- Azure Functions: Analyzes data and processes metadata.
- PnP Modern Search: Allows for custom document search experiences.
8. How do I implement Azure AI Document Intelligence?
To implement the solution, you need to:
- Train an object detection model using AI Builder.
- Set up Power Automate to trigger actions when new documents are added.
- Use Form Recognizer for OCR and combine data from AI Builder and Form Recognizer using Azure Functions.
- Enrich the document metadata and integrate it into SharePoint for better search functionality.
9. What are the benefits of using Azure AI Document Intelligence?
The main benefits include:
- Time-saving automation: Automates text extraction, reducing manual effort.
- Improved searchability: Makes documents more searchable by adding structured metadata.
- Enhanced document management: Allows businesses to extract and analyze critical data from complex documents efficiently.
10. How can Azure AI Document Intelligence help with regulatory and compliance documents?
By automating the extraction and structuring of information from regulatory documents, businesses can ensure they meet compliance standards more efficiently, with improved accuracy and speed in processing documentation.
11. Can Azure AI Document Intelligence be customized for my business needs?
Yes, Azure AI Document Intelligence offers custom models that can be trained on your specific data to meet your business needs. This flexibility allows you to tailor the solution to your particular industry and document types.