Enrich your RAG data with AI-generated context using Azure AI Search and Azure AI Foundry SLMs and LLMs. This demo shows how to integrate custom skills into your Azure AI Search indexing pipeline and adapt prompts to improve response accuracy. Leverage Azure AI Search's indexing capabilities for efficient data transformations, with practical applications like image captioning and document summarization, tailored for precise and relevant responses.
Retrieval Augmented Generation (RAG) applications often require data preparation processes, including extracting text and images, chunking, and vectorizing data, to enhance chatbot interactions with more insightful responses. Azure AI Search's AI enrichment feature enables these capabilities by allowing advanced data transformations directly within the indexing pipeline. This process includes not only natively supported operations such as optical character recognition (OCR) and integrated vectorization but also supports the addition of personalized logic tailored to specific business needs via custom skills. Organizations can leverage Azure AI Foundry's Large Language Models (LLMs) and Small Language Models (SLMs) to enrich RAG-required data chunks with detailed descriptions, summaries, and metadata. The demo discussed in this blog post showcases an end-to-end implementation of incorporating LLMs/SLMs custom prompts during the data preparation phase to enhance these data chunks, providing a practical guide to adjusting your Azure AI Search index content to meet your RAG apps business needs.
Why Integrate LLMs and SLMs into Azure AI Search Indexing Pipeline?
Adaptable Data Transformation and Customization: Advanced AI models, such as Azure OpenAI's GPT-4o and Microsoft's Phi35-vision, enable businesses to transform data in adaptable ways tailored to specific scenarios. The ability to customize prompts and leverage the Azure AI Model Catalog empowers organizations to tune data transformation processes to meet unique needs. This flexibility supports diverse applications, from image captioning to document summarization, with effectiveness varying by scenario.
Flexible Model Selection: Azure AI Foundry provides a range of models to choose from, allowing users to select the most suitable options based on their data type and transformation requirements.
Key Scenarios and Use Cases
Here are some scenarios where a custom skill calling an LLM/SLM can help you.
- Image Captioning: Automatically generating descriptive captions for images enhances searchability, particularly useful in industries where technical diagrams need translation into text.
- Document Summarization: Summarizing lengthy documents saves time and effort. Concise summaries enable users to quickly grasp key elements, though effectiveness may differ based on document complexity.
- Entity Extraction: Extracting key entities from documents augments and enriches search indexes, essential for contexts where metadata classification is crucial.
- Metadata Generation: Using models like Phi-3 or Azure OpenAI GPT-4o to generate metadata tags enhances document classification and retrieval.
- Image Verbalization: Converting images into descriptive text provides an additional layer of information, helping to improve accessibility and context understanding.
Understanding Custom Skills in Azure AI Search
Azure AI Search skills are powerful tools, components of AI enrichment, integrated into the Azure AI Search indexing pipeline, designed to perform various data transformations such as text analysis, image processing, and more. These skills are defined through a skillset, which can include both natively supported skills and custom skills. Custom skills in Azure AI Search allow developers to define targeted tasks for precise data enrichment using their own logic. By creating a skillset, users can utilize natively supported skills and also leverage Azure Functions to deploy custom logic.
One of the key advantages of custom skills is that they enable developers to run their own custom code as part of the ingestion pipeline, while still benefiting from the robust features of the Azure AI Search built-in indexers. This includes support for a variety of data sources, incremental change tracking, and scheduling. This integration allows developers to focus on specific enrichment tasks without having to manage the entire indexing process manually. By using custom skills, developers can efficiently address specific data processing needs and scenarios, leveraging the Azure AI Search platform to create tailored solutions. This approach is more efficient than building a custom indexing pipeline from scratch, as it takes advantage of existing infrastructure and capabilities provided by Azure AI Search, ensuring a seamless and scalable data indexing process.
How to Get Started
Get ready to explore the Azure AI Search with custom skills enabling LLMs/SLMs demo. It demonstrates practical scenarios such as image captioning, document summarization, and entity extraction, offering flexibility through the Azure AI Model Catalog LLMs/SLMs.
This demo also showcases the integration of Azure's built-in indexers, chunking and vectorization to enhance search capabilities. Follow the step-by-step process outlined in the GitHub repo demo sample. Here's a quick overview of the setup process and the features leveraged.
Key Steps
- Environment Setup: Clone the repository and install dependencies using the provided requirements file.
- Azure Services:
- Azure AI Search: Set up an Azure AI Search service to index and search your data.
- Blob Storage: Use an Azure Blob Storage container as your data source.
- Azure Functions: Deploy Azure Functions for custom skills from the Azure AI Search Power Skills repository.
- Configuration: Set environment variables to connect your setup with Azure services. This includes configuring endpoints and API keys for the models you'll use.
Leveraged Features
- Built-in Indexers: Automatically extract, enrich and index data from Azure AI Search supported data sources.
- Integrated Vectorization: Enables document chunking and vector embedding creation, facilitating out-of-the-box vector similarity search.
Next Steps
What’s new in Azure AI Search?
Explore GitHub Power skills repo for examples of custom skills for Azure AI Search.