Focus on search: Set your Generative AI projects up for success

When designing Retrieval Augmented Generation (RAG) AI solutions, search is paramount. A robust and accurate search mechanism ensures that the most relevant information from your knowledge base is retrieved, enhancing the quality and effectiveness of your AI application. So, how can you get the most out of your search capabilities in Azure AI Search? Let's explore this topic further.

Importance of Search in Generative AI Solutions

Imagine your generative AI app is a knowledgeable assistant responding to user queries. For this assistant to provide insightful answers, it needs access to the right information.

This is where search comes in. If your search mechanism is poorly designed, your AI might retrieve irrelevant or incomplete data, leading to inaccurate and unhelpful responses. Investing time in fine-tuning your search process is therefore central to a successful RAG solution.

Organisations building generative AI solutions often underestimate the effort required in search design. The model gets the attention, but search quality determines whether the model has the right material to work with.

Azure AI Search: What It Is and What It Does

Azure AI Search is a cloud-based search service that allows you to quickly and easily add a robust search experience to your applications.

Some of the key features of Azure AI Search that promote good search quality and performance include:

  • Semantic Ranker. This feature uses AI to reorder search results based on how semantically relevant they are to the query. Results are ranked based on meaning, rather than simply on keyword matching. The semantic ranker is generally available (GA) and included with all tiers of Azure AI Search at no additional cost.
  • Scoring Profiles. These allow you to control how documents are ranked based on factors such as freshness, location, or price. For example, you can create a scoring profile that boosts the ranking of recently updated documents or those near the user's location. You can use scoring profiles in combination with both keyword search and vector search.
  • AI Enrichment. This allows you to extract text and information from content that can't be indexed for full-text search, including images and videos. AI enrichment makes your search results more comprehensive and relevant.
  • Vector Search. This feature allows you to search for documents that are semantically similar to a query. You can use vector search to find relevant documents even if they don't contain the exact keywords. Vector search is generally available.
  • Hybrid Search. This feature combines keyword search and vector search, giving you the best of both approaches. Use keyword search for documents containing specific terms, and vector search for documents semantically similar to a query.
  • Integrations. Azure AI Search integrates with other Azure services, such as Azure Blob Storage, Azure SQL Database, Azure Cosmos DB, and Azure OpenAI, making it straightforward to search your data regardless of where it is stored.

Four Stages of Building a Search Pipeline

Building an effective search pipeline involves a multi-faceted approach, encompassing several key stages:

Stage 1: Data Ingestion

Before you can search your data, you need to ingest it into Azure AI Search. Data can come from a variety of sources such as Azure Blob Storage, Azure SQL Database, Azure Cosmos DB, and OneLake. Indexers, a feature of Azure AI Search, automate this process by extracting searchable content from your data sources and transforming it into JSON documents for indexing. Indexers can also perform change and deletion detection to keep your search index current.

Stage 2: Chunking Strategy

Large documents often need to be divided into smaller, manageable units called chunks. This process, known as chunking, is important for optimising RAG responses and performance. Chunking allows multiple retrieved documents to be passed to an LLM within its context window, provides a mechanism for ranking the most relevant passages, and enables vector search, which has a per-model limit on how much content can be embedded into each vector.

The Document Layout skill available in Azure AI Search offers a structure-aware chunking approach, breaking content into headings and semantically coherent chunks like paragraphs and sentences. This skill uses the layout model in Document Intelligence to identify document structure and represent it in JSON using Markdown syntax. Choosing a suitable chunk size and overlap, and preserving sentence boundaries for semantic coherence, is key for effective chunking.

Stage 3: Types of Indexes and Search Strategies

Azure AI Search offers three main types of indexes and corresponding search strategies:

  • Keyword Search: This traditional approach breaks content into terms, creating inverted indexes for quick retrieval. Keyword search excels at finding exact matches but can struggle with semantic understanding. It uses the BM25 probabilistic model for scoring, which determines relevance based on the frequency of search terms in a document relative to their frequency in the entire corpus.
  • Vector Search: This technique uses embeddings (mathematical representations of text that capture semantic meaning) to find documents similar in meaning to a query. Approximate Nearest Neighbor (ANN) search algorithms, such as Hierarchical Navigable Small World (HNSW), are commonly used to efficiently find similar vectors in a large dataset. An alternative scoring algorithm, Exhaustive K-Nearest Neighbors (KNN), performs a brute-force search to find the nearest neighbours, ensuring high accuracy but potentially requiring more computational resources.
  • Hybrid Search: This approach combines the strengths of both keyword and vector search, offering comprehensive results. Azure AI Search uses Reciprocal Rank Fusion (RRF) to merge results from both methods and produce a unified, highly relevant result set.

Stage 4: Querying and Prompting

Crafting effective queries or prompts is the final step in retrieving the most relevant information. Experimenting with different query formulations, utilising search operators, and leveraging features like filters and facets can significantly affect the accuracy and relevance of your results. For example, you can use the search, filter, and vectorQueries parameters in a hybrid search query to refine results based on both keyword and vector criteria.

Methods for Bulk Testing Search Results

Bulk testing of search results is essential to ensure accuracy and relevance. You can use a variety of tools and techniques for this, including:

  • Azure Search Performance Testing Tool: This open-source tool provides a framework for benchmarking the performance of your Azure AI Search service, including both query and data ingestion workloads.
  • REST Clients: Use REST clients, such as Visual Studio Code with the REST extension, to create and execute a range of queries for testing various search scenarios.
  • Search Explorer: The built-in Search Explorer in the Azure portal allows you to interactively test queries, refine scoring profiles, and analyse search results.
  • Azure AI Prompt Flow: With Prompt Flow, you can evaluate and compare variations of prompts, gather user feedback, and measure metrics including groundedness, relevance, and retrieval score.

By using these methods, you can thoroughly evaluate the effectiveness of your search implementation and make necessary adjustments for optimal performance.

Improving the Quality and Query Performance of Search Results

Beyond the core features, there are several practices worth considering to enhance the quality and performance of your search results:

  • Optimise Your Index. Maintaining a lean and efficient index is important for performance. Periodically review your index size and schema for content reduction opportunities. Simplify your schema by limiting field attributes and using alternatives to complex types like Collections, or flatten field hierarchies where possible.
  • Optimise Your Queries. Craft your queries carefully to avoid unnecessary complexity. Limit the number of searchable fields, reduce the amount of data returned, avoid partial term searches, and simplify filters. Use search functions instead of overly complex filter criteria, and break down complex regular expressions for better performance.
  • Leverage Hybrid Search and Semantic Ranking. Combine keyword and vector search, and enable semantic ranking to reorder results based on semantic relevance. This ensures the most relevant content appears at the top. Hybrid search plus semantic re-ranking offers higher answer recall, broader query coverage, and increased precision.
  • Refine Your Chunking Strategy. Experiment with different chunk sizes, overlaps, and boundary strategies to find the optimal configuration for your data and embedding model. The goal is to create chunks that are both semantically coherent and comprehensive, improving search relevance and recall. You can specify token chunking in the Text Split skill, allowing you to chunk by token length and set the tokeniser and any tokens that should not be split.
  • Explore Advanced Features. Query rewriting in the semantic ranker can generate more relevant results. Rescoring options for compressed vectors help balance index size and retrieval quality. The vectorQueries.Weight property lets you fine-tune the influence of individual vector queries in multi-query requests.

By carefully implementing these strategies, you can create a highly efficient and relevant search experience for your RAG AI solutions, leading to more accurate, informative, and valuable results for your users. A well-tuned search mechanism is the cornerstone of a successful RAG AI application.

Further Reading

Thinking about RAG or AI search for your organisation?

Getting search right is one of the most consequential decisions in any RAG implementation. If you are exploring how to build or improve AI-powered search for your organisation, our team has hands-on experience designing and deploying Azure AI Search solutions across a range of industries. From chunking strategy through to semantic ranking and hybrid search, we can help you build a search pipeline that delivers accurate, relevant results.

Explore our AI Accelerate Workshop for a structured way to get started, or learn more about why organisations choose Hypergen as their AI partner.

Let's talk about your AI search project