
Revolutionizing Conversational AI: Microsoft Unveils Agentic Retrieval in Azure AI Search
2025-05-31
Author: Charlotte
Microsoft's Groundbreaking Leap in AI Conversational Systems
In a game-changing announcement, Microsoft has unveiled the public preview of agentic retrieval within Azure AI Search, a revolutionary query engine designed to tackle complex questions with unparalleled efficiency. This innovative technology promises to boost answer relevance in conversational AI by up to a staggering 40% when compared to traditional Retrieval-Augmented Generation (RAG) methods.
It’s not just about providing answers; this multi-turn system taps into conversation history and utilizes Azure OpenAI to deconstruct queries into targeted subqueries, executing them simultaneously across text and vector embeddings.
How It Works: Streamlined Retrieval Process
With its robust new features, agentic retrieval is accessible through a newly introduced Knowledge Agents object in the 2025-05-01-preview data plane REST API and Azure SDK prerelease packages. This functionality builds on the established index of Azure AI Search, linking a dedicated "Agent" resource with Azure OpenAI, while the retrieval engine orchestrates the entire process.
Matthew Gotteiner from Microsoft elaborated during a recent Build conference, shedding light on the intricacies of the agentic retrieval approach. The journey begins with a large language model (LLM) analyzing the entire chat thread to pinpoint the essential information. This is followed by a meticulous planning stage that incorporates both chat history and the original query.
Speed vs. Complexity: The Balancing Act
One intriguing aspect of this process is the relationship between speed and the number of subqueries generated. While the system is designed to boost efficiency by running subqueries in parallel, Gotteiner noted that a complex, multi-faceted query could take longer to resolve. Interestingly, utilizing a more straightforward query planner that produces fewer broad subqueries may yield quicker results than a detailed planner aimed at generating numerous highly focused subqueries.
Enhanced Results and Insights
Once the subqueries are executed, the results are refined using Azure's semantic ranker, resulting in a cohesive grounding payload that includes the top findings alongside structured metadata. Additionally, the API provides a comprehensive activity log of the entire retrieval process, ensuring transparency and insight into how the system operates.
The Future of Intelligent Knowledge Retrieval
In a compelling summary, Akshay Kokane, a Software Engineer at Microsoft, shared insights on a blog, emphasizing the shortcomings of traditional RAG systems in meeting the demands of complex enterprise scenarios. He argued that while these systems serve as excellent for enhancing large language models (LLMs) with specific domain knowledge, their static, linear workflows are often insufficient.
Enter Agentic RAG (ARAG). This dynamic approach fills the gaps by introducing intelligent reasoning, adaptive tool selection, and iterative refinement. This allows agents to modify their search strategies, evaluate results, and craft more precise, context-rich answers — making them perfectly suited for the ever-evolving requirements of business, compliance processes, and multi-source data environments.