Google Cloud has expanded the grounding capabilities of Vertex AI, significantly enhancing the platform’s ability to generate more accurate and reliable AI responses. These advancements aim to mitigate AI hallucinations and elevate the overall quality of generative AI applications and agents.
One of the key additions is the introduction of dynamic retrieval for Grounding with Google Search, which is now generally available. This innovative feature allows Gemini, Google’s advanced large language model, to intelligently decide whether to ground user inquiries in Google Search or rely on its intrinsic knowledge. This approach helps balance response quality with cost efficiency, as grounding with Google Search incurs additional processing costs. Gemini makes this decision based on its understanding of whether the information requested is likely to be static, slowly changing, or rapidly evolving.
For example, when asked about recent movies, Gemini uses Google Search for the latest information. Conversely, for general questions like “What is the capital of France?” it provides an answer from its existing knowledge base without external grounding. This dynamic approach not only enhances response accuracy but also optimizes resource usage.
Google Cloud is also introducing a “high-fidelity” mode for grounding, currently in the experimental phase. This mode targets industries such as healthcare and financial services, where precision and reliability are paramount.
Additionally, Google will soon enable grounding models with third-party datasets, expected to launch in Q3. Collaborating with specialized data providers like Moody’s, MSCI, Thomson Reuters and Zoominfo, Google will offer access to their datasets via Vertex AI. This feature will allow enterprises to integrate highly specific and authoritative information into their AI models, further boosting the accuracy and relevance of generated responses.
For enterprises aiming to ground their AI models in private data, Google Cloud provides Vertex AI Search and a suite of APIs for Retrieval Augmented Generation (RAG). These tools help businesses create custom RAG workflows, build semantic search engines, or enhance existing search capabilities. The APIs, now generally available, offer functionalities for document parsing, embedding generation, semantic ranking, grounded answer generation and a fact-checking service called check-grounding.
These enhancements are part of Google Cloud’s broader strategy to make generative AI more reliable and suitable for enterprise use. By connecting AI models to diverse and reliable information sources—including web data, company documents, operational databases and enterprise applications—Google aims to ground AI in what it calls “enterprise truth.”
Focusing on grounding addresses the growing industry concern over AI hallucinations. As AI models become more complex, the risk of producing faulty or unreliable outputs increases. Grounding techniques like RAG mitigate this risk by feeding models with facts from external knowledge sources, thus improving the accuracy and trustworthiness of responses.
By enabling enterprises to leverage both public and private data sources, Google is enabling the development of more robust and trustworthy AI applications across various industries.