If you’ve ever done it used a generative artificial intelligence tool, lied to you. Probably several times.
These recurring fabrications are often referred to as AI hallucinations, and developers are working feverishly to make generative AI tools more reliable by curbing these unfortunate errors. One of the most popular approaches to reducing AI hallucinations (and one that’s growing in popularity in Silicon Valley) is called augmented recovery generation.
The RAG process is quite complicated, but at a basic level it augments your prompts by gathering information from a custom database, and then the large language model generates a response based on that data. For example, a company could upload all of its HR policies and benefits into a RAG database and have the AI chatbot focus only on the answers that can be found in those documents.
So how is this process different from a standard ChatGPT output? I asked Pablo Arredondo, Vice President of CoCounsel at Thomson Reuters, who has been using the RAG method to develop aspects of an AI tool for legal professionals. “Instead of responding based on the memories encoded during initial model training,” he says, “you use the search engine to pull real documents, be it case law, articles, or whatever, and then anchor the model’s response to these documents.”
For example, we could upload the entire history of WIRED, all print magazines and web articles since 1993, to a private database and create a RAG implementation that references these documents when answering reader questions. By giving the AI tool a narrow focus and quality information, the RAG-augmented chatbot would be more adept than a general-purpose chatbot at answering questions about WIRED and relevant topics. Would I still make mistakes and sometimes misinterpret the data? Absolutely. But the odds of making whole items that never existed would be definitely reduced.
“You’re rewarding it, the way you train the model, to try to write something where every statement of fact can be attributed to a source,” says Patrick Lewis, lead AI modeling at Cohere who helped develop the RAG concept a few years ago. If you teach the model to effectively sort the data provided and use citations in each output, the AI tool is less likely to make serious mistakes.
However, the exact amount of RAG that reduces AI hallucinations is a point of contention for researchers and developers. Lewis chose his words carefully during our conversation, describing RAG outputs as “low hallucinations” rather than hallucination-free. The process is definitely not a panacea that eliminates all AI bugs.