Glossary

Retrieval Augmented Generation (RAG)

RAG: Supercharging LLMs with Precision Knowledge Retrieval

What is RAG?

Retrieval Augmented Generation ("RAG") is a form of optimizing the way large language models respond to questions, specifically those that require subject-matter expertise. They often instruct the LLM to point toward proprietary information instead of its general training parameterization. As a result, answers to questions will focus on information that is more domain-specific, a current limitation with the broader training data behind most LLMs. 

RAG allows for an LLM’s ability to understand contextual relevance to be traceably applied to subject-matter-specific proprietary information. That helps alleviate many (not all) issues surrounding hallucination and false confidence, empowering the user to verify the source. As Marco Argenti (Goldman Sachs, Chief Information Officer) recently noted (August 2024):

you have the RAG, which is the retrieval-augmented generation, which is actually interesting because you tell the model that, instead of using its own internal knowledge in order to give you an answer, which sometimes, as I said, is like a representation of reality, but it's often not accurate, you point them to the right sections of the document that actually is more likely to answer your question, okay? And that's the key. It needs to point to the right sections, and then you get the citations back. So that took a lot of effort, but we're using that in many, many cases because then we expanded the use-case from purely, like, banker assistant in a way to more, okay, document management, you know, we process millions of documents. Think of credit documents, loan documents.

Because RAG points the LLM toward information within the existing entitlements architecture, the appropriate data security and governance oversight can be implemented. While this is thought of as an elegant solution (especially when information is paired with a source), there are still issues regarding accuracy/hallucination that need to be dealt with (after all, the G in RAG stands for Generation!), as examined by research on applying RAG to workflows such as legal research. 

To learn more about RAG within the context of Investment Banking and the industry's roll-out of Generative AI functionality, read our blog post The Future is Now (September 2024).

 

Source: Wikicommons

Defined by others: 

Amazon: Retrieval-Augmented Generation (RAG) is the process of optimizing the output of a large language model, so it references an authoritative knowledge base outside of its training data sources before generating a response.

Databricks: Retrieval augmented generation, or RAG, is an architectural approach that can improve the efficacy of large language model (LLM) applications by leveraging custom data. This is done by retrieving data/documents relevant to a question or task and providing them as context for the LLM. RAG has shown success in supporting chatbots and Q&A systems that need to maintain up-to-date information or access domain-specific knowledge.