RAG Resources
- Files from my "Using Retrieval Augmented Generation (RAG), Langchain, and LLMs for Cybersecurity Operations" Course
- LangFlow - a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
Disadvantages of RAG
RAG Patterns
- Generative AI Lifecycle Patterns
- Why do RAG pipelines fail? Advanced RAG Patterns — Part1 Ozgur Guler
- How to improve RAG peformance — Advanced RAG Patterns — Part2
- Patterns for Building LLM-based Systems & Products
- AI Engineer Summit - Building Blocks for LLM Systems & Products
- Technical Considerations for Complex RAG
Dialogue Routing
Retrieval
Vector Retrieval
- Boosting RAG: Picking the Best Embedding & Reranker models
- What We Need to Know Before Adopting a Vector Database
Chunking
- Chunking Strategies for LLM Applications
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
- How to Chunk Text Data — A Comparative Analysis
Embeddings
Vector Search
Not Vector Retrieval
- Vector Search Is Not All You Need
- Build a search engine, not a vector DB
- Improving RAG (Retrieval Augmented Generation) Answer Quality with Re-ranker
- From Search to Synthesis: Enhancing RAG with BM25 and Reciprocal Rank Fusion
Generation
Prompts
- Emerging RAG & Prompt Engineering Architectures for LLMs
- How to Cut RAG Costs by 80% Using Prompt Compression
Prompting strategies
Multi-Modal RAG
Multi-index RAG
Multi-Document
FLARE
Chain-of-Verification
Chain-Of-Thought
Context
Long context RAG
Knowledge and Knowledge Graphs
- Graph RAG: Unleashing the Power of Knowledge Graphs with LLM
- Embeddings + Knowledge Graphs: The Ultimate Tools for RAG Systems
- The Practical Benefits to Grounding an LLM in a Knowledge Graph Daniel Bukowski
- HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
Automated prompt optimization
Hallucination
Guardrails
LLM Models
Finetuning and Pretraining
- Fine-Tuning Llama 2.0 with Single GPU Magic
- Practitioners guide to fine-tune LLMs for domain-specific use case
- Are You Pre-training your RAG Models on Your Raw Text?
- Combine Multiple LoRA Adapters for Llama 2
- RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM Application?
Evaluation of RAGs
- RAG Evaluation
- Evaluating RAG: A journey through metrics
- Exploring End-to-End Evaluation of RAG Pipelines
- Evaluation Driven Development, the Swiss Army Knife for RAG Pipelines
- Evaluating the Ideal Chunk Size for a RAG System using LlamaIndex
Performance and cost
Privacy
Security
Applications of RAG
Chatbots
Tools
DSPy
AutoRAG
- AutoRAG - AutoML tool for RAG. Automatically optimize RAG pipeline with single YAML file.
AutoGPT
Langchain
LlamaIndex
-
Building Production-Ready LLM Apps with LlamaIndex: Document Metadata for Higher Accuracy Retrieval
- Building Production-Ready LLM Apps With LlamaIndex: Recursive Document Agents for Dynamic Retrieval
Vendor-specific examples
Elastcisearch + OpenAI
OpenAI and ChatGPT
Tools and fucntions
- Unlocking the Power of the OpenAI API: Master Function-Calling with Practical Examples
- penAI/Chat-GPT Function Calling : for Enhanced AI Interactions
Vespa
Qdrant
Running RAGs in production
Vectors corner
- Similarity Search, Part 2: Product Quantization
- Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval
- Cohere int8 & binary Embeddings - Scale Your Vector Database to Large Datasets Image of Nils Reimers
Research Papers
Survey and Benchmark
Benchmarking Large Language Models in Retrieval-Augmented Generation \ Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun \ arXiv 2023. [Paper][Github] \ 4 Sep 2023
Retrieval-enhanced LLMs
Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models \ Wenhao Yu, Hongming Zhang, Xiaoman Pan, Kaixin Ma, Hongwei Wang, Dong Yu \ arxiv - Nov 2023 [Paper]
REST: Retrieval-Based Speculative Decoding \ Zhenyu He, Zexuan Zhong, Tianle Cai, Jason D Lee, Di He \ arXiv - Nov 2023 [Paper][Github]
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Anonymous
ICLR 24 – Oct 2023 [paper]
Self-Knowledge Guided Retrieval Augmentation for Large Language Models \ Yile Wang, Peng Li, Maosong Sun, Yang Liu \ arXiv - Oct 2023 [Ppaer]
Retrieval meets Long Context Large Language Models \ Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro \ arxiv - Oct 2023 [Paper]
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan, Saiful Haq, Ashutosh Sharma, Thomas T. Joshi, Hanna Moazam, Heather Miller, Matei Zaharia, Christopher Potts
arXiv – Oct 2023 [paper] [code]
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Jian Xie, Kai Zhang, Jiangjie Chen, Renze Lou, Yu Su
ICLR 24 – May 2023 [paper] [code]
Active Retrieval Augmented Generation
Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig
arXiv – May 2023 [paper] [code]
REPLUG: Retrieval-Augmented Black-Box Language Models
Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, Wen-tau Yih
arXiv – Jan 2023 [paper]
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, Douwe Kiela NeurIPS 2020 - May 2020 [Paper]
RAG Instruction Tuning
RA-DIT: Retrieval-Augmented Dual Instruction Tuning
Anonymous
ICLR 24 – Oct 23 [paper]
InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining
Boxin Wang, Wei Ping, Lawrence McAfee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro \
arXiv - Oct 23 [paper]
RAG In-Context Learning
In-Context Retrieval-Augmented Language Models
Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, Yoav Shoham
AI21 Labs – Jan 2023 [paper] [code]
RAG Embeddings
RegaVAE: A Retrieval-Augmented Gaussian Mixture Variational Auto-Encoder for Language Modeling \ Jingcheng Deng, Liang Pang, Huawei Shen, Xueqi Cheng \ EMNLP 2023 - Oct 2023 [Paper][Github]
Text Embeddings Reveal (Almost) As Much As Text \ John X. Morris, Volodymyr Kuleshov, Vitaly Shmatikov, Alexander M. Rush \ EMNLP 2023 - Oct 2023 [Paper][Github]
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents \ Michael Günther, Jackmin Ong, Isabelle Mohr, Alaeddine Abdessalem, Tanguy Abel, Mohammad Kalim Akram, Susana Guzman, Georgios Mastrapas, Saba Sturua, Bo Wang, Maximilian Werk, Nan Wang, Han Xiao \ arXiv - Oct 2023. [Paper][Model]
RAG Simulators
KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants \ Kaustubh D. Dhole \ Simulation of Conversational Intelligence in Chat, EACL 2024 [Paper]
RAG Search
RAG Long-text and Memory
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models \ Bernal Jiménez Gutiérrez, Yiheng Shu, Yu Gu, Michihiro Yasunaga, Yu Su \ arXiv - May 2024 [paper] [GitHub]
Understanding Retrieval Augmentation for Long-Form Question Answering \ Hung-Ting Chen, Fangyuan Xu, Shane A. Arora, Eunsol Choi \ arXiv - Oct 2023 [Paper]
RAG Evaluation
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems \ Jon Saad-Falcon, Omar Khattab, Christopher Potts, Matei Zaharia \ arXiv - Nov 2023. [Paper] [Github]
RAG Optimization
Learning to Filter Context for Retrieval-Augmented Generation \ Zhiruo Wang, Jun Araki, Zhengbao Jiang, Md Rizwan Parvez, Graham Neubig \ arxiv- Nov 2023 [Paper][Github]
Large Language Models Can Be Easily Distracted by Irrelevant Context \ Freda Shi, Xinyun Chen, Kanishka Misra, Nathan Scales, David Dohan, Ed Chi, Nathanael Schärli, Denny Zhou \ ICML 2023 - Jan 2023 [Paper][Github]
Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks \ Akari Asai, Matt Gardner, Hannaneh Hajishirzi \ NAACL 2022 - Dec 2021 [Paper][Github]
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories \ Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh Hajishirzi \ ACL 2023 - Dec 2022 [Paper][Github]
RAG Application
Deficiency of Large Language Models in Finance: An Empirical Examination of Hallucination \ Haoqiang Kang, Xiao-Yang Liu \ arXiv - Nov 2023 [Paper]
Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature \ Alejandro Lozano, Scott L Fleming, Chia-Chun Chiang, Nigam Shah \ arXiv - Oct 2023. [Paper]
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers \ Sheshera Mysore, Zhuoran Lu, Mengting Wan, Longqi Yang, Steve Menezes, Tina Baghaee, Emmanuel Barajas Gonzalez, Jennifer Neville, Tara Safavi \ arXiv - Nov 2023. [Paper]