RAG Technique Selection GuideΒΆ
Use this guide after you understand baseline RAG. The goal is to answer one practical question:
What should you add next to improve your system, and what problem is it actually solving?
This guide is informed by the patterns collected in the RAG_Techniques repository, but organized for curriculum use rather than as a long catalog.
1. Start with Failure Mode, Not HypeΒΆ
Most RAG teams choose the wrong upgrade because they choose by trend instead of failure mode.
Use this sequence:
Identify how the system fails.
Decide whether the failure is query-side, document-side, retrieval-side, or control-loop related.
Add the smallest technique that directly fixes that failure.
Benchmark against the simpler baseline before keeping it.
2. Technique Selection MatrixΒΆ
If your system fails like this |
Start with these techniques |
Why they help |
Cost / complexity |
Hands-on notebooks |
|---|---|---|---|---|
User questions are vague, short, or ambiguous |
Query rewriting, multi-query retrieval, HyDE |
Better query-document alignment |
Low to medium |
|
Relevant facts are split across document sections |
Parent-child retrieval, hierarchical retrieval, RAPTOR |
Preserves both local detail and broader context |
Medium to high |
|
Top-k results are close but noisy |
Hybrid retrieval, reranking, contextual compression |
Improves ranking quality before generation |
Medium |
|
Retrieved chunks miss structure or context |
Semantic chunking, proposition chunking, contextual headers, window expansion |
Improves chunk quality and context boundaries |
Medium |
|
System answers with weak evidence |
Reliable RAG, CRAG, Self-RAG, retrieval feedback loops |
Adds correction and abstention behavior |
Medium to high |
|
Questions require entity relationships or multi-hop reasoning |
GraphRAG, hierarchical indices, RAPTOR |
Better reasoning over relationships and summaries |
High |
|
Inputs contain charts, screenshots, or scanned PDFs |
Visual RAG, multimodal retrieval, caption-based retrieval |
Makes non-text evidence retrievable |
High |
|
You need tool use and dynamic routing |
Agentic RAG, retrieval orchestration |
Lets the system choose retrieval tools dynamically |
High |
Phase 15 later builds on this |
3. Best Next Upgrade by System StageΒΆ
Stage A: Your first baseline worksΒΆ
Add these first:
Better chunking
Hybrid retrieval
Reranking
Reason: these usually produce the biggest quality gain per unit of complexity.
Stage B: Retrieval quality is okay, but query understanding is weakΒΆ
Add these next:
Query rewriting
Multi-query retrieval
HyDE
Reason: these help when user intent is not expressed in the same language as your indexed content.
Stage C: You have partially good retrieval, but answers are still unreliableΒΆ
Add these next:
Contextual compression
Evidence grading
CRAG-style retry or abstention
Reason: the problem is often not finding documents, but passing too much or too little evidence into generation.
Stage D: Your corpus is large, structured, or multi-hopΒΆ
Add these next:
Parent-child retrieval
RAPTOR or hierarchy-based retrieval
GraphRAG
Reason: flat chunk retrieval does not capture long-range document structure well.
4. Recommended Study OrderΒΆ
If you want a disciplined path through the advanced material, use this order:
02_document_processing.ipynbβ chunking strategies07_evaluation.ipynbβ metrics and benchmarking (run this early so you can measure everything that follows)08_hyde_reranking.ipynbβ HyDE query expansion + reranking (self-contained, no API keys)11_corrective_rag.ipynbβ CRAG-style retrieval grading, retry, abstention (self-contained)12_parent_child_retrieval.ipynbβ chunk-to-parent expansion (self-contained)13_raptor_retrieval.ipynbβ hierarchical summary-tree retrieval (self-contained)09_advanced_retrieval.ipynbβ ColBERT, Cohere reranking, full pipeline (requires OpenAI + Cohere API keys)10_graphrag_visual_rag.ipynbβ entity-relationship and multimodal retrievalchallenges.md/assignment.md
Notebooks 3-6 above are self-contained (TF-IDF + toy data, no API keys needed) so you can
run them immediately. Notebook 7 (09_advanced_retrieval.ipynb) requires API keys but covers
production-grade libraries (LangChain, ChromaDB, Cohere) for the same concepts.
This ordering mirrors how strong systems are actually built:
first improve chunks,
then set up evaluation,
then improve query understanding and ranking,
then add reliability controls,
then introduce hierarchical structure,
then expand into graph and multimodal patterns.
5. Technique Tradeoffs That MatterΒΆ
HyDEΒΆ
Use when:
queries are abstract
semantic similarity is weak
users ask concept-heavy questions
Avoid when:
latency and query-time LLM cost are strict constraints
your baseline query rewriting already works well
RerankingΒΆ
Use when:
top-k contains near misses
you can afford a second-stage ranker
precision matters more than raw recall
Avoid when:
latency budgets are extremely tight
your corpus is small and clean
Contextual CompressionΒΆ
Use when:
retrieved chunks are too long or noisy
model context windows are being wasted
answers degrade because too much irrelevant text is included
Avoid when:
your base chunks are already short and precise
RAPTOR / Hierarchical RetrievalΒΆ
Use when:
documents are long and structured
queries require section-level synthesis
flat chunk retrieval misses broad context
Avoid when:
corpus is small and shallow
the overhead is not justified by question complexity
CRAG / Self-RAGΒΆ
Use when:
you need reliability and abstention
hallucinations are costly
retrieval quality varies heavily across questions
Avoid when:
the system is still missing basic evaluation and retrieval baselines
GraphRAGΒΆ
Use when:
answers depend on entities and relationships
multi-hop reasoning is common
your domain already has graph-like structure
Avoid when:
you have not yet validated that simpler retrieval methods fail
6. Mapping to the Cloned RepositoryΒΆ
Use the cloned RAG_Techniques repo as an idea bank, not as a checklist to blindly implement.
Good references to study:
Query-side:
query_transformations.ipynb,HyDe_Hypothetical_Document_Embedding.ipynbChunking/context:
semantic_chunking.ipynb,proposition_chunking.ipynb,contextual_compression.ipynbRetrieval quality:
fusion_retrieval.ipynb,reranking.ipynb,relevant_segment_extraction.ipynbStructure:
hierarchical_indices.ipynb,raptor.ipynb,graph_rag.ipynb,Microsoft_GraphRag.ipynbReliability:
reliable_rag.ipynb,crag.ipynb,self_rag.ipynb,retrieval_with_feedback_loop.ipynbMultimodal:
multi_model_rag_with_captioning.ipynb,multi_model_rag_with_colpali.ipynb
Public repo link:
7. Practical Build RecipesΒΆ
Recipe 1: Best general-purpose upgrade pathΒΆ
Improve chunking
Add hybrid retrieval
Add reranking
Add evaluation and failure analysis
This is the default recommendation for most production RAG systems.
Recipe 2: Best path for enterprise searchΒΆ
Metadata-aware chunking
Hybrid retrieval
Reranking
Contextual compression
Abstention and evidence checks
Recipe 3: Best path for research or long reportsΒΆ
Semantic or proposition chunking
Parent-child retrieval
RAPTOR or hierarchy-based retrieval
GraphRAG if multi-hop reasoning still fails
Recipe 4: Best path for support bots and copilotsΒΆ
Query rewriting
Multi-query retrieval
Reranking
Conversational context handling
Reliability loop for weak retrieval cases
8. What Not to DoΒΆ
Do not jump to GraphRAG before baseline retrieval is measured.
Do not add agentic orchestration before reranking and evaluation are working.
Do not assume a more advanced architecture is better for a small or clean corpus.
Do not evaluate only answer fluency; evaluate retrieval and faithfulness separately.
9. Suggested Capstone ExtensionsΒΆ
If you want to extend the Phase 8 capstone, pick one of these paths:
Retrieval quality path: hybrid search + reranking + compression β start with
08_hyde_reranking.ipynb.Reliability path: evidence grading + CRAG-style retry + abstention β start with
11_corrective_rag.ipynb.Structured reasoning path: parent-child retrieval β RAPTOR β start with
12_parent_child_retrieval.ipynbthen13_raptor_retrieval.ipynb.Advanced architecture path: GraphRAG or multimodal RAG β
10_graphrag_visual_rag.ipynb.
Pick one path and measure it properly. That is better than adding five advanced techniques without evidence.