RAG prompting with Claude XML
Inject retrieved passages with
Retrieval-augmented generation (RAG) means fetching relevant passages and giving them to the model as context. Claude's canonical RAG pattern uses <documents> with one <document> per passage, each carrying a <source>.
How to apply the pattern
- Put documents before instructions for long contexts. Anthropic's guidance: place large reference material first, then your instructions, then the user's question. Claude attends to instructions more strongly when they're closer to the query.
- Always include
<source>. Otherwise Claude invents citations. - Tell Claude what to do when the answer isn't in the documents. "Say 'I don't know based on the provided sources'" — without this, it'll hallucinate.
- Ask for cited claims. "Cite each claim in [doc-N] format." Then verify in code.
Worked example
<documents>
<document>
<source>doc-1: pricing.md</source>
<document_content>
The Pro plan is $20/month and includes 100 GB of storage.
</document_content>
</document>
<document>
<source>doc-2: refund-policy.md</source>
<document_content>
All paid plans include a 14-day money-back guarantee.
</document_content>
</document>
</documents>
<instructions>
Answer the user's question using only the documents above. Cite the source
in [doc-N] brackets for each claim. If the answer is not in the documents,
say "I don't have that information in the provided sources."
</instructions>
<question>How much does the Pro plan cost and can I get a refund?</question>
Tips
- Chunk before retrieving — passages should be 200–800 tokens, not whole files.
- Deduplicate retrieved passages; near-duplicates pull attention without adding signal.
- Cap total documents at the smallest count that still answers the question well.
Cite this page
RAG prompting with Claude XML. claudexml.com. https://claudexml.com/patterns/rag-context/