RAG prompting with Claude XML

Inject retrieved passages with // and ground the answer.

Retrieval-augmented generation (RAG) means fetching relevant passages and giving them to the model as context. Claude's canonical RAG pattern uses <documents> with one <document> per passage, each carrying a <source>.

How to apply the pattern

Put documents before instructions for long contexts. Anthropic's guidance: place large reference material first, then your instructions, then the user's question. Claude attends to instructions more strongly when they're closer to the query.
Always include <source>. Otherwise Claude invents citations.
Tell Claude what to do when the answer isn't in the documents. "Say 'I don't know based on the provided sources'" — without this, it'll hallucinate.
Ask for cited claims. "Cite each claim in [doc-N] format." Then verify in code.

Worked example

<documents>
  <document>
    <source>doc-1: pricing.md</source>
    <document_content>
    The Pro plan is $20/month and includes 100 GB of storage.
    </document_content>
  </document>
  <document>
    <source>doc-2: refund-policy.md</source>
    <document_content>
    All paid plans include a 14-day money-back guarantee.
    </document_content>
  </document>
</documents>

<instructions>
Answer the user's question using only the documents above. Cite the source
in [doc-N] brackets for each claim. If the answer is not in the documents,
say "I don't have that information in the provided sources."
</instructions>

<question>How much does the Pro plan cost and can I get a refund?</question>

Tips

Chunk before retrieving — passages should be 200–800 tokens, not whole files.
Deduplicate retrieved passages; near-duplicates pull attention without adding signal.
Cap total documents at the smallest count that still answers the question well.

Cite this page
RAG prompting with Claude XML. claudexml.com. https://claudexml.com/patterns/rag-context/