<claudexml/>
Complex · advanced

Map-reduce summarization over many chunks

Summarize each chunk independently, then synthesize a single coherent summary — two prompts, structured handoff.

Documents that exceed your context budget — or where single-pass summaries lose detail. Map each chunk to a small summary, then reduce the summaries into one.

The prompt

Copy this verbatim. Replace the {{ … }} placeholders with your values.

<!-- PASS 1 (map): run this once per chunk, in parallel -->
<instructions>
You are summarizing one chunk of a larger document. Produce JSON inside <chunk_summary> tags:

{
  "key_claims": ["string"],          // up to 5; one fact per item, verbatim where possible
  "entities": ["string"],            // people, orgs, products mentioned
  "open_questions": ["string"],      // things this chunk raises but does not answer
  "chunk_id": "{{ chunk_id }}"
}

Do not generalize beyond this chunk. Do not reference "the document" — you only see one chunk.
</instructions>

<chunk id="{{ chunk_id }}">{{ chunk_text }}</chunk>

Return inside <chunk_summary> tags.

<!-- PASS 2 (reduce): one call, taking all chunk_summary outputs as input -->
<instructions>
You will receive N per-chunk summaries inside <chunk_summary> tags. Synthesize a final summary.

Output three sections, each in its own tag:

<tldr>One paragraph, ≤120 words.</tldr>
<key_findings>Bulleted, deduplicated across chunks. Cite chunk_id in [c:ID] format.</key_findings>
<unanswered>Bulleted list of open questions that no chunk answered.</unanswered>

Rules:
- Deduplicate claims that appear in multiple chunks.
- Surface contradictions explicitly: "[c:3] states X but [c:7] states Y."
- Do not invent facts not present in any chunk_summary.
</instructions>

{{ chunk_summaries_concatenated }}

Sample input

Pass 1: a 6kb chunk of a research report. Pass 2: 12 chunk_summary blobs from pass 1.

Expected output

<tldr>
The report finds that small fine-tuned retrievers (110M params) match 7B baselines on domain queries
at one-tenth the cost, with most gains coming from query-log diversity rather than model size...
</tldr>
<key_findings>
- Fine-tuned 110M retriever beats 7B baseline by 4 nDCG@10 points on the in-domain test set [c:3, c:8]
- 8× latency reduction with no measurable recall loss [c:4]
- Generalization to long-tail queries is unverified [c:9, c:11]
</key_findings>
<unanswered>
- How does the approach hold up on multilingual queries?
- What is the cost of building the 50k query training set in domains without query logs?
</unanswered>

Notes & tuning tips

  • Two prompts, not one. Parallelize pass 1 across chunks; pass 2 runs once on the concatenated outputs.
  • Per-chunk JSON output is what makes deduplication tractable in pass 2 — free prose doesn't deduplicate cleanly.
  • If you can't run two passes, single-pass summarization with all chunks inline is acceptable up to ~50k tokens; beyond that, accuracy degrades sharply.
  • Chunk IDs are load-bearing — they're how the final summary cites back to source positions.

What this example uses

Tags: <instructions> <format>

Patterns: multi document structured output

Cite this page
Map-reduce summarization over many chunks. claudexml.com. https://claudexml.com/examples/map-reduce-summarize/