Transformation · intermediate

Transcript cleaner with speaker labels and summary

Multi-stage pipeline: clean fillers, label speakers consistently, produce a 5-bullet summary.

Raw meeting transcripts (Zoom, Otter) are noisy. You want three sequential views in one call: cleaned, speaker-labeled, summarized.

The prompt

Copy this verbatim. Replace the {{ … }} placeholders with your values.

<instructions>
Process the raw transcript in <transcript> in three stages. Output each stage in its own tag.

1. <cleaned>: remove filler words (um, uh, like, you know, sort of), false starts,
   and timestamps. Preserve meaning. Preserve all decisions and action items.
2. <speakers>: label speakers as Speaker 1, Speaker 2, ... consistently across the whole transcript.
   Use the cleaned version from stage 1.
3. <summary>: a 5-bullet summary of decisions and action items, with the owner if stated.

Do all three stages in one response.
</instructions>

<transcript>{{ raw_transcript }}</transcript>

Sample input

00:00:01 [voice A]: so um yeah we should like ship the auth fix today
00:00:06 [voice B]: uh yeah i'll do that
00:00:10 [voice A]: cool, also let's like push the design review to thursday
00:00:14 [voice B]: works for me

Expected output

<cleaned>
We should ship the auth fix today.
Yeah, I'll do that.
Also, let's push the design review to Thursday.
Works for me.
</cleaned>

<speakers>
Speaker 1: We should ship the auth fix today.
Speaker 2: Yeah, I'll do that.
Speaker 1: Also, let's push the design review to Thursday.
Speaker 2: Works for me.
</speakers>

<summary>
- Decision: ship auth fix today
- Action: Speaker 2 will ship the auth fix today
- Decision: design review moved to Thursday
- (no other decisions)
- (no other action items)
</summary>

Notes & tuning tips

Three-stage output in one call is cheaper than three sequential calls — the model can reference its own earlier output.
If "voice A / voice B" labels are reliable in your source, ask for those to be preserved rather than re-labeling.
For long transcripts, chunk by speaker turn or by 5-minute windows; quality degrades on very long single prompts.

What this example uses

Tags: <instructions>

Cite this page
Transcript cleaner with speaker labels and summary. claudexml.com. https://claudexml.com/examples/transcript-cleaner/