Step 4: View the transcription - Minute AI Walkthrough
See the AI-generated transcript with speaker identification
Great! You've deployed the demo
Now let's walk through what you just deployed and see it in action.
Start WalkthroughChoose your next step
Generate Evidence Pack
Create your business case documentation with what you've learned.
Generate Evidence PackWalkthrough progress
Step 4 of 6 • 3 minutes
View the transcription
Once AWS Transcribe has finished processing your recording, review the transcript with automatic speaker identification.
Expected outcome
- Transcription has completed successfully
- You can see the full transcript with speaker labels
- You understand how AWS Transcribe identifies different speakers
Waiting for transcription
After uploading your recording, AWS Transcribe processes the audio in the background. The time depends on the length of your recording:
| Recording length | Approximate processing time |
|---|---|
| 5 minutes | 2 to 3 minutes |
| 15 minutes | 4 to 6 minutes |
| 30 minutes | 5 to 10 minutes |
| 60 minutes | 10 to 15 minutes |
You can leave and come back. The transcription runs in the background on AWS. You do not need to keep the browser tab open. When you return, the transcript will be ready if processing has completed.
Viewing the transcript
-
Check the meeting status
Navigate to your meeting. The page will show the current status. When the transcription is complete, the status will update automatically.
-
Click the "Transcript" tab
Once transcription is complete, click the "Transcript" tab to see the full text of your meeting recording.
-
Review speaker labels
AWS Transcribe automatically identifies different speakers in the recording. Each speaker is labelled (e.g. "Speaker 1", "Speaker 2") with their dialogue grouped together.
How speaker identification works
AWS Transcribe (opens in new tab) uses machine learning to distinguish between different voices in an audio recording. This is known as "speaker diarisation". The service:
- Detects voice patterns -- identifies unique vocal characteristics for each participant
- Labels speakers consistently -- the same person gets the same label throughout the transcript
- Handles overlapping speech -- manages cases where people talk over each other
- Works with multiple speakers -- handles meetings with many participants
Note: Speaker labels are generic ("Speaker 1", "Speaker 2") because Transcribe cannot know participants' names. You can edit the transcript to add real names if needed, which will improve the quality of the generated minutes.
Troubleshooting: Transcription taking a long time
If the transcription seems stuck:
- Long recordings (over 60 minutes) can take 15 to 20 minutes to process
- Check the CloudWatch Logs URL from CloudFormation Outputs for any error messages
- Try uploading a shorter recording (5 to 10 minutes) to verify the system is working
- Refresh the page to check for status updates
Troubleshooting: Poor transcription quality
Transcription quality depends on audio quality. For the best results:
- Use recordings with clear audio and minimal background noise
- Recordings where speakers take turns produce better results than those with lots of overlapping speech
- Higher quality audio formats (WAV) may produce slightly better results than compressed formats (MP3)
- Transcribe works best with English audio, though it supports many languages