Step 3: Upload & Process Document - FOI Redaction Walkthrough
Upload your sample document and watch AI detect PII in real-time
Great! You've deployed the demo
Now let's walk through what you just deployed and see it in action.
Start WalkthroughChoose your next step
Generate Evidence Pack
Create your business case documentation with what you've learned.
Generate Evidence PackWalkthrough progress
Step 3 of 5 • 2 minutes
Upload & Process Document
Upload your downloaded FOI document and watch AI automatically detect personal information in real-time.
Screenshot updating - please check back soon
Screenshot updating - please check back soon
Expected outcome
- Document uploads successfully to the interface
- Processing completes within 15-30 seconds
- PII detection results appear automatically
- You see the wow moment of instant PII identification
Upload your document
-
Locate the upload area in the redaction interface
You should see a drag-and-drop zone or a "Choose file" button
-
Upload the sample document you downloaded
Either drag the PDF file onto the drop zone, or click "Choose file" and select it from your downloads folder
-
Wait for upload confirmation
You'll see a progress indicator showing the file is uploading. For our sample documents (under 2MB), this takes 1-3 seconds.
-
Watch the processing indicator
After upload, the AI processing begins automatically. You'll see a status message like "Analyzing document for PII..."
Processing time: Sample documents process in 15-30 seconds. The AI is reading the entire document, extracting text, identifying PII entities, calculating confidence scores, and preparing the redacted output.
What's happening during processing
Behind the scenes, the AI performs these steps:
Document upload to S3
Your sample document is securely uploaded to an encrypted S3 bucket for processing. (1-3 seconds)
Text extraction (if needed)
If your document is a scanned PDF, Amazon Textract extracts text using OCR. Typed PDFs skip this step. (5-10 seconds for scanned documents)
PII entity detection
Amazon Comprehend analyzes the text and identifies PII entities (names, addresses, phone, email, etc.). Each detection gets a confidence score. (8-15 seconds)
Results preparation
The system prepares both the detection results (with locations and confidence scores) and the redacted output document. (2-5 seconds)
The wow moment: AI reads and understands your FOI document
In under 30 seconds, AI has read your entire FOI response, identified every instance of personal information across all pages, calculated confidence scores for each detection, and prepared a redacted version. This is work that would take an FOI officer 20-30 minutes of careful line-by-line review.
Technical detail
Processing status messages
During processing, you'll see status updates:
- Uploading...
- Document is being uploaded to secure storage (1-3 seconds)
- Analyzing document...
- AI is reading the document and detecting PII entities (15-25 seconds)
- Processing complete
- PII detection finished successfully. Results are ready to review.
While you wait...
Processing takes 15-30 seconds. Consider:
- In production: This would run automatically when FOI responses are prepared
- Volume processing: The system can handle multiple documents simultaneously
- Time savings: 30 seconds vs 20-30 minutes manual review = 40-60× faster
- Accuracy: AI doesn't get tired or miss PII due to document fatigue
Troubleshooting
Upload fails immediately
If upload fails before processing starts:
- Check file size is under 10MB (our samples are all under 2MB)
- Verify file format is PDF, DOCX, or TXT
- Check file isn't corrupted (try opening it locally first)
- Check browser console for CORS or network errors
- Try refreshing the redaction interface and uploading again
Processing never completes (timeout)
If processing spinner continues beyond 60 seconds:
- Refresh the page - processing may have completed but UI didn't update
- Check the S3 bucket for uploaded files (confirms upload succeeded)
- Check CloudWatch logs for Lambda function errors
- Verify Comprehend API is responding (check AWS service health dashboard)
- Try with a different sample document to isolate the issue
- Check Lambda function timeout is set to 60 seconds minimum
Processing fails with error message
If you see an error message after processing starts:
- Check error message for specifics (API quota, permission denied, etc.)
- Verify Lambda has IAM permissions for Comprehend DetectPiiEntities API
- Check Comprehend API quotas not exceeded (1,000 requests per second default)
- Review CloudWatch logs for detailed error stack traces
- Wait 1-2 minutes and retry (temporary AWS service issues)
- Check document content isn't triggering Comprehend content filters
Something went wrong? Get help
If you're stuck or encounter unexpected behavior:
- Contact the NDX:Try support team
- Check AWS Service Health Dashboard for Comprehend outages
- Review the FOI Redaction scenario troubleshooting guide
- Try deleting and redeploying the CloudFormation stack