How It Works
Understand the architecture and data flows powering Simply Readable’s document translation and Easy Read features
How It Works
Simply Readable orchestrates 12 AWS services to power document translation into 75 languages and Easy Read conversion using generative AI. Here's how it all fits together.
Architecture overview
The application uses an event-driven architecture built on AWS Step Functions. When a user uploads a document, the system automatically routes it through the appropriate processing pipeline — either translation via Amazon Translate or Easy Read conversion via Amazon Bedrock.
High-level flow
- User uploads a document through the React web app (hosted on CloudFront + S3)
- Document is stored in an identity-scoped S3 path (
private/{cognito_sub}/*) - A DynamoDB stream triggers an EventBridge Pipe, which starts a Step Functions workflow
- For translation: Step Functions calls Amazon Translate directly
- For Easy Read: The document is parsed (DOCX to HTML to Markdown), then sent to Amazon Bedrock (Claude 3 Haiku for text simplification, Nova Canvas for image generation)
- Results are written back to S3 and the user is notified via AppSync subscriptions
AWS services used
Amazon Translate
Translates documents into 75 languages. Supports .docx, .xlsx, .pptx, .txt, and .html formats.
Amazon Bedrock
Powers Easy Read conversion. Claude 3 Haiku simplifies text; Nova Canvas generates supporting images.
AWS Step Functions
Orchestrates processing workflows with error handling and parallel processing.
Amazon Cognito
Authentication and identity-scoped S3 access ensures users only see their own documents.
AWS AppSync
GraphQL API with real-time subscriptions for live progress updates.
Amazon DynamoDB
Stores job metadata and status. DynamoDB Streams trigger workflows via EventBridge Pipes.
AWS Lambda
12 functions for document parsing, Bedrock invocation, and AppSync mutations.
Amazon CloudFront + S3
Serves the React app with TLS 1.2+. Separate buckets for translation and readable content.
AWS WAF
Protects the AppSync API with AWS Managed Rules.
Amazon Comprehend
Detects document language automatically for translation source language selection.
Easy Read processing pipeline
- Document import — Upload triggers a DynamoDB Stream event via EventBridge Pipes
- DOCX to HTML — Lambda uses mammoth.js to extract structured content
- HTML to Markdown — Lambda uses TurndownService to create clean text
- Split into sections — Markdown is split into chunks for AI processing
- Text simplification — Claude 3 Haiku rewrites each section in plain English
- Image generation — Nova Canvas generates supporting images
- Assembly — Results are combined and stored in S3 for download
Security patterns
- Identity-scoped storage — Documents stored under Cognito identity ID in S3
- Dual authentication — AppSync uses Cognito User Pools and IAM authorization
- WAF protection — AWS Managed Rules protect the API
- Encryption — AES-256 at rest, TLS 1.2+ in transit
- Least privilege — Purpose-specific IAM roles for each Lambda