Skip to main content

This is a new service. Help us improve it and give your feedback by email.

Walkthrough complete — Paperless-ngx with AI

You’ve completed the Paperless-ngx with AI walkthrough

Walkthrough complete

You've explored a working AI-classified document archive on AWS

You've just put 36 parish documents through an OCR + Bedrock classification pipeline, browsed the archive Bedrock built, and asked the documents plain-English questions. In 15 minutes you've seen what would otherwise be a multi-vendor procurement involving an EDMS, an OCR engine and an AI provider.

What you've learned

📄 Open-source, env-var configured

Upstream Paperless-ngx is fully configured by environment variables — no fork, no patch, no licence. The same image you'd download for your home server runs unchanged on Fargate.

Value: zero licence cost, no vendor lock-in, an active community of tens of thousands.

🧠 AI applied at the right layer

Bedrock isn't replacing OCR or replacing search — it's enriching the metadata Paperless already has. Title, tags, document type, correspondent, summary. Everything else (the search box, the inbox tag, the document detail view) is the upstream UI that Paperless users already know.

Value: AI where it adds value, conventional UI everywhere else.

🔍 Retrieval-augmented chat with citations

Every chat answer is grounded in the documents you've actually consumed and cites them by source. The Knowledge Base + S3 Vectors + Guardrails stack is fully managed — there's no vector DB cluster to size, no embedding pipeline to maintain, no separate moderation system.

Value: trustworthy answers, FOI-defensible, no hallucinated content.

🛡️ Guardrails as a first-class control

Bedrock Guardrails block political opinion, medical and legal advice, and anonymise UK PII (names, addresses, phone, email, NI numbers, NHS numbers, payment cards) at the answer layer — before the response leaves AWS, regardless of what the model would otherwise say.

Value: defensible PII handling, consistent content safety across every chat.

Production readiness

This demo runs the same upstream image used in production by tens of thousands of households and small organisations. For a council deployment you'd additionally want:

Feature Demo status Production requirement
Authentication Built-in admin user SSO via SAML or OIDC, MFA
Custom domain CloudFront URL archive.yourcouncil.gov.uk + ACM certificate
Document ingestion 36 sample parish documents IMAP scrape, scanner-direct upload, batch import
Backup & retention Aurora 1-day automated Point-in-time recovery, retention policy aligned with statutory requirements
Audit log CloudWatch Logs Centralised SIEM, immutable logging, FOI-defensible record
Bedrock cost controls Per-document classification Budget alerts, model usage caps, on-prem fallback for sensitive content
Guardrails Default content / topic / PII set Topic and word lists tuned to your council's policies

The point isn't that this scenario is production-ready out of the box. The point is that the heavy lifting — OCR, classification, RAG, Guardrails, multi-format conversion — is already done. The bits left are integration with your council's identity, network, retention rules and procurement framework.

Next steps

Generate Evidence Pack

Create your business case with what you've learned. Perfect for committee papers.

Generate Evidence Pack

Return to Paperless-ngx with AI

Review deployment options, costs, and technical details.

Back to scenario

Try Another

Explore more AI scenarios for local government.

Browse scenarios

Questions or feedback?

Build: 38afc52 (opens in new tab)