Skip to main content

This is a new service. Help us improve it and give your feedback by email.

Paperless-ngx with AI Walkthrough

Step-by-step guide to a searchable, AI-classified parish document archive on AWS

Try Paperless-ngx with AI

15 minutes to see your filing cabinet, searchable

Sign in to a working Paperless-ngx archive, see how Amazon Bedrock has tagged and titled every document, then ask plain-English questions of the archive.
No technical knowledge required

This walkthrough takes you through Paperless-ngx (opens in new tab) as a parish clerk or council records officer would use it. You'll sign in, browse documents that Amazon Bedrock has already classified, search the OCR'd text, and chat with your archive. Perfect for clerks, records managers, FOI officers, and digital transformation leads.

Paperless-ngx (opens in new tab) is the open-source document archive trusted by tens of thousands of households and organisations. This NDX:Try scenario lifts the upstream Docker stack onto AWS Fargate (Django + Postgres + Redis + Apache Tika + Gotenberg) and adds Amazon Bedrock on top, so every document that lands in the consume folder is OCR'd, auto-titled, auto-tagged, classified, the correspondent extracted, and a 1-2 sentence summary stored against it. A separate chat UI lets you ask questions of the archive using Amazon Bedrock Knowledge Base over Amazon S3 Vectors with Bedrock Guardrails for content safety and PII anonymisation.

What you'll do

In this 15-minute walkthrough, you'll:

  1. Sign in to Paperless-ngx — Find your URL and admin password from the stack outputs and access the dashboard (2 minutes)
  2. Browse the AI-classified archive — See how Amazon Nova Pro has tagged, titled, classified and extracted correspondents from 36 sample parish documents (5 minutes)
  3. Open a document and search — Drop into a single document to see the OCR'd content, then run a full-text search (3 minutes)
  4. Chat with the archive — Ask plain-English questions of the documents and watch Bedrock Knowledge Base return answers with citations and Guardrails redact PII (5 minutes)

Before you start

Important You must have already requested the Paperless-ngx with AI scenario via NDX:Try.

Deploy the Paperless-ngx scenario if you haven't already. Deployment takes about 25 minutes (Aurora Serverless, ElastiCache, Bedrock Knowledge Base and CloudFront all provision in parallel).

Sample data check

Ready Sample data is pre-loaded and ready to use.

What you'll learn

For records officers and clerks

  • What an AI-classified document archive looks like in practice
  • How OCR + AI tagging change retrieval time vs. a paper filing cabinet
  • How chat-with-archive supports FOI requests and audit
  • How PII anonymisation can be applied at the answer layer

For technical staff

  • The Paperless-ngx post-consume hook pattern with Amazon Nova Pro
  • Amazon Bedrock Knowledge Base over Amazon S3 Vectors
  • Bedrock Guardrails (content / topic / PII filters)
  • Multi-container Fargate (Django + Tika + Gotenberg sidecars)
  • How an upstream image is configured entirely via env vars (no fork)
Start walkthrough

Takes approximately 15 minutes

Build: 38afc52 (opens in new tab)