I Built a Document Triage with Telegram, n8n, and AWS Bedrock — 6 Decisions That Shaped a Self-Hosted AI Document Analyst

The volume of documents landing on my mobile outpaces my ability to read them. Research papers, articles, books — shared as PDFs across channels faster than any individual can process. The question isn't 'how do I read more?' — it's 'how do I decide what deserves my attention in 30 seconds, not 30 minutes?'

So I built this triage system

Telegram mandates HTTPS for webhooks. I had no domain to attach a certificate to. Here's how a self-signed cert, an Nginx proxy, and a reverse-engineered secret token got a PDF-summarising bot into production on EC2 with AWS Bedrock doing the thinking.

TL;DR: No domain → self-signed cert uploaded to Telegram API → Nginx TLS termination → n8n orchestration → Bedrock summarisation. Zero access keys. Full architecture below.

Constraints
Architecture: Telegram to Bedrock on EC2
Key Design Decisions
What Failed First
The Implementation That Shipped
What I'd Do Differently
Takeaways for Architects

Constraints

Constraint	Value
Domain	None available — no Route 53 hosted zone, no ACM certificate possible
HTTPS	Mandatory — Telegram rejects webhook registration without TLS
Auth model	No long-lived credentials in config files or environment variables
Orchestration	Visual workflow preferred — needs to be modifiable without redeployment
LLM	AWS-native, no external API keys beyond what IAM provides

Architecture: Telegram to Bedrock on EC2

Data flow: User sends PDF → Telegram delivers webhook POST to EC2:443 → Nginx terminates TLS, proxies to n8n:5678 → n8n downloads file, extracts text, invokes Bedrock → response sent back via Telegram Bot API.

Key Design Decisions

Every architecture is a set of trade-offs made explicit. Here are the ones that shaped this system:

#	Decision	Why	Trade-off Accepted
1	Self-signed cert + Nginx (no domain needed)	Telegram accepts uploaded certs via `setWebhook` API; eliminates domain dependency entirely	Browser shows cert warnings; webhook registration is manual
2	n8n over custom code	Visual workflow with built-in Telegram, PDF extraction, and LLM chain nodes. Hours to build, not weeks.	Undocumented webhook secret behaviour; version-pinning required
3	EC2 direct deployment	Cloud infrastructure with native internet connectivity. Predictable networking for webhook delivery.	Monthly compute cost; single point of failure without ASG
4	IAM role, not access keys	Zero rotation burden, no exposure risk, automatic credential refresh via instance metadata	None — strictly superior for EC2-hosted workloads
5	Manual webhook with computed secret	n8n can't upload self-signed certs when registering webhooks; manual `setWebhook` call bridges the gap	Must re-run registration script after workflow changes
6	Encryption key in Secrets Manager	Key loss = total credential loss. Secrets Manager provides audit logging, durability, and prevents accidental exposure in config files	Extra API call at startup (~100ms latency)

What Failed First

Tunnel services: unreliable for webhook delivery

I tried localhost.run, cloudflared, and ngrok to expose a local n8n instance. All three connected via SSH but HTTP traffic never arrived reliably. Tunnel services that rely on custom subdomains (*.lhr.life, *.trycloudflare.com) introduce a dependency outside your control — DNS resolution, uptime, and connection stability are all delegated to a third party. For a webhook endpoint that needs to be reachable 24/7, that's a risk I wasn't willing to accept.

Webhook 403s: proxy trust misconfiguration

After deploying to EC2 with Nginx in front of n8n, every Telegram webhook returned 403 Forbidden. The fix: N8N_PROXY_HOPS=1. Without it, n8n doesn't trust the X-Forwarded-For header from Nginx and rejects requests as spoofed.

Encryption key mismatch: silent data loss

I recreated the Docker container with a different encryption key. n8n showed 'Set up owner account' — all existing workflows and credentials were gone. The encryption key and the Docker volume are a coupled pair. Change one without the other and you lose everything.

The undocumented secret token

n8n v2.22.5 enforces a secret token on Telegram webhook requests. After extensive testing and reading the source code, I found the formula follows this pattern:

secret_token = {workflowId}_{nodeId}

The workflowId is visible in the browser URL. The nodeId appears when you click the Telegram Trigger node. This isn't surfaced in the n8n documentation or UI — it required empirical discovery. Without the correct secret, Telegram receives a 403 on every delivery and your bot stays silent.

The Implementation That Shipped

Docker: n8n with IAM role access

# Fetch encryption key from Secrets Manager (never hardcode)
N8N_KEY=$(aws secretsmanager get-secret-value \
  --secret-id <your-secret-id> \
  --region <your-region> \
  --query SecretString \
  --output text)

sudo docker run -d \
  --name n8n \
  --restart always \
  -e WEBHOOK_URL=https://<your-ec2-public-ip> \
  -e N8N_EDITOR_BASE_URL=https://<your-ec2-public-ip> \
  -e N8N_ENCRYPTION_KEY=${N8N_KEY} \
  -e GENERIC_TIMEZONE=Europe/London \
  -e N8N_AWS_SYSTEM_CREDENTIALS_ACCESS_ENABLED=true \
  -e N8N_PROXY_HOPS=1 \
  -e NODE_TLS_REJECT_UNAUTHORIZED=0 \
  -p 5678:5678 \
  -v n8n_data:/home/node/.n8n \
  docker.n8n.io/n8nio/n8n:2.22.5

Key decisions embedded in this config:

N8N_AWS_SYSTEM_CREDENTIALS_ACCESS_ENABLED=true — uses EC2 IAM role, no access keys
N8N_PROXY_HOPS=1 — trusts exactly one proxy layer (Nginx)
N8N_ENCRYPTION_KEY fetched from Secrets Manager at runtime, never stored on disk
Version pinned to 2.22.5 — webhook secret behaviour changes between versions

Nginx: TLS termination

server {
    listen 443 ssl;
    server_name _;

    ssl_certificate     /etc/ssl/certs/n8n.pem;
    ssl_certificate_key /etc/ssl/private/n8n.key;

    client_max_body_size 50m;

    location / {
        proxy_pass http://localhost:5678;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto https;
        proxy_buffering off;
        proxy_request_buffering off;
    }
}

proxy_request_buffering off is critical — without it, Nginx buffers the request body and n8n fails to parse multipart uploads (PDFs).

Webhook registration: the self-signed cert trick

curl -s \
  -F "url=https://<your-ec2-public-ip>/webhook/<your-webhook-id>/webhook" \
  -F "certificate=@/etc/ssl/certs/n8n.pem" \
  -F "secret_token=<workflowId>_<nodeId>" \
  "https://api.telegram.org/bot<your-bot-token>/setWebhook"

Three things happen in this single call:

Telegram learns the webhook URL
The self-signed public cert is uploaded — Telegram will trust it for future deliveries
The secret token is registered — Telegram includes it as a header, n8n validates it

Workflow node chain

The LLM prompt:

Analyse the following document and provide:
1. Title/Subject
2. Key Insights (3-5 bullet points)
3. Summary (2-3 paragraphs)

Document text:
{{ $json.text }}

System message: 'You are a document analyst. Provide clear summaries. Format for Telegram.'

What I'd Do Differently

Register a domain. Self-signed certs work but add operational friction — manual webhook re-registration after cert renewal, browser warnings on the editor. A domain costs as little as $5/year, and with ACM providing free certificates, the entire self-signed complexity disappears. For an MVP this was acceptable; for anything beyond, it's the first thing I'd change.

Switch to PostgreSQL for n8n's backend. n8n officially supports PostgreSQL as its production database (SQLite is the default; MySQL/MariaDB are deprecated). SQLite lives inside the Docker volume — it locks on writes, doesn't support safe hot backups, and is incompatible with n8n's queue mode for horizontal scaling. Amazon RDS for PostgreSQL or Aurora Serverless v2 would give managed backups, point-in-time recovery, and a path to multi-worker deployments without touching the application layer.

Add an Auto Scaling Group with min=1 and a launch template. The current architecture is a single EC2 instance — one availability zone, one point of failure. An ASG with the same user-data script gives self-healing (automatic replacement on health check failure) without adding architectural complexity. Combined with a domain and ACM, this moves the system from 'working prototype' to 'production-grade' with minimal additional cost.

Takeaways for Architects

Self-signed certificates are a valid pattern — but only when the webhook consumer explicitly supports certificate upload. Telegram does. Most services (Stripe, GitHub, Slack) do not. Validate this before committing to the architecture.

IAM roles are strictly superior to access keys for EC2 workloads. No rotation, no exposure risk, automatic refresh. There's no trade-off here — just use them.

Pin your versions. n8n's webhook secret enforcement appeared between versions without a migration path. n8n:latest is a liability. Pin, test upgrades in staging, keep a rollback plan.

Debug webhooks layer by layer. Five distinct failure modes: network, TLS, authentication, application logic, response formatting. Each must pass before testing the next. Skipping layers leads to circular troubleshooting.

Put intelligence in the prompt, not the infrastructure. The entire 'AI' part of this system is a 5-line prompt template. The other 95% of effort was infrastructure — certificates, secrets, networking, Docker volumes. The model is a commodity. Getting data to and from it reliably is the craft.

What's your approach when you need HTTPS webhooks but can't get a domain? I'd be curious whether others have hit the same self-signed cert pattern — or found a better workaround. Drop a comment below.

推荐订阅源

DEV Community

Table of Contents