How it works
Every external source implements one TypeScript interface,
Connector, with optional search() and
start(sink) capabilities. Search-capable connectors
respond to the Saga search box in real time. Feed-capable
connectors poll on their own schedule and write normalized
FeedItem records into Postgres on AWS RDS.
A separate dispatcher worker reads the central table on a
5-minute tick, applies configurable routing rules, and POSTs
to Saga's Feed REST API as ninjs items. Decoupling ingest from
dispatch means we can change rules ("send all Politiloggen P1
to Saga main feed") on the fly without re-running connectors.
Attached media — PDFs, images — is queued at ingest time and
pulled to S3 by a downloader worker. Editorial desks get
permanent access to source documents even when the upstream
takes them down. Glacier IR after 90 days keeps cold-storage
costs bounded.
Everything runs on AWS Fargate, ARM64, behind CloudFront, in
eu-north-1. Total infrastructure spend at idle: under €40/month.
Schema migrations apply automatically at container startup, so
deploys don't need the usual choreography.