The full stack in one table
| Layer | Technology | EU residency |
|---|---|---|
| API runtime | Cloudflare Workers (Hono framework, TypeScript strict) | weur region |
| Database | Cloudflare D1 (SQLite, append-only billing ledger) | location_hint=weur |
| Object storage | Cloudflare R2 (scan reports, PDF exports) | jurisdiction=eu |
| Cache / rate-limit | Cloudflare KV (OTP tokens, intel tool cache, rate-limit counters) | EU region |
| Async jobs | Cloudflare Queues (scan dispatch, event processing) | EU Workers consumer |
| Vector search | Cloudflare Vectorize (finding semantic search) | EU Workers |
| Scan compute | Fly Machines (Node.js 24, Frankfurt, fra region) | Frankfurt, Germany |
| AI inference | Anthropic Claude via Cloudflare AI Gateway | CF AI Gateway EU edge; Anthropic DPA |
| Transactional email | Resend (EU region endpoint) | EU |
| Billing | Polar.sh (Merchant of Record) | Polar.sh DPA; payment data not in AssurePort infra |
| Edge / WAF / DDoS | Cloudflare WAF + Turnstile (bot protection) | Global edge, EU data processing |
| DNS | Cloudflare DNS (assureport.com) | Global |
The Cloudflare Workers + Fly split
AssurePort has two compute layers. The Cloudflare Workers layer handles all API requests: authentication, scan management, billing, threat intel tools, webhook processing. Workers are stateless, serverless, and constrained to the weur region for data handling. They are fast (sub-millisecond cold start) and cheap at the request volumes we currently operate.
The Fly Machines layer handles scan execution. This is compute-intensive work: running 13-agent pipelines, invoking AI models, generating Chromium-based PDF exports, executing Nuclei templates, decompiling APKs. Fly Frankfurt machines run with auto_stop=off and min_machines_running=1 so there is always a warm machine ready to accept a scan dispatch from the Workers queue.
The split exists because Workers have a 30-second CPU time limit per request and 128MB memory limit. A full Web Pentest pipeline takes 15-45 minutes and uses several gigabytes of working memory at peak. Fly Machines have no such constraints. The dispatch model uses a Cloudflare Queue: the Worker enqueues the scan job, the Fly machine dequeues and executes, then POSTs results back to the Worker API with HMAC authentication.
Multi-tenant isolation: the D1 design
Every query in the AssurePort D1 layer includes a tenant_id filter. This is enforced at the repository layer in TypeScript — not at the application routing layer, not at the middleware layer, but at the point where SQL is constructed. A cross-tenant data leak would require a bug in the repository layer, which is the smallest and most-reviewed surface in the codebase.
The billing ledger is append-only: no UPDATE operations on credit or charge rows. Every credit allocation, scan reservation, and charge is a new INSERT row. The current balance for a tenant is computed by summing the ledger — a pattern borrowed from double-entry bookkeeping. This means the billing history is immutable and auditable: we can reconstruct any account’s state at any point in time from the ledger.
Cloudflare D1 regional limitation: D1 does not yet support per-row replication guarantees to a specific region. The location_hint=weur flag tells D1 to prefer Western Europe for replica placement. For our use case (security findings, billing records), Western Europe placement is sufficient. For use cases requiring contractual EU-only guarantees at the row level, D1 is not the right choice today — a Neon or Supabase EU-hosted Postgres would be.
The scan pipeline: 6 live engines
AssurePort runs six production scan engines, each implemented as a multi-agent pipeline in runner/src/:
- Web Pentest (13 agents, 5 phases): Recon, auth analysis, exploit attempt, validation, remediation synthesis. Covers OWASP Top 10 against live web applications.
- API Pentest (7 agents): OWASP API Top 10 2023 including BOLA, mass assignment, resource-level auth, rate limiting bypass, JWT algorithm confusion.
- GitHub SAST (7 agents): Secrets, dependency CVEs, IaC misconfiguration, auth anti-patterns, OWASP source patterns.
- Mobile APK (6 agents): Android APK analysis via apktool and jadx, OWASP MASVS L1/L2 coverage, hardcoded credentials, exported components.
- Cloud Pentest (5 agents): AWS, Azure, GCP and Kubernetes. CIS Benchmark alignment, IAM misconfiguration, exposed storage buckets, K8s API surface.
- Email Security: SPF, DKIM and DMARC validation, phishing-kit detection, passive mail infrastructure reconnaissance.
All six engines run on the Fly Frankfurt machine pool. The Dockerfile includes Node.js 24, Chromium (for PDF export), nuclei (for network-layer vulnerability templates), and apktool/jadx (for APK analysis). The image is approximately 2.1GB at build time.
AI model usage: three models, one pipeline
Each scan pipeline uses three Anthropic Claude models in different roles:
- claude-sonnet-4-6 (primary): The workhorse of the pentest pipeline. Handles exploit reasoning, BOLA detection, auth flaw analysis, remediation code generation. Every agent that requires multi-step reasoning uses Sonnet.
- claude-haiku-4-5 (report / validation): Generates the natural-language sections of the scan report, validates RoE (Rules of Engagement) signatures, and processes high-volume structured output. Haiku is 5-10x cheaper per token than Sonnet — using it for structured tasks reduces cost significantly.
- claude-opus-4-7 (fallback): Used when Sonnet hits rate limits on high-concurrency scans. Also used for the rare complex reasoning task that benefits from Opus-class depth (e.g., novel vulnerability chain analysis).
All model calls use cache_control: { type: "ephemeral" } for prompt caching, reducing token cost on repeated system prompt content by 70-90% across a pipeline run. All model calls route through Cloudflare AI Gateway (EU edge) for caching, observability, and rate-limit management.
The trade-offs we accepted
EU-first architecture has costs. We accepted three material trade-offs:
- Latency for non-EU users. Users in the US, APAC, and MENA experience higher API latency than they would from a US-East deployment. For synchronous API calls (auth, dashboard loads), this is perceptible — 80-120ms vs 20-40ms for a US-East baseline. For scan dispatch (async), latency is irrelevant because the response is immediate and the result arrives via polling.
- Higher compute cost per scan. Fly Frankfurt machines cost slightly more than comparable US-based Fly machines. The premium is approximately 8% on our current compute bill. We consider this the cost of EU residency compliance.
- Cloudflare Workers ecosystem constraints. Workers do not support Node.js native modules, file system access, or arbitrary background tasks. Every feature we add must be compatible with the Workers runtime. This excludes some tooling that requires native binaries (e.g., Nuclei runs on Fly, not Workers).
What we got in return: a platform that EU enterprise security teams can procure without a lengthy data residency evaluation, a compliance posture that satisfies GDPR and DORA data handling requirements at the infrastructure level, and a billing model (Polar.sh as Merchant of Record) that handles VAT collection in 47 jurisdictions without a separate tax infrastructure.