← All insights
ComplianceMay 26, 20265 min read

EU Data Residency for AI Platforms: The Trade-Offs Your Architecture Has to Settle

Share
Article cover placeholder

TL;DR

The three EU data residency architectures (US redaction, EU models, self-hosted) compared on capability, cost, latency, compliance, and vendor risk. Includes the per-workflow decision framework and the platform features that make the mixed architecture work.

EU Data Residency for AI Platforms: The Trade-Offs Your Architecture Has to Settle

Data residency was a niche topic five years ago. Schrems II, the EU AI Act, sector-specific rules (DORA, NIS2), and customer contractual requirements have made it a defining architecture decision for any AI platform serving EU customers. The choice is not binary; it is a set of trade-offs between model capability, cost, latency, and compliance posture.

This is the framework that gets the decision right per workflow, not per platform.

The three architectures, plainly stated

Architecture A: EU-hosted gateway, US-hosted models with redaction.

Your platform runs in the EU. When an agent needs an LLM call, the gateway redacts PII and other sensitive data before sending the prompt to a US-hosted model (OpenAI, Anthropic via US, Google US endpoints). The response comes back, the platform re-injects the masked tokens, and the response goes to the user. The model never sees the personal data in identified form.

Architecture B: EU-hosted gateway, EU-jurisdiction models.

Your platform runs in the EU. The gateway routes to models hosted in the EU and operated by EU-jurisdiction entities (Mistral, Aleph Alpha, certain Azure OpenAI deployments in EU regions, open-weight models you run yourself). No cross-border transfer of personal data occurs.

Architecture C: Self-hosted.

Your platform and the models run in your own infrastructure, on-premises or in your own cloud tenancy. No third party processes the data. Full control, full responsibility.

These are not mutually exclusive. Mature platforms route per workflow: routine tasks to Architecture A for cost, sensitive tasks to B or C.

The trade-offs

Capability

US frontier models (GPT-4o class, Claude Opus class, Gemini Ultra) still lead on many benchmarks. EU-jurisdiction models have closed the gap on many use cases; for reasoning, coding, and structured extraction, the difference is often invisible. For nuanced multi-language generation, agentic reasoning, and long-context work, US frontier still has an edge that matters for some workflows.

Self-hosted open-weight models (Llama, Qwen, Mistral open releases, smaller fine-tuned models) are genuinely useful for narrow tasks. For broad enterprise agent work, you typically need a frontier model somewhere in the routing path.

Cost

US-hosted frontier models have aggressive pricing because of scale. EU-jurisdiction models are competitive on average and sometimes cheaper, but the long tail of model choices is smaller. Self-hosted models have a different cost structure entirely: capex for the infrastructure, opex for engineering, near-zero per-token cost. Self-hosted wins on cost at very high volumes and loses badly at low volumes.

For a typical mid-market team with mixed workloads, the cost ordering is usually A < B < C, with C only winning above a few million prompts per day.

Latency

Inference latency is dominated by model size and architecture, not by geography for typical loads. Cross-region network latency adds 30-150ms; usually invisible in agent workflows that take seconds end to end. For real-time voice or sub-second interactive workflows, geography starts to matter.

Compliance posture

  • Architecture A: passes a serious Transfer Impact Assessment if the redaction is rigorous and the model provider has signed SCCs / operates under the EU-US Data Privacy Framework. Documentation burden is meaningful but tractable. The DPO conversation is the redaction guarantee.
  • Architecture B: simplest compliance story. No cross-border transfer of personal data, the DPO conversation is short, the documentation is light. Some EU customers contractually require this for their data.
  • Architecture C: maximum control, maximum operational burden. You are now responsible for the model security, the operational reliability, and the maintenance. Required by some heavily regulated customers (defence, certain financial services, certain healthcare contexts).

Vendor risk

  • Architecture A: dependent on the US model provider's stability, pricing, and policy
  • Architecture B: dependent on EU providers who are smaller and earlier in their commercial maturity
  • Architecture C: dependent on your own engineering team's ability to operate model infrastructure

There is no zero-risk option. Each shifts the risk to a different place.

The decision framework per workflow

For each AI workflow you deploy, walk these questions:

  1. What personal data is in the prompt or context? If none, Architecture A is fine and you skip the rest.

  2. Is the personal data redactable without breaking the workflow? A customer support agent processing a ticket can redact the customer name and order ID for the LLM call and re-inject. A workflow that summarises personal communications cannot meaningfully redact and still produce a useful summary.

  3. Does any contractual obligation require EU-only processing? Some customer contracts (especially in financial services and government) require EU-only data residency for personal data. If yes, B or C only.

  4. Does any regulatory obligation require EU-only or self-hosted? Some national rules require certain processing within the country or on certified infrastructure. If yes, B or C with the right certifications.

  5. What is the capability requirement? If the workflow needs a frontier model and no EU-jurisdiction model is good enough, A with redaction is the path. If a smaller model handles it, B or C becomes viable.

  6. What is the volume? Above several million prompts per day, C becomes economically interesting. Below that, A or B is usually right.

The output is a per-workflow routing rule that the platform enforces. Routine internal productivity goes to A. Customer-facing personal data goes to B. Regulated workloads go to C.

What the platform has to support

A platform that handles all three architectures cleanly:

  • Model routing per workflow, per agent, per workspace, per data category
  • PII detection and redaction at the gateway (not as a post-hoc cleanup)
  • Audit logging of which model handled which prompt, with the residency information attached
  • The ability to add or remove models without rewriting agents
  • A clear sub-processor list per architecture so the DPO can review

The platform should not force a single architecture choice. The right answer for a 200-agent estate is almost always a mix.

Where AgentWorks lands

AgentWorks supports all three. Customers route most internal productivity workflows through US frontier models with EU-side redaction (Architecture A). Customer-data workflows go to EU-jurisdiction models (Architecture B). Regulated and high-volume specialised workflows can run on self-hosted open-weight models (Architecture C) through the same agent platform. The audit log is unified across all three so the DPO and the compliance team see one source of truth.

For the deep technical story see the AI workforce platform and models documentation. The point of this article is the decision framework: choose per workflow, document the choice, and revisit when the model landscape moves.

About the author

· Founder, AgentWorks

Erwin Berkouwer is the founder of AgentWorks — an AI agent platform purpose-built for European teams that need EU AI Act-ready governance, multi-LLM choice across OpenAI, Anthropic, Google and Mistral, and transparent per-token € pricing.

Read more about Erwin