PII Redaction at the LLM Gateway: Stop Data Leaks

A support agent pastes a customer email into a chat tool to summarise it. The email contains a name, an address, a credit card complaint, and a medical reference. Three seconds later, that full message is sitting in a US provider's training pipeline waiting room, or worse, in their logs for the next 30 days. The data protection officer finds out three months later when an auditor pulls vendor sub-processor agreements.

This is the dominant pattern of GDPR risk in AI deployments. Not malicious exfiltration, just normal employees doing normal work with tools that send personal data to processors who are not on the approved list. Blocking the tool is not a real answer. Redacting at the gateway is.

The problem: every LLM call is a cross-border transfer

EU data protection treats sending personal data to a US-hosted LLM as a transfer under Chapter V of GDPR. That triggers Standard Contractual Clauses, sub-processor disclosure, transfer impact assessments, and explicit purpose limitation. Most teams have none of this in place for ad-hoc model usage. Even those that do find the controls collapse the moment an employee uses a non-approved tool, or the approved tool routes through a model whose hosting region changes mid-quarter.

The risk is not theoretical. Italian, French, and German regulators have all opened proceedings against AI tool deployments in the past 18 months. The standard pattern: a fine in the range of 50,000 to 300,000 euro for a mid-sized company, plus a public order to suspend the service until controls are fixed. Bigger names face bigger numbers. The Italian DPA fined OpenAI 15 million euro in December 2024 for the data flow patterns behind ChatGPT.

Look at what actually leaves the network during a typical AI workflow:

Names and email addresses in support tickets fed to a summariser
Salary figures and performance notes in HR queries routed to coaching agents
Patient identifiers in medical record drafts going through a transcription model
Customer financial data inside RAG retrieval context
API keys and internal hostnames in code review prompts
Free-text employee opinions in survey analysis runs

Each of those is a regulated category. Each goes to the LLM provider as plain text inside a JSON payload. The provider may or may not retain it, may or may not train on it, may or may not store it in a region you control. From the regulator's perspective, the controller (your company) is on the hook regardless of the provider's promises.

The most painful incident pattern is the slow one. A model provider changes their data handling policy in a release note. The change covers a region you did not realise you were using. Six months later, an audit reveals 4 million prompts sent to that region, none of which were covered by your transfer impact assessment. Now you are not arguing about future controls. You are documenting historic exposure.

The solution: redact at the gateway, route deterministically, log the redaction

The pattern that works at scale puts a policy enforcement point between your applications and the model providers. Every prompt passes through it. Every response passes through it. Three things happen in that hop:

Detection: pattern matching plus a small specialised model identifies personal data, sensitive categories, secrets, and customer-confidential terms. Use deterministic regex for known formats (IBAN, BSN, credit cards, email, phone) and a fast NER model for free-text names and addresses.
Redaction or substitution: replace each detected entity with a stable placeholder. Names become [PERSON_1], addresses become [ADDRESS_1], customer IDs become [CUSTOMER_TKN_abc123]. The mapping is stored encrypted in your gateway database, indexed by the run ID.
Reversal on response: when the model replies, the gateway substitutes the placeholders back to the original values before the response reaches the requesting application. The model never sees the real data; the user never sees a placeholder.

Three details that most homegrown solutions miss:

First, the reversal must be scoped to the original requester. If user A submits a prompt containing customer X's data, only the response to user A may have customer X's data unredacted. A multi-tenant gateway needs a per-run keyring, not a global mapping table. Otherwise a future prompt by user B might surface customer X's redacted name in an unrelated context.

Second, the gateway must also redact retrieval context. Most teams remember to clean the user prompt but forget that the RAG pipeline injects retrieved document chunks straight into the system prompt. If your knowledge base contains call transcripts, those chunks arrive at the model with full PII intact. Redact at the chunking step or at the gateway, not both, but make sure one layer owns it.

Third, tool calls leak as much as prompts. When an agent calls a CRM tool to look up customer data and feeds the result back to the model, the JSON response often carries the entire customer record. Apply redaction to tool outputs before they re-enter the model context window. Treat every input boundary, not just the first one, as a potential leak.

Fourth, attachments are usually invisible to the redaction layer. PDFs, screenshots, voice transcripts, and spreadsheets get uploaded straight to multimodal endpoints with the personal data perfectly intact. A serious gateway runs OCR on images, transcribes audio, parses PDFs, and applies the same redaction rules to the extracted text before the file ever leaves the perimeter. Without this, you have a glass wall: airtight against prompts, transparent to file uploads.

Fifth, watch the streaming response. When tokens arrive incrementally, the placeholder reversal must apply as the stream is assembled, not only at the end. If the application renders partial responses to users (typing animation, live drafting), reverse each token boundary inline. Buffering the entire response before reversal looks safe but adds 3 to 5 seconds of perceived latency, which the product team will route around within a week.

Expert tip: never store the placeholder mapping in the same row as the prompt. Use a separate table with a short TTL (default 30 days for non-regulated, six months for high-risk). After the TTL expires, the mapping is cryptographically erased. The prompt log keeps the redacted version for compliance, the original data is unrecoverable, and your right-to-erasure flow becomes a single delete.

For regulated workflows where redaction is not enough, the gateway also routes deterministically. A prompt that contains health data is forced through a model hosted in the EU under a signed Data Processing Agreement, regardless of cost optimisation logic. A prompt with no sensitive data can route freely to the cheapest capable provider. The routing decision is logged alongside the redaction map, giving you a per-request audit trail that holds up to scrutiny.

Practical applications and ROI

Gateway redaction is not a defensive line item. Teams that ship it correctly see hard, measurable returns:

Workflow	Without gateway redaction	With gateway redaction	Outcome
Customer support summarisation	Blocked by DPO until DPIA done (8 weeks)	Approved within 5 days	Time-to-production cut 80%
HR survey analysis	Manual anonymisation by analyst	Automatic redaction at gateway	6 hours per survey saved
Medical records intake	Cannot use US-hosted LLMs at all	Pseudonymised data flows freely	New use case unlocked
Sales call analysis	Sampling only, 5% of calls	Full coverage, 100% of calls	20x more training data
Internal helpdesk	Shadow IT using ChatGPT.com	Centralised, governed gateway	Audit findings closed

The medical row is the unlock. Most healthcare and insurance teams cannot use any US-hosted LLM directly, full stop. Gateway redaction with EU routing turns a hard no into a normal procurement decision. The same logic applies to public sector, financial services, and legal verticals.

Another often-missed return: redaction logs themselves are valuable. When the gateway records what it redacted (categories and counts, not the data), the security team gets a real-time map of where personal data flows inside the company. That heat map informs DPIAs, retention policies, and access-control decisions far better than annual self-reported surveys. Several enterprise customers describe this telemetry as the first time they actually knew what personal data their AI workloads touched.

The cost side is small. A production-grade redaction gateway adds 80 to 150 milliseconds of latency per request. For chat workloads where the model itself takes 2 to 8 seconds to respond, that overhead is invisible to the user. CPU cost is roughly 0.0003 euro per request at scale. Compared to a single DPO escalation, the lifetime cost is rounding noise.

How to get started

Four steps move you from no gateway to full enforcement.

Step 1: inventory current outbound LLM traffic. Set up a one-week egress log on outbound calls to OpenAI, Anthropic, Google, Mistral, Cohere, and the dozen other endpoints employees actually use. The findings are always larger than people expect. Pair this with a survey: which tools do teams use that nobody has approved.

Step 2: pick a platform that includes a redaction gateway natively. Building it from scratch is 3 to 6 months for a strong team, plus ongoing model updates as new PII patterns emerge. The AgentWorks AI workforce platform ships the gateway as a default component with EU-hosted detection models and configurable per-tenant policies.

Step 3: define your redaction policy per data category. Names: redact. Addresses: redact. Internal customer IDs: pseudonymise reversibly. Free text: scan and flag. Document the policy once, store it as a versioned artefact in the gateway, and it applies across every workflow your team ships.

Step 4: roll out by use case, not by tool. Start with one high-value workflow (support summarisation, sales call analysis) and prove the redaction does not break the model's usefulness. Measure quality before and after. Once the team trusts the gateway, expand to the next workflow. Trying to switch every AI tool overnight produces resistance and shortcuts.

Teams that follow this sequence reach gateway-enforced compliance in 4 to 8 weeks. The DPO stops being a blocker. The legal team stops re-litigating each new use case. The CFO stops watching incident reserves grow.

Closing

PII redaction at the gateway is the single highest-leverage compliance investment you can make in your AI stack. It does not slow down the team, it does not require the legal department to approve every new use case, and it turns regulator visits from existential threats into routine evidence pulls. The platforms that get this right become the obvious choice for any company that has a data protection officer on the org chart. The ones that do not will spend the next two years fighting the same fire over and over.

See how it works in practice. Book a 15-minute platform walkthrough at agent-works.ai/contact.

PII Redaction at the LLM Gateway: Stop Data Leaks

PII Redaction at the LLM Gateway: Stop Data Leaks

The problem: every LLM call is a cross-border transfer

The solution: redact at the gateway, route deterministically, log the redaction

Practical applications and ROI

How to get started

Closing

About the author

AI Sovereignty: When EU Teams Actually Need On-Premise

NIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss

AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain

PII Redaction at the LLM Gateway: Stop Data Leaks

The problem: every LLM call is a cross-border transfer

The solution: redact at the gateway, route deterministically, log the redaction

Practical applications and ROI

How to get started

Closing

About the author

Related articles

AI Sovereignty: When EU Teams Actually Need On-Premise

NIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss

AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain