AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain
TL;DR
12 AI-specific procurement questions every EU buyer should ask before signing: model routing, data residency, PII handling, AI Act classification, audit logs, deprecation, injection defence, training data, no-training commitment, incident response, change management, and exit patterns.
AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain
Procurement teams have mature templates for SaaS vendor due diligence. SOC 2, ISO 27001, GDPR Article 28 processor terms, data residency. Those questions still matter. They are not enough for AI vendors. The AI-specific risks — model provider chains, training data exposure, output liability, model deprecation, EU AI Act conformity — need their own due diligence pass.
These are the 12 questions to add to every AI vendor evaluation. With what good answers look like and what the red-flag answers sound like.
1. Who are your model providers and what is your routing logic?
A serious AI vendor uses multiple model providers and routes per task. They name the providers (OpenAI, Anthropic, Google, Mistral, plus their own self-hosted options), describe the routing logic per workflow, and let you configure it for your data sensitivity.
Red flag: "We use the best models for the job." That tells you nothing.
Good answer: a sub-processor list per workflow type, a routing policy you can review, and the ability to constrain routing to specific providers or geographies for your data.
2. What is your data residency story per data category?
Not "we are EU-hosted." Per data category, where does the data live at rest, where does it move in transit, and which sub-processors see it.
Red flag: a single bullet point saying EU-hosted without specifying which infrastructure, which sub-processors, and what happens when a model provider's region is unavailable.
Good answer: a data flow diagram per workflow, named regions, named sub-processors, failover behaviour documented, and the ability to constrain residency per workspace.
3. How is PII detected and handled before reaching third-party models?
For any vendor using third-party LLMs to process your personal data, the detection and redaction (or routing) of PII is the single biggest TIA question.
Red flag: "We rely on the model provider's data protection commitments."
Good answer: gateway-side PII detection with measurable precision and recall, redaction or routing rules per data category, and audit log entries showing the redaction operation per prompt.
4. What is your EU AI Act risk classification posture?
The vendor should know whether their system is high-risk under Annex III or II, whether you (the deployer) inherit obligations, and what conformity assessment work they have done.
Red flag: "We are working on it" or "The AI Act doesn't really apply to us."
Good answer: a classification analysis per use case the vendor enables, the conformity assessment status (planned, in progress, completed) where applicable, the technical documentation file structure, and the deployer guidance that helps you with your own obligations.
5. What audit-log evidence do you produce per inference?
For both AI Act Article 12 record-keeping and your own GDPR accountability, the audit log content matters.
Red flag: "We log API calls."
Good answer: per-inference records with timestamp, input (or PII-redacted hash of input), prompt template version, model and version, output, the human approval or override, retention policy, and export format. See our Article 12 logging guide for the full content checklist.
6. How do you handle model deprecation?
Model providers deprecate models. Your vendor's agents are tuned to specific models. When the model goes away, what happens to the agent?
Red flag: "We will figure it out when it happens."
Good answer: a documented deprecation handling process, a model abstraction layer that allows substitution without rewriting agents, prior experience with deprecations (the vendor names them and describes how they handled them), and a notice period to you for any deprecation that affects your workflows.
7. What is your prompt and tool injection defence?
Agentic systems that can be tricked into running unintended actions are a real risk. Especially for agents with tool access (writing to systems, sending messages, executing code).
Red flag: "Our prompts are robust" with no specifics.
Good answer: documented defence layers (prompt isolation, output validation, tool access controls, content moderation, anomaly detection), a published responsible disclosure program, and recent test results from internal or external red-teaming.
8. What is the training data provenance for models you operate?
For self-hosted models or fine-tunes the vendor provides, the training data provenance affects your copyright exposure and your bias profile.
Red flag: vague answers about "publicly available data."
Good answer: documented training data sources, opt-out compliance status, licence audit, and the bias evaluation methodology the vendor uses. Less applicable if the vendor only routes to third-party models, more critical if they ship their own.
9. How do you handle customer data in training?
Your data should not train someone else's model unless you explicitly opted in. This is a common gotcha in the standard terms.
Red flag: silence on this in the DPA, or an opt-out buried in a sub-clause.
Good answer: explicit no-training-on-customer-data commitment as a default, opt-in (not opt-out) for any use of customer data in model improvement, and audit rights to verify.
10. What does your incident response look like for AI-specific incidents?
Standard cyber incident response is well-defined. AI-specific incidents (prompt injection that exfiltrates data, model hallucination causing material harm to a customer, bias issues surfaced in production) have their own response patterns.
Red flag: AI incidents handled by the generic incident response with no AI-specific runbook.
Good answer: AI-specific incident classification, a runbook that includes model rollback / disable, customer notification timeline, regulator notification triggers under the AI Act, and post-incident reporting.
11. What is your model versioning and change-management practice?
When the vendor updates a model or a prompt, what changes do you see? When is your DPIA or conformity assessment invalidated?
Red flag: "We continuously improve our models" with no notification practice.
Good answer: documented versioning of models and prompts, change notifications to customers for material changes, change classification (immaterial vs material), and the policy that gives you time to re-assess before a material change rolls out.
12. What is your exit pattern?
Can you get your data out, including the agent definitions and audit logs, in a usable format if you leave?
Red flag: data exports limited to the agent outputs, no export of the agent definitions or knowledge base.
Good answer: documented data portability covering customer data, agent definitions, knowledge base content, audit logs, and the export format and timeline. Plus a transition plan that does not leave you stranded.
How to use the 12 questions
Send them in the RFI. The answers tell you two things: whether the vendor has thought about AI-specific risks, and whether they have built the controls or are bluffing. A vendor that struggles with these questions in 2026 is a vendor whose product was built before the regulatory landscape shifted; they may catch up, but their roadmap is the risk you are buying.
The vendors who answer well will not be the cheapest. They will be the ones whose contract you actually sign without a six-month security and compliance negotiation, and whose system survives your first regulator inspection without rework.
About the author
Erwin Berkouwer · Founder, AgentWorks
Erwin Berkouwer is the founder of AgentWorks — an AI agent platform purpose-built for European teams that need EU AI Act-ready governance, multi-LLM choice across OpenAI, Anthropic, Google and Mistral, and transparent per-token € pricing.
Read more about ErwinRelated articles
Read article: AI Sovereignty: When EU Teams Actually Need On-Premise ComplianceMay 26, 20265 min readAI Sovereignty: When EU Teams Actually Need On-Premise
AI sovereignty is a political term that hides a real technical decision. When on-premise AI is the right answer, when managed EU is enough, and how to choose without overspending on either side.
Read more →Read article: NIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss ComplianceMay 26, 20266 min readNIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss
NIS2 expanded the EU cybersecurity perimeter to thousands of organisations. AI systems are part of that perimeter. The overlap with the EU AI Act and what it means for your AI agent operations.
Read more →Read article: EU AI Act Article 14: What Human Oversight Actually Looks Like in Production ComplianceMay 26, 20265 min readEU AI Act Article 14: What Human Oversight Actually Looks Like in Production
Article 14 requires human oversight of high-risk AI systems. Most teams interpret that as "a human reviews the output." That is not what the article says. Here is what oversight looks like when the regulator inspects.
Read more →