EU AI Act Annex III: Classify Your AI System Before the Auditor Does
TL;DR
The decision tree for EU AI Act risk classification: walking Annex II and Annex III, applying the carve-outs correctly, and producing the documentation that defends the classification under regulator inspection. Includes the misclassifications that most commonly cause fines.
EU AI Act Annex III: Classify Your AI System Before the Auditor Does
The EU AI Act applies different obligations to different risk classes. Most of the operational pain is in the high-risk category, defined by Annex III plus the products covered in Annex I. The question that decides everything for your AI system is the classification: high-risk, limited-risk, or minimal-risk. Get it wrong upward and you over-engineer compliance for systems that do not need it. Get it wrong downward and the regulator finds the gap during inspection.
The Annex III categorisation is not always obvious. This is the decision tree that handles 80% of the calls without a legal opinion.
The four risk classes in plain language
Prohibited (Article 5): AI systems that are simply not allowed in the EU. Includes manipulative techniques exploiting vulnerabilities, social scoring by public authorities, real-time remote biometric identification in publicly accessible spaces for law enforcement (with narrow exceptions), and predictive policing based solely on profiling. If your system might fall here, stop and get specialist advice immediately.
High-risk (Annex II products + Annex III use cases): AI systems that, while not prohibited, carry significant risk to safety or fundamental rights. The full compliance regime applies: risk management, data governance, technical documentation, record-keeping, transparency, human oversight, accuracy and robustness, conformity assessment, registration, post-market monitoring.
Limited-risk (Article 50): AI systems that interact with humans, emotion-recognition, biometric categorisation, or generate synthetic content (chatbots, deepfakes). Transparency obligations apply: users informed they are interacting with AI, synthetic content labelled.
Minimal-risk: everything else. No specific AI Act obligations, though GDPR and other laws still apply.
The Annex III categories that catch most enterprise deployments
Annex III is the list of high-risk use cases under the AI Act. The categories that most enterprise teams need to understand:
-
Biometrics (where not prohibited): remote biometric identification, biometric categorisation by sensitive characteristics, emotion recognition.
-
Critical infrastructure: safety components in road traffic management, water/gas/heating/electricity supply, digital infrastructure.
-
Education and vocational training: AI for admissions, evaluation of learning outcomes, assessment of appropriate level of education, monitoring of prohibited behaviour during tests.
-
Employment, workers management, and access to self-employment: recruitment, evaluation, promotion, termination decisions, task allocation based on individual behaviour or personal traits, performance and behaviour monitoring.
-
Access to and enjoyment of essential private services and essential public services and benefits: eligibility assessment for public benefits, creditworthiness assessment, credit scoring (with exception for fraud detection), risk assessment and pricing for life and health insurance, emergency call dispatch and triage.
-
Law enforcement: profiling, polygraphs, evidence reliability assessment, criminal analytics, several other narrowly defined uses.
-
Migration, asylum, and border control management: polygraphs and similar, security risk assessment, asylum and visa examination, migration related profiling.
-
Administration of justice and democratic processes: AI assisting judicial authorities in researching and interpreting facts and law, AI used to influence the outcome of elections or referenda (note: AI tools used to organise/optimise political campaigns are not in scope).
The two carve-outs you want to know:
- An AI system in Annex III is not high-risk if it does not pose a significant risk of harm to health, safety, or fundamental rights, because it performs a narrow procedural task, improves the result of a previously completed human activity, detects decision-making patterns without aiming to replace or influence the prior human assessment without proper human review, or performs a preparatory task. The provider must document this assessment.
- AI systems used solely for the purpose of profiling natural persons in the Annex III categories are always high-risk (no carve-out).
The decision tree for a new agent
For each AI agent you are deploying, walk this tree:
-
Is the system one of the prohibited practices in Article 5? If yes, stop deployment and seek legal advice. If no, continue.
-
Is the system a safety component of a product covered by Annex II (EU product safety law: machinery, toys, medical devices, etc.)? If yes, the system is high-risk under the relevant product law overlay; full conformity assessment applies. If no, continue.
-
Does the system fall under any Annex III category? Walk the categories above. If yes, the system is presumptively high-risk. If no, the system is limited or minimal-risk and you skip to step 6.
-
Does the Annex III carve-out apply? Is the system genuinely narrow procedural, an improvement on prior human work, pattern detection only, or preparatory? If yes, document the assessment and the system is not high-risk (but be honest: most agents do not qualify). If no, the system is high-risk.
-
Is the system used for profiling of natural persons? If yes, regardless of carve-out, the system is high-risk.
-
Does the system interact with humans, generate synthetic content, do biometric categorisation, or do emotion recognition? If yes, the limited-risk transparency obligations apply. If no, minimal-risk obligations only.
This tree handles the majority of calls. The edge cases (when does "preparatory task" apply? what counts as "essential service"?) need specialist input, but the bulk of the classification work is mechanical.
What good classification documentation looks like
For each agent, record:
- The agent's purpose and intended use, in business language
- The Annex II / Annex III categories considered and the conclusion per category, with reasoning
- Any carve-out claimed and the supporting analysis
- The risk class conclusion and the date of assessment
- The name and role of the person who signed the classification
- A trigger list for re-classification: changes that would require revisiting (model swap, new use case, new data source)
The documentation lives in the system's technical file and updates when the system materially changes. The DPO and the AI Act compliance lead share ownership.
The most common misclassifications
The patterns that get teams in trouble:
-
Recruitment screening tools classified as "limited-risk" because "the human makes the decision." The Annex III text is explicit; if the system is intended for recruitment selection, it is high-risk regardless of how the output is used. The human-in-the-loop is part of the oversight regime, not a get-out clause.
-
Credit scoring for fraud detection classified as not in scope because "we are doing fraud detection." The fraud-detection exception in Annex III for creditworthiness is narrow; many systems that look like fraud detection actually use the score for credit eligibility too. The classification needs to cover the actual use, not the named purpose.
-
Performance monitoring of employees classified as limited-risk because "we are not making employment decisions with it." Annex III covers monitoring "behaviour and performance" of workers explicitly. If the agent watches what employees do, it is high-risk.
-
Multi-purpose agents misclassified as the highest of any subset use when in practice they should be classified per use case. If you have one agent doing three things, two of which are minimal-risk and one of which is high-risk, the agent is high-risk for the regulated use and the others inherit some of the obligations through proximity.
Why this matters more than people think
Classification decides the entire compliance scope. High-risk systems need a conformity assessment before deployment, registration in the EU database (for most categories), and continuous post-market monitoring. Limited-risk systems just need transparency. Minimal-risk systems need nothing AI-Act-specific.
A misclassified high-risk system that ships without conformity assessment is in non-compliance from day one, with fines up to EUR 15 million or 3 percent of global turnover. A correctly classified high-risk system that ships with full conformity assessment is a managed deployment. The work to get the classification right is hours; the cost of getting it wrong is annual revenue.
This is the work the compliance team earns its keep on. Worth doing carefully, and worth documenting in a way the regulator can read on first request.
About the author
Erwin Berkouwer · Founder, AgentWorks
Erwin Berkouwer is the founder of AgentWorks — an AI agent platform purpose-built for European teams that need EU AI Act-ready governance, multi-LLM choice across OpenAI, Anthropic, Google and Mistral, and transparent per-token € pricing.
Read more about ErwinRelated articles
Read article: AI Sovereignty: When EU Teams Actually Need On-Premise ComplianceMay 26, 20265 min readAI Sovereignty: When EU Teams Actually Need On-Premise
AI sovereignty is a political term that hides a real technical decision. When on-premise AI is the right answer, when managed EU is enough, and how to choose without overspending on either side.
Read more →Read article: NIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss ComplianceMay 26, 20266 min readNIS2 and AI Systems: The Cybersecurity Overlap Most Compliance Teams Miss
NIS2 expanded the EU cybersecurity perimeter to thousands of organisations. AI systems are part of that perimeter. The overlap with the EU AI Act and what it means for your AI agent operations.
Read more →Read article: AI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain ComplianceMay 26, 20265 min readAI Vendor Due Diligence for EU Buyers: 12 Questions That Save You a Year of Pain
Most AI procurement processes are still copy-paste of generic SaaS due diligence. The 12 AI-specific questions every EU buyer should ask before signing, and what good answers look like.
Read more →