If your CFO asks, “What does AI cost us?” and the answer is “it depends,” you are not alone. Token pricing - charging by the language model’s input and output length - is the de facto meter for frontier models, but few teams translate tokens into budget lines they can defend. This guide demystifies tokens, walks through example calculations, and shows how AgentWorks helps operators connect usage to outcomes.

Tokens in plain language

Models process text as tokens, rough subword chunks - not strictly “words.” A short English sentence might be 15–40 tokens; a page of dense prose can be 500–800. Providers bill input tokens (prompt + context you attach) and output tokens (the completion). Retrieval-heavy workflows inflate input tokens because you prepend document snippets.

Why tokens spike silently

Large system prompts duplicated on every call.
Verbose JSON passed between chained agents.
Re-tries on flaky tools that re-send the same context.

Worked examples you can copy

Example A - Support macro (short reply)
Input 800 tokens + output 200 tokens at blended effective €0.003 / 1K tokens → about €0.003 per interaction. At 5,000 tickets/month → ~€15 in model variable cost before platform fees - tiny compared to labor, but measurable.

Example B - Research + long draft pipeline
Input 6,000 tokens + output 1,500 tokens per run at €0.006 / 1K blended → ~€0.045 per article. One hundred articles/month → ~€4.50 variable - again modest until you add image tools, embeddings, or multiple retries.

Example C - High-context RAG
If each query ships 12k tokens of retrieved text, your input side dominates. Cutting chunk size 20% can save more than swapping models - a reminder to optimize architecture, not only unit price.

How AgentWorks surfaces real-time costs

AgentWorks focuses on per-template and per-run visibility so owners can see which workflows consume tokens and how approvals affect throughput. Pair that discipline with Token Pricing Explained: Calculate Your AI Costs when you socialize numbers internally.

Budgeting tips that actually stick

Attach monthly caps per workspace or template.
Separate sandbox spend from production keys.
Review top 10 expensive runs weekly for the first quarter after launch.

Negotiating vendor bills beyond tokens

Remember egress, storage, embeddings, and tool calls (search APIs, CRM writes). A token bill is only one line in the spreadsheet. For packaging comparisons - including bundled assistants versus standalone platforms - see Copilot vs AgentWorks.

Ready to model costs with templates instead of ad-hoc chats? Open an account and connect your first workflow with metering you can explain in a finance review.

Tokens and quality: the hidden tradeoff

Cheaper models can increase token efficiency but require more iterative prompting - which can erase savings. Conversely, premium models may reduce rework. The right answer is task-specific: classify workflows, run a two-week bake-off on real prompts, and lock choices per template.

Embeddings and retrieval costs

Vector indexes incur compute and storage beyond raw chat tokens. Budget line items should include reindex jobs after major document updates. If finance only watches chat completions, RAG programs look artificially cheap until month-end surprises land.

Showing finance a dashboard they trust

Translate tokens into FTE hours saved using conservative assumptions. If a support macro saves four minutes per ticket and you deflect two hundred tickets weekly, that is thirteen hours - before you count happier customers. Pair those narratives with pricing conversations so procurement sees both unit economics and outcomes.

Governance costs money too - budget it

Logging, retention, and redaction are not free, but they are cheaper than regulatory findings or brand incidents. Treat compliance features as insurance with measurable premium - especially when operating under EU expectations summarized in EU AI Act 2026 requirements.

Next step

Run a thirty-day metering pilot on one template, publish the results internally, then expand. Sign up, connect data sources, and let the numbers tell the story finance wants to hear.

Token Pricing Explained: Calculate Your AI Costs

Tokens in plain language

Why tokens spike silently

Worked examples you can copy

How AgentWorks surfaces real-time costs

Budgeting tips that actually stick

Negotiating vendor bills beyond tokens

Tokens and quality: the hidden tradeoff

Embeddings and retrieval costs

Showing finance a dashboard they trust

Governance costs money too - budget it

Next step

About the author

AI Total Cost of Ownership: The 12-Month Model That Catches the Surprises

AI Workforce Sizing: How Many Agents Do You Actually Need

CFO Guide to AI Agent ROI: A Calculation That Survives Board Review

Tokens in plain language

Why tokens spike silently

Worked examples you can copy

How AgentWorks surfaces real-time costs

Budgeting tips that actually stick

Negotiating vendor bills beyond tokens

Tokens and quality: the hidden tradeoff

Embeddings and retrieval costs

Showing finance a dashboard they trust

Governance costs money too - budget it

Next step

About the author

Related articles

AI Total Cost of Ownership: The 12-Month Model That Catches the Surprises

AI Workforce Sizing: How Many Agents Do You Actually Need

CFO Guide to AI Agent ROI: A Calculation That Survives Board Review