Does using an MCP server always mean the LLM has broad access to my systems?

No, but by default many implementations end up that way because they authenticate the whole server with one shared credential and expose broad, free-form tools. Scoping each tool to a narrow, named operation with its own limited credential prevents this without changing how the agent calls the tool.

What is the difference between prompt injection and tool poisoning in MCP?

Prompt injection is malicious text that coerces the model into taking an unintended action, often arriving through content the agent retrieves, such as a document, a web page, or an API response. Tool poisoning is a specific case where the malicious content is embedded in a tool's description or output to manipulate which tool the model chooses or what arguments it sends.

Do I need human approval on every MCP tool call?

No. Reserve human-in-the-loop approval for actions with real consequences, such as financial transactions, bulk data changes, external communications, or destructive operations. Read-only and low-risk tools can run without a checkpoint, provided they are properly scoped and logged.

Can I trust MCP servers pulled from public registries?

Treat them as unreviewed third-party code, not vetted infrastructure. Run them in an isolated container, restrict their network egress to only what the integration needs, and review the server's source or maintainer reputation before granting it any write-capable credential.

Securing MCP Servers in Production: Least Privilege

Your agents now call tools through MCP servers instead of hardcoded API integrations. That flexibility comes with a new attack surface: every MCP server is a bridge between an LLM and a real system, and most teams wire that bridge with a single shared credential and no scoping. If the agent is compromised through a malicious prompt, the blast radius is whatever that credential can touch.

Least privilege is the fix, and it applies at four layers most teams only apply at one: the server process, the tool schema, the credential, and the network path.

Why MCP breaks the old trust model

A traditional API integration has a fixed set of calls, reviewed once at build time. An MCP server exposes tools dynamically, and the LLM decides at runtime which tool to call and with what arguments, based on text it read. That text can come from a user, a document, a web page, or the output of another tool. If any of it contains instructions, the model may follow them.

This is a confused deputy problem: the agent has legitimate access to a tool, but an attacker manipulates it into using that access on the attacker's behalf. Security researchers have documented this as tool poisoning and resource poisoning: malicious content embedded in a file, ticket, or API response that the agent later retrieves and treats as trustworthy input rather than data. The MCP specification does not require capability attestation, so a server can claim broader permissions than a client expects, and there is no built-in origin check on server-initiated requests. None of this is theoretical; it shows up in red-team reports against production deployments.

The practical consequence: you cannot secure an MCP integration by securing the LLM's judgment. You secure it by shrinking what each tool call is physically capable of doing, so a bad decision has a small consequence.

Four layers of least privilege

1. Process isolation

Run each MCP server in its own sandboxed process or container, not as a library loaded into your main application. Container isolation means a compromised server cannot read your application's memory, reach other MCP servers on the same host, or write to the host filesystem. This also limits what a malicious or unmaintained third-party MCP server, pulled from a public registry, can do before you have vetted it. Treat every external MCP server as untrusted code until it has been through review, the same way you would treat a dependency with install scripts.

2. Scoped tool schemas, not omnibus tools

The single most common finding in MCP security reviews is a tool that accepts free-form input and dispatches to arbitrary backend operations at runtime: a "run_query" tool that takes any SQL string, or a "call_api" tool that takes any endpoint. These collapse your entire permission model into "whatever the model decides to send."

Replace them with narrow, named tools: get_invoice_by_id, update_ticket_status, list_open_orders. Each tool's input schema, output shape, and downstream credential should be the minimum the task requires. This is more work upfront and it is the only version of MCP tooling that survives a security audit.

3. Scoped tokens per tool, not one key for the server

Most early MCP deployments authenticate the whole server with a single service-account API key, shared across every user and every tool the server exposes. That means you cannot tell which user triggered which downstream action, and a leaked key grants access to everything the server can do.

Two changes fix this. First, use per-request identity: pass the calling user's authenticated identity through to the downstream call, not a static service credential, so every action is attributable. Second, mint scoped tokens per tool rather than per server: the tool that reads a calendar should hold a token that can only read that calendar, not one that can also send email. Start tools at read-only or discovery scope by default and require an explicit elevation step, logged and reviewable, before granting write or destructive scopes.

4. Network egress allowlisting

An MCP server that can reach anything on the internet is a data exfiltration channel waiting to be used. Put each server behind an explicit egress allowlist: the specific hosts it needs, such as a CRM API, a ticketing system, or an internal database, and nothing else. This turns a successful prompt injection into a contained failure: the agent might be tricked into attempting a bad call, but it has nowhere to send the result except the systems you already approved.

Treat tool output as untrusted input

The layer teams skip most often is what happens after a tool call returns. Output from a tool, such as a web page, a document, or a database row, flows back into the model's context and can itself contain instructions telling the model to ignore its prior steps and take a new action. The model has no reliable way to distinguish data it retrieved from instructions it should follow unless you build that distinction for it.

Two practical mitigations: strip or escape markup and instruction-like phrasing from tool outputs before they re-enter the context, and add per-tool audit logging that records the exact arguments and the exact output for every call, independent of what the model reports doing. When something goes wrong, the audit log, not the model's own narration, is what tells you what actually happened.

Rotate credentials issued to MCP servers on a fixed schedule, and immediately on any server update or dependency change. A long-lived key on a server you don't control the code of is a standing liability, not a one-time risk you accepted at setup.

What this looks like in a managed platform

Building all four layers by hand, meaning sandboxing, scoped tool schemas, per-tool token minting, egress control, output sanitization, and audit logging, is a real engineering program, not a checklist you finish in an afternoon. It is also exactly the boundary between an agent that is safe to give to non-engineers and one that requires a security team's sign-off for every new integration.

This is the case for handling tool access at the platform layer rather than reimplementing it per agent. AgentWorks masks PII at the gateway before it reaches a model, keeps an append-only audit trail of every tool call, and gates sensitive actions behind human-in-the-loop approval, the same control points this article walks through, applied consistently across every agent instead of once per project. See the AgentWorks agent platform for how tool access, approval gates, and audit logging are wired together by default.

Getting started

Inventory every MCP server your agents currently call, and list the credential each one uses today.
Split any omnibus "do anything" tool into named, narrow-scope tools before adding new capabilities.
Move from one shared server credential to per-tool scoped tokens, starting with read-only scopes and elevating only what is used.
Add an egress allowlist and an output-sanitization step in front of every tool response before it re-enters the model's context.

None of these steps require rewriting your agent logic. They require treating each MCP server the way you would treat a new microservice with access to production data: reviewed, scoped, sandboxed, and logged, before it goes live.

Securing MCP Servers in Production: Least Privilege

Securing MCP Servers in Production: Least Privilege

Why MCP breaks the old trust model

Four layers of least privilege

1. Process isolation

2. Scoped tool schemas, not omnibus tools

3. Scoped tokens per tool, not one key for the server

4. Network egress allowlisting

Treat tool output as untrusted input

What this looks like in a managed platform

Getting started

About the author

How to Reduce AI Hallucinations with Cited Answers

GPT-5 vs Claude vs Gemini: Picking the Right Model

Connect Notion & Confluence to Your AI Agents

Securing MCP Servers in Production: Least Privilege

Why MCP breaks the old trust model

Four layers of least privilege

1. Process isolation

2. Scoped tool schemas, not omnibus tools

3. Scoped tokens per tool, not one key for the server

4. Network egress allowlisting

Treat tool output as untrusted input

What this looks like in a managed platform

Getting started

About the author

Related articles

How to Reduce AI Hallucinations with Cited Answers

GPT-5 vs Claude vs Gemini: Picking the Right Model

Connect Notion & Confluence to Your AI Agents