MCP vs. custom tool integrations — what we learned building Agenhost
MCP is new, the trade-offs are real, and we made the call in production. Here’s the decision, documented.
We didn't plan to use MCP. The protocol was six months old, the documentation was thin, and half our team's initial reaction was “let's just build a custom tool layer and not depend on something that might not survive.” Three months and two production deployments later, we're glad we ignored that instinct.
This is the story of why we switched, where the switch paid off, and where a custom tool layer is still the right call. It's grounded in one real system—Agenhost—and one real migration, not a benchmark we ran to write a blog post.
What we built first, and why it hurt
Agenhost started life with a hand-rolled tool layer. Every integration—the database reader, the file writer, the external API callers—was a bespoke function wired directly into the agent's execution loop. It worked. For about five tools.
The problem wasn't any single integration. It was the combinatorics. Each new tool meant a new schema to define, a new error-handling path, a new place for the agent's function-calling contract to drift out of sync with the actual implementation. By the time we had a dozen tools, a third of our agent bugs were really integration-contract bugs in disguise.
def register_tool(name, fn, schema):
# every tool: its own schema, its own validation,
# its own error path, its own drift risk
TOOLS[name] = {"fn": fn, "schema": schema}
# ...repeated 12x, each subtly differentWhy MCP changed the math
MCP—the Model Context Protocol—gives agents a standardized, secure way to access tools, data sources, and APIs. Instead of building a custom connector for every integration, you run MCP servers that are modular, reusable, and model-agnostic.
The win wasn't that MCP let us do something we couldn't do before. It's that it moved the integration contract out of our agent loop and into a standard the tooling already understood. The schema, the validation, and the transport stopped being our code to maintain.
The win wasn't new capability. It was deleting the code most likely to break.
Two things got measurably better. First, adding a tool became a server-registration task, not an agent-loop surgery. Second—and this mattered more than we expected—swapping the underlying model from gpt-4o to claude-sonnet stopped touching the tool layer at all. The MCP boundary made the model a configuration detail.
The migration itself
We migrated incrementally, one tool category at a time, running the legacy layer and the MCP layer side by side behind a flag. No big-bang rewrite. The riskiest tools—the ones with write access—went last, after we trusted the read-only ones in production for two weeks.
Where a custom layer is still right
This approach works for systems where the tool surface is well-defined at design time. If your tool set changes frequently at runtime—and for some agentic applications it does—the MCP server management overhead becomes significant, and you may want a more dynamic registry approach instead.
Two more honest caveats. MCP is young: the ecosystem moved under us twice during the build, and we had to pin versions aggressively. And for a system with three tools that will never grow, the protocol is overhead you don't need—the custom layer we started with was the correct choice at five tools. It only became wrong at twelve.
The lesson we took away wasn't “use MCP.” It was that the integration contract is the part of an agentic system most likely to rot—and the architecture that survives is the one that makes that contract someone else's problem to maintain. MCP did that for us. At a smaller scale, a disciplined custom layer does it too.
If you're standing where we were six months ago—protocol looks immature, instinct says build your own—the question to ask isn't “is MCP ready?” It's “how many tools will this system have in a year?” Our answer was “more than twelve.” That made the decision for us.