I read through several of the top level pages, then SQLite, but still had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition, example, or scope of capability that it is meant to imply.
After reading the Python server tutorial, it looks like there is some tool calling going on, in the old terminology. That makes more sense. But none of the examples seem to indicate what the protocol is, whether it's a RAG sort of thing, do I need to prompt, etc.
It would be nice to provide a bit more concrete info about capabilities and what the purposes is before getting into call diagrams. What do the arrows represent? That's more important to know than the order that a host talks to a server talks to a remote resource.
I think this is something that I really want and want to build a server for, but it's unclear to me how much more time I will have to invest before getting the basic information about it!
The gist of it is: you have an llm application such as Claude desktop. You want to have it interact (read or write) with some system you have. MCP solves this.
For example you can give the application the database schema as a “resource”, effectively saying; here is a bunch of text, do whatever you want with it during my chat with the llm. Or you can give the application a tool such as query my database. Now the model itself can decide when it wants to query (usually because you said: hey tell me what’s in the accounts table or something similar).
It’s “bring the things you care about” to any llm application with an mcp client
We definitely hope this will solve the NxM problem.
On tools specifically, we went back and forth about whether the other primitives of MCP ultimately just reduce to tool use, but ultimately concluded that separate concepts of "prompts" and "resources" are extremely useful to express different _intentions_ for server functionality. They all have a part to play!
I think this where the real question is for me. When I read about MCP, the topmost question in my mind is "Why isn't this just tool calling?" I had difficulty finding an answer to this. Below, you have someone else asking "Why not just use GraphQL?" And so on.
It would probably be helpful for many of your readers if you had a focused document that addressed specifically that motivating question, together with illustrated examples. What does MCP provide, and what does it intend to solve, that a tool calling interface or RPC protocol can't?
Yeah even I don't understand how it exactly solves the NXM problem (which translates to having M different prompts for N different llms. corerct me if I'm wrong please)
N (LLM clients/vendors) x M (tools/tool suppliers).
The N×M problem may simply be moved rather than solved:
- Instead of N×M direct integrations
- We now have N MCP client implementations
- M MCP server implementations
This feels similar to SOAP but might be more of a lower level protocol similar to HTTP itself. Hard to tell with the implementation examples being pretty subjective programs in python.
It seems to support your ask, as much as a protocol can. Having read all the docs and looked through some code, my mental model is:
- A host never talks to a server directly, only via a Client (which is presumably a human). The host has or is the LLM (app).
- A server only supplies context data (readonly), in the form of tool call, direct resource URL, or pre populated prompt. It can call back to a client directly, for example to request something from the hosts LLM.
- A client sits in the middle, representing the human in the loop. It manages the requests bidirectionally
It seems mostly modeled around the security boundaries, rather than just AI capabilities domains. The client is always in the loop, the host and server do not directly communicate.
How can an add on that works with arbitrary "servers" tell the difference between these two tools? Without being able to tell the difference you can't really build a generic way to ask for confirmation in the application that is using the server...
{
name: "create_directory",
description:
"Create a new directory or ensure a directory exists. Can create multiple " +
"nested directories in one operation. If the directory already exists, " +
"this operation will succeed silently. Perfect for setting up directory " +
"structures for projects or ensuring required paths exist. Only works within allowed directories.",
inputSchema: zodToJsonSchema(CreateDirectoryArgsSchema) as ToolInput,
},
{
name: "list_directory",
description:
"Get a detailed listing of all files and directories in a specified path. " +
"Results clearly distinguish between files and directories with [FILE] and [DIR] " +
"prefixes. This tool is essential for understanding directory structure and " +
"finding specific files within a directory. Only works within allowed directories.",
inputSchema: zodToJsonSchema(ListDirectoryArgsSchema) as ToolInput,
},
Great work on the protocol!!
I am looking for some examples of creating my own custom client with the Anthropic API leveraging MCP, but I could not find any. Pretty much want to understand how Claude Desktop is integrating with MCP Server along with Anthropic API
Can you provide some pointers about the integration?
e.g.
At first glance it seems to be a proposed standard interface and protocol for describing and offering an external system to the function calling faculity of an LLM.
> had no idea what was meant by "context" as it's a highly ambiguous word and is never mentioned with any concrete definition
(forgive me if you know this and are asking a different question, but:)
I don't know how familiar you are with LLMs, but "context" used in that context generally has the pretty clear meaning of "the blob of text you give in between (the text of) the system prompt and (the text of) the user prompt"[1], which acts as context for the user's request (hence the name). Very often this is the conversation history in chatbot-style LLMs, but it can include stuff like the content of text files you're working with, or search/function results.
[1] If you want to be pedantic, technically each instance of "text" should say "tokens" there, and the maximum "context" length includes the length of both prompts.
1. The sampling documentation is confusing. "Sampling" means something very specific in statistics, and I'm struggling to see any connection between the term's typical usage and the usage here. Perhaps "prompt delegation" would be a more obvious term to use.
Another thing that's confusing about the sampling concept is that it's initiated by a server instead of a client, a reversal of how client/server interactions normally work. Without concrete examples, it's not obvious why or how a server might trigger such an exchange.
2. Some information on how resources are used would be helpful. How do resources get pulled into the context for queries? How are clients supposed to determine which resources are relevant? If the intention is that clients are to use resource descriptions to determine which to integrate into prompts, then that purpose should be more explicit.
Perhaps a bigger problem is that I don't see how clients are to take a resource's content into account when analyzing its relevance. Is this framework intentionally moving away from the practice of comparing content and query embeddings? Or is this expected to be done by indices maintained on the client?
I just want to say kudos for the design of the protocol. Seems inspired by https://langserver.org/ in all the right ways. Reading through it is a delight, there's so many tasteful little decisions.
One bit of constructive feedback: the TypeScript API isn't using the TypeScript type system to its fullest. For example, for tool providers, you could infer the type of a tool request handler's params from the json schema of the corresponding tool's input schema.
I guess that would be assuming that the model is doing constrained sampling correctly, such that it would never generate JSON that does not match the schema, which you might not want to bake into the reference server impl. It'd mean changes to the API too, since you'd need to connect the tool declaration and the request handler for that tool in order to connect their types.
This is a great idea! There's also the matter of requests' result types not being automatically inferred in the SDK right now, which would be great to fix.
Could I convince you to submit a PR? We'd love to include community contributions!
If you were willing to bring additional zod tooling or move to something like TypeBox (https://github.com/sinclairzx81/typebox), the json schema would be a direct derivation of the tools' input schemas in code.
The json-schema-to-ts npm package has a FromSchema type operator that converts the type of a json schema directly to the type of the values it describes. Zod and TypeBox are good options for users, but for the reference implementation I think a pure type solution would be better.
In case of Claude Desktop App, I assume the decision which MCP-server's tool to use based on the end-user's query is done by Claude LLM using something like ReAct loop. Are the prompts and LLM-generated tokens involved inside "Protocol Handshake"-phase available for review?
I'd love to develop some MCP servers, but I just learned that Claude Desktop doesn't support Linux. Are there any good general-purpose MCP clients that I can test against? Do I have to write my own?
(Closest I can find is zed/cody but those aren't really general purpose)
Is it at least somewhat in sync with plans from Microsoft , OpenAI and Meta? And is it compatible with the current tool use API and computer use API that you’ve released?
From what I’ve seen, OpenAI attempted to solve the problem by partnering with an existing company that API-fys everything. This feels looks a more viable approach, if compared to effectively starting from scratch.
It seems extremely verbose. Why does the transport mechanism matter? Would have loved a protocol/standard about how best to organize/populate the context. I think MCP touches on that but has too much of other stuff for me.
this is really cool stuff. I just started to write a server and I have a few questions. Not sure if HN is the right place, so where would you suggest to ask them?
Anyway, if there is no place yet, my questions are:
- In the example https://modelcontextprotocol.io/docs/first-server/python , what is the difference between read_resources and call_tool. In both cases the call the fetch_weather function. Would be nice to have that explained better.
I implemented in my own server only the call_tool function and Claude seems to be able to call it.
- Where is inputSchema of Tool specified in the docs? It would be nice if inputSchema would be explained a bit better. For instance how can I make a list of strings field that has a default value.
- How can i view the output of logger? It would be nice to see somewhere an example on how to check the logs. I log some stuff with logger.info and logger.error but I have no clue where I can actually look at it. My work around now is to log to a local file and tail if..
General feedback
- PLEASE add either automatic reload of server (hard) or a reload button in the app (probably easier). Its really disrupting to the flow when you have ot restart the app on any change.
- Claude Haiku never calls the tools. It just tells me it can't do it. Sonnet can do it but is really slow.
- The docs are really really version 0.1 obviously :-) Please put some focus on it...
Are there any resources for building the LLM side of MCP so we can use the servers with our own integration? Is there a specific schema for exposing MCP information to tool or computer use?
If you have specific questions, please feel free to start a discussion on the respective https://github.com/modelcontextprotocol discussion, and we are happy to help you with integrating MCP.
A few common use cases that I've been using is connecting a development database in a local docker container to Claude Desktop or any other MCP Client (e.g. an IDE assistant panel). I visualized the database layout in Claude Desktop and then create a Django ORM layer in my editor (which has MCP integration).
Zed editor had just announced support for MSP in some of their extensions, publishing an article showing some possible use cases/ideas: https://zed.dev/blog/mcp
Superb work and super promising! I had wished for a protocol like this.
Is there a recommended resource for building MCP client? From what I've seen it just mentions Claude desktop & co are clients. SDK readme seems to cover it a bit but some examples could be great.
If you run into issues, feel free to open a discussion in the respective SDK repository and we are happy to help.
(I've been fairly successful in taking the spec documentation in markdown, an SDK and giving both to Claude and asking questions, but of course that requires a Claude account, which I don't want to assume)
I'm looking at integrating MCP with desktop app. The spec (https://spec.modelcontextprotocol.io/specification/basic/tra...) mentions "Clients SHOULD support stdio whenever possible.". The server examples seem to be mostly stdio as well. In the context of a sandboxed desktop app, it's often not practical to launch a server as subprocess because:
- sandbox restrictions of executing binaries
- needing to bundle binary leads to a larger installation size
Would it be reasonable to relax this restriction and provide both SSE/stdio for the default server examples?
Having broader support for SSE in the servers repository would be great. Maybe I can encourage you to open a PR or at least an issue.
I can totally see your concern about sandboxed app, particularly for flatpack or similar distribution methods. I see you already opened a discussion https://github.com/modelcontextprotocol/specification/discus..., so let's follow up there. I really appreciate the input.
A possible cheap win for servers would be to support the systemd "here's an fd number you get exec'ed with" model - that way server code that's only written to do read/write on a normal fd should be trivial to wire up to unix sockets, TCP sockets, etc.
(and then having a smol node/bun/go/whatever app that can sit in front of any server that handles stdio - or a listening socket for a server that can handle multiple clients - and translates the protocol over to SSE or websockets or [pick thing you want here] lets you support all such servers with a single binary to install)
Not that there aren't advantages to having such things baked into the server proper, but making 'writing a new connector that works at all' as simple as possible while still having access to multiple approaches to talk to it seems like something worthy of consideration.
[possibly I should've put this into the discussion, but I have to head out in a minute or two; anybody who's reading this and engaging over there should feel free to copy+paste anything I've said they think is relevant]
It's not exactly immutable, but any backwards incompatible changes would require a version bump.
We don't have a roadmap in one particular place, but we'll be populating GitHub Issues, etc. with all the stuff we want to get to! We want to develop this in the open, with the community.
Ahh thanks! I was gonna say it's broken, but I now see that you're supposed to notice the sidebar changed and select one of the child pages. Would def recommend changing the sidebar link to that path instead of the index -- I would do it myself but couldn't find the sidebar in your doc repos within 5 minutes of looking.
Thanks for your hard work! "LSP for LLMs" is a fucking awesome idea
I can see where you're going with this and I can understand why you don't want to get into authorization, but if you're going to be encouraging tool developers to spin up json-rpc servers I hope you have some kind of plan for authorization otherwise you're encouraging a great way to break security models. Just because it's local doesn't mean it's secure. This protocol is dead the moment it becomes an attack vector.
Did I misunderstand, or does it not seem to have support for user authentication? It seems your operating model is that the MCP server is, during installation time, configured authentication for the underlying service. This is fine for non-serious use cases such as weather forecast querying, or for small-scale situations where only a couple of people have access to an LLM that's connected to the MCP server. But in an enterprise setting, there are thousands of people, whose level of access to the service behind the MCP server, differs. I think the MCP server needs a way to know the identity of the human behind the LLM, so that it can perform appropriate authentication and authorization.
For Rust, could one leverage the type + docs system to create such a server? I didn't delve into the details but one of the issues of Claude is that it has no knowledge of the methods that are available to it (vs LSP). Will creating such a server make it able to do informed suggestions?
Second, a question. Computer Use and JSON mode are great for creating a quasi-API for legacy software which offers no integration possibilities. Can MCP better help with legacy software interactions, and if so, in what ways?
Probably, yes! You could imagine building an MCP server (integration) for a particular piece of legacy software, and inside that server, you could employ Computer Use to actually use and automate it.
The benefit would be that to the application connecting to your MCP server, it just looks like any other integration, and you can encapsulate a lot of the complexity of Computer Use under the hood.
If you explore this, we'd love to see what you come up with!
I have a case in mind where I would like to connect to multiple databases. Does the integration endpoint specification in claude_desktop_config.json allow us to pass some description so as to differentiate different databases? How?
The result that MCP server returned will be transfer to MCP host(Claude, IDEs, Tools), there are some privacy issues because the process is automatic after one-time permission provided.
For instance, when there is something wrong for MCP host, it query all data from database and transfer it to host, all data will be leaked.
It's hard to totally prevent this kind of problem when interacting with local data, But, Is there some actions to prevent this kind of situations for MCP?
Your concerns are very valid. This is partly why right now, in Claude Desktop, it's not possible to grant permission permanently. The most you can do is "Allow for this chat," which applies to one tool from one server at a time.
You guys need a professional documentation person on your team, one that specializes in only writing documentation. I say this because the existing documentation is a confusing mess. This is going to cause all kinds of problems purely because it is weakly explained, and I see incorrect usage of words all over. Even the very beginning definitions of client, host and server are nonstandard.
Any ideas on how the concepts here will mesh with the recently released Microsoft.Extensions.AI library released by MS for .NET, that is also supposed to make it easy to work with different models in a standardized way?
Is there any way to give a MCP server access for good? Trying out the demo it asked me every single time for permission which will be annoying for longer usage.
We do want to improve this over time, just trying to find the right balance between usability and security. Although MCP is powerful and we hope it'll really unlock a lot of potential, there are still risks like prompt injection and misconfigured/malicious servers that could cause a lot of damage if left unchecked.
Will this be partially available from the Claude website for connections to other web services? E.g. could the GitHub server be called from https://claude.ai?
Any idea on timelines? I’d love to be able to have generation and tool use contained within a customer’s AWS account using bedrock. Ie I pass a single cdk that can interface with an exposed internet MCP service and an in-VPC service for sensitive data.