A Developer's Guide to Understanding the Model Context Protocol

In my previous article, I briefly touched on the Model Context Protocol (MCP) and its ambition to become the standard for agentic communication. In this piece, I’ll clarify what MCP actually is, unpack its core concepts, and give an overview on how it all works together.

MCP is still in its very early stages — it will evolve rapidly — but its foundational components and ideas are likely to persist (even if a competing protocol ultimately prevails). That makes learning about it now a worthwhile investment.

What are protocols and why do we need them?

A protocol is a system of rules that defines the correct conduct and procedures to follow in a given situation. Following an established protocol creates efficiency and keeps everything running smoothly. It ensures all parties are on the same page about the steps needed to achieve a common goal and implicitly enables progress.

It needs to be simple enough to be followed efficiently, but at the same time define everything necessary to accomplish the goal.

Protocols are especially important for inter‑system communication. While humans can reason and recover if they deviate slightly from established conventions, machines cannot. For machines to work together reliably, they need very strictly defined rules.

That’s why there are so many communication protocols describing exactly how computers talk over a network to perform specific tasks. Examples include HTTP, SFTP, SMTP, TCP, and numerous others. These standards are so fundamental that average users take them for granted — yet they enable virtually everything we do on the internet.

What is MCP?

MCP is an open protocol that standardizes how applications provide context to LLMs. — Official Anthropic documentation

In other words, the goal of MCP is to specify what an application needs to do in order for its functionalities and resources to be accessible to LLM-based applications that follow the same communication protocol.

It’s important to keep in mind that MCP, like any other protocol, is a specification, and it’s up to developers to provide implementations. There are already SDKs available in popular programming languages that implement the core components of MCP, allowing developers to build MCP-compliant applications in the language of their choice.

Core components of MCP

There are a lot of terms floating around MCP, and it can seem daunting at first to figure out where each piece fits and what its purpose is. Following grouping should help you understand the core concepts:

  • Involved parties: Host, Client, Server
  • Underlying communication: Transport
  • Message types: Request, Response, Notification
  • Capabilities: Tools, Resources, Prompts, Roots, Sampling

There are more concepts than this, but these are the most important to understand initially.

Involved parties

Since the purpose of MCP is to define communication between LLM-based agents and other applications such as web or local machine applications, the terms Host, Client, and Server map to those entities.

Let’s start with the agents.

An LLM by itself isn’t capable of doing anything beyond receiving textual input and producing textual output. To give it actual capabilities — like browsing the web or sending an email — developers wrap it in an application that interprets when the model wants to take action and executes that action in the real world.

This combination of an LLM and a wrapper application is often referred to as an autonomous agent, as it effectively allows the model to decide when to initiate real-world actions.

In MCP, the Host refers to this orchestration layer — the application coordinating between the user, LLM, and external tools or resources.

On the other side, we have traditional applications that expose some form of interface — let’s say a REST API — which allows clients to trigger actions, such as sending an email.

Today, many agent implementations work by having the host application parse LLM responses to determine what the model wants to do and then map those intents to concrete API calls. But REST is just an architectural style — not a strict protocol — and APIs often vary significantly, even when performing similar tasks. Maintaining adapters for each API is tedious and error-prone. Worse, it limits agent capabilities since the agents can only interact with APIs they’ve been explicitly programmed to understand.

MCP introduces Servers to address this.

An MCP Server is essentially an LLM-friendly wrapper around an application’s APIs. It operates independently from the Host and runs in a separate process. It may run on the same machine, but conceptually it’s better to think of it as a remote service exposing well-defined functionality that interested clients can consume. The functionalities exposed by the Server enable interested parties to discover available capabilities, invoke well-defined actions, subscribe to updates, and more.

The way Hosts interact with Servers is through Clients.

In the MCP architecture, a Client serves as a bridge enabling a Host to interact seamlessly with one or more Servers. It handles connection management, structures message exchanges, and abstracts the details of interacting with individual Servers.

The Client provides a simple and standardized interface, allowing the Host to discover the available tools and resources provided by Servers and to invoke these capabilities easily. Clients reside close to the Host within the system architecture and are part of the same application process, however, conceptually they represent a distinct component focused specifically on managing communication and maintaining adherence to the MCP protocol.

With Clients, Hosts gain a reusable and consistent mechanism for interacting with various Servers. This is crucial for building scalable, modular agentic systems, allowing agent developers to focus their efforts on enhancing the agent's capabilities and performance rather than maintaining numerous custom adapters for different APIs.

Underlying communication

For Clients to be able to interact with Servers, there has to be some kind of message transport mechanism between the two.

MCP is designed to be transport-agnostic, and the specification doesn't place strict constraints on the underlying communication between Client and Server — as long as the mechanism:

  • Respects the lifecycle requirements defined by MCP (e.g., connection setup, operation, and termination)
  • Supports bidirectional message exchange
  • Preserves the JSON-RPC message format

That being said, MCP defines two standard transport mechanisms, each serving a distinct purpose.

The first is stdio, which facilitates communication between a Client and a Server running on the same machine. The Client writes its messages to the Server's stdin, and the Server to stdout. A Server running locally may seem counterintuitive at first, but it makes sense in many scenarios — for example, when exposing operating system functionality to an agent or implementing a local Server as an adapter for web APIs that don’t yet support MCP natively.

The second is HTTP+SSE, which supports communication between Clients and independently running Servers over HTTP. Since one of MCP’s requirements is bidirectional communication, standard HTTP alone wouldn’t be enough — so it's combined with Server-Sent Events (SSE) to allow the Server to initiate messages to the Client.

As mentioned earlier, MCP aims to be transport-agnostic, meaning developers are free to implement their own transport mechanisms — whether using WebSockets, message queues, or something else — as long as the above requirements are met.

Message types

In MCP, all communication between Clients and Servers happens through structured messages that follow the JSON-RPC 2.0 specification. This specification defines a well-established pattern for remote procedure calls using JSON, and MCP builds on top of it.

MCP defines three message types, serving two distinct purposes.

First, there are Request and Response, which are tightly coupled—each Request must have a corresponding Response. These messages are linked using an id field, which acts as a correlation identifier and must be present in both.

A Request must include an id and a method name. It is always initiated by the Client with the intention of triggering some action on the Server. This could be anything from initiating a connection to invoking a tool or retrieving a resource. Optionally, a Request may also include parameters relevant to the method being called.

For each Request, the Client expects a Response from the Server with the same id. The Response must either contain a result object if the operation was successful, or an error object with a numerical error code and message if something went wrong.

The third type of message is a Notification, which serves a different purpose. Notifications can originate from either the Client or the Server, and unlike Requests, they do not expect a Response. Because of this, the id field is not required. The only required field for a Notification is the method name, though parameters can also be included optionally.

Notifications are useful in scenarios like when the Server needs to inform a subscribed Client that some resource has changed, or when the Client wants to cancel an already submitted Request by referencing its id.

A good example of all three message types working together is during the connection initialization between a Client and a Server. MCP clearly defines the steps both parties must take to establish a working connection.

To initiate the connection, the Client sends a Request with the initialize method, declaring the protocol version it supports and its capabilities. The Server responds with a Response message containing the protocol version it supports (either the one received or the latest version it supports), along with its own declared capabilities.

Once the Client receives and accepts the Server's Response, it sends a Notification with the method notifications/initialized. At that point, the connection is established and both Client and Server can proceed with further communication until the connection is eventually terminated.

Capabilities

All components introduced so far have focused on the communication layer — defining who the parties are, how they establish connections, how messages are transmitted, and what message types exist.

Capabilities shift the focus from communication infrastructure to actual functionality. They describe what kind of functionality a party — typically the Server — offers, and how those functionalities can be discovered and invoked.

This part of the protocol is what enables Agents to dynamically explore and use what a Server provides, without requiring hardcoded assumptions. In other words, capabilities are what turn a Server into something useful — they expose its features in a standardized way that agentic systems can reason about and act upon.

As mentioned earlier, during connection initialization, parties can declare the capabilities they support. For example, if a Server declares that it supports tools, the Client knows it can retrieve a list of available tools by sending a request with the method name tools/list. The Server then responds with a list of all tools it exposes. Each listed tool includes its name, a human-readable description, and a schema describing the required input parameters. To invoke a tool, the Client sends a request using the tools/call method along with the appropriate parameters.

A similar pattern applies to resources and prompts.

Servers can declare three types of capabilities:

  • Tools: Executable functions that can be invoked by Clients and used by LLMs to perform actions.
  • Resources: Structured data or content that can be accessed by Clients and used as context for LLM interactions.
  • Prompts: Reusable prompt templates or workflows that Clients can pass to LLMs or end users.

Although these capability types are semantically different and serve different purposes, MCP does not strictly enforce how specific functionality must be exposed. These categories are more like guiding assumptions based on common needs in agentic systems. For example, you could expose a resource or a prompt by wrapping it in a tool that returns the data, but it goes against the intended semantics.

Since MCP communication is bidirectional, Clients can also declare their own capabilities — features that Servers can invoke. These include:

  • Roots: Specifies which filesystem roots the Server has access to.
  • Sampling: Enables Servers to request LLM completions through the Client.

Depending on the business logic and the architecture of the application, some capabilities will make more sense than others. At their core, capabilities are a flexible mechanism to support a broad range of functionality likely to be required in agentic systems.

Putting it all together

To make things more concrete, let's walk through a practical example that involves both a local and a remote MCP Server.

Imagine an agent whose task is to scan a specific folder on the local filesystem for image files, and then upload those images to a remote image processing service. It's a simple use case, but it can showcase a lot of concepts discussed in previous sections.

  1. Startup and initialization

    • The Host application initializes a Client.
    • The Client launches a local MCP Server (via stdio) that exposes access to the local file system as tools and resources.
    • It also connects to a remote Server over HTTP+SSE that offers an image processing API (e.g., compression, filtering, or classification).
    • For both connections, the Client sends an initialize Request and receives the Server's capabilities in a Response.
    • Once initialization is complete, the Client sends a notifications/initialized Notification.
  2. Capability discovery

    • From the local Server, the Client learns that it exposes a list_files tool that returns files from a given folder, as well as read_file tool which allows reading of specific file
    • From the remote Server, it learns about upload_image tool that takes binary content and metadata.
  3. Interaction

    • The user tasks the application with a goal like: "Find all .png images in my screenshots folder and upload them for optimization."
    • The Host inserts the user input into prompt template which also explains to LLM which tools it has as its disposal
    • The LLM decides that it first needs to invoke the list_files
    • The Host parses LLM response, invokes list_files tool through the Client
    • The Server responds with a list of file paths and Host feeds the response back to LLM
    • LLM decides that it now needs to read each png file and upload it to the server with available tools
    • For each file, the agent invokes a second local tool to read the file content (e.g. read_file).
    • It then calls the upload_image tool on the remote Server with the image data.
    • Each of these tool invocations is done via a tools/call Request, and the results are returned in a Response.
  4. Bidirectional functionality

    • Suppose the remote Server wants to ask the agent to summarize the image content using the LLM — it can send a sampling/complete Request to the Client.
    • The Client passes the prompt to the LLM and returns the result as a Response.
  5. Termination

    • After all images are uploaded and optional feedback is generated, the Host or Client terminates the session with each Server.

MCP handles only the plumbing — all reasoning, decision‑making, and prompt orchestration live in the Host. The Host defines the LLM’s goal, supplies it with available tools and schemas, interprets its output, and routes any requested actions through the Client.

Extending an agent’s capabilities is as simple as adding a new MCP Server entry to your config: once registered, its tools become immediately available. And with emerging centralized registries of MCP Servers, agents can automatically discover and use brand‑new tools they weren’t originally programmed with — enabling truly self‑evolving workflows.

Conclusion

MCP is a blueprint that aims to standardize how LLM-based applications communicate with the external world. As with any specification, MCP itself doesn’t provide an implementation. Instead, it defines a common structure and communication format that developers can use to build interoperable Clients, Servers, and agentic systems.

Fortunately, you don’t need to start from scratch. SDKs already exist in several popular programming languages, along with frameworks that simplify the development of MCP-compliant components. For example, Spring Boot offers MCP Client and Server starters that allow developers to focus on business logic rather than the protocol's low-level semantics. Java isn’t the only ecosystem with support — other languages offer similar tooling.

If MCP becomes widely adopted, its ecosystem will continue to mature. Best practices will emerge, and development will naturally shift toward creating meaningful agent behavior or building Servers that expose valuable capabilities. This mirrors what we've seen with other successful protocols. Consider HTTP: most developers don’t worry about the intricacies of the protocol — they use well-established client libraries and server frameworks like Nginx, Apache Tomcat, or Express.js to build on top of it.

That said, understanding the core semantics of MCP — its parties, message types, capabilities, and communication flow — remains crucial. It enables developers to build more reliable, scalable, and future-proof systems, and it lays a solid foundation for contributing to or building on top of agentic technologies as they evolve.


References and technical resources for further reading: