Blogs/AI/How to Use MCP with STDIO Transport (Practical Guide)

How to Use MCP with STDIO Transport (Practical Guide)

Written by Kiruthika

Reviewed by Ajay Patel

Apr 21, 2026

7 Min Read

How to Use MCP with STDIO Transport (Practical Guide) Hero

What if you could let an AI model search the internet or solve problems on its own instead of hardcoding every action? I started exploring MCP (Model Context Protocol) while working on tool-enabled AI workflows where simple prompt-based answers weren’t enough. This guide shows how MCP helps connect AI models to real tools using standard input and output, without adding unnecessary complexity.

You’ll learn how to make models like Claude use tools you define, in real time. I’m writing this for developers who want practical control, whether that’s letting AI fetch live data, run calculations, or interact with systems safely. By the end, you’ll understand how MCP enables this kind of capability without turning your setup into a fragile experiment.

Let’s walk through it step by step.

How to Install and Set Up MCP?

1. Install Required Packages

To keep dependencies isolated and predictable, I recommend using a virtual environment before installing the required packages:

python -m venv venv
source venv/bin/activate
pip install "mcp[cli]" anthropic python-dotenv requests

Package Purpose:

mcp[cli] – Enables MCP client and server communication
anthropic – API client for Claude models
python-dotenv – Loads environment variables securely
requests – Handles API requests

2. Setting Up the .env File

Create a .env file in your project folder and add your API keys:

SERPER_API_KEY=your_serper_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here

This helps keep sensitive credentials out of your codebase and makes local development safer and cleaner.

How to Build an MCP Server?

To make the example concrete, I’ll start by building a simple MCP server with two capabilities:

A web search tool using the Serper API.
A basic addition function.

Server Code Breakdown

1. Import Required Modules

from mcp.server.fastmcp import FastMCP
import requests
import os
from dotenv import load_dotenv

load_dotenv()

mcp = FastMCP()

FastMCP: Initializes the MCP server.
dotenv: Loads API keys from the .env file.

2. Web Search Tool Using Serper API

Configuring Tools in MCP;

In MCP, any function decorated with @mcp.tool() becomes available to the model as a callable tool. From working with tool-enabled agents, I’ve found that clear descriptions and input schemas matter more than the function itself; they guide the model in choosing the right action for a given query.

For example, when a user asks something informational like “What is Model Context Protocol?”, the model can infer that a web search tool is the appropriate choice.

For example:

API_KEY = os.getenv("SERPER_API_KEY")
API_URL = "https://google.serper.dev/search"

@mcp.tool()
def serper_search(query: str) -> dict:
    """Search the web using Serper API for user queries"""
    headers = {"X-API-KEY": API_KEY, "Content-Type": "application/json"}
    data = {"q": query}
    try:
        response = requests.post(API_URL, json=data, headers=headers)
        response.raise_for_status()
        result = response.json()
        print(f"Search result for '{query}': {result}")
        return result
    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        return {"error": str(e)}

Takes user queries and fetches search results from the Serper API.

Implementing MCP with stdio Transport

Walk through a complete MCP stdio integration and understand data flow between agent and host.

Murtuza Kutub

Co-Founder, F22 Labs

Walk away with actionable insights on AI adoption.

Limited seats available!

Saturday, 23 May 2026

10PM IST (60 mins)

3. Basic Arithmetic Tool

@mcp.tool()
def add(a: int, b: int) -> int:
    """Add two numbers"""
    print(f"Adding {a} and {b}")
    return a + b

4. Running the MCP Server with stdio transport

The stdio transport communicates through standard input and output streams. I’ve found this especially useful for local development and CLI-based integrations, where simplicity and debuggability matter more than network overhead.

if __name__ == "__main__":
    mcp.run(transport="stdio")

Using MCP Client with Standard Input/Output (stdio) Transport

With STDIO transport, you do not need to manage a separate server process, which makes testing and early experimentation much simpler. You can start the client by passing the server script path directly through the command line.

This approach is useful for quick local setups, debugging, and lightweight deployments.

Create a client.py file and add the following code.

Code Walkthrough

1. Importing Required Libraries

import asyncio
from typing import Optional
from contextlib import AsyncExitStack
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from anthropic import Anthropic
from dotenv import load_dotenv

load_dotenv()

AsyncExitStack: Ensures proper resource cleanup.
ClientSession: Manages the connection to the server.
stdio_client: Sets up stdio transport.
Anthropic: Handles natural language processing.
.env File: Used for securely loading environment variables.

2. Defining the MCP Client Class

class MCPClient:
    def __init__(self):
        self.session: Optional[ClientSession] = None
        self.exit_stack = AsyncExitStack()
        self.anthropic = Anthropic()

Purpose: Sets up the client session and prepares the components needed to connect Claude with MCP tools.
AsyncExitStack: Manages async resources, ensuring proper cleanup.
Anthropic: Integrates the LLM (Claude model in this case).

3. Connecting to the MCP Server

async def connect_to_server(self, server_script_path: str):
    """Connect to an MCP server
    
    Args:
        server_script_path: Path to the server script (.py or .js)
    """
    # Determine script type
    is_python = server_script_path.endswith('.py')
    is_js = server_script_path.endswith('.js')
    
    if not (is_python or is_js):
        raise ValueError("Server script must be a .py or .js file")
    
    # Choose command based on file type
    command = "python" if is_python else "node"
    
    # Set up Stdio transport parameters
    server_params = StdioServerParameters(
        command=command,
        args=[server_script_path],
        env=None
    )
    
    # Establish stdio transport
    stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params))
    self.stdio, self.write = stdio_transport
    self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write))
    
    await self.session.initialize()
    
    # List available tools
    response = await self.session.list_tools()
    tools = response.tools
    print("\nConnected to server with tools:", [tool.name for tool in tools])

server_script_path: Path to the server script provided via the command line.
StdioServerParameters: Sets up how the client communicates with the server.
stdio_client(): Creates a standard I/O transport client.
list_tools(): Lists available tools on the server.

4. Processing Queries

async def process_query(self, query: str) -> str:
    """Process a query using Claude and available tools"""
    messages = [{"role": "user", "content": query}]
    
    # Get available tools from the server
    response = await self.session.list_tools()
    available_tools = [
        {
            "name": tool.name,
            "description": tool.description,
            "input_schema": tool.inputSchema
        } 
        for tool in response.tools
    ]
    
    # Generate response using Claude
    response = self.anthropic.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1000,
        messages=messages,
        tools=available_tools
    )
    
    tool_results = []
    final_text = []

    # Process each content piece in response
    for content in response.content:
        if content.type == 'text':
            final_text.append(content.text)
        
        elif content.type == 'tool_use':
            tool_name = content.name
            tool_args = content.input
            
            # Call the tool and get results
            result = await self.session.call_tool(tool_name, tool_args)
            tool_results.append({"call": tool_name, "result": result})
            final_text.append(f"[Calling tool {tool_name} with args {tool_args}]")
            
            # Update conversation history
            if hasattr(content, 'text') and content.text:
                messages.append({"role": "assistant", "content": content.text})
            messages.append({"role": "user", "content": result.content})
            
            # Generate follow-up response
            response = self.anthropic.messages.create(
                model="claude-3-5-sonnet-20241022",
                max_tokens=1000,
                messages=messages,
            )
            final_text.append(response.content[0].text)
    
    return "\n".join(final_text)

Dynamic Tool Selection: In practice, tool choice depends heavily on how clearly the tool is described, something I learned quickly while testing ambiguous schemas.
Asynchronous Calls: Each tool is invoked without blocking the main loop.
Conversation History: Maintains context for better interaction.

5. Interactive Chat Loop

async def chat_loop(self):
    """Run an interactive chat loop"""
    print("\nMCP Client Started!")
    print("Type your queries or 'quit' to exit.")
    
    while True:
        try:
            query = input("\nQuery: ").strip()
            
            if query.lower() == 'quit':
                break
                
            response = await self.process_query(query)
            print("\n" + response)
                
        except Exception as e:
            print(f"\nError: {str(e)}")

Interactive Mode: Keeps the session running so you can test multiple queries without restarting the client.
Error Handling: Catches and prints exceptions

6. Cleaning Up Resources

async def cleanup(self):
    """Clean up resources"""
    await self.exit_stack.aclose()

Purpose: Ensures all resources are closed properly.
Context Manager: Automatically handles cleanup.

7. Main Function

async def main():
    if len(sys.argv) < 2:
        print("Usage: python client.py <path_to_server_script>")
        sys.exit(1)
        
    client = MCPClient()
    try:
        await client.connect_to_server(sys.argv[1])
        await client.chat_loop()
    finally:
        await client.cleanup()

if __name__ == "__main__":
    import sys
    asyncio.run(main())

Command Line Argument: Expects the path to the server script.
asyncio.run(): Runs the async main function.

Why Use Stdio Transport?

No Need to Start Server Separately: The server runs as a subprocess, which removes friction during local development.
Simple & Efficient: Uses standard input/output for communication.
Quick Debugging: This setup makes it easier to inspect inputs and outputs when something doesn’t behave as expected.

How to Run the MCP Client

python client.py /path/to/serper_server.py

client.py: The MCP client script.
/path/to/serper_server.py: The MCP server script.

Implementing MCP with stdio Transport

Walk through a complete MCP stdio integration and understand data flow between agent and host.

Murtuza Kutub

Co-Founder, F22 Labs

Walk away with actionable insights on AI adoption.

Limited seats available!

Saturday, 23 May 2026

10PM IST (60 mins)

Type your queries and click enter in the terminal:

Query: What is Model Context Protocol?

To exit, type:

Query: quit

Suggested Read- MCP Practical Guide with SSE Transport

Conclusion

I wrote this guide to share a practical way to use MCP with STDIO transport based on what has worked for me during hands-on testing. You’ve seen how MCP helps AI models connect with custom tools through a simple and inspectable communication layer.

By following this setup, you can start building AI-driven applications with faster testing and cleaner integrations. This tutorial covered setting up an MCP server and client, building custom tools, and using Claude for real-time tool execution.

If you need production-ready MCP integrations, scalable AI agents, or custom tool workflows, it can be valuable to hire AI developers with real implementation experience.

Frequently Asked Questions

1. What is MCP with STDIO transport?

MCP with STDIO transport lets AI clients communicate with tools or servers through standard input and output streams, making local integrations simple and lightweight.

2. When should I use STDIO transport in MCP?

Use STDIO transport for local development, CLI tools, testing environments, and quick prototypes where running a network server is unnecessary.

3. What are the benefits of MCP STDIO transport?

It offers simple setup, easier debugging, lower overhead, and fast communication between the MCP client and server.

4. Can I use Claude with MCP STDIO transport?

Yes. Claude can work with MCP clients to call tools, process queries, and use real-time tool execution through STDIO transport.

5. Is STDIO transport suitable for production use?

It works well for some controlled environments, but larger distributed systems often prefer HTTP or SSE-based transports.

6. Do I need coding experience to use MCP?

Basic Python or JavaScript knowledge is helpful, especially when building MCP servers, clients, and custom tools.

Kiruthika

AI/ML Engineer

I'm an AI/ML engineer passionate about developing cutting-edge solutions. I specialize in machine learning techniques to solve complex problems and drive innovation through data-driven insights.

Share this article

Next for you

TRT-LLM vs vLLM vs SGLang: What to Choose in 2026 Cover

AI

May 15, 2026 • 11 min read

TRT-LLM vs vLLM vs SGLang: What to Choose in 2026

Running LLMs efficiently is one of the most important engineering challenges in today’s world. We need to choose the right inference engine. The wrong choice can mean slow responses, wasted GPU memory, and poor user experience. This blog documents what we learned after benchmarking three inference engines on a RTX 4090 server: NVIDIA TensorRT-LLM, vLLM, and SGLang. We explain not just the numbers, but why each engine behaves the way it does at the GPU level. What Are These Engines? Before co

Speculative Speculative Decoding Explained Cover

AI

May 13, 2026 • 12 min read

Speculative Speculative Decoding Explained

If you have worked with large language models in production, you have probably faced this problem: Models are powerful, but they are slow. Even with good GPUs, generating responses one token at a time adds latency. For real-world applications like chat systems, copilots, or voice assistants, this delay is noticeable and often unacceptable. Several techniques have been proposed to speed up inference. One of the most effective is speculative decoding, which uses a smaller model to guess the nex

Rethinking RAG: Retrieval Without Embeddings Using PageIndex Cover

AI

May 11, 2026 • 7 min read

Rethinking RAG: Retrieval Without Embeddings Using PageIndex

Retrieval-Augmented Generation (RAG) powers most modern LLM applications, but production systems often reveal the same problems: broken context from chunking, embedding mismatches, and important information that never gets retrieved. PageIndex takes a different approach. Instead of relying on embeddings and vector databases, it lets the LLM reason through a document’s structure to find relevant information. Documents are transformed into a hierarchical semantic tree, allowing the model to navi