Building Production AI Agents with MCP and Anthropic Claude
Model Context Protocol (MCP) is quietly becoming one of the most important architectural patterns for production AI systems. It's not a framework — it's a protocol. And that distinction matters enormously when you're building systems that need to run reliably at scale.
This post covers what I learned building the Giovani AI Agent: a production MCP server connected to a live PostgreSQL database, orchestrated through Anthropic Claude, serving real business users with natural-language queries and auto-generated visualizations.
What is MCP?
MCP defines a standard way for AI models to communicate with external tools, databases, APIs, and services. Think of it as a universal adapter that lets your LLM interact with the world in a structured, auditable way.
User Query → Claude → MCP Client → MCP Server → Production DB → Data → Claude → Response
The key architectural insight: the LLM never holds your data. Claude asks the MCP server for data, the server executes against your database, and returns structured results. The LLM only processes what's returned.
The Architecture
import { Server } from '@modelcontextprotocol/sdk/server/index.js'
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
const server = new Server(
{ name: 'giovani-db-agent', version: '1.0.0' },
{ capabilities: { tools: {} } }
)
// Register a tool: natural language → SQL → results
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'query_sales_data',
description: 'Query sales and inventory data in natural language',
inputSchema: {
type: 'object',
properties: {
question: { type: 'string', description: 'Natural language question' },
date_range: { type: 'string', description: 'Optional date range filter' },
},
required: ['question'],
},
},
],
}))Schema Ingestion — the Foundation
Before Claude can query your database meaningfully, it needs to understand the schema. We load the entire ER diagram and table definitions into the system prompt:
async function loadDatabaseSchema(pool: Pool): Promise<string> {
const tables = await pool.query(`
SELECT
table_name,
column_name,
data_type,
is_nullable,
column_default
FROM information_schema.columns
WHERE table_schema = 'public'
ORDER BY table_name, ordinal_position
`)
// Format for Claude's context window
return formatSchemaForLLM(tables.rows)
}Stateful Memory with ChromaDB
One challenge with LLM agents is context loss across sessions. We solved this by storing schema embeddings and frequently-asked query patterns in ChromaDB:
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./db_context")
ef = embedding_functions.DefaultEmbeddingFunction()
collection = client.get_or_create_collection(
name="query_patterns",
embedding_function=ef,
)
def remember_successful_query(nl_query: str, sql: str, result_summary: str):
"""Persist successful NL→SQL translations for future context."""
collection.add(
documents=[nl_query],
metadatas=[{"sql": sql, "summary": result_summary}],
ids=[generate_id(nl_query)],
)What Makes This Production-Grade
A few things separate a demo from a production agent:
- Schema caching — don't re-fetch schema on every query
- SQL validation before execution — Claude generates SQL, you validate before running
- Result size limits — cap rows returned to prevent context overflow
- Audit logging — every NL→SQL→result cycle is logged
- Human-in-the-loop for writes — agent can only
SELECT; mutations need explicit confirmation
Results
After 3 months in production at Giovani:
- Business users issue 50–80 natural-language queries per day
- Average query → visualization time: < 8 seconds
- SQL accuracy rate: ~94% (validated against known queries)
- Zero data incidents — read-only MCP tools enforce this at the protocol level
MCP isn't hype. It's a practical architectural pattern that makes AI agents auditable, composable, and safe to run against real production systems.
If you're building MCP-based agents and want to compare notes, reach out on LinkedIn.