Building Production AI Agents with MCP and Anthropic Claude

Model Context Protocol (MCP) is quietly becoming one of the most important architectural patterns for production AI systems. It's not a framework — it's a protocol. And that distinction matters enormously when you're building systems that need to run reliably at scale.

This post covers what I learned building the Giovani AI Agent: a production MCP server connected to a live PostgreSQL database, orchestrated through Anthropic Claude, serving real business users with natural-language queries and auto-generated visualizations.

What is MCP?

MCP defines a standard way for AI models to communicate with external tools, databases, APIs, and services. Think of it as a universal adapter that lets your LLM interact with the world in a structured, auditable way.

User Query → Claude → MCP Client → MCP Server → Production DB → Data → Claude → Response

The key architectural insight: the LLM never holds your data. Claude asks the MCP server for data, the server executes against your database, and returns structured results. The LLM only processes what's returned.

The Architecture

mcp-server.ts

import { Server } from '@modelcontextprotocol/sdk/server/index.js'
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js'
 
const server = new Server(
  { name: 'giovani-db-agent', version: '1.0.0' },
  { capabilities: { tools: {} } }
)
 
// Register a tool: natural language → SQL → results
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [
    {
      name: 'query_sales_data',
      description: 'Query sales and inventory data in natural language',
      inputSchema: {
        type: 'object',
        properties: {
          question: { type: 'string', description: 'Natural language question' },
          date_range: { type: 'string', description: 'Optional date range filter' },
        },
        required: ['question'],
      },
    },
  ],
}))

Schema Ingestion — the Foundation

Before Claude can query your database meaningfully, it needs to understand the schema. We load the entire ER diagram and table definitions into the system prompt:

schema-loader.ts

async function loadDatabaseSchema(pool: Pool): Promise<string> {
  const tables = await pool.query(`
    SELECT 
      table_name,
      column_name,
      data_type,
      is_nullable,
      column_default
    FROM information_schema.columns
    WHERE table_schema = 'public'
    ORDER BY table_name, ordinal_position
  `)
  
  // Format for Claude's context window
  return formatSchemaForLLM(tables.rows)
}

Stateful Memory with ChromaDB

One challenge with LLM agents is context loss across sessions. We solved this by storing schema embeddings and frequently-asked query patterns in ChromaDB:

vector_store.py

import chromadb
from chromadb.utils import embedding_functions
 
client = chromadb.PersistentClient(path="./db_context")
ef = embedding_functions.DefaultEmbeddingFunction()
 
collection = client.get_or_create_collection(
    name="query_patterns",
    embedding_function=ef,
)
 
def remember_successful_query(nl_query: str, sql: str, result_summary: str):
    """Persist successful NL→SQL translations for future context."""
    collection.add(
        documents=[nl_query],
        metadatas=[{"sql": sql, "summary": result_summary}],
        ids=[generate_id(nl_query)],
    )

What Makes This Production-Grade

A few things separate a demo from a production agent:

Schema caching — don't re-fetch schema on every query
SQL validation before execution — Claude generates SQL, you validate before running
Result size limits — cap rows returned to prevent context overflow
Audit logging — every NL→SQL→result cycle is logged
Human-in-the-loop for writes — agent can only SELECT; mutations need explicit confirmation

Results

After 3 months in production at Giovani:

Business users issue 50–80 natural-language queries per day
Average query → visualization time: < 8 seconds
SQL accuracy rate: ~94% (validated against known queries)
Zero data incidents — read-only MCP tools enforce this at the protocol level

MCP isn't hype. It's a practical architectural pattern that makes AI agents auditable, composable, and safe to run against real production systems.

If you're building MCP-based agents and want to compare notes, reach out on LinkedIn.