building with the claude sdk — tools, mcp, and what i've learned
most of my recent ai engineering work runs on the anthropic claude sdk. it’s become my default stack for anything that requires reliable tool use, structured output, or long-context reasoning. here’s what i’ve built with it and what i’ve learned.
why claude for production tool use
the deciding factor for me was reliability. claude follows tool schemas more consistently than other models i’ve tested, and it handles the case where a tool call fails more gracefully — it’ll explain the failure and try an alternative rather than hallucinating a result.
the sdk itself is straightforward:
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
tools=[
{
"name": "query_database",
"description": "run a read-only sql query against the postgres database",
"input_schema": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "valid sql select statement"},
},
"required": ["query"],
},
}
],
messages=[{"role": "user", "content": "how many users signed up last week?"}],
)
the postgresql streaming query tool
one of the first things i built at spritle was a streaming query tool that lets claude interact with a postgresql database conversationally. the key design decisions:
read-only by default — all queries are run in a transaction that’s immediately rolled back. claude can explore the schema and read data, but can’t mutate anything without a separate explicit write tool.
schema injection — before each session, i inject a compact schema description into the system prompt. claude uses this to write correct queries without needing to call a schema-inspection tool every time.
streaming output — for queries that return large result sets, i stream the response back using the sdk’s streaming api so the user sees results progressively rather than waiting for a full response:
with client.messages.stream(
model="claude-opus-4-5",
max_tokens=2048,
system=f"database schema:\n{schema_description}",
tools=[query_tool, explain_tool],
messages=messages,
) as stream:
for event in stream:
if event.type == "content_block_delta":
yield event.delta.text
building mcp servers that claude drives
the model context protocol (mcp) flips the tool-use model: instead of defining tools inline in the api call, you run a separate server that advertises tools, and the ai client discovers them at runtime.
for the gitlab-mcp project, claude desktop is the client — it connects to my mcp server on startup, fetches the tool list (40+ operations), and can then call any of them during a conversation:
[user]: check if there are failing ci jobs on the main branch and summarise them
[claude]: i'll check the pipeline status for you.
→ calls: list_pipeline_jobs(project_id="...", ref="main")
→ calls: get_job_log(job_id="1234")
→ response: "there are 2 failing jobs: test-unit (exit code 1) and lint (exit code 2). the unit test failure is in auth_test.py line 47..."
the key insight: mcp lets you build the tool once and use it with any compatible client. the same gitlab-mcp server works with claude, cursor, and cline without any changes.
whatsapp mcp — personal messaging automation
the whatsapp-mcp project takes a different angle: bridging claude with personal whatsapp accounts. a go binary connects to whatsapp’s web api using the whatsmeow library; a python flask server exposes this as mcp tools.
the tools claude can call:
send_message— text, images, documents, audioget_messages— fetch message history for a contact or groupsearch_messages— full-text search across stored messageslist_chats— get all active conversations
practically, this means you can have a conversation with claude like:
“summarise what my team discussed in the group chat this week and draft a reply to any open questions”
claude calls get_messages for the group, reads the history, identifies open threads, and uses send_message to draft a reply for you to review.
authentication persists for ~20 days via qr code scan — no credentials stored.
stitch mcp — ai-driven ui generation
stitch-mcp wraps google stitch’s ui generation api as mcp tools, letting claude drive interface creation from natural language:
“create a dashboard with a sidebar nav, a metrics overview card, and a data table — apply a dark design system”
claude chains the tool calls: create_project → generate_screen (with the prompt) → create_design_system → apply_design_system. the result is a complete figma-compatible design exported from a natural language description.
the interesting engineering challenge here: google stitch’s api is stateful — screens reference projects, design systems reference screens. claude needs to track the state across tool calls. the solution: each tool response includes the created resource’s id, and claude passes it forward. no external state management needed.
key lessons
tool descriptions matter more than you think. claude reads the description field to decide which tool to call and how to use it. one sentence of clear, specific description beats a paragraph of vague text.
structured output over free text. any tool that returns unstructured text forces claude to parse it — and parsing errors cascade. return typed objects or json. if you must return text, make the format explicit and consistent.
test the tool-call loop. the most common failure mode: claude calls a tool, gets an unexpected response format, then calls the same tool again with slightly different inputs, loops 3 times, then gives up. catch this in testing before it hits production.
streaming is worth the complexity. for any operation that takes more than 2 seconds, stream the response. users tolerate slow responses if they see progress; they abandon interfaces that appear frozen.
all three mcp projects (gitlab, whatsapp, stitch) are open source. the claude sdk powers the ai layer in everything i’ve shipped at spritle since october 2025.