Build with Claude: Practical Guidance & Best Practices
Build with Claude: Practical Guidance & Best Practices
Claude
Dec 9, 2025


To build with Claude, pick the right model, ground it in your data, and return structured outputs. Use tools (and MCP) for actions, prompt caching for speed/cost, and add computer use for desktop tasks. Wrap with evals, guardrails, and a rollout plan so pilots turn into production value.
Why this matters now
Claude’s newest models (Opus/Sonnet 4.5) bring stronger coding, agent workflows, and long‑context reasoning. Add structured outputs, prompt caching, tool use & computer use, plus the Model Context Protocol (MCP), and you can move from neat demos to dependable systems that ship value.
Model selection
Opus 4.5: best for complex coding, agents, research, multi‑step plans. Use when quality trumps cost.
Sonnet 4.5: balanced performance/price for most apps and APIs.
Haiku (if available in your region): fastest/lowest cost for simple classification, extraction, or routing.
Context window: up to 200k tokens on current 4.5 family. Still, don’t just “paste everything in”—ground and fetch what’s relevant.
Golden rules (that save weeks later)
Return structure, not prose. Ask for structured outputs (JSON Schema) so your code handles results deterministically.
Keep prompts modular. Use a short, stable system prompt + task instructions + examples.
Cache heavy prefixes. Enable prompt caching for policies, examples, and long backgrounds to cut latency/cost.
Connect tools the clean way. Use tool use and standardise connections via MCP.
Design control points. Human‑in‑the‑loop for high‑risk steps; automatic on low‑risk.
Measure and iterate. Track accuracy, latency, cost, and business outcome (time saved, quality, risk, revenue).
Quickstart (10–15 minutes)
1) Choose a model & shape the system prompt
Describe the role, audience, constraints, and failure behaviour (e.g., “say I don’t know when uncertain”). Keep it brief.
2) Define the output schema
Create a JSON Schema that captures exactly what you need (types, enums, required fields). Avoid over‑complex features to start.
3) Add tools (optional now, essential later)
Start with a simple “search” or “getCustomer” tool. Move to MCP when you’re ready to standardise connections.
4) Turn on prompt caching
Cache the static prefix (policies, examples) to reduce latency/cost for every call.
5) Write a tiny eval
10–20 cases that mirror real inputs. Score for exact‑match fields and business rules.
6) Ship to a pilot group
Instrument logs and dashboards. Decide to scale/iterate/stop based on the data.
Example: TypeScript Messages API with structured JSON
import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! }); const schema = { type: "object", additionalProperties: false, properties: { intent: { type: "string", enum: ["support", "sales", "other"] }, priority: { type: "integer", minimum: 1, maximum: 5 }, summary: { type: "string", maxLength: 240 } }, required: ["intent", "priority", "summary"] } as const; const msg = await client.messages.create({ model: "claude-sonnet-4.5", max_tokens: 800, system: "You are a helpful triage assistant. If uncertain, say you are uncertain.", messages: [{ role: "user", content: "My checkout fails at payment step" }], output_format: { type: "json_schema", json_schema: schema } }); const result = JSON.parse(msg.content[0].text);
Prompt caching (TypeScript)
const msg = await client.messages.create({ model: "claude-sonnet-4.5", max_tokens: 800, system: { type: "text", text: "Global policies and examples...", cache_control: { type: "ephemeral" } // cache this heavy prefix }, messages: [{ role: "user", content: "New request text..." }] });
Tools, MCP, and computer use (when you’re ready to act)
Tool use lets Claude call functions you define (with JSON‑schema parameters). Great for: database lookups, searches, sending emails, posting to CRMs.
MCP standardises how models connect to tools and data—think USB‑C for AI. Adopt it to avoid bespoke adapters and enable a growing ecosystem of ready‑made servers.
Computer use (beta) allows controlled desktop interactions (mouse/keyboard/screen) for tasks that still live in GUI‑only apps.
Pattern
Start with 1–2 tools (e.g., knowledge search, entity fetch).
Add programmatic tool calling and a tool runner that retries with clear errors.
Log every call (inputs/outputs) for traceability.
Prompt patterns that consistently help
Role + constraints: “You are a compliance reviewer. If unsure, ask for the policy page.”
Examples (few‑shot): one good example beats ten vague rules.
Output contract: remind Claude that invalid JSON = invalid task.
Let it say I don’t know: fewer hallucinations, better trust.
Decompose tasks: ask for a plan before action in high‑stakes flows.
Evaluations & quality
Set success criteria first (exact‑match fields, pass@k, precision/recall, latency). Build a small eval harness:
10–20 realistic test cases with expected outputs.
Automatic scoring for structural validity + business rules.
A weekly trend chart (accuracy, cost, latency, cache hit rate).
Red flags: silently changed formats, creeping latency, tool‑call failures, over‑confident answers.
Cost & performance checklist
Prompt caching for static prefixes and example blocks.
Streaming responses for perceived speed in UI.
Shorter messages: put long context behind a retrieval step instead of inline.
Right‑size the model: Haiku for extraction/routing; Sonnet/Opus for heavy reasoning.
Batching non‑interactive jobs off‑peak.
Security, safety & governance
Data scopes: least‑privilege keys for tools and MCP servers; avoid broad access.
PII handling: mask on ingest; unmask only where authorised.
Human control points: approvals for high‑impact actions (payments, customer comms).
Audit & observability: log prompts, tool calls, and outputs with IDs; retain for compliance.
Policy prompts: encode “must/never” rules in a short system layer; keep versioned.
Practical recipes
1) Knowledge answering with citations
Retrieve top 5 passages from your search/KB tool.
Ask Claude to answer with inline citations and a confidence label.
Return JSON:
{answer, citations:[...], confidence: enum}.
2) Support triage
Classify intent/priority, summarise, propose next step.
If high‑risk or missing data, route to human.
Record reasoning in a hidden field for audits.
3) Sales research assistant
Tool calls: company lookup → CRM enrichment → email draft.
Output JSON + separate draft text artefact for review.
4) Agent for back‑office ops (computer use)
Read screen → click through legacy GUI → export data → upload to system of record.
Guard with timeouts, whitelists, and a manual approval step.
Common pitfalls (and friendly fixes)
Free‑text outputs → switch to structured outputs.
Massive prompts → ground via retrieval + prompt caching.
Over‑automation → keep humans in the loop for edge cases first.
One giant tool → split into small, composable tools with clear schemas.
No evals → even 20 test cases beats guessing.
FAQ
Is Claude suitable for beginners?
Yes. Start with the Messages API and structured outputs; add tools later.
How do I stop hallucinations?
Ask for uncertainty, ground with retrieved context, and require citations or structured outputs.
What context window should I expect?
Plan around long‑context models (up to ~200k tokens) but fetch only what’s relevant.
Can Claude act in my systems?
Yes, via tool use and MCP; for desktop apps, consider computer use (beta) with guardrails.
How do I measure ROI?
Track accuracy, time saved, error reduction, cost per task, and—if applicable—revenue lift.
Book a Claude Build Workshop — we’ll help you select the right model, wire tools via MCP, enable structured outputs and caching, and stand up an eval harness so you launch with confidence.
To build with Claude, pick the right model, ground it in your data, and return structured outputs. Use tools (and MCP) for actions, prompt caching for speed/cost, and add computer use for desktop tasks. Wrap with evals, guardrails, and a rollout plan so pilots turn into production value.
Why this matters now
Claude’s newest models (Opus/Sonnet 4.5) bring stronger coding, agent workflows, and long‑context reasoning. Add structured outputs, prompt caching, tool use & computer use, plus the Model Context Protocol (MCP), and you can move from neat demos to dependable systems that ship value.
Model selection
Opus 4.5: best for complex coding, agents, research, multi‑step plans. Use when quality trumps cost.
Sonnet 4.5: balanced performance/price for most apps and APIs.
Haiku (if available in your region): fastest/lowest cost for simple classification, extraction, or routing.
Context window: up to 200k tokens on current 4.5 family. Still, don’t just “paste everything in”—ground and fetch what’s relevant.
Golden rules (that save weeks later)
Return structure, not prose. Ask for structured outputs (JSON Schema) so your code handles results deterministically.
Keep prompts modular. Use a short, stable system prompt + task instructions + examples.
Cache heavy prefixes. Enable prompt caching for policies, examples, and long backgrounds to cut latency/cost.
Connect tools the clean way. Use tool use and standardise connections via MCP.
Design control points. Human‑in‑the‑loop for high‑risk steps; automatic on low‑risk.
Measure and iterate. Track accuracy, latency, cost, and business outcome (time saved, quality, risk, revenue).
Quickstart (10–15 minutes)
1) Choose a model & shape the system prompt
Describe the role, audience, constraints, and failure behaviour (e.g., “say I don’t know when uncertain”). Keep it brief.
2) Define the output schema
Create a JSON Schema that captures exactly what you need (types, enums, required fields). Avoid over‑complex features to start.
3) Add tools (optional now, essential later)
Start with a simple “search” or “getCustomer” tool. Move to MCP when you’re ready to standardise connections.
4) Turn on prompt caching
Cache the static prefix (policies, examples) to reduce latency/cost for every call.
5) Write a tiny eval
10–20 cases that mirror real inputs. Score for exact‑match fields and business rules.
6) Ship to a pilot group
Instrument logs and dashboards. Decide to scale/iterate/stop based on the data.
Example: TypeScript Messages API with structured JSON
import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! }); const schema = { type: "object", additionalProperties: false, properties: { intent: { type: "string", enum: ["support", "sales", "other"] }, priority: { type: "integer", minimum: 1, maximum: 5 }, summary: { type: "string", maxLength: 240 } }, required: ["intent", "priority", "summary"] } as const; const msg = await client.messages.create({ model: "claude-sonnet-4.5", max_tokens: 800, system: "You are a helpful triage assistant. If uncertain, say you are uncertain.", messages: [{ role: "user", content: "My checkout fails at payment step" }], output_format: { type: "json_schema", json_schema: schema } }); const result = JSON.parse(msg.content[0].text);
Prompt caching (TypeScript)
const msg = await client.messages.create({ model: "claude-sonnet-4.5", max_tokens: 800, system: { type: "text", text: "Global policies and examples...", cache_control: { type: "ephemeral" } // cache this heavy prefix }, messages: [{ role: "user", content: "New request text..." }] });
Tools, MCP, and computer use (when you’re ready to act)
Tool use lets Claude call functions you define (with JSON‑schema parameters). Great for: database lookups, searches, sending emails, posting to CRMs.
MCP standardises how models connect to tools and data—think USB‑C for AI. Adopt it to avoid bespoke adapters and enable a growing ecosystem of ready‑made servers.
Computer use (beta) allows controlled desktop interactions (mouse/keyboard/screen) for tasks that still live in GUI‑only apps.
Pattern
Start with 1–2 tools (e.g., knowledge search, entity fetch).
Add programmatic tool calling and a tool runner that retries with clear errors.
Log every call (inputs/outputs) for traceability.
Prompt patterns that consistently help
Role + constraints: “You are a compliance reviewer. If unsure, ask for the policy page.”
Examples (few‑shot): one good example beats ten vague rules.
Output contract: remind Claude that invalid JSON = invalid task.
Let it say I don’t know: fewer hallucinations, better trust.
Decompose tasks: ask for a plan before action in high‑stakes flows.
Evaluations & quality
Set success criteria first (exact‑match fields, pass@k, precision/recall, latency). Build a small eval harness:
10–20 realistic test cases with expected outputs.
Automatic scoring for structural validity + business rules.
A weekly trend chart (accuracy, cost, latency, cache hit rate).
Red flags: silently changed formats, creeping latency, tool‑call failures, over‑confident answers.
Cost & performance checklist
Prompt caching for static prefixes and example blocks.
Streaming responses for perceived speed in UI.
Shorter messages: put long context behind a retrieval step instead of inline.
Right‑size the model: Haiku for extraction/routing; Sonnet/Opus for heavy reasoning.
Batching non‑interactive jobs off‑peak.
Security, safety & governance
Data scopes: least‑privilege keys for tools and MCP servers; avoid broad access.
PII handling: mask on ingest; unmask only where authorised.
Human control points: approvals for high‑impact actions (payments, customer comms).
Audit & observability: log prompts, tool calls, and outputs with IDs; retain for compliance.
Policy prompts: encode “must/never” rules in a short system layer; keep versioned.
Practical recipes
1) Knowledge answering with citations
Retrieve top 5 passages from your search/KB tool.
Ask Claude to answer with inline citations and a confidence label.
Return JSON:
{answer, citations:[...], confidence: enum}.
2) Support triage
Classify intent/priority, summarise, propose next step.
If high‑risk or missing data, route to human.
Record reasoning in a hidden field for audits.
3) Sales research assistant
Tool calls: company lookup → CRM enrichment → email draft.
Output JSON + separate draft text artefact for review.
4) Agent for back‑office ops (computer use)
Read screen → click through legacy GUI → export data → upload to system of record.
Guard with timeouts, whitelists, and a manual approval step.
Common pitfalls (and friendly fixes)
Free‑text outputs → switch to structured outputs.
Massive prompts → ground via retrieval + prompt caching.
Over‑automation → keep humans in the loop for edge cases first.
One giant tool → split into small, composable tools with clear schemas.
No evals → even 20 test cases beats guessing.
FAQ
Is Claude suitable for beginners?
Yes. Start with the Messages API and structured outputs; add tools later.
How do I stop hallucinations?
Ask for uncertainty, ground with retrieved context, and require citations or structured outputs.
What context window should I expect?
Plan around long‑context models (up to ~200k tokens) but fetch only what’s relevant.
Can Claude act in my systems?
Yes, via tool use and MCP; for desktop apps, consider computer use (beta) with guardrails.
How do I measure ROI?
Track accuracy, time saved, error reduction, cost per task, and—if applicable—revenue lift.
Book a Claude Build Workshop — we’ll help you select the right model, wire tools via MCP, enable structured outputs and caching, and stand up an eval harness so you launch with confidence.
Get practical advice delivered to your inbox
By subscribing you consent to Generation Digital storing and processing your details in line with our privacy policy. You can read the full policy at gend.co/privacy.

From AI silos to systems: Miro workflows that scale

Notion in healthcare: military-grade decision templates & governance

Claude Skills and CLAUDE.md: a practical 2026 guide for teams

Perplexity partners with Cristiano Ronaldo: what it means for AI search

Gemini 3 Deep Think: how it works and how to turn it on

Break the Cycle: Discover AI Assistants That Keep Track of Your Tasks

Dial Down the Buzz, Begin the Guide: Implementing Enterprise AI Effectively and Securely at Scale

Asana for Manufacturing: Develop a Smart Operational Backbone

From Excitement to Confidence: Designing Your AI Program to Comply with Standards

Unlocking Your Organizational Knowledge with Custom AI Solutions

From AI silos to systems: Miro workflows that scale

Notion in healthcare: military-grade decision templates & governance

Claude Skills and CLAUDE.md: a practical 2026 guide for teams

Perplexity partners with Cristiano Ronaldo: what it means for AI search

Gemini 3 Deep Think: how it works and how to turn it on

Break the Cycle: Discover AI Assistants That Keep Track of Your Tasks

Dial Down the Buzz, Begin the Guide: Implementing Enterprise AI Effectively and Securely at Scale

Asana for Manufacturing: Develop a Smart Operational Backbone

From Excitement to Confidence: Designing Your AI Program to Comply with Standards

Unlocking Your Organizational Knowledge with Custom AI Solutions
Generation
Digital

Business Number: 256 9431 77 | Copyright 2026 | Terms and Conditions | Privacy Policy
Generation
Digital







