Build with Claude: Practical Guidance & Best Practices

Build with Claude: Practical Guidance & Best Practices

Claude

Dec 9, 2025

A diverse group of three people collaborate in a modern office with exposed brick walls, reviewing a flowchart on a large computer screen, embodying teamwork and innovation in a professional environment.
A diverse group of three people collaborate in a modern office with exposed brick walls, reviewing a flowchart on a large computer screen, embodying teamwork and innovation in a professional environment.

To build with Claude, pick the right model, ground it in your data, and return structured outputs. Use tools (and MCP) for actions, prompt caching for speed/cost, and add computer use for desktop tasks. Wrap with evals, guardrails, and a rollout plan so pilots turn into production value.

Why this matters now

Claude’s newest models (Opus/Sonnet 4.5) bring stronger coding, agent workflows, and long‑context reasoning. Add structured outputs, prompt caching, tool use & computer use, plus the Model Context Protocol (MCP), and you can move from neat demos to dependable systems that ship value.

Model selection

  • Opus 4.5: best for complex coding, agents, research, multi‑step plans. Use when quality trumps cost.

  • Sonnet 4.5: balanced performance/price for most apps and APIs.

  • Haiku (if available in your region): fastest/lowest cost for simple classification, extraction, or routing.

Context window: up to 200k tokens on current 4.5 family. Still, don’t just “paste everything in”—ground and fetch what’s relevant.

Golden rules (that save weeks later)

  1. Return structure, not prose. Ask for structured outputs (JSON Schema) so your code handles results deterministically.

  2. Keep prompts modular. Use a short, stable system prompt + task instructions + examples.

  3. Cache heavy prefixes. Enable prompt caching for policies, examples, and long backgrounds to cut latency/cost.

  4. Connect tools the clean way. Use tool use and standardise connections via MCP.

  5. Design control points. Human‑in‑the‑loop for high‑risk steps; automatic on low‑risk.

  6. Measure and iterate. Track accuracy, latency, cost, and business outcome (time saved, quality, risk, revenue).

Quickstart (10–15 minutes)

1) Choose a model & shape the system prompt
Describe the role, audience, constraints, and failure behaviour (e.g., “say I don’t know when uncertain”). Keep it brief.

2) Define the output schema
Create a JSON Schema that captures exactly what you need (types, enums, required fields). Avoid over‑complex features to start.

3) Add tools (optional now, essential later)
Start with a simple “search” or “getCustomer” tool. Move to MCP when you’re ready to standardise connections.

4) Turn on prompt caching
Cache the static prefix (policies, examples) to reduce latency/cost for every call.

5) Write a tiny eval
10–20 cases that mirror real inputs. Score for exact‑match fields and business rules.

6) Ship to a pilot group
Instrument logs and dashboards. Decide to scale/iterate/stop based on the data.

Example: TypeScript Messages API with structured JSON

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });

const schema = {
  type: "object",
  additionalProperties: false,
  properties: {
    intent: { type: "string", enum: ["support", "sales", "other"] },
    priority: { type: "integer", minimum: 1, maximum: 5 },
    summary: { type: "string", maxLength: 240 }
  },
  required: ["intent", "priority", "summary"]
} as const;

const msg = await client.messages.create({
  model: "claude-sonnet-4.5",
  max_tokens: 800,
  system: "You are a helpful triage assistant. If uncertain, say you are uncertain.",
  messages: [{ role: "user", content: "My checkout fails at payment step" }],
  output_format: { type: "json_schema", json_schema: schema }
});

const result = JSON.parse(msg.content[0].text);

Prompt caching (TypeScript)

const msg = await client.messages.create({
  model: "claude-sonnet-4.5",
  max_tokens: 800,
  system: {
    type: "text",
    text: "Global policies and examples...",
    cache_control: { type: "ephemeral" } // cache this heavy prefix
  },
  messages: [{ role: "user", content: "New request text..." }]
});

Tools, MCP, and computer use (when you’re ready to act)

Tool use lets Claude call functions you define (with JSON‑schema parameters). Great for: database lookups, searches, sending emails, posting to CRMs.
MCP standardises how models connect to tools and data—think USB‑C for AI. Adopt it to avoid bespoke adapters and enable a growing ecosystem of ready‑made servers.
Computer use (beta) allows controlled desktop interactions (mouse/keyboard/screen) for tasks that still live in GUI‑only apps.

Pattern

  • Start with 1–2 tools (e.g., knowledge search, entity fetch).

  • Add programmatic tool calling and a tool runner that retries with clear errors.

  • Log every call (inputs/outputs) for traceability.

Prompt patterns that consistently help

  • Role + constraints: “You are a compliance reviewer. If unsure, ask for the policy page.”

  • Examples (few‑shot): one good example beats ten vague rules.

  • Output contract: remind Claude that invalid JSON = invalid task.

  • Let it say I don’t know: fewer hallucinations, better trust.

  • Decompose tasks: ask for a plan before action in high‑stakes flows.

Evaluations & quality

Set success criteria first (exact‑match fields, pass@k, precision/recall, latency). Build a small eval harness:

  • 10–20 realistic test cases with expected outputs.

  • Automatic scoring for structural validity + business rules.

  • A weekly trend chart (accuracy, cost, latency, cache hit rate).

Red flags: silently changed formats, creeping latency, tool‑call failures, over‑confident answers.

Cost & performance checklist

  • Prompt caching for static prefixes and example blocks.

  • Streaming responses for perceived speed in UI.

  • Shorter messages: put long context behind a retrieval step instead of inline.

  • Right‑size the model: Haiku for extraction/routing; Sonnet/Opus for heavy reasoning.

  • Batching non‑interactive jobs off‑peak.

Security, safety & governance

  • Data scopes: least‑privilege keys for tools and MCP servers; avoid broad access.

  • PII handling: mask on ingest; unmask only where authorised.

  • Human control points: approvals for high‑impact actions (payments, customer comms).

  • Audit & observability: log prompts, tool calls, and outputs with IDs; retain for compliance.

  • Policy prompts: encode “must/never” rules in a short system layer; keep versioned.

Practical recipes

1) Knowledge answering with citations

  • Retrieve top 5 passages from your search/KB tool.

  • Ask Claude to answer with inline citations and a confidence label.

  • Return JSON: {answer, citations:[...], confidence: enum}.

2) Support triage

  • Classify intent/priority, summarise, propose next step.

  • If high‑risk or missing data, route to human.

  • Record reasoning in a hidden field for audits.

3) Sales research assistant

  • Tool calls: company lookup → CRM enrichment → email draft.

  • Output JSON + separate draft text artefact for review.

4) Agent for back‑office ops (computer use)

  • Read screen → click through legacy GUI → export data → upload to system of record.

  • Guard with timeouts, whitelists, and a manual approval step.

Common pitfalls (and friendly fixes)

  • Free‑text outputs → switch to structured outputs.

  • Massive prompts → ground via retrieval + prompt caching.

  • Over‑automation → keep humans in the loop for edge cases first.

  • One giant tool → split into small, composable tools with clear schemas.

  • No evals → even 20 test cases beats guessing.

FAQ

Is Claude suitable for beginners?
Yes. Start with the Messages API and structured outputs; add tools later.

How do I stop hallucinations?
Ask for uncertainty, ground with retrieved context, and require citations or structured outputs.

What context window should I expect?
Plan around long‑context models (up to ~200k tokens) but fetch only what’s relevant.

Can Claude act in my systems?
Yes, via tool use and MCP; for desktop apps, consider computer use (beta) with guardrails.

How do I measure ROI?
Track accuracy, time saved, error reduction, cost per task, and—if applicable—revenue lift.

Book a Claude Build Workshop — we’ll help you select the right model, wire tools via MCP, enable structured outputs and caching, and stand up an eval harness so you launch with confidence.

To build with Claude, pick the right model, ground it in your data, and return structured outputs. Use tools (and MCP) for actions, prompt caching for speed/cost, and add computer use for desktop tasks. Wrap with evals, guardrails, and a rollout plan so pilots turn into production value.

Why this matters now

Claude’s newest models (Opus/Sonnet 4.5) bring stronger coding, agent workflows, and long‑context reasoning. Add structured outputs, prompt caching, tool use & computer use, plus the Model Context Protocol (MCP), and you can move from neat demos to dependable systems that ship value.

Model selection

  • Opus 4.5: best for complex coding, agents, research, multi‑step plans. Use when quality trumps cost.

  • Sonnet 4.5: balanced performance/price for most apps and APIs.

  • Haiku (if available in your region): fastest/lowest cost for simple classification, extraction, or routing.

Context window: up to 200k tokens on current 4.5 family. Still, don’t just “paste everything in”—ground and fetch what’s relevant.

Golden rules (that save weeks later)

  1. Return structure, not prose. Ask for structured outputs (JSON Schema) so your code handles results deterministically.

  2. Keep prompts modular. Use a short, stable system prompt + task instructions + examples.

  3. Cache heavy prefixes. Enable prompt caching for policies, examples, and long backgrounds to cut latency/cost.

  4. Connect tools the clean way. Use tool use and standardise connections via MCP.

  5. Design control points. Human‑in‑the‑loop for high‑risk steps; automatic on low‑risk.

  6. Measure and iterate. Track accuracy, latency, cost, and business outcome (time saved, quality, risk, revenue).

Quickstart (10–15 minutes)

1) Choose a model & shape the system prompt
Describe the role, audience, constraints, and failure behaviour (e.g., “say I don’t know when uncertain”). Keep it brief.

2) Define the output schema
Create a JSON Schema that captures exactly what you need (types, enums, required fields). Avoid over‑complex features to start.

3) Add tools (optional now, essential later)
Start with a simple “search” or “getCustomer” tool. Move to MCP when you’re ready to standardise connections.

4) Turn on prompt caching
Cache the static prefix (policies, examples) to reduce latency/cost for every call.

5) Write a tiny eval
10–20 cases that mirror real inputs. Score for exact‑match fields and business rules.

6) Ship to a pilot group
Instrument logs and dashboards. Decide to scale/iterate/stop based on the data.

Example: TypeScript Messages API with structured JSON

import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });

const schema = {
  type: "object",
  additionalProperties: false,
  properties: {
    intent: { type: "string", enum: ["support", "sales", "other"] },
    priority: { type: "integer", minimum: 1, maximum: 5 },
    summary: { type: "string", maxLength: 240 }
  },
  required: ["intent", "priority", "summary"]
} as const;

const msg = await client.messages.create({
  model: "claude-sonnet-4.5",
  max_tokens: 800,
  system: "You are a helpful triage assistant. If uncertain, say you are uncertain.",
  messages: [{ role: "user", content: "My checkout fails at payment step" }],
  output_format: { type: "json_schema", json_schema: schema }
});

const result = JSON.parse(msg.content[0].text);

Prompt caching (TypeScript)

const msg = await client.messages.create({
  model: "claude-sonnet-4.5",
  max_tokens: 800,
  system: {
    type: "text",
    text: "Global policies and examples...",
    cache_control: { type: "ephemeral" } // cache this heavy prefix
  },
  messages: [{ role: "user", content: "New request text..." }]
});

Tools, MCP, and computer use (when you’re ready to act)

Tool use lets Claude call functions you define (with JSON‑schema parameters). Great for: database lookups, searches, sending emails, posting to CRMs.
MCP standardises how models connect to tools and data—think USB‑C for AI. Adopt it to avoid bespoke adapters and enable a growing ecosystem of ready‑made servers.
Computer use (beta) allows controlled desktop interactions (mouse/keyboard/screen) for tasks that still live in GUI‑only apps.

Pattern

  • Start with 1–2 tools (e.g., knowledge search, entity fetch).

  • Add programmatic tool calling and a tool runner that retries with clear errors.

  • Log every call (inputs/outputs) for traceability.

Prompt patterns that consistently help

  • Role + constraints: “You are a compliance reviewer. If unsure, ask for the policy page.”

  • Examples (few‑shot): one good example beats ten vague rules.

  • Output contract: remind Claude that invalid JSON = invalid task.

  • Let it say I don’t know: fewer hallucinations, better trust.

  • Decompose tasks: ask for a plan before action in high‑stakes flows.

Evaluations & quality

Set success criteria first (exact‑match fields, pass@k, precision/recall, latency). Build a small eval harness:

  • 10–20 realistic test cases with expected outputs.

  • Automatic scoring for structural validity + business rules.

  • A weekly trend chart (accuracy, cost, latency, cache hit rate).

Red flags: silently changed formats, creeping latency, tool‑call failures, over‑confident answers.

Cost & performance checklist

  • Prompt caching for static prefixes and example blocks.

  • Streaming responses for perceived speed in UI.

  • Shorter messages: put long context behind a retrieval step instead of inline.

  • Right‑size the model: Haiku for extraction/routing; Sonnet/Opus for heavy reasoning.

  • Batching non‑interactive jobs off‑peak.

Security, safety & governance

  • Data scopes: least‑privilege keys for tools and MCP servers; avoid broad access.

  • PII handling: mask on ingest; unmask only where authorised.

  • Human control points: approvals for high‑impact actions (payments, customer comms).

  • Audit & observability: log prompts, tool calls, and outputs with IDs; retain for compliance.

  • Policy prompts: encode “must/never” rules in a short system layer; keep versioned.

Practical recipes

1) Knowledge answering with citations

  • Retrieve top 5 passages from your search/KB tool.

  • Ask Claude to answer with inline citations and a confidence label.

  • Return JSON: {answer, citations:[...], confidence: enum}.

2) Support triage

  • Classify intent/priority, summarise, propose next step.

  • If high‑risk or missing data, route to human.

  • Record reasoning in a hidden field for audits.

3) Sales research assistant

  • Tool calls: company lookup → CRM enrichment → email draft.

  • Output JSON + separate draft text artefact for review.

4) Agent for back‑office ops (computer use)

  • Read screen → click through legacy GUI → export data → upload to system of record.

  • Guard with timeouts, whitelists, and a manual approval step.

Common pitfalls (and friendly fixes)

  • Free‑text outputs → switch to structured outputs.

  • Massive prompts → ground via retrieval + prompt caching.

  • Over‑automation → keep humans in the loop for edge cases first.

  • One giant tool → split into small, composable tools with clear schemas.

  • No evals → even 20 test cases beats guessing.

FAQ

Is Claude suitable for beginners?
Yes. Start with the Messages API and structured outputs; add tools later.

How do I stop hallucinations?
Ask for uncertainty, ground with retrieved context, and require citations or structured outputs.

What context window should I expect?
Plan around long‑context models (up to ~200k tokens) but fetch only what’s relevant.

Can Claude act in my systems?
Yes, via tool use and MCP; for desktop apps, consider computer use (beta) with guardrails.

How do I measure ROI?
Track accuracy, time saved, error reduction, cost per task, and—if applicable—revenue lift.

Book a Claude Build Workshop — we’ll help you select the right model, wire tools via MCP, enable structured outputs and caching, and stand up an eval harness so you launch with confidence.

Get practical advice delivered to your inbox

By subscribing you consent to Generation Digital storing and processing your details in line with our privacy policy. You can read the full policy at gend.co/privacy.

Ready to get the support your organization needs to successfully use AI?

Miro Solutions Partner
Asana Platinum Solutions Partner
Notion Platinum Solutions Partner
Glean Certified Partner

Ready to get the support your organization needs to successfully use AI?

Miro Solutions Partner
Asana Platinum Solutions Partner
Notion Platinum Solutions Partner
Glean Certified Partner

Generation
Digital

Canadian Office
33 Queen St,
Toronto
M5H 2N2
Canada

Canadian Office
1 University Ave,
Toronto,
ON M5J 1T1,
Canada

NAMER Office
77 Sands St,
Brooklyn,
NY 11201,
USA

Head Office
Charlemont St, Saint Kevin's, Dublin,
D02 VN88,
Ireland

Middle East Office
6994 Alsharq 3890,
An Narjis,
Riyadh 13343,
Saudi Arabia

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo

Business Number: 256 9431 77 | Copyright 2026 | Terms and Conditions | Privacy Policy

Generation
Digital

Canadian Office
33 Queen St,
Toronto
M5H 2N2
Canada

Canadian Office
1 University Ave,
Toronto,
ON M5J 1T1,
Canada

NAMER Office
77 Sands St,
Brooklyn,
NY 11201,
USA

Head Office
Charlemont St, Saint Kevin's, Dublin,
D02 VN88,
Ireland

Middle East Office
6994 Alsharq 3890,
An Narjis,
Riyadh 13343,
Saudi Arabia

UK Fast Growth Index UBS Logo
Financial Times FT 1000 Logo
Febe Growth 100 Logo


Business No: 256 9431 77
Terms and Conditions
Privacy Policy
© 2026