Free Practice Questions for Anthropic Claude Certified Architect – Foundations (CCA-F) Certification

🔄 Last checked for updates June 1st, 2026

Study with 614 exam-style practice questions designed to help you prepare for the Anthropic Claude Certified Architect – Foundations (CCA-F).

All Domains

Practice with randomly mixed questions from all topics

Question MixAll Topics

FormatRandom Order

Domain Mode

Practice questions from a specific topic area

Select Domain

Quiz History

Exam Details

Key information about Anthropic Claude Certified Architect – Foundations (CCA-F)

Official study guide

View

Question formats CertSafari offers

Multiple choice

exam format:

Multiple choice, scenario-based questions

passing score:

720 out of 1000

target audience:

Solution architect who designs and implements production applications with Claude, with 6+ months practical experience

Exam Topics & Skills Assessed

Skills measured (from the official study guide)

Domain 1: Agentic Architecture & Orchestration

Subdomain 1.1: Design and implement agentic loops for autonomous task execution

Knowledge of:

- The agentic loop lifecycle: sending requests to Claude, inspecting stop_reason ("tool_use" vs "end_turn"), executing requested tools, and returning results for the next iteration - How tool results are appended to conversation history so the model can reason about the next action - The distinction between model-driven decision-making (Claude reasons about which tool to call next based on context) and pre-conﬁgured decision trees or tool sequences

Skills in:

- Implementing agentic loop control ﬂow that continues when stop_reason is "tool_use" and terminates when stop_reason is "end_turn" - Adding tool results to conversation context between iterations so the model can incorporate new information into its reasoning - Avoiding anti-patterns such as parsing natural language signals to determine loop termination, setting arbitrary iteration caps as the primary stopping mechanism, or checking for assistant text content as a completion indicator

Subdomain 1.2: Orchestrate multi-agent systems with coordinator-subagent patterns

Knowledge of:

- Hub-and-spoke architecture where a coordinator agent manages all inter-subagent communication, error handling, and information routing - How subagents operate with isolated context—they do not inherit the coordinator's conversation history automatically - The role of the coordinator in task decomposition, delegation, result aggregation, and deciding which subagents to invoke based on query complexity - Risks of overly narrow task decomposition by the coordinator, leading to incomplete coverage of broad research topics

Skills in:

- Designing coordinator agents that analyze query requirements and dynamically select which subagents to invoke rather than always routing through the full pipeline - Partitioning research scope across subagents to minimize duplication (e.g., assigning distinct subtopics or source types to each agent) - Implementing iterative reﬁnement loops where the coordinator evaluates synthesis output for gaps, re-delegates to search and analysis subagents with targeted queries, and re-invokes synthesis until coverage is sufﬁcient - Routing all subagent communication through the coordinator for observability, consistent error handling, and controlled information ﬂow

Subdomain 1.3: Configure subagent invocation, context passing, and spawning

Knowledge of:

- The Task tool as the mechanism for spawning subagents, and the requirement that allowedTools must include "Task" for a coordinator to invoke subagents - That subagent context must be explicitly provided in the prompt—subagents do not automatically inherit parent context or share memory between invocations - The AgentDefinition conﬁguration including descriptions, system prompts, and tool restrictions for each subagent type - Fork-based session management for exploring divergent approaches from a shared analysis baseline

Skills in:

- Including complete ﬁndings from prior agents directly in the subagent's prompt (e.g., passing web search results and document analysis outputs to the synthesis subagent) - Using structured data formats to separate content from metadata (source URLs, document names, page numbers) when passing context between agents to preserve attribution - Spawning parallel subagents by emitting multiple Task tool calls in a single coordinator response rather than across separate turns - Designing coordinator prompts that specify research goals and quality criteria rather than step-by-step procedural instructions, to enable subagent adaptability

Subdomain 1.4: Implement multi-step workflows with enforcement and handoff patterns

Knowledge of:

- The difference between programmatic enforcement (hooks, prerequisite gates) and prompt-based guidance for workﬂow ordering - When deterministic compliance is required (e.g., identity veriﬁcation before ﬁnancial operations), prompt instructions alone have a non-zero failure rate - Structured handoff protocols for mid-process escalation that include customer details, root cause analysis, and recommended actions

Skills in:

- Implementing programmatic prerequisites that block downstream tool calls until prerequisite steps have completed (e.g., blocking process_refund until get_customer has returned a veriﬁed customer ID) - Decomposing multi-concern customer requests into distinct items, then investigating each in parallel using shared context before synthesizing a uniﬁed resolution - Compiling structured handoff summaries (customer ID, root cause, refund amount, recommended action) when escalating to human agents who lack access to the conversation transcript

Subdomain 1.5: Apply Agent SDK hooks for tool call interception and data normalization

Knowledge of:

- Hook patterns (e.g., PostToolUse) that intercept tool results for transformation before the model processes them - Hook patterns that intercept outgoing tool calls to enforce compliance rules (e.g., blocking refunds above a threshold) - The distinction between using hooks for deterministic guarantees versus relying on prompt instructions for probabilistic compliance

Skills in:

- Implementing PostToolUse hooks to normalize heterogeneous data formats (Unix timestamps, ISO 8601, numeric status codes) from different MCP tools before the agent processes them - Implementing tool call interception hooks that block policy-violating actions (e.g., refunds exceeding $500) and redirect to alternative workﬂows (e.g., human escalation) - Choosing hooks over prompt-based enforcement when business rules require guaranteed compliance

Subdomain 1.6: Design task decomposition strategies for complex workflows

Knowledge of:

- When to use ﬁxed sequential pipelines (prompt chaining) versus dynamic adaptive decomposition based on intermediate ﬁndings - Prompt chaining patterns that break reviews into sequential steps (e.g., analyze each ﬁle individually, then run a cross-ﬁle integration pass) - The value of adaptive investigation plans that generate subtasks based on what is discovered at each step

Skills in:

- Selecting task decomposition patterns appropriate to the workﬂow: prompt chaining for predictable multi-aspect reviews, dynamic decomposition for open-ended investigation tasks - Splitting large code reviews into per-ﬁle local analysis passes plus a separate cross-ﬁle integration pass to avoid attention dilution - Decomposing open-ended tasks (e.g., "add comprehensive tests to a legacy codebase") by ﬁrst mapping structure, identifying high-impact areas, then creating a prioritized plan that adapts as dependencies are discovered

Subdomain 1.7: Manage session state, resumption, and forking

Knowledge of:

- Named session resumption using --resume <session-name> to continue a speciﬁc prior conversation - fork_session for creating independent branches from a shared analysis baseline to explore divergent approaches - The importance of informing the agent about changes to previously analyzed ﬁles when resuming sessions after code modiﬁcations - Why starting a new session with a structured summary is more reliable than resuming with stale tool results

Skills in:

- Using --resume with session names to continue named investigation sessions across work sessions - Using fork_session to create parallel exploration branches (e.g., comparing two testing strategies or refactoring approaches from a shared codebase analysis) - Choosing between session resumption (when prior context is mostly valid) and starting fresh with injected summaries (when prior tool results are stale) - Informing a resumed session about speciﬁc ﬁle changes for targeted re-analysis rather than requiring full re-exploration

Domain 2: Tool Design & MCP Integration

Subdomain 2.1: Design effective tool interfaces with clear descriptions and boundaries

Knowledge of:

- Tool descriptions as the primary mechanism LLMs use for tool selection; minimal descriptions lead to unreliable selection among similar tools - The importance of including input formats, example queries, edge cases, and boundary explanations in tool descriptions - How ambiguous or overlapping tool descriptions cause misrouting (e.g., analyze_content vs analyze_document with near-identical descriptions) - The impact of system prompt wording on tool selection: keyword-sensitive instructions can create unintended tool associations

Skills in:

- Writing tool descriptions that clearly differentiate each tool's purpose, expected inputs, outputs, and when to use it versus similar alternatives - Renaming tools and updating descriptions to eliminate functional overlap (e.g., renaming analyze_content to extract_web_results with a web-speciﬁc description) - Splitting generic tools into purpose-speciﬁc tools with deﬁned input/output contracts (e.g., splitting a generic analyze_document into extract_data_points, summarize_content, and verify_claim_against_source) - Reviewing system prompts for keyword-sensitive instructions that might override well-written tool descriptions

Subdomain 2.2: Implement structured error responses for MCP tools

Knowledge of:

- The MCP isError ﬂag pattern for communicating tool failures back to the agent - The distinction between transient errors (timeouts, service unavailability), validation errors (invalid input), business errors (policy violations), and permission errors - Why uniform error responses (generic "Operation failed") prevent the agent from making appropriate recovery decisions - The difference between retryable and non-retryable errors, and how returning structured metadata prevents wasted retry attempts

Skills in:

- Returning structured error metadata including errorCategory (transient/validation/permission), isRetryable boolean, and human-readable descriptions - Including retriable: false ﬂags and customer-friendly explanations for business rule violations so the agent can communicate appropriately - Implementing local error recovery within subagents for transient failures, propagating to the coordinator only errors that cannot be resolved locally along with partial results and what was attempted - Distinguishing between access failures (needing retry decisions) and valid empty results (representing successful queries with no matches)

Subdomain 2.3: Distribute tools appropriately across agents and configure tool choice

Knowledge of:

- The principle that giving an agent access to too many tools (e.g., 18 instead of 4-5) degrades tool selection reliability by increasing decision complexity - Why agents with tools outside their specialization tend to misuse them (e.g., a synthesis agent attempting web searches) - Scoped tool access: giving agents only the tools needed for their role, with limited cross-role tools for speciﬁc high-frequency needs - tool_choice conﬁguration options: "auto", "any", and forced tool selection ({"type": "tool", "name": "..."})

Skills in:

- Restricting each subagent's tool set to those relevant to its role, preventing cross-specialization misuse - Replacing generic tools with constrained alternatives (e.g., replacing fetch_url with load_document that validates document URLs) - Providing scoped cross-role tools for high-frequency needs (e.g., a verify_fact tool for the synthesis agent) while routing complex cases through the coordinator - Using tool_choice forced selection to ensure a speciﬁc tool is called ﬁrst (e.g., forcing extract_metadata before enrichment tools), then processing subsequent steps in follow-up turns - Setting tool_choice: "any" to guarantee the model calls a tool rather than returning conversational text

Subdomain 2.4: Integrate MCP servers into Claude Code and agent workflows

Knowledge of:

- MCP server scoping: project-level (.mcp.json) for shared team tooling vs user-level (~/.claude.json) for personal/experimental servers - Environment variable expansion in .mcp.json (e.g., ${GITHUB_TOKEN}) for credential management without committing secrets - That tools from all conﬁgured MCP servers are discovered at connection time and available simultaneously to the agent - MCP resources as a mechanism for exposing content catalogs (e.g., issue summaries, documentation hierarchies, database schemas) to reduce exploratory tool calls

Skills in:

- Conﬁguring shared MCP servers in project-scoped .mcp.json with environment variable expansion for authentication tokens - Conﬁguring personal/experimental MCP servers in user-scoped ~/.claude.json - Enhancing MCP tool descriptions to explain capabilities and outputs in detail, preventing the agent from preferring built-in tools (like Grep) over more capable MCP tools - Choosing existing community MCP servers over custom implementations for standard integrations (e.g., Jira), reserving custom servers for team-speciﬁc workﬂows - Exposing content catalogs as MCP resources to give agents visibility into available data without requiring exploratory tool calls

Subdomain 2.5: Select and apply built-in tools (Read, Write, Edit, Bash, Grep, Glob) effectively

Knowledge of:

- Grep for content search (searching ﬁle contents for patterns like function names, error messages, or import statements) - Glob for ﬁle path pattern matching (ﬁnding ﬁles by name or extension patterns) - Read/Write for full ﬁle operations; Edit for targeted modiﬁcations using unique text matching - When Edit fails due to non-unique text matches, using Read + Write as a fallback for reliable ﬁle modiﬁcations

Skills in:

- Selecting Grep for searching code content across a codebase (e.g., ﬁnding all callers of a function, locating error messages) - Selecting Glob for ﬁnding ﬁles matching naming patterns (e.g., **/*.test.tsx) - Using Read to load full ﬁle contents followed by Write when Edit cannot ﬁnd unique anchor text - Building codebase understanding incrementally: starting with Grep to ﬁnd entry points, then using Read to follow imports and trace ﬂows, rather than reading all ﬁles upfront - Tracing function usage across wrapper modules by ﬁrst identifying all exported names, then searching for each name across the codebase

Domain 3: Claude Code Configuration & Workflows

Subdomain 3.1: Configure CLAUDE.md files with appropriate hierarchy, scoping, and modular organization

Knowledge of:

- The CLAUDE.md conﬁguration hierarchy: user-level (~/.claude/CLAUDE.md), project-level (.claude/CLAUDE.md or root CLAUDE.md), and directory-level (subdirectory CLAUDE.md ﬁles) - That user-level settings apply only to that user—instructions in ~/.claude/CLAUDE.md are not shared with teammates via version control - The @import syntax for referencing external ﬁles to keep CLAUDE.md modular (e.g., importing speciﬁc standards ﬁles relevant to each package) - .claude/rules/ directory for organizing topic-speciﬁc rule ﬁles as an alternative to a monolithic CLAUDE.md

Skills in:

- Diagnosing conﬁguration hierarchy issues (e.g., a new team member not receiving instructions because they're in user-level rather than project-level conﬁguration) - Using @import to selectively include relevant standards ﬁles in each package's CLAUDE.md based on maintainer domain knowledge - Splitting large CLAUDE.md ﬁles into focused topic-speciﬁc ﬁles in .claude/rules/ (e.g., testing.md, api-conventions.md, deployment.md) - Using the /memory command to verify which memory ﬁles are loaded and diagnose inconsistent behavior across sessions

Subdomain 3.2: Create and configure custom slash commands and skills

Knowledge of:

- Project-scoped commands in .claude/commands/ (shared via version control) vs user-scoped commands in ~/.claude/commands/ (personal) - Skills in .claude/skills/ with SKILL.md ﬁles that support frontmatter conﬁguration including context: fork, allowed-tools, and argument-hint - The context: fork frontmatter option for running skills in an isolated sub-agent context, preventing skill outputs from polluting the main conversation - Personal skill customization: creating personal variants in ~/.claude/skills/ with different names to avoid affecting teammates

Skills in:

- Creating project-scoped slash commands in .claude/commands/ for team-wide availability via version control - Using context: fork to isolate skills that produce verbose output (e.g., codebase analysis) or exploratory context (e.g., brainstorming alternatives) from the main session - Conﬁguring allowed-tools in skill frontmatter to restrict tool access during skill execution (e.g., limiting to ﬁle write operations to prevent destructive actions) - Using argument-hint frontmatter to prompt developers for required parameters when they invoke the skill without arguments - Choosing between skills (on-demand invocation for task-speciﬁc workﬂows) and CLAUDE.md (always-loaded universal standards)

Subdomain 3.3: Apply path-specific rules for conditional convention loading

Knowledge of:

- .claude/rules/ ﬁles with YAML frontmatter paths ﬁelds containing glob patterns for conditional rule activation - How path-scoped rules load only when editing matching ﬁles, reducing irrelevant context and token usage - The advantage of glob-pattern rules over directory-level CLAUDE.md ﬁles for conventions that span multiple directories (e.g., test ﬁles spread throughout a codebase)

Skills in:

- Creating .claude/rules/ ﬁles with YAML frontmatter path scoping (e.g., paths: ["terraform/**/*"]) so rules load only when editing matching ﬁles - Using glob patterns in path-speciﬁc rules to apply conventions to ﬁles by type regardless of directory location (e.g., **/*.test.tsx for all test ﬁles) - Choosing path-speciﬁc rules over subdirectory CLAUDE.md ﬁles when conventions must apply to ﬁles spread across the codebase

Subdomain 3.4: Determine when to use plan mode vs direct execution

Knowledge of:

- Plan mode is designed for complex tasks involving large-scale changes, multiple valid approaches, architectural decisions, and multi-ﬁle modiﬁcations - Direct execution is appropriate for simple, well-scoped changes (e.g., adding a single validation check to one function) - Plan mode enables safe codebase exploration and design before committing to changes, preventing costly rework - The Explore subagent for isolating verbose discovery output and returning summaries to preserve main conversation context

Skills in:

- Selecting plan mode for tasks with architectural implications (e.g., microservice restructuring, library migrations affecting 45+ ﬁles, choosing between integration approaches with different infrastructure requirements) - Selecting direct execution for well-understood changes with clear scope (e.g., a single-ﬁle bug ﬁx with a clear stack trace, adding a date validation conditional) - Using the Explore subagent for verbose discovery phases to prevent context window exhaustion during multi-phase tasks - Combining plan mode for investigation with direct execution for implementation (e.g., planning a library migration, then executing the planned approach)

Subdomain 3.5: Apply iterative refinement techniques for progressive improvement

Knowledge of:

- Concrete input/output examples as the most effective way to communicate expected transformations when prose descriptions are interpreted inconsistently - Test-driven iteration: writing test suites ﬁrst, then iterating by sharing test failures to guide progressive improvement - The interview pattern: having Claude ask questions to surface considerations the developer may not have anticipated before implementing - When to provide all issues in a single message (interacting problems) versus ﬁxing them sequentially (independent problems)

Skills in:

- Providing 2-3 concrete input/output examples to clarify transformation requirements when natural language descriptions produce inconsistent results - Writing test suites covering expected behavior, edge cases, and performance requirements before implementation, then iterating by sharing test failures - Using the interview pattern to surface design considerations (e.g., cache invalidation strategies, failure modes) before implementing solutions in unfamiliar domains - Providing speciﬁc test cases with example input and expected output to ﬁx edge case handling (e.g., null values in migration scripts) - Addressing multiple interacting issues in a single detailed message when ﬁxes interact, versus sequential iteration for independent issues

Subdomain 3.6: Integrate Claude Code into CI/CD pipelines

Knowledge of:

- The -p (or --print) ﬂag for running Claude Code in non-interactive mode in automated pipelines - --output-format json and --json-schema CLI ﬂags for enforcing structured output in CI contexts - CLAUDE.md as the mechanism for providing project context (testing standards, ﬁxture conventions, review criteria) to CI-invoked Claude Code - Session context isolation: why the same Claude session that generated code is less effective at reviewing its own changes compared to an independent review instance

Skills in:

- Running Claude Code in CI with the -p ﬂag to prevent interactive input hangs - Using --output-format json with --json-schema to produce machine-parseable structured ﬁndings for automated posting as inline PR comments - Including prior review ﬁndings in context when re-running reviews after new commits, instructing Claude to report only new or still-unaddressed issues to avoid duplicate comments - Providing existing test ﬁles in context so test generation avoids suggesting duplicate scenarios already covered by the test suite - Documenting testing standards, valuable test criteria, and available ﬁxtures in CLAUDE.md to improve test generation quality and reduce low-value test output

Domain 4: Prompt Engineering & Structured Output

Subdomain 4.1: Design prompts with explicit criteria to improve precision and reduce false positives

Knowledge of:

- The importance of explicit criteria over vague instructions (e.g., "ﬂag comments only when claimed behavior contradicts actual code behavior" vs "check that comments are accurate") - How general instructions like "be conservative" or "only report high-conﬁdence ﬁndings" fail to improve precision compared to speciﬁc categorical criteria - The impact of false positive rates on developer trust: high false positive categories undermine conﬁdence in accurate categories

Skills in:

- Writing speciﬁc review criteria that deﬁne which issues to report (bugs, security) versus skip (minor style, local patterns) rather than relying on conﬁdence-based ﬁltering - Temporarily disabling high false-positive categories to restore developer trust while improving prompts for those categories - Deﬁning explicit severity criteria with concrete code examples for each severity level to achieve consistent classiﬁcation

Subdomain 4.2: Apply few-shot prompting to improve output consistency and quality

Knowledge of:

- Few-shot examples as the most effective technique for achieving consistently formatted, actionable output when detailed instructions alone produce inconsistent results - The role of few-shot examples in demonstrating ambiguous-case handling (e.g., tool selection for ambiguous requests, branch-level test coverage gaps) - How few-shot examples enable the model to generalize judgment to novel patterns rather than matching only pre-speciﬁed cases - The effectiveness of few-shot examples for reducing hallucination in extraction tasks (e.g., handling informal measurements, varied document structures)

Skills in:

- Creating 2-4 targeted few-shot examples for ambiguous scenarios that show reasoning for why one action was chosen over plausible alternatives - Including few-shot examples that demonstrate speciﬁc desired output format (location, issue, severity, suggested ﬁx) to achieve consistency - Providing few-shot examples distinguishing acceptable code patterns from genuine issues to reduce false positives while enabling generalization - Using few-shot examples to demonstrate correct handling of varied document structures (inline citations vs bibliographies, methodology sections vs embedded details) - Adding few-shot examples showing correct extraction from documents with varied formats to address empty/null extraction of required ﬁelds

Subdomain 4.3: Enforce structured output using tool use and JSON schemas

Knowledge of:

- Tool use (tool_use) with JSON schemas as the most reliable approach for guaranteed schema-compliant structured output, eliminating JSON syntax errors - The distinction between tool_choice: "auto" (model may return text instead of calling a tool), "any" (model must call a tool but can choose which), and forced tool selection (model must call a speciﬁc named tool) - That strict JSON schemas via tool use eliminate syntax errors but do not prevent semantic errors (e.g., line items that don't sum to total, values in wrong ﬁelds) - Schema design considerations: required vs optional ﬁelds, enum ﬁelds with "other" + detail string patterns for extensible categories

Skills in:

- Deﬁning extraction tools with JSON schemas as input parameters and extracting structured data from the tool_use response - Setting tool_choice: "any" to guarantee structured output when multiple extraction schemas exist and the document type is unknown - Forcing a speciﬁc tool with tool_choice: {"type": "tool", "name": "extract_metadata"} to ensure a particular extraction runs before enrichment steps - Designing schema ﬁelds as optional (nullable) when source documents may not contain the information, preventing the model from fabricating values to satisfy required ﬁelds - Adding enum values like "unclear" for ambiguous cases and "other" + detail ﬁelds for extensible categorization - Including format normalization rules in prompts alongside strict output schemas to handle inconsistent source formatting

Subdomain 4.4: Implement validation, retry, and feedback loops for extraction quality

Knowledge of:

- Retry-with-error-feedback: appending speciﬁc validation errors to the prompt on retry to guide the model toward correction - The limits of retry: retries are ineffective when the required information is simply absent from the source document (vs format or structural errors) - Feedback loop design: tracking which code constructs trigger ﬁndings (detected_pattern ﬁeld) to enable systematic analysis of dismissal patterns - The difference between semantic validation errors (values don't sum, wrong ﬁeld placement) and schema syntax errors (eliminated by tool use)

Skills in:

- Implementing follow-up requests that include the original document, the failed extraction, and speciﬁc validation errors for model self-correction - Identifying when retries will be ineffective (e.g., information exists only in an external document not provided) versus when they will succeed (format mismatches, structural output errors) - Adding detected_pattern ﬁelds to structured ﬁndings to enable analysis of false positive patterns when developers dismiss ﬁndings - Designing self-correction validation ﬂows: extracting "calculated_total" alongside "stated_total" to ﬂag discrepancies, adding "conﬂict_detected" booleans for inconsistent source data

Subdomain 4.5: Design efficient batch processing strategies

Knowledge of:

- The Message Batches API: 50% cost savings, up to 24-hour processing window, no guaranteed latency SLA - Batch processing is appropriate for non-blocking, latency-tolerant workloads (overnight reports, weekly audits, nightly test generation) and inappropriate for blocking workﬂows (pre-merge checks) - The batch API does not support multi-turn tool calling within a single request (cannot execute tools mid-request and return results) - custom_id ﬁelds for correlating batch request/response pairs

Skills in:

- Matching API approach to workﬂow latency requirements: synchronous API for blocking pre-merge checks, batch API for overnight/weekly analysis - Calculating batch submission frequency based on SLA constraints (e.g., 4-hour windows to guarantee 30-hour SLA with 24-hour batch processing) - Handling batch failures: resubmitting only failed documents (identiﬁed by custom_id) with appropriate modiﬁcations (e.g., chunking documents that exceeded context limits) - Using prompt reﬁnement on a sample set before batch-processing large volumes to maximize ﬁrst-pass success rates and reduce iterative resubmission costs

Subdomain 4.6: Design multi-instance and multi-pass review architectures

Knowledge of:

- Self-review limitations: a model retains reasoning context from generation, making it less likely to question its own decisions in the same session - Independent review instances (without prior reasoning context) are more effective at catching subtle issues than self-review instructions or extended thinking - Multi-pass review: splitting large reviews into per-ﬁle local analysis passes plus cross-ﬁle integration passes to avoid attention dilution and contradictory ﬁndings

Skills in:

- Using a second independent Claude instance to review generated code without the generator's reasoning context - Splitting large multi-ﬁle reviews into focused per-ﬁle passes for local issues plus separate integration passes for cross-ﬁle data ﬂow analysis - Running veriﬁcation passes where the model self-reports conﬁdence alongside each ﬁnding to enable calibrated review routing

Domain 5: Context Management & Reliability

Subdomain 5.1: Manage conversation context to preserve critical information across long interactions

Knowledge of:

- Progressive summarization risks: condensing numerical values, percentages, dates, and customer-stated expectations into vague summaries - The "lost in the middle" effect: models reliably process information at the beginning and end of long inputs but may omit ﬁndings from middle sections - How tool results accumulate in context and consume tokens disproportionately to their relevance (e.g., 40+ ﬁelds per order lookup when only 5 are relevant) - The importance of passing complete conversation history in subsequent API requests to maintain conversational coherence

Skills in:

- Extracting transactional facts (amounts, dates, order numbers, statuses) into a persistent "case facts" block included in each prompt, outside summarized history - Extracting and persisting structured issue data (order IDs, amounts, statuses) into a separate context layer for multi-issue sessions - Trimming verbose tool outputs to only relevant ﬁelds before they accumulate in context (e.g., keeping only return-relevant ﬁelds from order lookups) - Placing key ﬁndings summaries at the beginning of aggregated inputs and organizing detailed results with explicit section headers to mitigate position effects - Requiring subagents to include metadata (dates, source locations, methodological context) in structured outputs to support accurate downstream synthesis - Modifying upstream agents to return structured data (key facts, citations, relevance scores) instead of verbose content and reasoning chains when downstream agents have limited context budgets

Subdomain 5.2: Design effective escalation and ambiguity resolution patterns

Knowledge of:

- Appropriate escalation triggers: customer requests for a human, policy exceptions/gaps (not just complex cases), and inability to make meaningful progress - The distinction between escalating immediately when a customer explicitly demands it versus offering to resolve when the issue is straightforward - Why sentiment-based escalation and self-reported conﬁdence scores are unreliable proxies for actual case complexity - How multiple customer matches require clariﬁcation (requesting additional identiﬁers) rather than heuristic selection

Skills in:

- Adding explicit escalation criteria with few-shot examples to the system prompt demonstrating when to escalate versus resolve autonomously - Honoring explicit customer requests for human agents immediately without ﬁrst attempting investigation - Acknowledging frustration while offering resolution when the issue is within the agent's capability, escalating only if the customer reiterates their preference - Escalating when policy is ambiguous or silent on the customer's speciﬁc request (e.g., competitor price matching when policy only addresses own-site adjustments) - Instructing the agent to ask for additional identiﬁers when tool results return multiple matches, rather than selecting based on heuristics

Subdomain 5.3: Implement error propagation strategies across multi-agent systems

Knowledge of:

- Structured error context (failure type, attempted query, partial results, alternative approaches) as enabling intelligent coordinator recovery decisions - The distinction between access failures (timeouts needing retry decisions) and valid empty results (successful queries with no matches) - Why generic error statuses ("search unavailable") hide valuable context from the coordinator - Why silently suppressing errors (returning empty results as success) or terminating entire workﬂows on single failures are both anti-patterns

Skills in:

- Returning structured error context including failure type, what was attempted, partial results, and potential alternatives to enable coordinator recovery - Distinguishing access failures from valid empty results in error reporting so the coordinator can make appropriate decisions - Having subagents implement local recovery for transient failures and only propagate errors they cannot resolve, including what was attempted and partial results - Structuring synthesis output with coverage annotations indicating which ﬁndings are well-supported versus which topic areas have gaps due to unavailable sources

Subdomain 5.4: Manage context effectively in large codebase exploration

Knowledge of:

- Context degradation in extended sessions: models start giving inconsistent answers and referencing "typical patterns" rather than speciﬁc classes discovered earlier - The role of scratchpad ﬁles for persisting key ﬁndings across context boundaries - Subagent delegation for isolating verbose exploration output while the main agent coordinates high-level understanding - Structured state persistence for crash recovery: each agent exports state to a known location, and the coordinator loads a manifest on resume

Skills in:

- Spawning subagents to investigate speciﬁc questions (e.g., "ﬁnd all test ﬁles," "trace refund ﬂow dependencies") while the main agent preserves high-level coordination - Having agents maintain scratchpad ﬁles recording key ﬁndings, referencing them for subsequent questions to counteract context degradation - Summarizing key ﬁndings from one exploration phase before spawning sub-agents for the next phase, injecting summaries into initial context - Designing crash recovery using structured agent state exports (manifests) that the coordinator loads on resume and injects into agent prompts - Using /compact to reduce context usage during extended exploration sessions when context ﬁlls with verbose discovery output

Subdomain 5.5: Design human review workflows and confidence calibration

Knowledge of:

- The risk that aggregate accuracy metrics (e.g., 97% overall) may mask poor performance on speciﬁc document types or ﬁelds - Stratiﬁed random sampling for measuring error rates in high-conﬁdence extractions and detecting novel error patterns - Field-level conﬁdence scores calibrated using labeled validation sets for routing review attention - The importance of validating accuracy by document type and ﬁeld segment before automating high-conﬁdence extractions

Skills in:

- Implementing stratiﬁed random sampling of high-conﬁdence extractions for ongoing error rate measurement and novel pattern detection - Analyzing accuracy by document type and ﬁeld to verify consistent performance across all segments before reducing human review - Having models output ﬁeld-level conﬁdence scores, then calibrating review thresholds using labeled validation sets - Routing extractions with low model conﬁdence or ambiguous/contradictory source documents to human review, prioritizing limited reviewer capacity

Subdomain 5.6: Preserve information provenance and handle uncertainty in multi-source synthesis

Knowledge of:

- How source attribution is lost during summarization steps when ﬁndings are compressed without preserving claim-source mappings - The importance of structured claim-source mappings that the synthesis agent must preserve and merge when combining ﬁndings - How to handle conﬂicting statistics from credible sources: annotating conﬂicts with source attribution rather than arbitrarily selecting one value - Temporal data: requiring publication/collection dates in structured outputs to prevent temporal differences from being misinterpreted as contradictions

Skills in:

- Requiring subagents to output structured claim-source mappings (source URLs, document names, relevant excerpts) that downstream agents preserve through synthesis - Structuring reports with explicit sections distinguishing well-established ﬁndings from contested ones, preserving original source characterizations and methodological context - Completing document analysis with conﬂicting values included and explicitly annotated, letting the coordinator decide how to reconcile before passing to synthesis - Requiring subagents to include publication or data collection dates in structured outputs to enable correct temporal interpretation - Rendering different content types appropriately in synthesis outputs—ﬁnancial data as tables, news as prose, technical ﬁndings as structured lists—rather than converting everything to a uniform format

Techniques & products

Claude Code

Claude Agent SDK

Claude API

Model Context Protocol (MCP)

Agentic loops

Multi-agent systems

Subagents

Task tool

Agent SDK hooks

Tool interfaces

Built-in tools (Read, Write, Edit, Bash, Grep, Glob)

CLAUDE.md

Slash commands

Skills

Path-specific rules

Plan mode

Direct execution

Iterative refinement

CI/CD pipelines

Prompt engineering

Few-shot prompting

Structured output

JSON schemas

Validation

Retry loops

Feedback loops

Message Batches API

Multi-instance review architectures

Multi-pass review architectures

Conversation context management

Progressive summarization

"Lost in the middle" effect

Escalation patterns

Ambiguity resolution patterns

Error propagation strategies

Scratchpad files

Human review workflows

Confidence calibration

Information provenance

Multi-source synthesis

YAML frontmatter

Glob patterns

stop_reason

isError flag

tool_choice

custom_id

AgentDefinition

PostToolUse hook

fork_session

--resume flag

--print flag

--output-format json flag

--json-schema flag

/memory command

/compact command

Start Practicing

All Domains

Domain Mode

Quiz History

Exam Details

Exam Topics & Skills Assessed

Domain 1: Agentic Architecture & Orchestration

Domain 2: Tool Design & MCP Integration

Domain 3: Claude Code Configuration & Workflows

Domain 4: Prompt Engineering & Structured Output

Domain 5: Context Management & Reliability