Agents & MCP Servers
The frontier. Context engineering, multi-agent orchestration, RAG that works, MCP authoring, evals, and production guardrails.
Who this is for
Target audience
- Senior engineers building AI-powered features and internal tools
- Platform engineers designing agent infrastructure for their org
- Tech leads evaluating multi-agent architectures for production use
Prerequisites
- Solid programming skills in Python and/or TypeScript
- Completion of Track 1 or equivalent Claude Code fluency
- Familiarity with APIs, deployment pipelines, and production systems
What you'll learn
10 lessons, each built around the same structure: show, tell, do, break it, check. No lesson has more than 15 minutes of passive content before a hands-on moment.
- 1
Context engineering fundamentals
Rules, procedures, checks, isolation, and the PEV loop. How to design context that makes agents reliable, not just impressive.
- 2
Skills, subagents, and hooks
Anatomy of a Skill, writing descriptions the matcher loves, subagent orchestration, pre/post hooks for safety and quality.
- 3
MCP authoring from scratch
Build a custom MCP server: tool definitions, resource handling, error patterns, permissions, testing, and distribution.
- 4
Multi-agent orchestration: LangGraph
Graph-based agent orchestration. Nodes, edges, state management, human-in-the-loop breakpoints, and error recovery.
- 5
Multi-agent orchestration: CrewAI and swarms
Role-based (CrewAI), swarm/shared-conversation (AutoGen, OpenAI Swarm), hierarchical, and heterogeneous model routing.
- 6
RAG that works
Hybrid search, rerankers, HyDE, GraphRAG, memory architectures. The gap between demo RAG and production RAG.
- 7
Production evals and guardrails
Eval frameworks, automated quality gates, prompt injection defense, hallucination detection, cost optimization.
- 8
Observability and cost
Tracing agent executions, token-level cost attribution, latency budgets, and the dashboards your team actually needs.
- 9
Fine-tuning and distillation
When fine-tuning beats prompting, distillation patterns, data preparation, evaluation methodology.
- 10
Capstone: build a production agent
Design, build, evaluate, and deploy a multi-tool agent that solves a real workflow. Graded on reliability, cost, and safety.
What you'll build
Every track includes graded hands-on labs on realistic codebases. No toy examples.
Build a custom MCP server
Author an MCP server from scratch that integrates with an external API. Tool definitions, error handling, permissions, and a test suite.
Multi-agent pipeline
Build a 3-agent pipeline using LangGraph: research agent, analysis agent, and report-writing agent. With human-in-the-loop approval gates.
Production RAG system
Build a retrieval system with hybrid search, reranking, and evaluation. Measure retrieval quality and compare against baseline.
Sample lesson preview
Lesson preview
Context engineering: the PEV loop
- What context engineering actually is (and why 'just write a better prompt' is not it)
- The Plan-Execute-Verify loop: how production agents maintain reliability across long tasks
- Rules vs. procedures vs. checks: when each one matters and how they compose
- Hands-on: redesign a brittle agent prompt into a structured context engineering setup
Certified Advanced AI Engineer
Complete this track to earn your CAAE badge. Certifications are earned through practical assessment — a written exam plus a hands-on practical — not just quiz scores. Exportable as Open Badges 2.0 and verifiable by URL.
Badges are valid for 18 months, renewable with a short refresh assessment.
Start your team's training
Per-seat annual plans start at $300/user. Enterprise pricing available for teams over 200.
Not sure where to start?
Take our free 3-minute AI maturity assessment and get a personalized recommendation for which tracks fit your team.