AI CAPABILITY • PRACTICE

The Practice

Where AI knowledge becomes operational capability

Context engineering, agent orchestration, skills fluency, and curated learning paths – the disciplines and resources that create real-world AI value.

The discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

In 30 Seconds

Most AI failures aren't model problems. They're context problems. Context engineering is the discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

This is where strategy becomes capability. Without the right context architecture, even the best AI strategy remains a document on a shelf.

Our core expertise: This is Practice tier work— designing and implementing the context systems that make AI useful in your specific environment. Not one-off prompts, but persistent capability.

What We've Learned in Practice

Through implementing context systems across sustainability consulting, trading operations, and client delivery, we've validated several patterns:

Compression Compounds

Each context reduction makes the next easier. Start aggressive, refine based on what breaks.

Memory Health Is Maintenance

Context architecture isn't a one-time setup. Weekly maintenance protocols prevent gradual degradation.

Navigation Beats Structure

Folder hierarchies organise files. Topic-based navigation paths tell AI what to load and when. Both are needed.

Session Continuity Is Hard

The handoff between work sessions is the least discussed problem in AI—and often the most impactful to solve.

Results from our implementations: 3× faster session startup, 40% reduction in token costs, near-zero context drift.

The Shift from Prompts to Systems

Prompt engineering asks: “How do I phrase this question?”

Context engineering asks: “What does AI need to know to answer well?”

Anthropic defines it as “designing dynamic systems that provide AI models with the right information at the right time.” It's the evolution from crafting individual queries to architecting information environments.

Prompt EngineeringContext Engineering
FocusThe questionThe knowledge
ScopeSingle queryEntire system
ApproachCraft better promptsDesign information flow
ResultBetter answersConsistent capability

The Context Problem

More context doesn't mean better performance.

Research shows input length alone can reduce AI accuracy by 14-85% – even when all information is relevant.

Lost in the Middle

Models favour information at the start and end of context, missing what's in between.

Context Rot

Quality degrades gradually as context accumulates and ages without maintenance.

Signal Dilution

Important information drowns in noise when everything is loaded indiscriminately.

Many teams dump everything into context. That's like answering every question by reading the encyclopaedia aloud.

The issue is not just volume; it's missing decision lineage. Context graphs help by recording approvals and exceptions so AI can retrieve the relevant rationale without loading everything.

THE INDUSTRY'S UNSOLVED PROBLEM

Context Drift

Academic research (2025-2026) identifies context drift—the gradual degradation of context quality across sessions and time—as the central unsolved challenge in AI memory systems. Most tools handle single-session context well. Multi-session, multi-day continuity remains hard. This is where operational rhythm systems (session handoffs, weekly coordination) become critical.

Four Strategies for Managing Context

Based on Anthropic's framework for effective AI systems

1. Write

Persist externally

Store information outside the context window for later retrieval. Files, databases, knowledge bases – anything that persists beyond the session.

2. Select

Load only what's relevant

Retrieve context based on the task at hand, not everything available. Dynamic retrieval, semantic search, just-in-time loading.

3. Compress

Summarise, don't accumulate

Keep context lean through intelligent summarisation. Archive old content, preserve decisions, trim the unnecessary.

4. Isolate

Separate contexts for separate concerns

Don't let different workstreams pollute each other. Multi-agent architectures, session boundaries, role-specific loading.

The Temporal Dimension

The solution to context drift

Most context engineering focuses on structure—how information is organised. We've found the rhythm matters just as much. This is how you solve context drift: not just better architecture, but better cadence.

Session Handoffs

How does context pass between work sessions? What gets carried forward, what gets compressed, what gets archived? Explicit handoff protocols prevent the “starting from scratch” problem.

Weekly Coordination

How do strategic priorities flow into daily work? How does work roll up into weekly synthesis? Coordination bridges connect strategy to execution without context overload.

Memory Maintenance

Context degrades over time. Scheduled compression, archiving cadences, and health checks keep context fresh. Without maintenance, even good architecture accumulates noise.

Why this matters: Academic research identifies context drift as the central unsolved challenge in AI memory systems. Most tools handle single sessions well. Multi-session, multi-day continuity is where operational rhythm becomes critical.

The architecture is the skeleton. The rhythm is the heartbeat.

Tiered Context Architecture

We use a budget-based approach to context layers. Each tier has a token budget and update frequency—this prevents context bloat while ensuring AI has what it needs.

TierPurposeToken BudgetLoad Frequency
Tier 0Compressed state~300 tokensAlways
Tier 1Active context + navigation~1,000 tokensSession start
Tier 2Domain knowledgeOn-demandWhen needed
Tier 3ArchiveRarelyHistorical only

Key insight: Most teams overload Tier 0-1 and underuse Tier 2-3. The result is context bloat, slower reasoning, and higher costs.

THE LOAD PATTERN

TIER 0
Always loaded
TIER 1
Session start
TIER 2
On-demand
TIER 3
When needed

Context Graphs: Decision Lineage

Systems of record capture what happened. Context graphs capture why.

When AI needs to make a decision, it shouldn't just know the rule—it should know the precedents, exceptions, and reasoning that shaped it.

Approvals & Exceptions

Why was this approved? What precedent does it set? Context graphs make the reasoning retrievable.

Policy Evolution

How did we get here? What changed and why? Decision traces show the path, not just the destination.

Audit Trails

What informed this decision? Who signed off? Context graphs support governance and compliance.

We design context graphs that turn scattered decisions into searchable precedent— making institutional knowledge available to AI without loading everything.

Context as Competitive Moat

STRATEGICNEW — MARCH 2026

Google's internal AI team made a revealing observation in early 2026: the “sum totality of an organisation's documents” creates capabilities that no AI lab can replicate from the outside. Your organisational data, decision history, and institutional knowledge are not a problem to manage — they are the competitive advantage.

Why Labs Can't Compete

Foundation models are general-purpose. Your context — client history, policy decisions, domain-specific reasoning — is what transforms general AI into your AI. No amount of training data replicates what you've accumulated through operations.

Context Engineering = Moat Building

This reframes context engineering from a technical practice to a strategic investment. Every well-structured knowledge base, every maintained decision trace, every curated context layer is an asset your competitors don't have.

The implication: Organisations investing in context architecture today aren't just improving AI performance — they're building durable competitive advantage that deepens with every interaction.

Who Benefits from Context Engineering?

Individuals

  • • Consistent AI results across sessions
  • • Build on previous work, not from scratch
  • • Reduce time re-explaining context

Teams

  • • Reduce hallucinations through better knowledge
  • • Enable handoffs between human and AI
  • • Shared context across team members

Organisations

  • • Multi-agent coordination without pollution
  • • Governance and compliance controls
  • • Scalable knowledge management

Our Approach Is Informed By

Anthropic's context engineering guidance

Karpathy's “RAM management” framing

Vercel's AGENTS.md evaluation research

Mei & Yao survey (1,400+ academic papers)

MemAgents architecture research

Validated through our own implementations

In 30 Seconds

AI doesn't remember. Every conversation starts fresh. Every session begins from zero. The brilliant assistant who helped you yesterday has no idea who you are today.

This isn't a bug – it's how LLMs work. But it's also why most AI implementations deliver inconsistent value. The forgetting problem is solvable.

Memory health is the practice of designing systems that give AI what it needs to know – when it needs to know it – without drowning in irrelevant information.

Why This Matters

For Individuals

Without memory, you re-explain context every session. The same background, the same preferences, the same project details – again and again.

Time saved by AI gets consumed by context-setting. The productivity promise erodes with every fresh start.

For Teams

When AI forgets, knowledge doesn't compound. Insights from one session don't inform the next. Each team member starts from scratch.

The result: inconsistent outputs, duplicated effort, and AI that never gets better at understanding your work.

The compounding cost: Every time AI forgets, you lose the value of everything it learned. Good memory design means knowledge builds over time instead of resetting to zero.

The Forgetting Problem

Understanding why AI forgets is the first step to fixing it

Context Windows Have Limits

Every AI model has a finite “context window” – the amount of text it can consider at once. When the window fills, old information gets pushed out.

The Illusion

Modern models have large context windows (100K+ tokens). This feels like plenty of memory.

The Reality

Long contexts degrade performance. Research shows accuracy drops 14-85% as context length increases – even with relevant information.

No Native Persistence

LLMs have no built-in way to store information between sessions. Unlike databases or file systems, they don't write to permanent storage. Each conversation exists in isolation.

What Users Expect

“You remember that project we discussed last week, right?”

What Actually Happens

The model has no access to previous conversations. Last week doesn't exist.

Lost in the Middle

Even within a single context window, attention isn't uniform. Models favour information at the beginning and end, often missing what's in the middle.

The Pattern

Critical information buried in the middle of long conversations gets less attention from the model.

The Impact

Important context can be effectively “forgotten” even while technically still in the window.

Key insight: The “memory problem” isn't a flaw to be fixed by model improvements. It's a fundamental architecture that requires system-level solutions.

Symptoms of Poor Memory Health

Recognise these patterns? They're signs your AI system needs memory architecture.

Repetitive Context-Setting

You explain the same background information every session. “I work at X company, we do Y, the project is about Z...”

Inconsistent Outputs

The same question yields different answers in different sessions. No learning from previous interactions carries forward.

Contradictory Advice

AI suggests approaches that conflict with decisions made in previous sessions. It doesn't know what was already decided.

Context Rot

Long conversations degrade. The AI starts referencing outdated information or losing track of earlier agreements.

Knowledge Silos

Insights from one conversation can't be applied elsewhere. Each session is an island of learning that sinks after use.

The Eternal Beginner

Despite months of use, AI still asks basic questions. It never develops understanding of your domain or preferences.

These aren't AI limitations. They're architecture gaps. Every symptom has a solution – if you design for memory.

The Four Memory Strategies

Based on Anthropic's context engineering framework

1. Write: Persist Externally

Since AI has no native memory, create external storage. Files, databases, knowledge bases – anything that persists beyond the session.

Session Logs

Capture key decisions and outcomes from each conversation

Knowledge Files

Curated information that AI should always know

State Documents

Living files that track current project status

2. Select: Load Only What's Relevant

Don't load everything every time. Retrieve context based on the task at hand. Just-in-time loading beats all-the-time loading.

Dynamic Retrieval

Fetch relevant documents based on the current query

Semantic Search

Find information by meaning, not just keywords

Role-Based Loading

Different tasks load different context packages

3. Compress: Summarise, Don't Accumulate

Keep context lean. Replace long conversation history with concise summaries. Archive old content, preserve decisions, trim the unnecessary.

Conversation Summaries

Replace 50 messages with 5 key takeaways

Decision Logs

Keep what was decided, not how it was discussed

Context Pruning

Regular maintenance to remove outdated information

4. Isolate: Separate Contexts for Separate Concerns

Don't let different workstreams pollute each other. Use boundaries to keep contexts clean and focused.

Session Boundaries

Clear starts and ends for different work types

Project Isolation

Client A's context doesn't leak into Client B

Multi-Agent Design

Different agents with different specialised contexts

Layered Memory Architecture

Organise memory by stability and scope

Effective AI memory isn't a single file – it's a layered architecture. Higher layers are stable and rarely change. Lower layers are ephemeral and session-specific.

L1: SYSTEM INSTRUCTIONS
How the AI should behave, safety rails, core capabilities
L2: AGENT IDENTITY
Who is this AI? Expertise, tone, protocols
L3: STRATEGIC MEMORY
Key decisions, priorities, what's been tried
L4: KNOWLEDGE ARCHITECTURE
Where things are, how to navigate, reference material
L5: ENTITY CONTEXT
Client/project specific: history, preferences, current state
L6: SESSION CONTEXT
Current conversation, working memory, live state

The Design Principle

Load stable layers automatically. Load ephemeral layers dynamically. Don't burden every session with information that rarely changes.

The Maintenance Principle

Update each layer at appropriate intervals. Strategic memory weekly. Session context every message. Match maintenance rhythm to layer stability.

Memory Patterns in Practice

Common patterns for implementing healthy AI memory

The Handoff Document

A single file that captures: what happened, what was decided, what's next. Updated at session end, loaded at session start.

Best for:

Individuals working across multiple sessions on the same project

The Project Bible

A comprehensive reference document containing all project context. Loaded automatically when working on that project.

Best for:

Complex projects with many decisions and constraints to remember

The Skills Library

Modular knowledge files that can be loaded on demand. Different skills for different tasks, loaded as needed.

Best for:

Teams with diverse tasks requiring different domain expertise

The Weekly Bridge

A rhythm-based summary that synthesises the week's sessions. Carries forward key context without accumulating endless history.

Best for:

Ongoing operations with continuous but evolving context

Memory Requires Maintenance

Without Maintenance

  • • Context files grow stale
  • • Outdated information contradicts current reality
  • • Memory becomes noise rather than signal
  • • AI references things that are no longer true
  • • The system degrades back to forgetfulness

With Maintenance

  • • Context stays current and accurate
  • • Old information gets archived, not deleted
  • • Each session starts with relevant, fresh context
  • • Knowledge compounds reliably over time
  • • The system gets smarter, not staler

Memory health is a practice, not a one-time setup.
The rhythm matters as much as the architecture.

How We Help

Design, implement, and maintain AI memory systems

Memory Architecture

Design the right layer structure for your context. What belongs where, what loads when, how it all connects.

Implementation

Build the files, set up the retrieval, establish the workflows. From individual setups to team-scale systems.

Maintenance Protocols

Define the rhythms and processes that keep memory healthy. What gets updated when, how staleness is prevented.

In 30 Seconds

There are two ways to give AI the information it needs: load it upfront (passive context) or fetch it when needed (on-demand retrieval). Most teams assume retrieval is smarter. The research says otherwise.

Vercel's Next.js team ran rigorous evaluations comparing these approaches. The result: passive context achieved 100% accuracy where on-demand retrieval achieved 53%.

The insight: When information is always present, there's no decision point that can fail. No retrieval logic to get wrong. No ordering issues. Just consistent availability.

The Research

Vercel AGENTS.md Evaluation (January 2026)

Vercel's Next.js team tested how AI coding agents perform with different context configurations. They compared baseline performance against various approaches for providing project-specific information.

53%

No documentation

Baseline

53%

On-demand retrieval

Skills system

79%

Retrieval + instructions

Enhanced skills

100%

Passive context

AGENTS.md file

The striking finding: On-demand retrieval performed no better than having no documentation at all. The retrieval system existed, but it didn't help. Only when information was passively present did performance improve.

Why Passive Context Wins

Three fundamental advantages over on-demand retrieval

1. No Decision Point

On-demand retrieval requires a decision: “What information do I need for this query?” That decision can be wrong. The model might not realise it needs certain context.

With passive context, there's no decision to get wrong. The information is already there. Every time.

2. Consistent Availability

Retrieval systems are probabilistic. They might return relevant documents 80% of the time, or 60%, or 40%. The quality varies by query, by phrasing, by the state of the vector database.

Passive context is deterministic. The same information is present on every turn. No variance. No “sometimes it works” frustration.

3. No Ordering Issues

With retrieval, critical information might arrive too late in the reasoning process. The model starts generating before realising it needs more context.

Passive context is present from the first token. The model reasons with full information from the start.

The Tradeoff: Why Not Load Everything?

If passive context is better, why not just load all available information? Because context windows have effective limits that are smaller than their technical limits.

The Memento Limit

Research suggests effective reasoning capacity is around 100K tokens, even when context windows are technically larger. Beyond this, performance degrades.

A 200K context window doesn't give you 200K of useful reasoning space. It gives you 100K of effective space with increasing noise.

Lost in the Middle

Models pay more attention to the beginning and end of context. Information in the middle gets weighted less, even when it's critical.

More context can mean important information gets buried where the model is less likely to use it effectively.

The goal isn't maximum context. It's the right context. Passive for what matters most. Retrieval for everything else.

Tiered Context Architecture

The pattern that balances passive reliability with retrieval flexibility

TierTypeToken BudgetWhat Goes Here
Tier 0Passive~300 tokensCompressed state: current status, key metrics, active items
Tier 1Passive~1,000 tokensActive context: navigation, recent decisions, current focus
Tier 2On-demandVariableDomain knowledge: loaded when topic requires it
Tier 3RetrievalAs neededArchive: historical, rarely accessed

Passive Foundation

Tier 0 and Tier 1 are always loaded. This is your passive context. Keep it lean (~1,300 tokens total) but ensure it contains everything AI needs to orient itself and navigate effectively.

Retrieval for Depth

Tier 2 and Tier 3 use selective retrieval. Navigation paths in Tier 1 point to relevant Tier 2 content. This gives you depth without bloat.

Implementation Patterns

Practical approaches for passive context systems

The Project File Pattern

A single file (CLAUDE.md, AGENTS.md, or similar) at project root containing everything AI needs to work effectively in that context.

Typical contents:

  • • Project description and purpose
  • • Key decisions and constraints
  • • Build/test commands
  • • Code conventions
  • • Current focus areas

The MEMORY + CONTEXT Pattern

Two complementary files: MEMORY.md for compressed state (~300 tokens), CONTEXT.md for active context and navigation (~1,000 tokens).

The split:

  • • MEMORY: “Where are we?” (status, metrics, active items)
  • • CONTEXT: “How do I work here?” (navigation, decisions, focus)

The Navigation Hub Pattern

Passive context includes a navigation table: “When you need X, read Y.” This creates predictable paths from topics to relevant files.

Example:

| Topic | Read |
| pricing | docs/pricing-rules.md |
| deployment | docs/deploy-guide.md |

The Token Budget Pattern

Explicit limits on each passive context file. When a file exceeds its budget, compress it. Move detail to Tier 2 and keep pointers in Tier 0-1.

Enforcement:

  • • Tier 0: Max 300 tokens (hard limit)
  • • Tier 1: Max 1,000 tokens (soft limit)
  • • Review weekly, compress as needed

When to Use What

Use Passive Context For

  • Identity and behaviour rules (always needed)
  • Current project state (changes, but always relevant)
  • Navigation pointers (how to find deeper content)
  • Recent decisions (context that's frequently referenced)
  • Session handoff state (what to pick up from last time)

Use Retrieval For

  • Large knowledge bases (too big for passive loading)
  • Historical archives (rarely needed)
  • Domain-specific content (only relevant for certain queries)
  • Reference documentation (detailed specs, APIs)
  • Content that varies by user/session

The combination is powerful: Passive foundation + selective retrieval. Reliability where it matters most. Flexibility where you need depth.

Common Mistakes

Mistake 1: No Passive Context at All

Relying entirely on retrieval. Every query starts with a search. Result: inconsistent baseline, variance in quality, 53% performance.

Fix: Establish a passive foundation, even if it's just 500 tokens.

Mistake 2: Too Much Passive Context

Loading everything passively to avoid retrieval complexity. Result: bloated context, lost-in-the-middle problems, degraded reasoning.

Fix: Enforce token budgets. Compress aggressively. Move detail to Tier 2.

Mistake 3: Stale Passive Context

Setting up passive context once and never updating it. Result: AI references outdated information, makes contradictory decisions.

Fix: Weekly review cadence. Update Tier 0 after every significant change.

Mistake 4: No Navigation to Tier 2

Passive context that doesn't tell AI where to find deeper information. Result: AI either hallucinates or asks repeatedly for guidance.

Fix: Include navigation paths. “When you need X, read Y.”

Getting Started

1

Create a Tier 0 file

Start with ~300 tokens of compressed state. Current status, key metrics, active items. Name it MEMORY.md or include it at the top of your main context file.

2

Add navigation to Tier 1

Create a CONTEXT.md with ~1,000 tokens. Include a navigation table: “When topic X comes up, read file Y.” This creates predictable paths to deeper content.

3

Configure automatic loading

Ensure your AI tool loads Tier 0 and Tier 1 at session start. For Claude Code, this means CLAUDE.md. For other tools, AGENTS.md or equivalent.

4

Establish maintenance rhythm

Weekly: review passive context for staleness. After significant changes: update Tier 0. Monthly: audit token budgets and compress as needed.

From Knowledge to Capability

Context engineering, agent orchestration, skills architecture, fluency development, and workforce capability – these practices determine whether AI delivers consistent value or inconsistent experiments. If any of these feel uncertain, we can help you get them right.