What is AI advisory and why do organisations need it?

AI advisory helps organisations navigate the complex AI landscape to find real value beyond the hype. Most AI projects fail not because of technology, but because of context problems, coordination gaps, and capability issues. AI advisory addresses these practical blockers to help you build AI capabilities that actually work.

What is context engineering in AI?

Context engineering is the discipline of designing what AI knows, when it knows it, and how that knowledge is structured. It's about giving AI the right information at the right time. Most AI failures are context failures - the model is capable, but it doesn't have the information it needs to help you effectively.

What's the difference between AI tools and AI fluency?

AI tools are software you can access. AI fluency is the ability to think with AI - to know when and how to use it effectively, to structure problems for AI collaboration, and to evaluate AI outputs critically. Many teams have tool access but struggle to get consistent value because they lack genuine fluency.

How do skills-based AI systems differ from traditional AI agents?

Traditional multi-agent systems create separate AI agents for different tasks. Skills-based systems give one AI agent multiple skills - packaged expertise it can invoke as needed. Skills compound: every skill you create is available to everyone, forever. This approach is simpler to maintain and scales organisational intelligence.

Which AI model should my organisation use - Claude, GPT, or Gemini?

The best model depends on your specific use case, data sensitivity requirements, and integration needs. Claude excels at nuanced reasoning and safety. GPT has the largest ecosystem. Gemini offers strong multimodal capabilities and Google integration. We help organisations evaluate options against their actual requirements rather than following hype.

PANDIONSTUDIO

AI CAPABILITY • PRACTICE

The Practice

Where AI knowledge becomes operational capability

Context engineering, agent orchestration, skills fluency, and curated learning paths – the disciplines and resources that create real-world AI value.

On the map: the Practice tier. What it maps here informs the operating model's AI architecture: context engineering, skills, memory, verification, and the orchestration and fluency that drive the model, harness and kernel as one system. See the full stack →

AT A GLANCE

The whole practice in one view

Ten sections plus a curated learning library, grouped by discipline. Click any card to open the detail below.

Context Engineering

4 sections

Agents & Orchestration

3 sections

Skills & Fluency

3 sections

Learning Resources

curated library

The discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

MAY 2026 PRACTICE LENS

The practical shift is from prompting to staging. Small teams get leverage when they prepare the context, mark what is locked or still open, give agents clear rubrics, and preserve the useful learning after each run. Agent management is becoming a work primitive, not a specialist side activity.

In 30 Seconds

Most AI failures aren't model problems. They're context problems. Context engineering is the discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

This is where strategy becomes capability. Without the right context architecture, even the best AI strategy remains a document on a shelf.

Our core expertise: This is Practice tier work— designing and implementing the context systems that make AI useful in your specific environment. Not one-off prompts, but persistent capability.

What We've Learned in Practice

Through implementing context systems across sustainability consulting, trading operations, and client delivery, we've validated several patterns:

Compression Compounds

Each context reduction makes the next easier. Start aggressive, refine based on what breaks.

Memory Health Is Maintenance

Context architecture isn't a one-time setup. Weekly maintenance protocols prevent gradual degradation.

Navigation Beats Structure

Folder hierarchies organise files. Topic-based navigation paths tell AI what to load and when. Both are needed.

Session Continuity Is Hard

The handoff between work sessions is the least discussed problem in AI—and often the most impactful to solve.

Results from our implementations: 3× faster session startup, 40% reduction in token costs, near-zero context drift.

The Shift from Prompts to Systems

Prompt engineering asks: “How do I phrase this question?”

Context engineering asks: “What does AI need to know to answer well?”

Anthropic defines it as “designing dynamic systems that provide AI models with the right information at the right time.” It's the evolution from crafting individual queries to architecting information environments.

	Prompt Engineering	Context Engineering
Focus	The question	The knowledge
Scope	Single query	Entire system
Approach	Craft better prompts	Design information flow
Result	Better answers	Consistent capability

NEW: MAY 2026

Staging Work

In the agent era, the operator's job is often no longer to produce the output directly. It is to stage the conditions under which the agent can produce the output well: context, constraints, examples, success criteria, handoff state, and review loops.

Locked

Decided. The agent should treat this as a constraint.

Provisional

Leaning this way, but open to pressure-testing.

Open

The agent should explore options and surface trade-offs.

Contested

There is a live disagreement or unresolved tension.

This is why handoff formats matter. Markdown works when the document is mostly for agents and will be edited repeatedly. HTML can work better when humans need to inspect, compare, or interact with the staging artefact. The format follows the audience, lifecycle, and time horizon.

Outcome Rubrics

Agents are producing more work than humans can comfortably review line by line. The next practice layer is to define what good looks like before the work starts: acceptance criteria, examples, failure modes, and a rubric that a separate review agent can apply after the first pass.

Practical pattern: assign the builder agent the task, assign a separate grader agent the rubric, and only bring the human in for judgment calls, exceptions, and final responsibility.

Dreaming as Maintenance

Scheduled memory review is becoming a product feature. The useful habit is older: review what happened, extract durable patterns, remove stale detail, and carry forward the learnings that should shape the next session.

Practical pattern: after a meaningful run, ask what should become a skill, what should become memory, what should be forgotten, and what should be checked next time.

Interaction Lowers Context Cost

Voice, browser context, pointing, interruption, and real-time correction are not just nicer interfaces. They reduce the cost of transferring intent from a human into an agent. That matters because the bottleneck in agent work is often not model intelligence; it is the user's ability to supply enough context without turning the setup into a separate job.

Speak

Dump messy context faster than typing, then ask the agent to structure it.

Show

Use browser, screen, document, or visual state as part of the prompt, not an afterthought.

Interrupt

Correct the agent while the work is forming, before the wrong path becomes expensive.

The Context Problem

More context doesn't mean better performance.

Research shows input length alone can reduce AI accuracy by 14-85% – even when all information is relevant.

Lost in the Middle

Models favour information at the start and end of context, missing what's in between.

Context Rot

Quality degrades gradually as context accumulates and ages without maintenance.

Signal Dilution

Important information drowns in noise when everything is loaded indiscriminately.

Many teams dump everything into context. That's like answering every question by reading the encyclopaedia aloud.

The issue is not just volume; it's missing decision lineage. Context graphs help by recording approvals and exceptions so AI can retrieve the relevant rationale without loading everything.

THE INDUSTRY'S UNSOLVED PROBLEM

Context Drift

Academic research (2025-2026) identifies context drift—the gradual degradation of context quality across sessions and time—as the central unsolved challenge in AI memory systems. Most tools handle single-session context well. Multi-session, multi-day continuity remains hard. This is where operational rhythm systems (session handoffs, weekly coordination) become critical.

Four Strategies for Managing Context

Based on Anthropic's framework for effective AI systems

1. Write

Persist externally

Store information outside the context window for later retrieval. Files, databases, knowledge bases – anything that persists beyond the session.

2. Select

Load only what's relevant

Retrieve context based on the task at hand, not everything available. Dynamic retrieval, semantic search, just-in-time loading.

3. Compress

Summarise, don't accumulate

Keep context lean through intelligent summarisation. Archive old content, preserve decisions, trim the unnecessary.

4. Isolate

Separate contexts for separate concerns

Don't let different workstreams pollute each other. Multi-agent architectures, session boundaries, role-specific loading.

The Temporal Dimension

The solution to context drift

Most context engineering focuses on structure—how information is organised. We've found the rhythm matters just as much. This is how you solve context drift: not just better architecture, but better cadence.

Session Handoffs

How does context pass between work sessions? What gets carried forward, what gets compressed, what gets archived? Explicit handoff protocols prevent the “starting from scratch” problem.

Weekly Coordination

How do strategic priorities flow into daily work? How does work roll up into weekly synthesis? Coordination bridges connect strategy to execution without context overload.

Memory Maintenance

Context degrades over time. Scheduled compression, archiving cadences, and health checks keep context fresh. Without maintenance, even good architecture accumulates noise.

Why this matters: Academic research identifies context drift as the central unsolved challenge in AI memory systems. Most tools handle single sessions well. Multi-session, multi-day continuity is where operational rhythm becomes critical.

The architecture is the skeleton. The rhythm is the heartbeat.

Tiered Context Architecture

We use a budget-based approach to context layers. Each tier has a token budget and update frequency—this prevents context bloat while ensuring AI has what it needs.

Tier	Purpose	Token Budget	Load Frequency
Tier 0	Compressed state	~300 tokens	Always
Tier 1	Active context + navigation	~1,000 tokens	Session start
Tier 2	Domain knowledge	On-demand	When needed
Tier 3	Archive	Rarely	Historical only

Key insight: Most teams overload Tier 0-1 and underuse Tier 2-3. The result is context bloat, slower reasoning, and higher costs.

THE LOAD PATTERN

TIER 0

Always loaded

→

TIER 1

Session start

→

TIER 2

On-demand

→

TIER 3

When needed

Context Graphs: Decision Lineage

Systems of record capture what happened. Context graphs capture why.

When AI needs to make a decision, it shouldn't just know the rule—it should know the precedents, exceptions, and reasoning that shaped it.

Approvals & Exceptions

Why was this approved? What precedent does it set? Context graphs make the reasoning retrievable.

Policy Evolution

How did we get here? What changed and why? Decision traces show the path, not just the destination.

Audit Trails

What informed this decision? Who signed off? Context graphs support governance and compliance.

We design context graphs that turn scattered decisions into searchable precedent— making institutional knowledge available to AI without loading everything.

Context as Competitive Moat

STRATEGICNEW — MARCH 2026

Google's internal AI team made a revealing observation in early 2026: the “sum totality of an organisation's documents” creates capabilities that no AI lab can replicate from the outside. Your organisational data, decision history, and institutional knowledge are not a problem to manage — they are the competitive advantage.

Why Labs Can't Compete

Foundation models are general-purpose. Your context — client history, policy decisions, domain-specific reasoning — is what transforms general AI into your AI. No amount of training data replicates what you've accumulated through operations.

Context Engineering = Moat Building

This reframes context engineering from a technical practice to a strategic investment. Every well-structured knowledge base, every maintained decision trace, every curated context layer is an asset your competitors don't have.

The implication: Organisations investing in context architecture today aren't just improving AI performance — they're building durable competitive advantage that deepens with every interaction.

The Four Levers of Efficient AI

NEW — JUNE 2026

Glean's June 2026 enterprise analysis names the four levers that determine whether AI becomes genuinely efficient or a “laziness tax.” They map directly to what context engineering delivers.

1. Context Quality

What information the model receives before it generates. Poor context means more tokens, worse outputs, higher cost. Good context means the model solves the right problem first time.

2. Model Routing

Routing each task to the cheapest model that meets quality threshold. Using a frontier model for every task is not a quality strategy — it's a laziness tax. Token pricing is the rate. Tokens-to-completion is the actual invoice.

3. Continual Learning

Capturing what works so you don't pay the exploratory cost twice. Skills, session templates, decision logs. Arvind Jain (Glean): “When someone does useful work, we document it so we do not recreate it from scratch.”

4. Harness Design

How the AI system is structured: tool selection, orchestration patterns, verification loops. A well-designed harness amplifies context quality and model routing. A poor harness makes even good models underperform.

These four levers are the industry-named reference for what Pandion's context engineering advisory delivers. Organisations investing in CE work are building the infrastructure of efficient AI — not just better outputs today, but lower cost-per-outcome as usage scales.

The Vendor-Neutral Floor

NEW — JUNE 2026

Model routing has a resilience dimension, not just a cost one. When access to any single model can be repriced or withdrawn, the practice that keeps you working is staying portable across vendors and, where it matters, able to run locally.

Four levels of independence

A ladder from quickest to most resilient: (1) a routing aggregator across many providers (e.g. OpenRouter); (2) open-source models inside your own cloud; (3) self-hosted on rented GPU; (4) fully local on hardware you control, which keeps running through any outage, price change, or vendor withdrawal.

The local stack, briefly

Running a model yourself is five layers: hardware (the key number is GPU memory), the model (a 7–14B open-weight model covers a lot), a serving layer (Ollama is the simplest), a harness (Open Web UI for chat; OpenClaw or Hermes for agents), and the app on top.

For a solo operator the realistic posture is not a private data centre. It is routing across vendors for everyday work, plus a small or mid-size local model for the few workflows that must keep running or must never leave your machine. Start with one good machine and one useful workflow, prove the quality, then decide whether to go further.

Who Benefits from Context Engineering?

Individuals

• Consistent AI results across sessions
• Build on previous work, not from scratch
• Reduce time re-explaining context

Teams

• Reduce hallucinations through better knowledge
• Enable handoffs between human and AI
• Shared context across team members

Organisations

• Multi-agent coordination without pollution
• Governance and compliance controls
• Scalable knowledge management

Our Approach Is Informed By

Anthropic's context engineering guidance

Karpathy's “RAM management” framing

Vercel's AGENTS.md evaluation research

Mei & Yao survey (1,400+ academic papers)

MemAgents architecture research

Validated through our own implementations

In 30 Seconds

AI doesn't remember. Every conversation starts fresh. Every session begins from zero. The brilliant assistant who helped you yesterday has no idea who you are today.

This isn't a bug – it's how LLMs work. But it's also why most AI implementations deliver inconsistent value. The forgetting problem is solvable.

Memory health is the practice of designing systems that give AI what it needs to know – when it needs to know it – without drowning in irrelevant information.

Why This Matters

For Individuals

Without memory, you re-explain context every session. The same background, the same preferences, the same project details – again and again.

Time saved by AI gets consumed by context-setting. The productivity promise erodes with every fresh start.

For Teams

When AI forgets, knowledge doesn't compound. Insights from one session don't inform the next. Each team member starts from scratch.

The result: inconsistent outputs, duplicated effort, and AI that never gets better at understanding your work.

The compounding cost: Every time AI forgets, you lose the value of everything it learned. Good memory design means knowledge builds over time instead of resetting to zero.

The Forgetting Problem

Understanding why AI forgets is the first step to fixing it

Context Windows Have Limits

Every AI model has a finite “context window” – the amount of text it can consider at once. When the window fills, old information gets pushed out.

The Illusion

Modern models have large context windows (100K+ tokens). This feels like plenty of memory.

The Reality

Long contexts degrade performance. Research shows accuracy drops 14-85% as context length increases – even with relevant information.

No Native Persistence

LLMs have no built-in way to store information between sessions. Unlike databases or file systems, they don't write to permanent storage. Each conversation exists in isolation.

What Users Expect

“You remember that project we discussed last week, right?”

What Actually Happens

The model has no access to previous conversations. Last week doesn't exist.

Lost in the Middle

Even within a single context window, attention isn't uniform. Models favour information at the beginning and end, often missing what's in the middle.

The Pattern

Critical information buried in the middle of long conversations gets less attention from the model.

The Impact

Important context can be effectively “forgotten” even while technically still in the window.

Key insight: The “memory problem” isn't a flaw to be fixed by model improvements. It's a fundamental architecture that requires system-level solutions.

Symptoms of Poor Memory Health

Recognise these patterns? They're signs your AI system needs memory architecture.

Repetitive Context-Setting

You explain the same background information every session. “I work at X company, we do Y, the project is about Z...”

Inconsistent Outputs

The same question yields different answers in different sessions. No learning from previous interactions carries forward.

Contradictory Advice

AI suggests approaches that conflict with decisions made in previous sessions. It doesn't know what was already decided.

Context Rot

Long conversations degrade. The AI starts referencing outdated information or losing track of earlier agreements.

Knowledge Silos

Insights from one conversation can't be applied elsewhere. Each session is an island of learning that sinks after use.

The Eternal Beginner

Despite months of use, AI still asks basic questions. It never develops understanding of your domain or preferences.

These aren't AI limitations. They're architecture gaps. Every symptom has a solution – if you design for memory.

The Four Memory Strategies

Based on Anthropic's context engineering framework

1. Write: Persist Externally

Since AI has no native memory, create external storage. Files, databases, knowledge bases – anything that persists beyond the session.

Session Logs

Capture key decisions and outcomes from each conversation

Knowledge Files

Curated information that AI should always know

State Documents

Living files that track current project status

2. Select: Load Only What's Relevant

Don't load everything every time. Retrieve context based on the task at hand. Just-in-time loading beats all-the-time loading.

Dynamic Retrieval

Fetch relevant documents based on the current query

Semantic Search

Find information by meaning, not just keywords

Role-Based Loading

Different tasks load different context packages

3. Compress: Summarise, Don't Accumulate

Keep context lean. Replace long conversation history with concise summaries. Archive old content, preserve decisions, trim the unnecessary.

Conversation Summaries

Replace 50 messages with 5 key takeaways

Decision Logs

Keep what was decided, not how it was discussed

Context Pruning

Regular maintenance to remove outdated information

4. Isolate: Separate Contexts for Separate Concerns

Don't let different workstreams pollute each other. Use boundaries to keep contexts clean and focused.

Session Boundaries

Clear starts and ends for different work types

Project Isolation

Client A's context doesn't leak into Client B

Multi-Agent Design

Different agents with different specialised contexts

Layered Memory Architecture

Organise memory by stability and scope

Effective AI memory isn't a single file – it's a layered architecture. Higher layers are stable and rarely change. Lower layers are ephemeral and session-specific.

Most stable

↓

Most ephemeral

L1: SYSTEM INSTRUCTIONS

How the AI should behave, safety rails, core capabilities

L2: AGENT IDENTITY

Who is this AI? Expertise, tone, protocols

L3: STRATEGIC MEMORY

Key decisions, priorities, what's been tried

L4: KNOWLEDGE ARCHITECTURE

Where things are, how to navigate, reference material

L5: ENTITY CONTEXT

Client/project specific: history, preferences, current state

L6: SESSION CONTEXT

Current conversation, working memory, live state

Set once

Monthly

Weekly

As needed

Per project

Per message

The Design Principle

Load stable layers automatically. Load ephemeral layers dynamically. Don't burden every session with information that rarely changes.

The Maintenance Principle

Update each layer at appropriate intervals. Strategic memory weekly. Session context every message. Match maintenance rhythm to layer stability.

Memory Patterns in Practice

Common patterns for implementing healthy AI memory

The Handoff Document

A single file that captures: what happened, what was decided, what's next. Updated at session end, loaded at session start.

Best for:

Individuals working across multiple sessions on the same project

The Project Bible

A comprehensive reference document containing all project context. Loaded automatically when working on that project.

Best for:

Complex projects with many decisions and constraints to remember

The Skills Library

Modular knowledge files that can be loaded on demand. Different skills for different tasks, loaded as needed.

Best for:

Teams with diverse tasks requiring different domain expertise

The Weekly Bridge

A rhythm-based summary that synthesises the week's sessions. Carries forward key context without accumulating endless history.

Best for:

Ongoing operations with continuous but evolving context

Memory Requires Maintenance

Without Maintenance

• Context files grow stale
• Outdated information contradicts current reality
• Memory becomes noise rather than signal
• AI references things that are no longer true
• The system degrades back to forgetfulness

With Maintenance

• Context stays current and accurate
• Old information gets archived, not deleted
• Each session starts with relevant, fresh context
• Knowledge compounds reliably over time
• The system gets smarter, not staler

Memory health is a practice, not a one-time setup.
The rhythm matters as much as the architecture.

How We Help

Design, implement, and maintain AI memory systems

Memory Architecture

Design the right layer structure for your context. What belongs where, what loads when, how it all connects.

Implementation

Build the files, set up the retrieval, establish the workflows. From individual setups to team-scale systems.

Maintenance Protocols

Define the rhythms and processes that keep memory healthy. What gets updated when, how staleness is prevented.

In 30 Seconds

There are two ways to give AI the information it needs: load it upfront (passive context) or fetch it when needed (on-demand retrieval). Most teams assume retrieval is smarter. The research says otherwise.

Vercel's Next.js team ran rigorous evaluations comparing these approaches. The result: passive context achieved 100% accuracy where on-demand retrieval achieved 53%.

The insight: When information is always present, there's no decision point that can fail. No retrieval logic to get wrong. No ordering issues. Just consistent availability.

The Research

Vercel AGENTS.md Evaluation (January 2026)

Vercel's Next.js team tested how AI coding agents perform with different context configurations. They compared baseline performance against various approaches for providing project-specific information.

53%

No documentation

Baseline

53%

On-demand retrieval

Skills system

79%

Retrieval + instructions

Enhanced skills

100%

Passive context

AGENTS.md file

The striking finding: On-demand retrieval performed no better than having no documentation at all. The retrieval system existed, but it didn't help. Only when information was passively present did performance improve.

Why Passive Context Wins

Three fundamental advantages over on-demand retrieval

1. No Decision Point

On-demand retrieval requires a decision: “What information do I need for this query?” That decision can be wrong. The model might not realise it needs certain context.

With passive context, there's no decision to get wrong. The information is already there. Every time.

2. Consistent Availability

Retrieval systems are probabilistic. They might return relevant documents 80% of the time, or 60%, or 40%. The quality varies by query, by phrasing, by the state of the vector database.

Passive context is deterministic. The same information is present on every turn. No variance. No “sometimes it works” frustration.

3. No Ordering Issues

With retrieval, critical information might arrive too late in the reasoning process. The model starts generating before realising it needs more context.

Passive context is present from the first token. The model reasons with full information from the start.

The Tradeoff: Why Not Load Everything?

If passive context is better, why not just load all available information? Because context windows have effective limits that are smaller than their technical limits.

The Memento Limit

Research suggests effective reasoning capacity is around 100K tokens, even when context windows are technically larger. Beyond this, performance degrades.

A 200K context window doesn't give you 200K of useful reasoning space. It gives you 100K of effective space with increasing noise.

Lost in the Middle

Models pay more attention to the beginning and end of context. Information in the middle gets weighted less, even when it's critical.

More context can mean important information gets buried where the model is less likely to use it effectively.

The goal isn't maximum context. It's the right context. Passive for what matters most. Retrieval for everything else.

Tiered Context Architecture

The pattern that balances passive reliability with retrieval flexibility

Tier	Type	Token Budget	What Goes Here
Tier 0	Passive	~300 tokens	Compressed state: current status, key metrics, active items
Tier 1	Passive	~1,000 tokens	Active context: navigation, recent decisions, current focus
Tier 2	On-demand	Variable	Domain knowledge: loaded when topic requires it
Tier 3	Retrieval	As needed	Archive: historical, rarely accessed

Passive Foundation

Tier 0 and Tier 1 are always loaded. This is your passive context. Keep it lean (~1,300 tokens total) but ensure it contains everything AI needs to orient itself and navigate effectively.

Retrieval for Depth

Tier 2 and Tier 3 use selective retrieval. Navigation paths in Tier 1 point to relevant Tier 2 content. This gives you depth without bloat.

Implementation Patterns

Practical approaches for passive context systems

The Project File Pattern

A single file (CLAUDE.md, AGENTS.md, or similar) at project root containing everything AI needs to work effectively in that context.

Typical contents:

• Project description and purpose
• Key decisions and constraints
• Build/test commands
• Code conventions
• Current focus areas

The MEMORY + CONTEXT Pattern

Two complementary files: MEMORY.md for compressed state (~300 tokens), CONTEXT.md for active context and navigation (~1,000 tokens).

The split:

• MEMORY: “Where are we?” (status, metrics, active items)
• CONTEXT: “How do I work here?” (navigation, decisions, focus)

The Navigation Hub Pattern

Passive context includes a navigation table: “When you need X, read Y.” This creates predictable paths from topics to relevant files.

Example:

| Topic | Read |
| pricing | docs/pricing-rules.md |
| deployment | docs/deploy-guide.md |

The Token Budget Pattern

Explicit limits on each passive context file. When a file exceeds its budget, compress it. Move detail to Tier 2 and keep pointers in Tier 0-1.

Enforcement:

• Tier 0: Max 300 tokens (hard limit)
• Tier 1: Max 1,000 tokens (soft limit)
• Review weekly, compress as needed

When to Use What

Use Passive Context For

✓Identity and behaviour rules (always needed)
✓Current project state (changes, but always relevant)
✓Navigation pointers (how to find deeper content)
✓Recent decisions (context that's frequently referenced)
✓Session handoff state (what to pick up from last time)

Use Retrieval For

✓Large knowledge bases (too big for passive loading)
✓Historical archives (rarely needed)
✓Domain-specific content (only relevant for certain queries)
✓Reference documentation (detailed specs, APIs)
✓Content that varies by user/session

The combination is powerful: Passive foundation + selective retrieval. Reliability where it matters most. Flexibility where you need depth.

Common Mistakes

Mistake 1: No Passive Context at All

Relying entirely on retrieval. Every query starts with a search. Result: inconsistent baseline, variance in quality, 53% performance.

Fix: Establish a passive foundation, even if it's just 500 tokens.

Mistake 2: Too Much Passive Context

Loading everything passively to avoid retrieval complexity. Result: bloated context, lost-in-the-middle problems, degraded reasoning.

Fix: Enforce token budgets. Compress aggressively. Move detail to Tier 2.

Mistake 3: Stale Passive Context

Setting up passive context once and never updating it. Result: AI references outdated information, makes contradictory decisions.

Fix: Weekly review cadence. Update Tier 0 after every significant change.

Mistake 4: No Navigation to Tier 2

Passive context that doesn't tell AI where to find deeper information. Result: AI either hallucinates or asks repeatedly for guidance.

Fix: Include navigation paths. “When you need X, read Y.”

Getting Started

Create a Tier 0 file

Start with ~300 tokens of compressed state. Current status, key metrics, active items. Name it MEMORY.md or include it at the top of your main context file.

Add navigation to Tier 1

Create a CONTEXT.md with ~1,000 tokens. Include a navigation table: “When topic X comes up, read file Y.” This creates predictable paths to deeper content.

Configure automatic loading

Ensure your AI tool loads Tier 0 and Tier 1 at session start. For Claude Code, this means CLAUDE.md. For other tools, AGENTS.md or equivalent.

Establish maintenance rhythm

Weekly: review passive context for staleness. After significant changes: update Tier 0. Monthly: audit token budgets and compress as needed.

In 30 Seconds

AgentOS is the persistent foundation underneath whichever AI tool you use. Plain text files at the root of your workspace describing who you are, what you know, how you work, what you remember, what you can reach, how you verify, and what you automate.

The model is the engine. The harness is the runtime (Claude Code, Cursor, Codex). The AgentOS is yours. Models change every six months. Harnesses converge over twelve to twenty-four. The AgentOS compounds across both.

The terminology landed publicly in April 2026 via AIDB's programme on Personal Context Portfolios. Several pieces of vocabulary — PCP, Monothread, Harness Engineering, Strict-Write, Auto-Dream — now name patterns we've been using or building for years. This section maps them.

The Seven Layers

Each layer is a discipline. You don't build them all at once. You build the foundation (Identity + Context) first, then add the others as your work demands them.

1. Identity

Who you are. What you do. What you stand for. The file every other layer references.

2. Context

Your situation. What’s true now. Your operating environment. (Pandion calls this layer’s discipline Context Engineering.)

3. Skills

Procedural knowledge made portable. Agents and named capabilities that can be loaded and run.

4. Memory

What compounds across sessions. What gets remembered, summarised, archived.

5. Connections

The data sources, services, and tools your AI can reach. Trust-graded.

6. Verification

How outputs are checked, grounded, evaluated. Trust by construction, not by hope.

7. Automations

What runs without you. Scheduled jobs, triggers, agents that act on signal.

The seven layers compound. The layers below feed the layers above. The whole stack survives every harness swap.

Personal Context Portfolio (PCP)

NLW's ten-file markdown recipe for the bottom of an AgentOS. A solo operator can sit down and have version one in an afternoon. Each file lives at the root of your workspace as plain text:

identity.md

rolesAndResponsibilities.md

currentProjects.md

teamAndRelationships.md

toolsAndSystems.md

communicationStyle.md

goalsAndPriorities.md

preferencesAndConstraints.md

domainKnowledge.md

decisionLog.md

PCP is a specific organising recipe for the Identity + Context (and bits of Memory + Connections) layers of AgentOS. It's not the whole AgentOS. It's a tractable starting point for layers 1, 2, 4 and 5.

Where Pandion sits: our Context Engineering methodology (MEMORY.md + CONTEXT.md + neural paths + topic-memory pattern) is a richer architecture than flat PCP for the Context layer specifically. Same job, more sophistication. PCP is a clean public recipe; CE is the upgrade path.

Monothread — one long-lived thread, not fresh chats per task

Named by Nick Bowman in mid-April. The pattern: a thread's value increases over time when context compaction is good.You keep one long-lived orchestration thread, plus specialist sub-threads spawned from it. The thread accumulates. You don't throw away context every Monday morning.

For most people who've learned AI through ChatGPT, the instinct is the opposite: fresh chat per task, one-off prompts, lose the context. Monothread inverts that: brief once, accumulate, compact. Pandion's BATON + MEMORY + MASTER-OVERVIEW filesystem pattern is monothread-as-files; the orchestration thread reads them at every session start.

What this looks like in practice: a single working thread for a project that runs across weeks, with the AgentOS files providing the persistent memory between sessions. Sub-threads spawn for narrow specialist work and report back. The orchestration thread never resets.

Harness Engineering — the named industry discipline

The lineage: prompt engineering (2023) became context engineering (2024) became harness engineering (2026). Each names a different layer of work:

Prompt engineering — how you phrase a single request. Largely absorbed into the model.
Context engineering — what the AI knows, when it knows it. The Context layer of AgentOS.
Harness engineering — how the runtime is configured: tools, memory wiring, file access, agent loops, verification gates. The discipline of choosing and tuning the harness.

A useful three-layer model from Aetna Labs (April 2026): Information (what the model can see), Execution (what tools it can run), Feedback (how outputs are checked). Most disappointing AI output is a configuration problem, not a model problem.

Strict-Write and Auto-Dream — memory disciplines

Two named patterns from the Practical AI post-mortem of the Claude Code source leak (April 2026). Both apply at the Memory layer.

Strict-Write

Only record to memory after environment verification — terminal output, API confirmation, filesystem write.

Hallucination prevention at the memory layer, not the inference layer. What gets remembered must have been observed.

Auto-Dream

Periodic consolidation. Every 24 hours (or weekly), review observations and consolidate into permanent facts.

Prevents memory accumulation noise. Pandion's Friday Review is auto-dream at weekly cadence.

Briefing Opus 4.8 — the judgment shift

Opus 4.8 (28 May 2026) carries forward the literal-instruction pattern from 4.7, so the five-point brief format still works. One thing changes: the model now pushes back more reliably. It is 4x less likely to let a flaw or an unsound plan pass without comment. That is an asset, not friction. Build your briefs expecting interrogation:

Lead with the goal in one sentence.
State the constraints (audience, length, tone, format, what to avoid).
Define what “done” looks like — the shape and standard of the output.
Tell the model what to verify before returning.
Then let it run. If it pushes back, that is useful signal, not a failure.

The pushback behaviour also changes how you run agentic loops. Where 4.7 would proceed and surface problems at the end, 4.8 is more likely to flag an issue mid-task. Build review checkpoints that can receive a challenge, not just a result.

The half-hour exercise that pays back across every prompt for the next quarter: take your most-used saved instruction or system prompt, the one you wrote against an earlier model and haven't touched in months, and tighten it. Be specific where you were vague. Add verification checks. Define done. Then check whether the brief is robust enough to receive a challenge mid-task.

Fast Mode note (29 June 2026): Opus 4.6 Fast Mode is deprecated from that date. Workflows using /fast will default to Opus 4.8 Fast Mode (2x standard Opus 4.8 pricing: $10/$50 per million tokens). Check any saved configurations or automations that rely on 4.6 Fast Mode before the end of June.

INTEGRATING FRAMENEW — MAY 2026

Don't Break the Loop

Jason Liu, on the Codex team, published a guide in May 2026 (“Codex Maxing”) listing nine tips for getting more out of Codex. Read in one go, they describe a single integrating shift: the productivity unlock with AI is no longer faster turn-taking. It is parallel work. The operator and the agent stay in motion together.

The tips map cleanly onto the AgentOS Layers we already use. Each is in service of one principle: never put the agent on pause while you think, observe, or change direction.

Vocabulary alignment: the Codex team is now articulating the patterns Pandion already names. Mono-thread, files-not-chat memory, harness as a work system rather than a chat replacement, voice as a way to brief richer context, side panel for parallel review. The AgentOS Layers framework is no longer Pandion-specific vocabulary; the practitioners shipping the harnesses are using the same words.

“A long thread can remember a lot, but that memory is trapped inside the thread unless the useful parts get serialized somewhere durable. Files force the agent to compress experience into a form that can survive the thread.”

— Jason Liu (Codex team), “Codex Maxing”, May 2026

Mono-thread

One long-lived durable thread per workstream. Compaction keeps the larger context alive across sessions. (Layer: Context, Memory.)

Voice

Brief the agent by rambling, not by typing a polished sentence. Messy input is fine; the agent helps you turn it into something clear. (Layer: Identity, Skills.)

Steer

Update the prompt while work is in progress. You don't need the perfect upfront brief; redirect in motion. (Layer: Skills, Verification.)

Files-not-chat memory

Structured memory in plain files (with an agents.md at the root telling the agent what to write down). Memory survives the thread. (Layer: Memory.)

Computer + browser use

Give the agent access to local files, the browser, and external services. It becomes an evidence gatherer, not a chat box. (Layer: Connections.)

Remote control

Steer long-running work from mobile while you do something else. Useful when tasks scale to hours, not minutes. (Layer: Skills, Automations.)

Heartbeats

Scheduled or trigger-based check-ins. The thread wakes itself up, checks email, Slack, a render, an inbox. (Layer: Automations.)

Goals (/goal)

For work with verifiable success criteria, hand the agent the goal and let it push against it. Now in both Codex and Claude Code. (Layer: Verification.)

Side panel

Inspect and annotate artefacts while the agent keeps building. Parallel review, not turn-taking. (Layer: Verification.)

What changes when you stop breaking the loop: you stop sitting and waiting for the agent to finish a thing before you can think about the next thing. The agent stops sitting and waiting for you to type the perfect next prompt. Both of you keep moving. For a solo or small-business operator, this is where the day-to-day multiplier with AI actually shows up — not in any single model upgrade, but in the shape of the working relationship.

NEW — MAY 2026RESEARCH PREVIEW

Dynamic Workflows and the Fleet Pattern

The “Don't Break the Loop” frame describes one-operator, one-agent motion. Dynamic Workflows (research preview, Max/Team/Enterprise) extends it: Claude coordinates a fleet of parallel subagents within a single session, each working a slice of a larger job simultaneously.

Agent View (research preview) adds the monitoring layer: kick off an agent, send it to the background, see what is waiting, running, or done without staying in the conversation. Jump in only when a decision is needed. The command is claude agents.

Dynamic Workflows

Ask Claude to “Create a workflow” and it coordinates a fleet: each subagent works a different part of the job in parallel, results converge in the session. No external tooling or separate API calls required.

Current availability: Max, Team, and Enterprise plans. Research preview. Watch for general availability.

Agent View

Background dispatch and monitoring. Start a task, step away, return when the agent flags a decision or completes. Replaces the “keep checking the terminal” loop with a status dashboard.

Available now: claude agents in Claude Code. Research preview across plans.

What to build toward now, even if you're on a Pro plan: use /goal to practise defining completion conditions rather than prompting step by step. That is the same skill Dynamic Workflows requires. The habit of staging a task well — clear criteria, clear constraints, clear “done” — transfers directly when fleet orchestration becomes available on your plan.

NEW — MAY 2026

The Human Sandwich and the Team Agent

Two practitioner patterns from Every’s experiments with AI-native work offer clear mental models for how human-agent collaboration actually operates in harnesses like Claude Code, Codex, and Cowork.

The Human Sandwich

Dan Shipper’s framing: you set the frame and success criteria (what “done” looks like, what counts as good). The agent collapses the task into drafts, searches, code, and comparisons. You judge, extend, and decide what’s next.

This is the practical shape of complex human-agent work: neither turn-by-turn prompting nor fully autonomous delegation, but a repeating cycle where human judgment bookends the agent’s execution. The “don’t break the loop” principle from the Codex team is the same idea from the agent’s side.

The Team Agent

Every’s lesson from their first round of agent experiments: per-person agents (each employee with their own AI replica) create per-person maintenance burdens and lose their value when the person leaves.

The better model: a shared agent built around a recurring workflow — weekly digest, client report drafting, invoice reconciliation — maintained by one person, used by everyone whose work intersects. Better continuity, lower maintenance overhead, and team context that survives staff changes.

The practical question to ask about any agent workflow: is this a personal agent (useful only to one person, loses context when they leave) or could it be a team agent (shared around a workflow, maintained once, useful to everyone whose work touches it)? Most recurring workflows benefit from being the second.

Explore Other Tiers

The Landscape

The full menu of AI tools, models, and platforms available today.

Explore →

The Foundation

Strategy, data, sustainability, and the adoption landscape.

Explore →

The Application

Domain-specific AI applications and implementation patterns.

Explore →

From Knowledge to Capability

Context engineering, agent orchestration, skills architecture, fluency development, and workforce capability – these practices determine whether AI delivers consistent value or inconsistent experiments. If any of these feel uncertain, we can help you get them right.

Start a Conversation ← Back to AI Capability

The Practice

The whole practice in one view

Context Engineering

Memory Health Protocol

Passive Context Architecture

AgentOS Layers in Practice

Agents & Orchestration

AI Agents Explained

The Skills Paradigm

AI Skills & Fluency

The AI Literacy Gap

Two Roles for the AI Era

Learning Paths

Context Engineering

In 30 Seconds

What We've Learned in Practice

Compression Compounds

Memory Health Is Maintenance

Navigation Beats Structure

Session Continuity Is Hard

The Shift from Prompts to Systems

Staging Work

Locked

Provisional

Open

Contested

Outcome Rubrics

Dreaming as Maintenance

Interaction Lowers Context Cost

Speak

Show

Interrupt

The Context Problem

Lost in the Middle

Context Rot

Signal Dilution

Four Strategies for Managing Context

1. Write

2. Select

3. Compress

4. Isolate

The Temporal Dimension

Session Handoffs

Weekly Coordination

Memory Maintenance

Tiered Context Architecture

Context Graphs: Decision Lineage

Approvals & Exceptions

Policy Evolution

Audit Trails

Context as Competitive Moat

Why Labs Can't Compete

Context Engineering = Moat Building

The Four Levers of Efficient AI

1. Context Quality

2. Model Routing

3. Continual Learning

4. Harness Design

The Vendor-Neutral Floor

Four levels of independence

The local stack, briefly

Who Benefits from Context Engineering?

Individuals

Teams

Organisations

Our Approach Is Informed By

Memory Health Protocol

In 30 Seconds

Why This Matters

For Individuals

For Teams

The Forgetting Problem

Context Windows Have Limits

No Native Persistence

Lost in the Middle

Symptoms of Poor Memory Health

Repetitive Context-Setting

Inconsistent Outputs

Contradictory Advice

Context Rot