AI CAPABILITY • PRACTICE

The Practice

Where AI knowledge becomes operational capability

Context engineering, agent orchestration, skills fluency, and curated learning paths – the disciplines and resources that create real-world AI value.

AT A GLANCE

The whole practice in one view

Ten sections plus a curated learning library, grouped by discipline. Click any card to open the detail below.

Context Engineering

4 sections

Agents & Orchestration

3 sections

Skills & Fluency

3 sections

Learning Resources

curated library

The discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

MAY 2026 PRACTICE LENS

The practical shift is from prompting to staging. Small teams get leverage when they prepare the context, mark what is locked or still open, give agents clear rubrics, and preserve the useful learning after each run. Agent management is becoming a work primitive, not a specialist side activity.

In 30 Seconds

Most AI failures aren't model problems. They're context problems. Context engineering is the discipline of designing what AI knows, when it knows it, and how that knowledge is structured.

This is where strategy becomes capability. Without the right context architecture, even the best AI strategy remains a document on a shelf.

Our core expertise: This is Practice tier work— designing and implementing the context systems that make AI useful in your specific environment. Not one-off prompts, but persistent capability.

What We've Learned in Practice

Through implementing context systems across sustainability consulting, trading operations, and client delivery, we've validated several patterns:

Compression Compounds

Each context reduction makes the next easier. Start aggressive, refine based on what breaks.

Memory Health Is Maintenance

Context architecture isn't a one-time setup. Weekly maintenance protocols prevent gradual degradation.

Navigation Beats Structure

Folder hierarchies organise files. Topic-based navigation paths tell AI what to load and when. Both are needed.

Session Continuity Is Hard

The handoff between work sessions is the least discussed problem in AI—and often the most impactful to solve.

Results from our implementations: 3× faster session startup, 40% reduction in token costs, near-zero context drift.

The Shift from Prompts to Systems

Prompt engineering asks: “How do I phrase this question?”

Context engineering asks: “What does AI need to know to answer well?”

Anthropic defines it as “designing dynamic systems that provide AI models with the right information at the right time.” It's the evolution from crafting individual queries to architecting information environments.

Prompt EngineeringContext Engineering
FocusThe questionThe knowledge
ScopeSingle queryEntire system
ApproachCraft better promptsDesign information flow
ResultBetter answersConsistent capability
NEW: MAY 2026

Staging Work

In the agent era, the operator's job is often no longer to produce the output directly. It is to stage the conditions under which the agent can produce the output well: context, constraints, examples, success criteria, handoff state, and review loops.

Locked

Decided. The agent should treat this as a constraint.

Provisional

Leaning this way, but open to pressure-testing.

Open

The agent should explore options and surface trade-offs.

Contested

There is a live disagreement or unresolved tension.

This is why handoff formats matter. Markdown works when the document is mostly for agents and will be edited repeatedly. HTML can work better when humans need to inspect, compare, or interact with the staging artefact. The format follows the audience, lifecycle, and time horizon.

Outcome Rubrics

Agents are producing more work than humans can comfortably review line by line. The next practice layer is to define what good looks like before the work starts: acceptance criteria, examples, failure modes, and a rubric that a separate review agent can apply after the first pass.

Practical pattern: assign the builder agent the task, assign a separate grader agent the rubric, and only bring the human in for judgment calls, exceptions, and final responsibility.

Dreaming as Maintenance

Scheduled memory review is becoming a product feature. The useful habit is older: review what happened, extract durable patterns, remove stale detail, and carry forward the learnings that should shape the next session.

Practical pattern: after a meaningful run, ask what should become a skill, what should become memory, what should be forgotten, and what should be checked next time.

Interaction Lowers Context Cost

Voice, browser context, pointing, interruption, and real-time correction are not just nicer interfaces. They reduce the cost of transferring intent from a human into an agent. That matters because the bottleneck in agent work is often not model intelligence; it is the user's ability to supply enough context without turning the setup into a separate job.

Speak

Dump messy context faster than typing, then ask the agent to structure it.

Show

Use browser, screen, document, or visual state as part of the prompt, not an afterthought.

Interrupt

Correct the agent while the work is forming, before the wrong path becomes expensive.

The Context Problem

More context doesn't mean better performance.

Research shows input length alone can reduce AI accuracy by 14-85% – even when all information is relevant.

Lost in the Middle

Models favour information at the start and end of context, missing what's in between.

Context Rot

Quality degrades gradually as context accumulates and ages without maintenance.

Signal Dilution

Important information drowns in noise when everything is loaded indiscriminately.

Many teams dump everything into context. That's like answering every question by reading the encyclopaedia aloud.

The issue is not just volume; it's missing decision lineage. Context graphs help by recording approvals and exceptions so AI can retrieve the relevant rationale without loading everything.

THE INDUSTRY'S UNSOLVED PROBLEM

Context Drift

Academic research (2025-2026) identifies context drift—the gradual degradation of context quality across sessions and time—as the central unsolved challenge in AI memory systems. Most tools handle single-session context well. Multi-session, multi-day continuity remains hard. This is where operational rhythm systems (session handoffs, weekly coordination) become critical.

Four Strategies for Managing Context

Based on Anthropic's framework for effective AI systems

1. Write

Persist externally

Store information outside the context window for later retrieval. Files, databases, knowledge bases – anything that persists beyond the session.

2. Select

Load only what's relevant

Retrieve context based on the task at hand, not everything available. Dynamic retrieval, semantic search, just-in-time loading.

3. Compress

Summarise, don't accumulate

Keep context lean through intelligent summarisation. Archive old content, preserve decisions, trim the unnecessary.

4. Isolate

Separate contexts for separate concerns

Don't let different workstreams pollute each other. Multi-agent architectures, session boundaries, role-specific loading.

The Temporal Dimension

The solution to context drift

Most context engineering focuses on structure—how information is organised. We've found the rhythm matters just as much. This is how you solve context drift: not just better architecture, but better cadence.

Session Handoffs

How does context pass between work sessions? What gets carried forward, what gets compressed, what gets archived? Explicit handoff protocols prevent the “starting from scratch” problem.

Weekly Coordination

How do strategic priorities flow into daily work? How does work roll up into weekly synthesis? Coordination bridges connect strategy to execution without context overload.

Memory Maintenance

Context degrades over time. Scheduled compression, archiving cadences, and health checks keep context fresh. Without maintenance, even good architecture accumulates noise.

Why this matters: Academic research identifies context drift as the central unsolved challenge in AI memory systems. Most tools handle single sessions well. Multi-session, multi-day continuity is where operational rhythm becomes critical.

The architecture is the skeleton. The rhythm is the heartbeat.

Tiered Context Architecture

We use a budget-based approach to context layers. Each tier has a token budget and update frequency—this prevents context bloat while ensuring AI has what it needs.

TierPurposeToken BudgetLoad Frequency
Tier 0Compressed state~300 tokensAlways
Tier 1Active context + navigation~1,000 tokensSession start
Tier 2Domain knowledgeOn-demandWhen needed
Tier 3ArchiveRarelyHistorical only

Key insight: Most teams overload Tier 0-1 and underuse Tier 2-3. The result is context bloat, slower reasoning, and higher costs.

THE LOAD PATTERN

TIER 0
Always loaded
TIER 1
Session start
TIER 2
On-demand
TIER 3
When needed

Context Graphs: Decision Lineage

Systems of record capture what happened. Context graphs capture why.

When AI needs to make a decision, it shouldn't just know the rule—it should know the precedents, exceptions, and reasoning that shaped it.

Approvals & Exceptions

Why was this approved? What precedent does it set? Context graphs make the reasoning retrievable.

Policy Evolution

How did we get here? What changed and why? Decision traces show the path, not just the destination.

Audit Trails

What informed this decision? Who signed off? Context graphs support governance and compliance.

We design context graphs that turn scattered decisions into searchable precedent— making institutional knowledge available to AI without loading everything.

Context as Competitive Moat

STRATEGICNEW — MARCH 2026

Google's internal AI team made a revealing observation in early 2026: the “sum totality of an organisation's documents” creates capabilities that no AI lab can replicate from the outside. Your organisational data, decision history, and institutional knowledge are not a problem to manage — they are the competitive advantage.

Why Labs Can't Compete

Foundation models are general-purpose. Your context — client history, policy decisions, domain-specific reasoning — is what transforms general AI into your AI. No amount of training data replicates what you've accumulated through operations.

Context Engineering = Moat Building

This reframes context engineering from a technical practice to a strategic investment. Every well-structured knowledge base, every maintained decision trace, every curated context layer is an asset your competitors don't have.

The implication: Organisations investing in context architecture today aren't just improving AI performance — they're building durable competitive advantage that deepens with every interaction.

Who Benefits from Context Engineering?

Individuals

  • • Consistent AI results across sessions
  • • Build on previous work, not from scratch
  • • Reduce time re-explaining context

Teams

  • • Reduce hallucinations through better knowledge
  • • Enable handoffs between human and AI
  • • Shared context across team members

Organisations

  • • Multi-agent coordination without pollution
  • • Governance and compliance controls
  • • Scalable knowledge management

Our Approach Is Informed By

Anthropic's context engineering guidance

Karpathy's “RAM management” framing

Vercel's AGENTS.md evaluation research

Mei & Yao survey (1,400+ academic papers)

MemAgents architecture research

Validated through our own implementations

In 30 Seconds

AI doesn't remember. Every conversation starts fresh. Every session begins from zero. The brilliant assistant who helped you yesterday has no idea who you are today.

This isn't a bug – it's how LLMs work. But it's also why most AI implementations deliver inconsistent value. The forgetting problem is solvable.

Memory health is the practice of designing systems that give AI what it needs to know – when it needs to know it – without drowning in irrelevant information.

Why This Matters

For Individuals

Without memory, you re-explain context every session. The same background, the same preferences, the same project details – again and again.

Time saved by AI gets consumed by context-setting. The productivity promise erodes with every fresh start.

For Teams

When AI forgets, knowledge doesn't compound. Insights from one session don't inform the next. Each team member starts from scratch.

The result: inconsistent outputs, duplicated effort, and AI that never gets better at understanding your work.

The compounding cost: Every time AI forgets, you lose the value of everything it learned. Good memory design means knowledge builds over time instead of resetting to zero.

The Forgetting Problem

Understanding why AI forgets is the first step to fixing it

Context Windows Have Limits

Every AI model has a finite “context window” – the amount of text it can consider at once. When the window fills, old information gets pushed out.

The Illusion

Modern models have large context windows (100K+ tokens). This feels like plenty of memory.

The Reality

Long contexts degrade performance. Research shows accuracy drops 14-85% as context length increases – even with relevant information.

No Native Persistence

LLMs have no built-in way to store information between sessions. Unlike databases or file systems, they don't write to permanent storage. Each conversation exists in isolation.

What Users Expect

“You remember that project we discussed last week, right?”

What Actually Happens

The model has no access to previous conversations. Last week doesn't exist.

Lost in the Middle

Even within a single context window, attention isn't uniform. Models favour information at the beginning and end, often missing what's in the middle.

The Pattern

Critical information buried in the middle of long conversations gets less attention from the model.

The Impact

Important context can be effectively “forgotten” even while technically still in the window.

Key insight: The “memory problem” isn't a flaw to be fixed by model improvements. It's a fundamental architecture that requires system-level solutions.

Symptoms of Poor Memory Health

Recognise these patterns? They're signs your AI system needs memory architecture.

Repetitive Context-Setting

You explain the same background information every session. “I work at X company, we do Y, the project is about Z...”

Inconsistent Outputs

The same question yields different answers in different sessions. No learning from previous interactions carries forward.

Contradictory Advice

AI suggests approaches that conflict with decisions made in previous sessions. It doesn't know what was already decided.

Context Rot

Long conversations degrade. The AI starts referencing outdated information or losing track of earlier agreements.

Knowledge Silos

Insights from one conversation can't be applied elsewhere. Each session is an island of learning that sinks after use.

The Eternal Beginner

Despite months of use, AI still asks basic questions. It never develops understanding of your domain or preferences.

These aren't AI limitations. They're architecture gaps. Every symptom has a solution – if you design for memory.

The Four Memory Strategies

Based on Anthropic's context engineering framework

1. Write: Persist Externally

Since AI has no native memory, create external storage. Files, databases, knowledge bases – anything that persists beyond the session.

Session Logs

Capture key decisions and outcomes from each conversation

Knowledge Files

Curated information that AI should always know

State Documents

Living files that track current project status

2. Select: Load Only What's Relevant

Don't load everything every time. Retrieve context based on the task at hand. Just-in-time loading beats all-the-time loading.

Dynamic Retrieval

Fetch relevant documents based on the current query

Semantic Search

Find information by meaning, not just keywords

Role-Based Loading

Different tasks load different context packages

3. Compress: Summarise, Don't Accumulate

Keep context lean. Replace long conversation history with concise summaries. Archive old content, preserve decisions, trim the unnecessary.

Conversation Summaries

Replace 50 messages with 5 key takeaways

Decision Logs

Keep what was decided, not how it was discussed

Context Pruning

Regular maintenance to remove outdated information

4. Isolate: Separate Contexts for Separate Concerns

Don't let different workstreams pollute each other. Use boundaries to keep contexts clean and focused.

Session Boundaries

Clear starts and ends for different work types

Project Isolation

Client A's context doesn't leak into Client B

Multi-Agent Design

Different agents with different specialised contexts

Layered Memory Architecture

Organise memory by stability and scope

Effective AI memory isn't a single file – it's a layered architecture. Higher layers are stable and rarely change. Lower layers are ephemeral and session-specific.

L1: SYSTEM INSTRUCTIONS
How the AI should behave, safety rails, core capabilities
L2: AGENT IDENTITY
Who is this AI? Expertise, tone, protocols
L3: STRATEGIC MEMORY
Key decisions, priorities, what's been tried
L4: KNOWLEDGE ARCHITECTURE
Where things are, how to navigate, reference material
L5: ENTITY CONTEXT
Client/project specific: history, preferences, current state
L6: SESSION CONTEXT
Current conversation, working memory, live state

The Design Principle

Load stable layers automatically. Load ephemeral layers dynamically. Don't burden every session with information that rarely changes.

The Maintenance Principle

Update each layer at appropriate intervals. Strategic memory weekly. Session context every message. Match maintenance rhythm to layer stability.

Memory Patterns in Practice

Common patterns for implementing healthy AI memory

The Handoff Document

A single file that captures: what happened, what was decided, what's next. Updated at session end, loaded at session start.

Best for:

Individuals working across multiple sessions on the same project

The Project Bible

A comprehensive reference document containing all project context. Loaded automatically when working on that project.

Best for:

Complex projects with many decisions and constraints to remember

The Skills Library

Modular knowledge files that can be loaded on demand. Different skills for different tasks, loaded as needed.

Best for:

Teams with diverse tasks requiring different domain expertise

The Weekly Bridge

A rhythm-based summary that synthesises the week's sessions. Carries forward key context without accumulating endless history.

Best for:

Ongoing operations with continuous but evolving context

Memory Requires Maintenance

Without Maintenance

  • • Context files grow stale
  • • Outdated information contradicts current reality
  • • Memory becomes noise rather than signal
  • • AI references things that are no longer true
  • • The system degrades back to forgetfulness

With Maintenance

  • • Context stays current and accurate
  • • Old information gets archived, not deleted
  • • Each session starts with relevant, fresh context
  • • Knowledge compounds reliably over time
  • • The system gets smarter, not staler

Memory health is a practice, not a one-time setup.
The rhythm matters as much as the architecture.

How We Help

Design, implement, and maintain AI memory systems

Memory Architecture

Design the right layer structure for your context. What belongs where, what loads when, how it all connects.

Implementation

Build the files, set up the retrieval, establish the workflows. From individual setups to team-scale systems.

Maintenance Protocols

Define the rhythms and processes that keep memory healthy. What gets updated when, how staleness is prevented.

In 30 Seconds

There are two ways to give AI the information it needs: load it upfront (passive context) or fetch it when needed (on-demand retrieval). Most teams assume retrieval is smarter. The research says otherwise.

Vercel's Next.js team ran rigorous evaluations comparing these approaches. The result: passive context achieved 100% accuracy where on-demand retrieval achieved 53%.

The insight: When information is always present, there's no decision point that can fail. No retrieval logic to get wrong. No ordering issues. Just consistent availability.

The Research

Vercel AGENTS.md Evaluation (January 2026)

Vercel's Next.js team tested how AI coding agents perform with different context configurations. They compared baseline performance against various approaches for providing project-specific information.

53%

No documentation

Baseline

53%

On-demand retrieval

Skills system

79%

Retrieval + instructions

Enhanced skills

100%

Passive context

AGENTS.md file

The striking finding: On-demand retrieval performed no better than having no documentation at all. The retrieval system existed, but it didn't help. Only when information was passively present did performance improve.

Why Passive Context Wins

Three fundamental advantages over on-demand retrieval

1. No Decision Point

On-demand retrieval requires a decision: “What information do I need for this query?” That decision can be wrong. The model might not realise it needs certain context.

With passive context, there's no decision to get wrong. The information is already there. Every time.

2. Consistent Availability

Retrieval systems are probabilistic. They might return relevant documents 80% of the time, or 60%, or 40%. The quality varies by query, by phrasing, by the state of the vector database.

Passive context is deterministic. The same information is present on every turn. No variance. No “sometimes it works” frustration.

3. No Ordering Issues

With retrieval, critical information might arrive too late in the reasoning process. The model starts generating before realising it needs more context.

Passive context is present from the first token. The model reasons with full information from the start.

The Tradeoff: Why Not Load Everything?

If passive context is better, why not just load all available information? Because context windows have effective limits that are smaller than their technical limits.

The Memento Limit

Research suggests effective reasoning capacity is around 100K tokens, even when context windows are technically larger. Beyond this, performance degrades.

A 200K context window doesn't give you 200K of useful reasoning space. It gives you 100K of effective space with increasing noise.

Lost in the Middle

Models pay more attention to the beginning and end of context. Information in the middle gets weighted less, even when it's critical.

More context can mean important information gets buried where the model is less likely to use it effectively.

The goal isn't maximum context. It's the right context. Passive for what matters most. Retrieval for everything else.

Tiered Context Architecture

The pattern that balances passive reliability with retrieval flexibility

TierTypeToken BudgetWhat Goes Here
Tier 0Passive~300 tokensCompressed state: current status, key metrics, active items
Tier 1Passive~1,000 tokensActive context: navigation, recent decisions, current focus
Tier 2On-demandVariableDomain knowledge: loaded when topic requires it
Tier 3RetrievalAs neededArchive: historical, rarely accessed

Passive Foundation

Tier 0 and Tier 1 are always loaded. This is your passive context. Keep it lean (~1,300 tokens total) but ensure it contains everything AI needs to orient itself and navigate effectively.

Retrieval for Depth

Tier 2 and Tier 3 use selective retrieval. Navigation paths in Tier 1 point to relevant Tier 2 content. This gives you depth without bloat.

Implementation Patterns

Practical approaches for passive context systems

The Project File Pattern

A single file (CLAUDE.md, AGENTS.md, or similar) at project root containing everything AI needs to work effectively in that context.

Typical contents:

  • • Project description and purpose
  • • Key decisions and constraints
  • • Build/test commands
  • • Code conventions
  • • Current focus areas

The MEMORY + CONTEXT Pattern

Two complementary files: MEMORY.md for compressed state (~300 tokens), CONTEXT.md for active context and navigation (~1,000 tokens).

The split:

  • • MEMORY: “Where are we?” (status, metrics, active items)
  • • CONTEXT: “How do I work here?” (navigation, decisions, focus)

The Navigation Hub Pattern

Passive context includes a navigation table: “When you need X, read Y.” This creates predictable paths from topics to relevant files.

Example:

| Topic | Read |
| pricing | docs/pricing-rules.md |
| deployment | docs/deploy-guide.md |

The Token Budget Pattern

Explicit limits on each passive context file. When a file exceeds its budget, compress it. Move detail to Tier 2 and keep pointers in Tier 0-1.

Enforcement:

  • • Tier 0: Max 300 tokens (hard limit)
  • • Tier 1: Max 1,000 tokens (soft limit)
  • • Review weekly, compress as needed

When to Use What

Use Passive Context For

  • Identity and behaviour rules (always needed)
  • Current project state (changes, but always relevant)
  • Navigation pointers (how to find deeper content)
  • Recent decisions (context that's frequently referenced)
  • Session handoff state (what to pick up from last time)

Use Retrieval For

  • Large knowledge bases (too big for passive loading)
  • Historical archives (rarely needed)
  • Domain-specific content (only relevant for certain queries)
  • Reference documentation (detailed specs, APIs)
  • Content that varies by user/session

The combination is powerful: Passive foundation + selective retrieval. Reliability where it matters most. Flexibility where you need depth.

Common Mistakes

Mistake 1: No Passive Context at All

Relying entirely on retrieval. Every query starts with a search. Result: inconsistent baseline, variance in quality, 53% performance.

Fix: Establish a passive foundation, even if it's just 500 tokens.

Mistake 2: Too Much Passive Context

Loading everything passively to avoid retrieval complexity. Result: bloated context, lost-in-the-middle problems, degraded reasoning.

Fix: Enforce token budgets. Compress aggressively. Move detail to Tier 2.

Mistake 3: Stale Passive Context

Setting up passive context once and never updating it. Result: AI references outdated information, makes contradictory decisions.

Fix: Weekly review cadence. Update Tier 0 after every significant change.

Mistake 4: No Navigation to Tier 2

Passive context that doesn't tell AI where to find deeper information. Result: AI either hallucinates or asks repeatedly for guidance.

Fix: Include navigation paths. “When you need X, read Y.”

Getting Started

1

Create a Tier 0 file

Start with ~300 tokens of compressed state. Current status, key metrics, active items. Name it MEMORY.md or include it at the top of your main context file.

2

Add navigation to Tier 1

Create a CONTEXT.md with ~1,000 tokens. Include a navigation table: “When topic X comes up, read file Y.” This creates predictable paths to deeper content.

3

Configure automatic loading

Ensure your AI tool loads Tier 0 and Tier 1 at session start. For Claude Code, this means CLAUDE.md. For other tools, AGENTS.md or equivalent.

4

Establish maintenance rhythm

Weekly: review passive context for staleness. After significant changes: update Tier 0. Monthly: audit token budgets and compress as needed.

In 30 Seconds

AgentOS is the persistent foundation underneath whichever AI tool you use. Plain text files at the root of your workspace describing who you are, what you know, how you work, what you remember, what you can reach, how you verify, and what you automate.

The model is the engine. The harness is the runtime (Claude Code, Cursor, Codex). The AgentOS is yours. Models change every six months. Harnesses converge over twelve to twenty-four. The AgentOS compounds across both.

The terminology landed publicly in April 2026 via AIDB's programme on Personal Context Portfolios. Several pieces of vocabulary — PCP, Monothread, Harness Engineering, Strict-Write, Auto-Dream — now name patterns we've been using or building for years. This section maps them.

The Seven Layers

Each layer is a discipline. You don't build them all at once. You build the foundation (Identity + Context) first, then add the others as your work demands them.

1. Identity

Who you are. What you do. What you stand for. The file every other layer references.

2. Context

Your situation. What’s true now. Your operating environment. (Pandion calls this layer’s discipline Context Engineering.)

3. Skills

Procedural knowledge made portable. Agents and named capabilities that can be loaded and run.

4. Memory

What compounds across sessions. What gets remembered, summarised, archived.

5. Connections

The data sources, services, and tools your AI can reach. Trust-graded.

6. Verification

How outputs are checked, grounded, evaluated. Trust by construction, not by hope.

7. Automations

What runs without you. Scheduled jobs, triggers, agents that act on signal.

The seven layers compound. The layers below feed the layers above. The whole stack survives every harness swap.

Personal Context Portfolio (PCP)

NLW's ten-file markdown recipe for the bottom of an AgentOS. A solo operator can sit down and have version one in an afternoon. Each file lives at the root of your workspace as plain text:

identity.md
rolesAndResponsibilities.md
currentProjects.md
teamAndRelationships.md
toolsAndSystems.md
communicationStyle.md
goalsAndPriorities.md
preferencesAndConstraints.md
domainKnowledge.md
decisionLog.md

PCP is a specific organising recipe for the Identity + Context (and bits of Memory + Connections) layers of AgentOS. It's not the whole AgentOS. It's a tractable starting point for layers 1, 2, 4 and 5.

Where Pandion sits: our Context Engineering methodology (MEMORY.md + CONTEXT.md + neural paths + topic-memory pattern) is a richer architecture than flat PCP for the Context layer specifically. Same job, more sophistication. PCP is a clean public recipe; CE is the upgrade path.

Monothread — one long-lived thread, not fresh chats per task

Named by Nick Bowman in mid-April. The pattern: a thread's value increases over time when context compaction is good.You keep one long-lived orchestration thread, plus specialist sub-threads spawned from it. The thread accumulates. You don't throw away context every Monday morning.

For most people who've learned AI through ChatGPT, the instinct is the opposite: fresh chat per task, one-off prompts, lose the context. Monothread inverts that: brief once, accumulate, compact. Pandion's BATON + MEMORY + MASTER-OVERVIEW filesystem pattern is monothread-as-files; the orchestration thread reads them at every session start.

What this looks like in practice: a single working thread for a project that runs across weeks, with the AgentOS files providing the persistent memory between sessions. Sub-threads spawn for narrow specialist work and report back. The orchestration thread never resets.

Harness Engineering — the named industry discipline

The lineage: prompt engineering (2023) became context engineering (2024) became harness engineering (2026). Each names a different layer of work:

  • Prompt engineering — how you phrase a single request. Largely absorbed into the model.
  • Context engineering — what the AI knows, when it knows it. The Context layer of AgentOS.
  • Harness engineering — how the runtime is configured: tools, memory wiring, file access, agent loops, verification gates. The discipline of choosing and tuning the harness.

A useful three-layer model from Aetna Labs (April 2026): Information (what the model can see), Execution (what tools it can run), Feedback (how outputs are checked). Most disappointing AI output is a configuration problem, not a model problem.

Strict-Write and Auto-Dream — memory disciplines

Two named patterns from the Practical AI post-mortem of the Claude Code source leak (April 2026). Both apply at the Memory layer.

Strict-Write

Only record to memory after environment verification — terminal output, API confirmation, filesystem write.

Hallucination prevention at the memory layer, not the inference layer. What gets remembered must have been observed.

Auto-Dream

Periodic consolidation. Every 24 hours (or weekly), review observations and consolidate into permanent facts.

Prevents memory accumulation noise. Pandion's Friday Review is auto-dream at weekly cadence.

Briefing Opus 4.7 — the literal-instruction shift

Opus 4.7 (April 2026) follows instructions more literally than 4.6. Vague or hedging prompts get punished where 4.6 would guess reasonably. The pattern that works:

  • Lead with the goal in one sentence.
  • State the constraints (audience, length, tone, format, what to avoid).
  • Define what “done” looks like — the shape and standard of the output.
  • Tell the model what to verify before returning.
  • Then let it run. Don't refine across ten messages.

Most people's instinct, learned over two years of ChatGPT, is to throw a quick prompt in and refine it conversationally. That instinct now costs you quality and (with the cost shift coming in mid-2026) money. The matching shift at runtime: 4.7 is built to be delegated to. Write a proper brief, hand it the work, walk away.

The half-hour exercise that pays back across every prompt for the next quarter: take your most-used saved instruction or system prompt, the one you wrote against an earlier model and haven't touched in months, and tighten it. Be specific where you were vague. Add verification checks. Define done.

INTEGRATING FRAMENEW — MAY 2026

Don't Break the Loop

Jason Liu, on the Codex team, published a guide in May 2026 (“Codex Maxing”) listing nine tips for getting more out of Codex. Read in one go, they describe a single integrating shift: the productivity unlock with AI is no longer faster turn-taking. It is parallel work. The operator and the agent stay in motion together.

The tips map cleanly onto the AgentOS Layers we already use. Each is in service of one principle: never put the agent on pause while you think, observe, or change direction.

Vocabulary alignment: the Codex team is now articulating the patterns Pandion already names. Mono-thread, files-not-chat memory, harness as a work system rather than a chat replacement, voice as a way to brief richer context, side panel for parallel review. The AgentOS Layers framework is no longer Pandion-specific vocabulary; the practitioners shipping the harnesses are using the same words.

“A long thread can remember a lot, but that memory is trapped inside the thread unless the useful parts get serialized somewhere durable. Files force the agent to compress experience into a form that can survive the thread.”

— Jason Liu (Codex team), “Codex Maxing”, May 2026

Mono-thread

One long-lived durable thread per workstream. Compaction keeps the larger context alive across sessions. (Layer: Context, Memory.)

Voice

Brief the agent by rambling, not by typing a polished sentence. Messy input is fine; the agent helps you turn it into something clear. (Layer: Identity, Skills.)

Steer

Update the prompt while work is in progress. You don't need the perfect upfront brief; redirect in motion. (Layer: Skills, Verification.)

Files-not-chat memory

Structured memory in plain files (with an agents.md at the root telling the agent what to write down). Memory survives the thread. (Layer: Memory.)

Computer + browser use

Give the agent access to local files, the browser, and external services. It becomes an evidence gatherer, not a chat box. (Layer: Connections.)

Remote control

Steer long-running work from mobile while you do something else. Useful when tasks scale to hours, not minutes. (Layer: Skills, Automations.)

Heartbeats

Scheduled or trigger-based check-ins. The thread wakes itself up, checks email, Slack, a render, an inbox. (Layer: Automations.)

Goals (/goal)

For work with verifiable success criteria, hand the agent the goal and let it push against it. Now in both Codex and Claude Code. (Layer: Verification.)

Side panel

Inspect and annotate artefacts while the agent keeps building. Parallel review, not turn-taking. (Layer: Verification.)

What changes when you stop breaking the loop: you stop sitting and waiting for the agent to finish a thing before you can think about the next thing. The agent stops sitting and waiting for you to type the perfect next prompt. Both of you keep moving. For a solo or small-business operator, this is where the day-to-day multiplier with AI actually shows up — not in any single model upgrade, but in the shape of the working relationship.

From Knowledge to Capability

Context engineering, agent orchestration, skills architecture, fluency development, and workforce capability – these practices determine whether AI delivers consistent value or inconsistent experiments. If any of these feel uncertain, we can help you get them right.