What is AI advisory and why do organisations need it?

AI advisory helps organisations navigate the complex AI landscape to find real value beyond the hype. Most AI projects fail not because of technology, but because of context problems, coordination gaps, and capability issues. AI advisory addresses these practical blockers to help you build AI capabilities that actually work.

What is context engineering in AI?

Context engineering is the discipline of designing what AI knows, when it knows it, and how that knowledge is structured. It's about giving AI the right information at the right time. Most AI failures are context failures - the model is capable, but it doesn't have the information it needs to help you effectively.

What's the difference between AI tools and AI fluency?

AI tools are software you can access. AI fluency is the ability to think with AI - to know when and how to use it effectively, to structure problems for AI collaboration, and to evaluate AI outputs critically. Many teams have tool access but struggle to get consistent value because they lack genuine fluency.

How do skills-based AI systems differ from traditional AI agents?

Traditional multi-agent systems create separate AI agents for different tasks. Skills-based systems give one AI agent multiple skills - packaged expertise it can invoke as needed. Skills compound: every skill you create is available to everyone, forever. This approach is simpler to maintain and scales organisational intelligence.

Which AI model should my organisation use - Claude, GPT, or Gemini?

The best model depends on your specific use case, data sensitivity requirements, and integration needs. Claude excels at nuanced reasoning and safety. GPT has the largest ecosystem. Gemini offers strong multimodal capabilities and Google integration. We help organisations evaluate options against their actual requirements rather than following hype.

What is the difference between Claude, GPT, and Gemini?

Claude (Anthropic) excels at nuanced reasoning, safety, and complex instructions. GPT (OpenAI) has the largest ecosystem and broad capabilities. Gemini (Google) offers native multimodality and Google integration. The best choice depends on your specific use case, data requirements, and existing tools.

What is Cursor and how does it differ from GitHub Copilot?

Cursor is a VS Code fork built entirely around AI assistance - it understands your whole codebase and can make multi-file changes. Copilot integrates AI into existing IDEs, focusing on code completion and suggestions. Cursor is more powerful for complex refactoring; Copilot is easier to adopt.

What are the best no-code AI automation platforms?

n8n offers visual workflow automation with self-hosting options and extensive AI integrations. Make (formerly Integromat) provides powerful visual automation with great value at scale. Zapier is most accessible for simple, linear workflows with the broadest integrations.

What is LangChain and when should I use it?

LangChain is a framework for building LLM applications - it provides building blocks for chains of operations, agents with tools, retrieval-augmented generation (RAG), and conversation memory. Use it when you need to build custom AI applications beyond simple chat.

What AI tools should a business start with?

Start with a capable general model (Claude or GPT) for knowledge work. Add a coding assistant (Cursor or Copilot) for development. Consider automation platforms (n8n or Make) for workflow integration. Match tools to actual needs rather than collecting capabilities.

PANDIONSTUDIO

AI CAPABILITY • LANDSCAPE

AI Landscape Navigator

The landscape is exploding. The gap between movers and waiters is widening.

More capability ships quarterly than used to ship in years. New models, tools, and platforms appear weekly. Competitive advantage is materialising now – not “someday.”

This living reference helps you understand what's out there, what matters, and where we stand. Not exhaustive – navigational.

IN THIS SECTION

Models & Providers

Building & Creating

Working & Integrating

A Living Reference

This page is updated as the landscape evolves. It reflects our current understanding and experience, not comprehensive market research. We include tools we've used, evaluated, or tracked closely. Last updated: May 2026.

MAY 2026 LANDSCAPE SHIFT

The useful question is no longer: which model is best?

For solo operators and small teams, the practical landscape question is now: which combination of tools gives reliable work at a sensible cost, with enough privacy, portability, and interaction quality for the work you actually do?

Capacity shows up as limits

Compute scarcity appears as rate limits, latency, pricing changes, outages, and tool-routing decisions.

Interaction changes adoption

Voice, screen context, interruption, and live correction make AI easier to use in real working situations.

Portability beats loyalty

The best small-business setup is rarely one model forever. It is a simple, portable harness that can route work well.

JUNE 2026 — STRUCTURAL SHIFT

The subsidy era is ending. Scarcity is structural, not temporary.

The AI cost curve is reversing. For three years, frontier model access has been subsidised — providers pricing below cost to drive adoption. June 2026 marks a visible industry-wide turn: Google Ultra moved to compute-based usage limits, OpenAI shifted Codex pricing, Anthropic Fast Mode costs doubled for Opus 4.8. The underlying constraint is hardware. TSMC fab build timelines and leading-edge node constraints mean chip manufacturing capacity will remain a structural ceiling on inference well into the 2030s. This is not a temporary squeeze — it is the era operators are now designing for.

Hardware is the ceiling

TSMC leads at leading-edge nodes. New fab capacity takes years. No policy or investment decision changes the 3–5 year build timeline for meaningful new supply.

Pricing is aligning with cost

Token pricing is the rate. Tokens-to-completion is the actual invoice. Agentic loops — not one-shot prompts — amplify this difference immediately.

Harness design is a cost control

Model routing, context quality, and loop design are the primary levers for keeping operational costs manageable as subsidies disappear.

JUNE 2026 — STRUCTURAL SHIFT

Open weights are now a frontier-class, ownable floor.

For most of the field's history, the best models were closed and rented. That is changing. In June 2026, open-weight models reached frontier-class output on real work at a fraction of the cost (GLM 5.2 matched a top-tier closed model on coding tasks at roughly a tenth of the price), and they stayed useful rather than fading after launch. Set against the same month's demonstration that a closed model can be switched off by its vendor or a government, the picture is clear: the model layer is both swappable and revocable, and an open-weight floor is the part you can own.

Fungible

Open weights plus smart routing now reach frontier-class results. The model is no longer the differentiator or the lock-in.

Sovereign

Weights you can download and run cannot be repriced or revoked. “Frontier open weights mean sovereign AI” (Aaron Levie, Box).

Practical floor

For a small operator the play is routing across vendors, plus a small or mid-size local model for the workflows that must keep running.

Providers worth tracking on this axis: Zhipu (GLM), Alibaba (Qwen), DeepSeek, Moonshot (Kimi), Meta (Llama, open-weight under a custom licence), and NousResearch (Hermes, tuned for agentic use). Check the licence: Apache 2 or MIT are the commercial-friendly defaults.

AT A GLANCE

The whole landscape in one view

Twelve categories, grouped by where they sit in your day-to-day work. Click any card to open the detail below.

Models & Providers

1 category

Building & Creating

2 categories

Working & Integrating

9 categories

The AI models powering everything — who builds them, what they’re good at, and how they compare.

ANTHROPIC

Claude models

Anthropic is the clearest work-AI story of 2026. In late May the picture shifted: capability, harness quality, and compute supply now move together. The SpaceX compute partnership deepened, the Opus 4.8 release (28 May) sharpened the model's agentic judgment and added parallel-subagent workflows, and the Mythos-class model has now landed: Fable 5 for all customers (10 June), with the restricted Mythos 5 for approved cyberdefenders and researchers.

Examples: Claude Fable 5, Claude Opus 4.8, Claude Sonnet 4.6, Claude Haiku 4.5

Strengths:

+Frontier reasoning and agentic judgment (Opus 4.8): 4x less likely to let code flaws pass unremarked; 84% on browser-agent evaluation; multimodal over PDFs and diagrams
+Practitioner-grade tools: Claude Code, Claude Design, Routines, Skills, MCPs
+1M token context window; extended thinking; Task Budgets (beta)
+SpaceX partnership expands inference capacity through 2029
+Dynamic Workflows (research preview, Max/Team/Enterprise): coordinate a fleet of parallel subagents for large tasks in a single session
+Fable 5 (10 June): new top-tier model for long-horizon autonomous work, sets the pace on coding benchmarks; restricted sibling Mythos 5 for approved cyberdefenders (formerly Project Glasswing, ~50 partners)

Considerations:

•April 1 2026 supply-chain incident (Claude Code auto-update shipped a hostile package for 3 hours)
•Premium pricing for frontier models ($5/$25 per million standard; Fast Mode $10/$50)
•Even after the SpaceX deal, token demand continues to outstrip available supply
•Opus 4.6 Fast Mode deprecated 29 June 2026: workflows using /fast will shift to Opus 4.8 at $10/$50 per million tokens (double the standard rate). Review before the cutover.
•Fable 5 (the new top tier) is priced at $10/$50 per million and draws plan usage roughly twice as fast as Opus 4.8; included in plan limits until 22 June 2026, then usage credits

Our view: Our primary platform. We run Claude Code as our business operations hub, orchestrating strategy, research, delivery, and knowledge management daily. The April product wave made Claude a practitioner-grade work platform; the May SpaceX deal turned compute capacity into product strategy; Opus 4.8 (28 May) sharpened the model's agentic judgment and added parallel-subagent workflows. Late May confirmed the year's direction: $47B revenue run rate, $965B Series H valuation (now more valuable than OpenAI), and the Mythos-class model has now landed as Fable 5 (10 June), the new top tier, with Mythos 5 restricted to approved cyberdefenders. June 2026 — RSI confirmed in practice: Anthropic's "When AI Builds Itself" paper shows 80% of Claude's production code is now written by Claude, with engineers achieving 8× output multipliers. Human review — not generation — is already the primary bottleneck. This is the first confirmed instance of AI-assisted recursive capability development at production scale.

OPENAI

GPT models & ChatGPT

OpenAI pioneered the current AI era and continues its strategic pivot to work AI. GPT-5.5 (April 2026) is the current flagship and powers Codex. The comparison against Opus 4.8 is genuinely task-dependent: Opus 4.8 leads on SWE-bench Pro (+10.6 pts: 69.2% vs 58.6%) and long-context retrieval; GPT-5.5 leads Terminal-Bench 2.1 under Codex CLI (83.4% vs 74.6%), its native harness. May saw four Codex CLI releases that made Goal Mode the default, added locked computer use (Mac), and brought Codex to iOS and Android. OpenAI is also formalising enterprise deployment through Guaranteed Capacity, a one-to-three-year compute commit deal.

Examples: GPT-5.5, GPT Realtime 2, Codex, GPT Image 2, ChatGPT Plus

Strengths:

+Largest consumer ecosystem and integrations
+GPT-5.5: leads Terminal-Bench 2.1 under Codex CLI (83.4%); strongest choice for Codex-native and terminal-centric workflows
+GPT Image 2: legible text in images (signage, slide labels, packaging)
+Realtime 2, Realtime Translate, Realtime Whisper; Goal Mode default in Codex; mobile Codex (iOS/Android)

Considerations:

•Anthropic now ahead on ARR ($30B vs OpenAI $25B); positioning is reversed
•GPT-5.5 API output is $30 per million tokens vs Opus 4.8's $25 — not cheaper at scale. Long-context surcharge applies above 272K input tokens.
•Sora shutdown signalled compute scarcity trade-offs
•Microsoft dependency remains a concentration risk

Our view: The ecosystem leader on consumer reach and a serious work-AI competitor. The Opus 4.8 vs GPT-5.5 picture has split by task: GPT-5.5 is the sharper tool for Codex CLI-native and terminal-centric workflows; Opus 4.8 leads on broader agentic and long-context work and is cheaper on output tokens. Realtime and Codex push OpenAI towards persistent, interactive work systems rather than chat alone.

GOOGLE

Gemini models + a sprawling AI product line

Google brings deep AI research heritage, the broadest distribution surface in technology, and (after Google I/O 2026) the most sprawling AI product line of any provider. The Gemini app and surrounding ecosystem scaled significantly through 2025-2026; monthly tokens processed across Google surfaces grew substantially in the same window. The scale advantage is real. The product clarity is not: a small-business operator now has to choose between Gemini, Gemini Advanced (AI Pro), Gemini Business (Workspace), AI Ultra, Spark, Anti Gravity 2.0, AI Studio, Jules, Flow, Veo, Omni, Nano Banana Pro, Google Pics, NotebookLM, and AI Mode in search — many overlapping, several launched without release dates, all evolving fast.

Examples: Gemini 3.5 Flash, Anti Gravity 2.0, Spark, Omni, Nano Banana Pro, AI Studio, Jules, Flow, Veo, Google Pics, NotebookLM, AI Mode (search)

Strengths:

+Largest distribution surface in consumer AI (significant scale across Gemini app and Google surfaces)
+TPU compute moat — now externalised as a business line, not just internal capacity
+Native multimodality and very long context windows
+Omni (announced May 2026): editing-first multimodal model — a "Nano Banana for video"
+Anti Gravity 2.0: agent-first standalone desktop app with multi-agent teams and scheduled tasks (parity with Claude Code / Codex, not yet leadership)
+NotebookLM remains a category-defining product for research, study, and synthesis

Considerations:

•Product sprawl is the dominant problem — the I/O 2026 lineup is genuinely hard to navigate, even for AI-fluent users
•Gemini 3.5 Flash benchmarks well on Terminal Bench 2.0 (76.2%) and is state-of-the-art on OS World, but pricing has shifted upward — speed is no longer paired with low cost
•Spark and several other I/O launches announced without release dates
•Google Ultra plan (May 2026) now uses compute-based usage limits; agentic tools (Anti Gravity, Flow) on usage-limit model — the subsidy era is ending here too
•Strategic uncertainty: an internal split between Hassabis (world-models / robotics / continual learning) and a coding-agent-led RSI direction means priorities may shift again

Our view: Google may win consumer AI by sheer distribution: it already touches consumers everywhere, and Gemini scale numbers are remarkable. For solo and small-business operators, however, the product sprawl is the unmet need. Choosing what to use for what is now harder than using it. This is the single clearest argument all year for an AI-navigator role — a guide who can map the landscape rather than build everything inside it.

THINKING MACHINES

Interaction models

Thinking Machines Lab introduced a distinct model category in May 2026: interaction models trained from scratch for continuous, time-aware exchange rather than turn-based chat. The architecture pairs a foreground interaction model with a background model doing longer reasoning, browsing, and agentic work. The important signal is not raw benchmark performance; it is the shift from "prompt in, answer out" to AI that can notice, interrupt, translate, correct, and keep working while the human keeps talking.

Examples: TML Interaction Small, real-time video + speech, background model pairing

Strengths:

+Real-time audio and visual proactivity
+200ms micro-turns rather than conventional turn-based chat
+Foreground interaction plus background reasoning architecture
+Strong fit for meetings, training, education, coaching, and live collaboration

Considerations:

•Early-stage lab, not yet a general platform choice
•Frontier labs may copy the abstraction quickly
•Commercial deployment path still unclear

Our view: A category signal more than a vendor recommendation today. Interaction is becoming capability, not interface polish. This belongs in the landscape because it changes what practitioners can expect from future harnesses.

XAI

Grok models

xAI has shifted from pure model challenger to infrastructure signal. Grok remains integrated with X, but the May 2026 Anthropic / SpaceX partnership reframed the story: xAI / SpaceX has enormous compute capacity, while Anthropic has stronger model and harness demand. Elon has also indicated xAI will be dissolved as a separate company into SpaceX AI. Treat xAI less as a dependable frontier model platform and more as a window into AI compute infrastructure.

Examples: Grok 4, Colossus 1, Colossus 2, SpaceX AI

Strengths:

+Real-time information from X
+Colossus 1 and Colossus 2 make SpaceX a meaningful compute actor
+Potential path towards orbital and vertically integrated AI compute
+Less guardrails on topics

Considerations:

•Limited enterprise features
•Tied to X ecosystem
•Significant organisational fragility (9/11 co-founders departed)
•Grok has not kept pace with the strongest model + harness combinations

Our view: Do not depend on Grok as a core work platform. Do monitor SpaceX AI as infrastructure: compute supply is now a strategic lever in the AI race, and Elon may be more consequential as a compute operator than as a model builder.

MISTRAL

European AI models

Open weight

European-founded AI company offering competitive models with strong performance-to-cost ratios. Open-weight models available for self-hosting, with API access for convenience. Le Chat Pro is one of the privacy-first tools commonly used on the "private side" of a Public/Private wall for solo regulated practices.

Examples: Mistral Large 3, Mistral Medium 3, Mistral Small 3.1, Le Chat Pro

Strengths:

+European data sovereignty option
+Strong price/performance
+Open-weight models available
+Le Chat Pro: privacy-first option for client-confidential work

Considerations:

•Smaller ecosystem than US providers
•Enterprise features still developing

Our view: Good option for European data sovereignty requirements. Le Chat Pro features prominently on the private side of the Public/Private wall pattern (see /ai/foundation).

APPLE

On-device AI + Private Cloud Compute

Apple announced its CEO succession in April 2026: hardware VP John Ternus replaces Tim Cook (rather than software-side or COO Jeff Williams). The signal is structural: Apple is betting on on-device silicon plus Private Cloud Compute, not the frontier-lab race. Apple Foundation Models run inside the device for most tasks; harder workloads route to Apple's own private cloud with verifiable guarantees that data stays out of training. For solo practitioners handling protected client data, this is one of the most consequential strategic signals of 2026.

Examples: Apple Foundation Models, Apple Intelligence, Private Cloud Compute

Strengths:

+On-device by default for most tasks (privacy by architecture)
+Private Cloud Compute with verifiable hardware-rooted guarantees
+Tight integration across Apple ecosystem (iOS, macOS, iCloud)
+Hardware-first strategic positioning under Ternus

Considerations:

•Apple ecosystem only
•Less raw frontier capability than dedicated lab models
•Still relatively new vs. Anthropic / OpenAI / Google

Our view: Watch closely. The on-device AI path becomes more compelling every quarter, particularly for solo regulated practitioners and personal-life users where privacy and offline capability matter. Apple Intelligence is a natural complement to Lumo, Mistral Le Chat, and Maple on the private side of a Public/Private wall.

DEEPSEEK

Chinese frontier AI at fraction of the cost

Open weight

DeepSeek shook the AI industry by producing frontier-competitive models at a fraction of US lab costs. DeepSeek V4 shipped on 27 April 2026 in Pro and Flash variants, priced at less than one-seventh the cost of Opus 4.6 for roughly one-generation-behind capability. R1 (reasoning) matches earlier o1 performance; V3 rivals GPT-4o. The arithmetic is now unambiguous: for routine tasks where "good enough" is genuinely good enough, DeepSeek changes the cost calculus.

Examples: DeepSeek V4 (Pro and Flash), V3, R1

Strengths:

+V4 ships at <1/7th the cost of Opus 4.6 for one-generation-behind capability
+Open-weight models available (R1, V3)
+Efficiency breakthroughs in training methodology
+Strong coding and mathematical reasoning; natively multimodal

Considerations:

•Chinese company — data sovereignty concerns for some organisations
•API reliability and availability can vary
•Censorship on certain topics (Chinese regulatory compliance)
•Rapidly evolving — model versions shift fast

Our view: The biggest disruption in AI economics since GPT-3. DeepSeek proved that frontier capability doesn't require frontier budgets. Essential for multi-model strategy — particularly for cost-sensitive workloads where R1 or V3 can match more expensive alternatives.

HUGGING FACE

The open-model hub

Open weight

Hugging Face is the registry and community hub for open AI: hundreds of thousands of open-weight models (Llama, Qwen, Mistral, GLM, DeepSeek), datasets, and runnable demos (Spaces). It is where you find, compare (the Open LLM leaderboards), download (quantised GGUF builds for local inference), or host (Inference Endpoints) open models. The default first stop for anyone building a self-hosted or open-weight stack.

Examples: Open-weight downloads (GGUF), datasets, Spaces, Inference Endpoints, Open LLM leaderboards

Strengths:

+The central registry for open-weight models and datasets
+Quantised GGUF builds sized for local hardware (pairs with Ollama / llama.cpp)
+Open LLM leaderboards to compare models on capability and cost
+Spaces (runnable demos) and managed Inference Endpoints for hosting
+Libraries (transformers, datasets) that are the de facto open-source standard

Considerations:

•A hub, not a model maker, so capability depends on the model you choose
•Self-hosting still needs capable hardware (GPU / VRAM)
•Model licences vary (Apache / MIT / custom), check per model

Our view: The source layer of the self-hosted / resilience-floor stack: where Pandion would pull an open-weight model to run locally (via Ollama) as the rug-pull hedge if the frontier became unavailable. Not a daily driver, but where the open-model story starts.

How We Navigate This

With so many options, how do you choose? Here's our approach.

Start with the Problem

Don't start with “what AI should we use?” Start with “what problem are we solving?” The tool follows from the task, not the other way around.

Favour Simplicity

The simplest tool that solves the problem is usually the right choice. Complexity has ongoing costs. Start simple; add sophistication when you hit limits.

Build for Portability

The landscape changes fast. Avoid deep lock-in where you can. Use standards (MCP, OpenAI-compatible APIs) that let you switch if better options emerge.

Test with Real Work

Demos impress; production reveals. Before committing, test tools on your actual tasks. What works in a demo may struggle with your specific context.

What's Not Here

Comprehensive Coverage

This isn't a complete market survey. We focus on tools we've used or seriously evaluated. Many good options aren't listed because we haven't worked with them.

Full Enterprise Stack

We cover M365 Copilot and Graph, but not the full enterprise AI stack (Copilot Studio, Power Platform AI, Salesforce Einstein, ServiceNow, etc.). These require enterprise-specific context.

Infrastructure Deep Dives

We now track AI compute infrastructure because it explains limits, pricing, and reliability. We do not attempt a full survey of GPU providers, cloud infrastructure, power markets, or on-premise deployment. Those choices need infrastructure-specific advice.

Pricing Details

Pricing changes frequently. We mention pricing considerations but don't list specific prices. Check provider websites for current rates.

From Landscape to Practice

Understanding the landscape is step one. Making it work for your organisation is where we help.

Right-Sized Stack

What combination of these tools makes sense for your organisation type?

Adoption Journey

Where are you on the spectrum from locked-out to power user?

Context Engineering

The right information at the right time. How to design systems that give AI what it needs.

Learn more →

Agents & Orchestration

One agent, infinite expertise. Skills-based AI systems that compound value.

Learn more →

AI Skills & Fluency

The bottleneck isn't tools – it's people and culture. Building genuine capability.

Learn more →

Why Timing Matters

The landscape is not just moving faster. Capacity, pricing, interaction, and deployment support now change what small teams can actually do with AI.

Clock Speed Reality

Features ship faster than conferences can announce them. The useful habit is not memorising every launch, but spotting which changes alter real work: better context transfer, cheaper execution, safer privacy, or more reliable delegation.

Leaders Pulling Ahead

The gap between organisations that get AI and those still experimenting is widening. Not because technology is inaccessible, but because execution speed is separating leaders from laggards. The pattern starts with documentation, research, and workflow support before it reaches core professional judgement.

Model Commoditisation

The models themselves are increasingly commoditised. Your advantage is less about choosing one winner and more about building a portable way of working: saved context, reusable instructions, clear routing, and enough fluency to move between tools when cost, limits, or quality shift.

Labs Eating the App Layer

AI labs are moving down-stack into code review, security scanning, meetings, design, and deployment support. For small teams, the question is practical: build on tools that are useful now, but keep your context, documents, and working method portable enough that a bundled feature does not strand you.

Need Help Navigating?

The landscape is overwhelming. We've been navigating it daily. Let's talk about what makes sense for your situation.

Explore AI Services ← Back to AI Capability

AI Landscape Navigator

IN THIS SECTION

A Living Reference

The useful question is no longer: which model is best?

Capacity shows up as limits

Interaction changes adoption

Portability beats loyalty

The subsidy era is ending. Scarcity is structural, not temporary.

Hardware is the ceiling

Pricing is aligning with cost

Harness design is a cost control

Open weights are now a frontier-class, ownable floor.

Fungible

Sovereign

Practical floor

The whole landscape in one view

Model Providers

Development Environments

No-Code/Low-Code Platforms

Productivity & Workspace AI

Automation Platforms

Agent Platforms

Agentic business platforms

Build partners & integrators

Integration Standards

Model Routers & Aggregators

Creative AI Tools

Specialised Tools

Model Providers

ANTHROPIC

OPENAI

GOOGLE

THINKING MACHINES

XAI

META

MISTRAL

APPLE

DEEPSEEK

HUGGING FACE

How We Navigate This

Start with the Problem

Favour Simplicity

Build for Portability

Test with Real Work

What's Not Here

Comprehensive Coverage

Full Enterprise Stack

Infrastructure Deep Dives

Pricing Details

From Landscape to Practice

Right-Sized Stack

Adoption Journey

Context Engineering

Agents & Orchestration

AI Skills & Fluency

Why Timing Matters

Clock Speed Reality

Leaders Pulling Ahead

Model Commoditisation

Labs Eating the App Layer

Need Help Navigating?