Desktop Intelligence — AI that operates through the desktop

01 · The Taxonomy

When AI stops answering and starts operating.

Desktop Intelligence is not a product category. It is a deployment posture — the point where an AI model gains access to the same surface a human developer uses: the file system, the shell, the application layer, the clipboard, the screen.

What distinguishes it from cloud AI or embedded AI is proximity to state. The model can read your project, see your errors, invoke your tools, and propose changes where you work — not in a separate window you copy-paste from.

The desktop is becoming the new action plane for AI.

This is not a prediction. Claude Code, GitHub Copilot, Cursor, and Windsurf already operate this way. Apple’s WWDC 2025 foundation-model APIs confirm the platform vendor agrees. The question is no longer whether AI will act on the desktop — it is who governs it when it does.

Edge Intelligence

Near the device

AI near device, sensor, and local inference. CoreML, MLX, on-device models. Constrained memory, thermal limits, intermittent connectivity.

Desktop Intelligence

Through the workspace

AI acting through the user’s desktop workspace. File access, shell commands, application control, screen awareness. Full environmental context.

Governed Intelligence

Under authority

Intelligence constrained by permissions, review, and policy. Deterministic governance over probabilistic reasoning. The constitutional layer.

Chat surface AI

AI that answers

• Receives a prompt, returns text
• No access to local files
• No tool invocation
• No persistent session state
• User copies output manually
• Governance: terms of service only

Desktop Intelligence

AI that operates

• Reads project context, proposes diffs
• Full file-system access
• Shell, git, build tools, APIs
• Session state across interactions
• Applies changes in-place
• Governance: an open, urgent question

02 · Computer Use vs Native Integration

Screen-driving or structured APIs.

Two architecturally different approaches to desktop AI. One drives the screen like a human. The other integrates through declared APIs. Both are live. They have different trade-off profiles, and the platform vendor’s preference is becoming clear.

Computer Use

Flexible · Brittle

Screen-pixel interaction. The model sees rendered UI and clicks, types, scrolls. Works on any application without integration.

CoverageAny app, any surface

ReliabilityLayout-dependent

SpeedScreenshot latency

Governance hookDifficult to intercept

EvidenceScreenshots only

Apple fitAccessibility abuse risk

Native Integration

Deterministic · Structured

API-level integration. The model calls declared tool functions, reads structured data, operates through typed interfaces.

CoverageIntegrated apps only

ReliabilityDeterministic

SpeedNative call speed

Governance hookNatural intercept point

EvidenceTyped action log

Apple fitPlatform-aligned

Anthropic’s position: Claude Code uses native tool integration (file read/write, shell exec, git). Claude’s computer-use mode drives the screen. Both ship. But the tool-use path produces structured evidence. Computer use produces screenshots.

Apple’s architectural preference: App Intents, Shortcuts, Foundation Models API. Apple builds structured integration surfaces, not screen-driving agents. The platform is designed for declared capabilities, not pixel scraping.

03 · Confirmed · Direction · Rumored

The Mac as the AI workbench.

Three evidence levels. Confirmed means Apple has shipped or publicly announced it. Direction means reliable reporting from known sources. Rumored means plausible but unverified. The distinction matters.

Apple Intelligence on Mac

Confirmed

On-device and Private Cloud Compute models shipping in macOS Sequoia. Writing tools, image generation, notification summaries, Siri improvements.

Shipping · macOS 15.x · M-series required

Foundation Models for Developers

Confirmed · WWDC25

Apple’s Foundation Models framework gives developers direct access to on-device language and image models. Structured generation, tool calling, guided decoding — the platform-native inference layer.

WWDC 2025 · macOS 26 / iOS 26 · Developer beta

Gemini Integration · macOS 16

Confirmed · March 2026

Google’s Gemini added as a cloud model option alongside ChatGPT in Apple Intelligence. Third-party model integration becoming a platform feature, not a workaround.

Announced March 2026 · macOS 16 / iOS 18.4+

Siri as System Agent

Reported · Not confirmed

Siri evolving from voice assistant to system-level agent capable of multi-step actions across apps. App Intents as the declared action surface. Persistent context across interactions.

Reported by Bloomberg · Expected WWDC 2026

Spotlight / App Menu Evolution

Rumored · WWDC 2026

Spotlight as a unified command surface for AI actions. Natural language triggering App Intents directly. The command palette becomes the agent dispatch surface.

Rumored · Unverified · Plausible given App Intents direction

SwiftUI for Spatial Intelligence

Leaked · WWDC 2026

SwiftUI extensions for spatial reasoning and scene understanding. Vision Pro as a spatial intelligence surface. The desktop expanding beyond the flat screen into volumetric space.

Leaked references in developer materials · Unconfirmed

Apple’s developer materials increasingly describe the Mac not as a consumer device but as an AI development and deployment surface. Foundation Models, App Intents, CoreML, MLX, Metal — every layer is being instrumented for intelligence.

Why the Mac is suddenly the most interesting AI desktop.

Not because Apple ships the best models. Because Apple controls the entire stack — silicon, OS, runtime, distribution — and is building declared integration surfaces instead of screen-driving hacks. The governance surface exists by design.

04 · ClawLaw · Watch Station

The more AI can do, the more it must be governed.

Anthropic’s own safety documentation for computer use warns of prompt injection, unintended actions, and autonomous scope creep. That warning is not a bug report. It is an invitation for the governance thesis.

Cascade without governance

Step 1

Agent reads project files

Step 2

Finds .env with credentials

Step 3

Includes creds in API call

Step 4

Credentials leave machine

Step 5

Prompt injection in response

Step 6

Agent executes injected cmd

Same cascade with ClawLaw governance

Step 1

Agent reads project files

Governed

.env on exclusion list

Governed

Credential pattern blocked

Governed

Outbound data reviewed

Governed

Injection detected, halted

Result

Session safe, evidence logged

ClawLaw Watch Station · Six surfaces

Verdict Stream

Live feed of every governance decision. Permit, deny, escalate — with full evaluation trace and the policy that applied.

Operating Picture

System-wide state at a glance. Active agents, session counts, resource posture, threat level — the principal’s situational awareness.

Evidence Log

Immutable record of every action, evaluation, and state transition. The audit trail that makes governance reconstructable.

Escalation Queue

Actions that require human review. The system surfaces uncertainty rather than guessing. The principal decides, not the agent.

Composition Map

Session-level view of cumulative agent actions. Detects composition drift — individually valid actions that collectively constitute scope creep.

Resource Posture

Token budgets, memory pressure, thermal state, API rate limits. Governance includes resource awareness — not just policy.

(DesktopState, AgentAction) → NewState | Rejection

The reducer equation. Every agent action is evaluated. The system produces a new state or a rejection. Never both. Never neither.

05 · Content Roadmap

Building the Desktop Intelligence library.

Published work, in-progress pieces, and planned essays. Desktop Intelligence is the applied layer — where the governance thesis meets real tools, real workflows, and real platform decisions.

Live · Position paper

The Agency Paradox: Governed Autonomy as Infrastructure

Published · v2.2

The foundational essay. Why individually approved agent actions can collectively constitute scope creep, and why AI agents require a constitutional governance layer separate from the acting system.

Claude Code as Desktop Intelligence

In development

A detailed analysis of Claude Code’s architecture as the first mainstream Desktop Intelligence deployment. Tool use, file access, session state, and the governance gap.

Computer Use: The Governance Blind Spot

Planned

Why screen-driving agents are harder to govern than API-integrated agents, and what that means for audit trails, replay, and deterministic evaluation.

Apple’s Desktop Intelligence Stack

Planned

Foundation Models, App Intents, CoreML, MLX, Metal. How Apple is building a vertically integrated AI desktop — and where governance must be inserted.

Watch Station: The Governance UI

In development

How Watch Station makes ClawLaw governance visible. Verdict streams, composition maps, escalation queues — the principal’s operating picture.

Desktop Intelligence.

When AI stops answering and starts operating.

Screen-driving or structured APIs.

The Mac as the AI workbench.

The more AI can do, the more it must be governed.

Building the Desktop Intelligence library.