Intelligence · Desktop

Desktop Intelligence.

AI that operates through the desktop as a live working environment — not just a chat surface. It can see context, use local tools, act across apps, and work near the user’s files, workflows, and state.

01 · The Taxonomy

When AI stops answering and starts operating.

Desktop Intelligence is not a product category. It is a deployment posture — the point where an AI model gains access to the same surface a human developer uses: the file system, the shell, the application layer, the clipboard, the screen.

What distinguishes it from cloud AI or embedded AI is proximity to state. The model can read your project, see your errors, invoke your tools, and propose changes where you work — not in a separate window you copy-paste from.

The desktop is becoming the new action plane for AI.

This is not a prediction. Claude Code, GitHub Copilot, Cursor, and Windsurf already operate this way. Apple’s WWDC 2025 foundation-model APIs confirm the platform vendor agrees. The question is no longer whether AI will act on the desktop — it is who governs it when it does.

Edge Intelligence
Near the device

AI near device, sensor, and local inference. CoreML, MLX, on-device models. Constrained memory, thermal limits, intermittent connectivity.

Desktop Intelligence
Through the workspace

AI acting through the user’s desktop workspace. File access, shell commands, application control, screen awareness. Full environmental context.

Governed Intelligence
Under authority

Intelligence constrained by permissions, review, and policy. Deterministic governance over probabilistic reasoning. The constitutional layer.

Chat surface AI
AI that answers
  • • Receives a prompt, returns text
  • • No access to local files
  • • No tool invocation
  • • No persistent session state
  • • User copies output manually
  • • Governance: terms of service only
Desktop Intelligence
AI that operates
  • • Reads project context, proposes diffs
  • • Full file-system access
  • • Shell, git, build tools, APIs
  • • Session state across interactions
  • • Applies changes in-place
  • • Governance: an open, urgent question
02 · Computer Use vs Native Integration

Screen-driving or structured APIs.

Two architecturally different approaches to desktop AI. One drives the screen like a human. The other integrates through declared APIs. Both are live. They have different trade-off profiles, and the platform vendor’s preference is becoming clear.

Computer Use
Flexible · Brittle

Screen-pixel interaction. The model sees rendered UI and clicks, types, scrolls. Works on any application without integration.

CoverageAny app, any surface
ReliabilityLayout-dependent
SpeedScreenshot latency
Governance hookDifficult to intercept
EvidenceScreenshots only
Apple fitAccessibility abuse risk
Native Integration
Deterministic · Structured

API-level integration. The model calls declared tool functions, reads structured data, operates through typed interfaces.

CoverageIntegrated apps only
ReliabilityDeterministic
SpeedNative call speed
Governance hookNatural intercept point
EvidenceTyped action log
Apple fitPlatform-aligned
Anthropic’s position: Claude Code uses native tool integration (file read/write, shell exec, git). Claude’s computer-use mode drives the screen. Both ship. But the tool-use path produces structured evidence. Computer use produces screenshots.
Apple’s architectural preference: App Intents, Shortcuts, Foundation Models API. Apple builds structured integration surfaces, not screen-driving agents. The platform is designed for declared capabilities, not pixel scraping.
03 · Confirmed · Direction · Rumored

The Mac as the AI workbench.

Three evidence levels. Confirmed means Apple has shipped or publicly announced it. Direction means reliable reporting from known sources. Rumored means plausible but unverified. The distinction matters.

Apple Intelligence on Mac
Confirmed

On-device and Private Cloud Compute models shipping in macOS Sequoia. Writing tools, image generation, notification summaries, Siri improvements.

Shipping · macOS 15.x · M-series required
Foundation Models for Developers
Confirmed · WWDC25

Apple’s Foundation Models framework gives developers direct access to on-device language and image models. Structured generation, tool calling, guided decoding — the platform-native inference layer.

WWDC 2025 · macOS 26 / iOS 26 · Developer beta
Gemini Integration · macOS 16
Confirmed · March 2026

Google’s Gemini added as a cloud model option alongside ChatGPT in Apple Intelligence. Third-party model integration becoming a platform feature, not a workaround.

Announced March 2026 · macOS 16 / iOS 18.4+
Siri as System Agent
Reported · Not confirmed

Siri evolving from voice assistant to system-level agent capable of multi-step actions across apps. App Intents as the declared action surface. Persistent context across interactions.

Reported by Bloomberg · Expected WWDC 2026
Spotlight / App Menu Evolution
Rumored · WWDC 2026

Spotlight as a unified command surface for AI actions. Natural language triggering App Intents directly. The command palette becomes the agent dispatch surface.

Rumored · Unverified · Plausible given App Intents direction
SwiftUI for Spatial Intelligence
Leaked · WWDC 2026

SwiftUI extensions for spatial reasoning and scene understanding. Vision Pro as a spatial intelligence surface. The desktop expanding beyond the flat screen into volumetric space.

Leaked references in developer materials · Unconfirmed
Apple’s developer materials increasingly describe the Mac not as a consumer device but as an AI development and deployment surface. Foundation Models, App Intents, CoreML, MLX, Metal — every layer is being instrumented for intelligence.
Why the Mac is suddenly the most interesting AI desktop.

Not because Apple ships the best models. Because Apple controls the entire stack — silicon, OS, runtime, distribution — and is building declared integration surfaces instead of screen-driving hacks. The governance surface exists by design.

04 · ClawLaw · Watch Station

The more AI can do, the more it must be governed.

Anthropic’s own safety documentation for computer use warns of prompt injection, unintended actions, and autonomous scope creep. That warning is not a bug report. It is an invitation for the governance thesis.

Cascade without governance
Step 1
Agent reads project files
Step 2
Finds .env with credentials
Step 3
Includes creds in API call
Step 4
Credentials leave machine
Step 5
Prompt injection in response
Step 6
Agent executes injected cmd
Same cascade with ClawLaw governance
Step 1
Agent reads project files
Governed
.env on exclusion list
Governed
Credential pattern blocked
Governed
Outbound data reviewed
Governed
Injection detected, halted
Result
Session safe, evidence logged
ClawLaw Watch Station · Six surfaces
Verdict Stream

Live feed of every governance decision. Permit, deny, escalate — with full evaluation trace and the policy that applied.

Operating Picture

System-wide state at a glance. Active agents, session counts, resource posture, threat level — the principal’s situational awareness.

Evidence Log

Immutable record of every action, evaluation, and state transition. The audit trail that makes governance reconstructable.

Escalation Queue

Actions that require human review. The system surfaces uncertainty rather than guessing. The principal decides, not the agent.

Composition Map

Session-level view of cumulative agent actions. Detects composition drift — individually valid actions that collectively constitute scope creep.

Resource Posture

Token budgets, memory pressure, thermal state, API rate limits. Governance includes resource awareness — not just policy.

(DesktopState, AgentAction) → NewState | Rejection
The reducer equation. Every agent action is evaluated. The system produces a new state or a rejection. Never both. Never neither.
05 · Content Roadmap

Building the Desktop Intelligence library.

Published work, in-progress pieces, and planned essays. Desktop Intelligence is the applied layer — where the governance thesis meets real tools, real workflows, and real platform decisions.

Live · Position paper
The Agency Paradox: Governed Autonomy as Infrastructure
Published · v2.2

The foundational essay. Why individually approved agent actions can collectively constitute scope creep, and why AI agents require a constitutional governance layer separate from the acting system.

Claude Code as Desktop Intelligence
In development

A detailed analysis of Claude Code’s architecture as the first mainstream Desktop Intelligence deployment. Tool use, file access, session state, and the governance gap.

Computer Use: The Governance Blind Spot
Planned

Why screen-driving agents are harder to govern than API-integrated agents, and what that means for audit trails, replay, and deterministic evaluation.

Apple’s Desktop Intelligence Stack
Planned

Foundation Models, App Intents, CoreML, MLX, Metal. How Apple is building a vertically integrated AI desktop — and where governance must be inserted.

Watch Station: The Governance UI
In development

How Watch Station makes ClawLaw governance visible. Verdict streams, composition maps, escalation queues — the principal’s operating picture.