AI that operates through the desktop as a live working environment — not just a chat surface. It can see context, use local tools, act across apps, and work near the user’s files, workflows, and state.
Desktop Intelligence is not a product category. It is a deployment posture — the point where an AI model gains access to the same surface a human developer uses: the file system, the shell, the application layer, the clipboard, the screen.
What distinguishes it from cloud AI or embedded AI is proximity to state. The model can read your project, see your errors, invoke your tools, and propose changes where you work — not in a separate window you copy-paste from.
The desktop is becoming the new action plane for AI.
This is not a prediction. Claude Code, GitHub Copilot, Cursor, and Windsurf already operate this way. Apple’s WWDC 2025 foundation-model APIs confirm the platform vendor agrees. The question is no longer whether AI will act on the desktop — it is who governs it when it does.
AI near device, sensor, and local inference. CoreML, MLX, on-device models. Constrained memory, thermal limits, intermittent connectivity.
AI acting through the user’s desktop workspace. File access, shell commands, application control, screen awareness. Full environmental context.
Intelligence constrained by permissions, review, and policy. Deterministic governance over probabilistic reasoning. The constitutional layer.
Two architecturally different approaches to desktop AI. One drives the screen like a human. The other integrates through declared APIs. Both are live. They have different trade-off profiles, and the platform vendor’s preference is becoming clear.
Screen-pixel interaction. The model sees rendered UI and clicks, types, scrolls. Works on any application without integration.
API-level integration. The model calls declared tool functions, reads structured data, operates through typed interfaces.
Three evidence levels. Confirmed means Apple has shipped or publicly announced it. Direction means reliable reporting from known sources. Rumored means plausible but unverified. The distinction matters.
On-device and Private Cloud Compute models shipping in macOS Sequoia. Writing tools, image generation, notification summaries, Siri improvements.
Apple’s Foundation Models framework gives developers direct access to on-device language and image models. Structured generation, tool calling, guided decoding — the platform-native inference layer.
Google’s Gemini added as a cloud model option alongside ChatGPT in Apple Intelligence. Third-party model integration becoming a platform feature, not a workaround.
Siri evolving from voice assistant to system-level agent capable of multi-step actions across apps. App Intents as the declared action surface. Persistent context across interactions.
Spotlight as a unified command surface for AI actions. Natural language triggering App Intents directly. The command palette becomes the agent dispatch surface.
SwiftUI extensions for spatial reasoning and scene understanding. Vision Pro as a spatial intelligence surface. The desktop expanding beyond the flat screen into volumetric space.
Apple’s developer materials increasingly describe the Mac not as a consumer device but as an AI development and deployment surface. Foundation Models, App Intents, CoreML, MLX, Metal — every layer is being instrumented for intelligence.
Not because Apple ships the best models. Because Apple controls the entire stack — silicon, OS, runtime, distribution — and is building declared integration surfaces instead of screen-driving hacks. The governance surface exists by design.
Anthropic’s own safety documentation for computer use warns of prompt injection, unintended actions, and autonomous scope creep. That warning is not a bug report. It is an invitation for the governance thesis.
Live feed of every governance decision. Permit, deny, escalate — with full evaluation trace and the policy that applied.
System-wide state at a glance. Active agents, session counts, resource posture, threat level — the principal’s situational awareness.
Immutable record of every action, evaluation, and state transition. The audit trail that makes governance reconstructable.
Actions that require human review. The system surfaces uncertainty rather than guessing. The principal decides, not the agent.
Session-level view of cumulative agent actions. Detects composition drift — individually valid actions that collectively constitute scope creep.
Token budgets, memory pressure, thermal state, API rate limits. Governance includes resource awareness — not just policy.
Published work, in-progress pieces, and planned essays. Desktop Intelligence is the applied layer — where the governance thesis meets real tools, real workflows, and real platform decisions.
The foundational essay. Why individually approved agent actions can collectively constitute scope creep, and why AI agents require a constitutional governance layer separate from the acting system.
A detailed analysis of Claude Code’s architecture as the first mainstream Desktop Intelligence deployment. Tool use, file access, session state, and the governance gap.
Why screen-driving agents are harder to govern than API-integrated agents, and what that means for audit trails, replay, and deterministic evaluation.
Foundation Models, App Intents, CoreML, MLX, Metal. How Apple is building a vertically integrated AI desktop — and where governance must be inserted.
How Watch Station makes ClawLaw governance visible. Verdict streams, composition maps, escalation queues — the principal’s operating picture.