Oh My Opencode Specialised Agents Deep Dive and Model Guide

Meet Sisyphus and its specialist agent crew.

Page content

The biggest capability jump in OpenCode comes from specialised agents: deliberate separation of orchestration, planning, execution, and research.

Oh My Opencode packages that idea into a batteries-included harness where Sisyphus coordinates a full “virtual team” of agents with different permissions, prompts, and model preferences.

oh my opencode agents

This is the deep dive into agents and model routing. If you are earlier in the journey:

For the wider AI coding toolchain context, see the AI developer tools overview.

What Is Oh My Opencode and How Does It Extend OpenCode

OpenCode is an open-source AI coding agent built for the terminal. It ships with a TUI, and the CLI starts that TUI by default when you run opencode with no arguments. It is provider-flexible: it supports a large provider catalog including local models, exposes provider configuration through its config file and /connect flow, and handles everything from cloud APIs to Ollama endpoints without patching.

Oh My Opencode (also known as oh-my-openagent, or just “omo”) is a community plugin that transforms OpenCode into a full multi-agent engineering system. It adds:

  • the Sisyphus orchestration system with parallel background execution
  • 11 specialised agents with distinct roles, prompts tuned per model family, and explicit tool permissions
  • LSP + AST-Grep for IDE-quality refactoring inside agents
  • Hashline — a hash-anchored edit tool that eliminates stale-line errors (see below)
  • Built-in MCPs: Exa (web search), Context7 (official docs), Grep.app (GitHub search), all on by default
  • /init-deep — auto-generates hierarchical AGENTS.md files throughout your project for lean context injection

One naming quirk: the upstream repository is now branded as oh-my-openagent, but the plugin package and install commands still use oh-my-opencode. The maintainer suggests calling it “oh-mo” or just “Sisyphus.”

Why Oh My Opencode Assigns Different Models to Different Agents

Oh My Opencode is built around one foundational idea: different models think differently, and each agent’s prompt is written for one mental model. Claude follows mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules means more compliance. GPT (especially 5.2+) follows principle-driven prompts — concise principles, XML structure, explicit decision criteria. Give GPT a 1,100-line Claude prompt and it contradicts itself. Give Claude a 121-line GPT prompt and it drifts.

This is not a quirk you configure around. It is the system’s design.

The practical consequence: when you change an agent’s model, you change which prompt fires. Agents that support multiple model families (Prometheus, Atlas) auto-detect your model at runtime via isGptModel() and switch prompts automatically. Agents that don’t (Sisyphus, Hephaestus) have prompts written for one family only — and swapping them to the wrong family degrades the output significantly.

How Oh My Opencode Specialised Agents Collaborate

The four agent personality groups

Agents fall into four groups based on which model family they are optimised for. This matters for both understanding the system and for self-hosting decisions.

Group 1 — Communicators (Claude / Kimi / GLM): Sisyphus and Metis. Long, mechanics-driven prompts (~1,100 lines for Sisyphus). Need models that reliably follow complex multi-layered instructions across dozens of tool calls. Claude Opus is the reference. Kimi K2.5 and GLM-5 are strong, cost-effective alternatives that behave similarly. Do not override these to older GPT models.

Group 2 — Dual-Prompt (Claude preferred, GPT supported): Prometheus and Atlas. Auto-detect your model family at runtime and switch to the appropriate prompt. Claude gets the full mechanics-driven version. GPT gets a compact, principle-driven version that achieves the same outcome in ~121 lines. Safe to use either; the system handles the switching.

Group 3 — GPT-Native (GPT-5.3-codex / GPT-5.4): Hephaestus, Oracle, Momus. Principle-driven, autonomous execution style. Their prompts assume goal-oriented, independent reasoning — which is what GPT is built for. Hephaestus has no fallback and requires GPT access. Do not override these to Claude; the behaviour degrades.

Group 4 — Utility Runners (speed over intelligence): Explore, Librarian, Multimodal Looker. Do grep, search, and retrieval. Intentionally use the fastest, cheapest models available. “Upgrading” Explore to Opus is hiring a senior engineer to file paperwork. These are also the best candidates for local model replacement.

Delegation mechanisms

Oh My Opencode uses two complementary tools for delegation:

  • task()category-based delegation: choose a category like visual-engineering or deep, optionally inject skills, and optionally run in the background
  • call_omo_agent()direct invocation of a specific agent by name, bypassing category routing

Both support parallel background execution, with concurrency enforced per provider and per model.

Categories are model routing presets

When Sisyphus delegates to a subagent, it picks a category, not a model name. The category maps to the right model automatically.

Category What it is for Default model
visual-engineering Frontend, UI/UX, CSS, design Gemini 3.1 Pro (high)
artistry Creative, novel approaches Gemini 3.1 Pro → Claude Opus → GPT-5.4
ultrabrain Hard logic, architecture decisions GPT-5.4 (xhigh) → Gemini 3.1 Pro → Claude Opus
deep Deep coding, complex multi-file logic GPT-5.3 Codex → Claude Opus → Gemini 3.1 Pro
unspecified-high General complex work Claude Opus → GPT-5.4 (high) → GLM-5
unspecified-low General standard work Claude Sonnet → GPT-5.3 Codex → Gemini Flash
quick Single-file changes, simple tasks Claude Haiku → Gemini Flash → GPT-5-Nano
writing Text, documentation, prose Gemini Flash → Claude Sonnet

Categories are the right abstraction for self-hosting too: map a category to a local model and every task routed to that category automatically uses it.

Model resolution order

Agent Request → User Override (if configured) → Fallback Chain → System Default

Provider priority when the same model is available through multiple providers:

Native (anthropic/, openai/, google/) > Kimi for Coding > GitHub Copilot > Venice > OpenCode Go > Z.ai Coding Plan

Oh My Opencode Agents: Full Catalogue with Roles and Model Requirements

Orchestrators

Sisyphus

Purpose: Main orchestrator. Plans, delegates, and drives tasks to completion through aggressive parallel execution.
Group: Communicator (Claude / Kimi / GLM)
Role: The team lead who coordinates across the whole codebase — its ~1,100-line mechanics-driven prompt needs a model that can follow every step across dozens of tool calls without losing track.

⚠️ Never override Sisyphus to older GPT models. GPT-5.4 has a dedicated prompt path but is not the recommended default. Claude Opus is the reference.

Fallback chain: anthropic/claude-opus-4-6 (max) → opencode-go/kimi-k2.5k2p5gpt-5.4glm-5big-pickle
Self-hosted: Sisyphus is the hardest agent to run locally. Its prompt complexity makes it dependent on models with strong instruction-following over long tool-call sequences. A local Qwen3-coder or DeepSeek-Coder-V3 may work for simple tasks, but expect degradation on workflows that require multi-agent coordination. If you self-host, test with a single-agent task before enabling parallel execution.


Atlas

Purpose: “Todo-list orchestrator.” Keeps a structured plan moving by enforcing completion and sequencing.
Group: Dual-prompt (Claude preferred, GPT supported)
Role: While Sisyphus handles the big picture, Atlas drives the checklist. Auto-detects your model family at runtime and switches prompts accordingly.

Fallback chain: anthropic/claude-sonnet-4-6opencode-go/kimi-k2.5
Self-hosted: A fast, reliable local coder model handles Atlas-style “drive the checklist” work reasonably well because the tasks are more structured than Sisyphus’s orchestration. Qwen3-coder at 32k+ context is a workable starting point.


Planning agents

The planning layer enforces “think before act”: requirements gathering, gap detection, and plan critique all happen before any execution agent sees the task.

Prometheus

Purpose: Strategic planner with an interview-style workflow. Activates when you press Tab or run /start-work.
Group: Dual-prompt (Claude preferred, GPT supported)
Role: Interviews you like a real engineer — identifies scope, surfaces ambiguities, and produces a verified plan before a single line of code is touched. The GPT version achieves the same in ~121 lines; the Claude version uses ~1,100 lines across 7 files.
Collaborates with: Metis (gap detection) and Momus (plan validation) before handing off to execution.

Fallback chain: anthropic/claude-opus-4-6 (max) → openai/gpt-5.4 (high) → opencode-go/glm-5google/gemini-3.1-pro
Self-hosted: Workable with a strong instruction-following local model at low temperature. Planning quality degrades when the model cannot hold your constraints and acceptance criteria in-context across a long multi-turn interview. Minimum 64k context window recommended.


Metis

Purpose: Pre-planning consultant and gap analyser. Runs at a higher temperature than most agents to encourage creative gap detection.
Group: Communicator (Claude preferred)
Role: “What did we miss?” reviewer before execution — not a code-writing worker, but part of the plan quality control story.
Collaborates with: Invoked by Prometheus before the plan is finalised.

Fallback chain: anthropic/claude-opus-4-6 (max) → opencode-go/glm-5k2p5
Self-hosted: A local reasoning-capable model is fine. Keep temperature non-zero if you want Metis to actually surface edge cases — set it to 0 and it becomes a rubber-stamp.


Momus

Purpose: Ruthless plan reviewer. Enforces clarity and verification standards. Can operate as a strict “OK or reject” gate.
Group: GPT-native
Role: QA-minded critic for plans. Tool restrictions keep it in review mode rather than execution mode.
Collaborates with: Used after plan creation to challenge feasibility before work begins.

Fallback chain: openai/gpt-5.4 (medium) → anthropic/claude-opus-4-6 (max) → google/gemini-3.1-pro (high)
Self-hosted: If you self-host, keep sampling very low. The entire point of Momus is stable, reproducible critique — creativity is the last thing you want here. A strong local reasoning model at temperature 0.1 or lower is the right configuration.


Worker agents

Hephaestus

Purpose: Autonomous deep worker. Give it a goal, not a recipe.
Group: GPT-native — GPT-5.3 Codex only
Role: The specialist who stays in their room coding all day. Explores the codebase, researches patterns, and executes end-to-end without constant supervision. The maintainer calls it “the Legitimate Craftsman” (a deliberate reference to Anthropic’s decision to block OpenCode).

⚠️ No fallback chain — requires GPT access. There is no Claude prompt for this agent. Running it without OpenAI or GitHub Copilot means it cannot execute. “GPT-5.3-codex-spark” exists but is explicitly not recommended — it compacts context so aggressively that Oh My Opencode’s context management breaks.

Fallback chain: openai/gpt-5.3-codex (medium) — no fallback
Self-hosted: There is no viable local replacement for Hephaestus today. Its prompt is built around GPT-Codex’s principle-driven, autonomous exploration style. If you need a deep worker on a fully local stack, use Sisyphus-Junior with the deep category instead (which routes to GPT-5.3 Codex, or falls back to Claude Opus if that is what you have).


Sisyphus-Junior

Purpose: Category-spawned executor used by the delegation system.
Group: Inherits from whichever category launched it
Role: The “specialist contractor” that inherits its model from category config. Created dynamically via task(), often with skills injected, and can be run in the background for parallelism. Think of it as a blank slate worker whose capability is determined entirely by which category you assign.

Fallback chain: anthropic/claude-sonnet-4-6 (default); inherits from the launching category in practice
Self-hosted: Sisyphus-Junior is the most practical place to start self-hosting. Map each category to a local model in oh-my-opencode.jsonc and every category-spawned task automatically uses it. Start with quick (simple tasks), verify it works, then expand to unspecified-low before touching anything that routes to deep or ultrabrain.


Specialist subagents

Oracle

Purpose: Read-only consultation for architecture decisions and complex debugging.
Group: GPT-native
Role: Senior architect and “last resort” debugger. Intentionally restricted from writing and delegating tools so its output stays advisory. Call Oracle after major work, after repeated failures, or before making a high-stakes architectural decision.

Fallback chain: openai/gpt-5.4 (high) → google/gemini-3.1-pro (high) → anthropic/claude-opus-4-6 (max)
Self-hosted: If you self-host Oracle, pick your strongest local reasoning model and keep sampling very low. The output quality difference between a capable local reasoner and GPT-5.4 is significant for complex architecture questions. In a hybrid setup, Oracle is one of the agents worth keeping on a cloud model while moving utility work local.


Librarian

Purpose: External docs and open-source research.
Group: Utility runner
Role: Documentation and evidence collector. Tool restrictions prevent editing, so it stays focused on sourcing and summarising. Designed to run in parallel with Explore for combined “inside the repo + outside the repo” evidence gathering.

Fallback chain: opencode-go/minimax-m2.5minimax-m2.5-freeclaude-haiku-4-5gpt-5-nano
Self-hosted: The best agent to move fully local on day one. Librarian’s job is retrieval and summarisation, not deep reasoning. Any local model with reliable tool calling handles it well. Even a 7B or 13B model is sufficient if it can follow the “search, collect, report” pattern without drifting.


Explore

Purpose: Contextual grep and fast codebase search.
Group: Utility runner
Role: The “find me the relevant files and patterns” agent. Fire 10 of these in parallel for non-trivial questions, each scoped to a different area of the codebase, then let the orchestrator synthesise the results.

Fallback chain: grok-code-fast-1opencode-go/minimax-m2.5minimax-m2.5-freeclaude-haiku-4-5gpt-5-nano
Self-hosted: Along with Librarian, Explore is the best starting point for local inference. Its job is pattern matching and structured reporting — the model does not need deep reasoning, just fast, reliable tool calling and decent instruction following. A small local coder model (Qwen2.5-Coder-7B or similar) at high throughput works well.


Multimodal Looker

Purpose: Vision analyst and “diagram reader.” Analyses images and PDFs via a look_at workflow.
Group: Utility runner (vision required)
Role: Heavily tool-restricted (read-only) to prevent side effects and keep it purely interpretive. Used when you need to feed UI screenshots, architecture diagrams, or PDF pages into the workflow.

Kimi K2.5 is specifically called out as excelling at multimodal understanding — that is why it sits high in this fallback chain.

Fallback chain: openai/gpt-5.4opencode-go/kimi-k2.5zai-coding-plan/glm-4.6vgpt-5-nano
Self-hosted: Local vision requires a multimodal model with solid tool calling and enough context. If your local stack is not there yet, keep Multimodal Looker on a cloud model — a misconfigured local vision pipeline produces silent garbage, not useful errors.


Oh My Opencode Model Routing: Fallback Chains and Provider Priority

Per-agent defaults and the “no single global model” design

Oh My Opencode ships with per-agent model defaults and fallback chains, not a single global model. The design is deliberately opinionated:

  • Explore and Librarian use the cheapest, fastest models because they do not need deep reasoning
  • Oracle and Momus use the highest-capability models because their outputs gate execution
  • Sisyphus and Prometheus get the best orchestration-class models by default

The OpenCode Go tier ($10/month)

OpenCode Go is a subscription tier that provides reliable access to Chinese frontier models through OpenCode’s infrastructure. It appears in the middle of many fallback chains as a bridge between premium native providers and free-tier alternatives.

Model via OpenCode Go Used by
opencode-go/kimi-k2.5 Sisyphus, Atlas, Sisyphus-Junior, Multimodal Looker
opencode-go/glm-5 Oracle, Prometheus, Metis, Momus
opencode-go/minimax-m2.5 Librarian, Explore

If you do not have Anthropic or OpenAI subscriptions, OpenCode Go plus GitHub Copilot covers most of the fallback chain at low cost.

Provider mappings for GitHub Copilot

When GitHub Copilot is the best available provider, agent assignments are:

Agent Model
Sisyphus github-copilot/claude-opus-4-6
Oracle github-copilot/gpt-5.4
Explore github-copilot/grok-code-fast-1
Librarian github-copilot/gemini-3-flash

Prompt variants track model families

If you switch an agent from Claude to GPT or Gemini, Oh My Opencode does not use the same prompt. Agents that support multiple families (Prometheus, Atlas) auto-detect via isGptModel() and switch. Agents that do not support multiple families (Sisyphus, Hephaestus) have one prompt — switch them to the wrong family and the output degrades.

If your agent outputs feel off after a model change, check whether you crossed a model family boundary and revert.


Running Oh My Opencode with Self-Hosted and Local Models

There are two layers to configure:

  1. OpenCode must know about your local provider and model IDs
  2. Oh My Opencode must be told which agent uses which model (because most agents ignore your UI-selected model by design)

What you can realistically run locally today

Agent Local viability Recommended approach
Explore ✅ Excellent Any fast local coder model (Qwen2.5-Coder-7B+)
Librarian ✅ Excellent Any fast local model with reliable tool calling
Sisyphus-Junior (quick category) ✅ Good Small coder model for quick tasks
Atlas ⚠️ Workable Mid-size model (13B+), 32k+ context
Prometheus ⚠️ Workable Strong instruction-follower, 64k+ context, low temperature
Metis ⚠️ Workable Reasoning-capable, keep temperature non-zero
Momus ⚠️ Workable Reasoning-capable, very low temperature
Sisyphus ⚠️ Partial Only for simple single-agent tasks; multi-agent orchestration needs Claude-class models
Oracle ❌ Not recommended Keep on cloud; quality gap is significant for complex queries
Hephaestus ❌ No local path Requires GPT-5.3-codex; no Claude or local equivalent

Step 1 — Add a local provider to OpenCode

OpenCode supports local models and custom baseURL values in provider config — Ollama, vLLM, and any OpenAI-compatible endpoint are first-class options. The OpenCode quickstart covers provider authentication in detail.

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b": { "name": "qwen2.5-coder:7b" },
        "qwen2.5-coder:32b": { "name": "qwen2.5-coder:32b" }
      }
    }
  }
}

For vLLM or LM Studio, the same pattern applies — just point baseURL to your server’s /v1 endpoint and list the models you have loaded.

OpenCode requires at least a 64k context window for orchestration agents. Anything smaller and you will see truncation errors mid-workflow.

Step 2 — Override agent models in Oh My Opencode config

Config locations (project takes precedence over user-level):

  • .opencode/oh-my-opencode.jsonc (project-level, highest priority)
  • ~/.config/opencode/oh-my-opencode.jsonc (user-level)

A practical hybrid config — local inference for utility agents, cloud for reasoning:

{
  "$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",

  "agents": {
    // Utility agents: fast local model is more than enough
    "explore":    { "model": "ollama/qwen2.5-coder:7b",  "temperature": 0.1 },
    "librarian":  { "model": "ollama/qwen2.5-coder:7b",  "temperature": 0.1 },

    // Sisyphus-Junior in quick mode: local is fine
    // (controlled via categories below)

    // Keep the reasoning agents on cloud
    "oracle":  { "model": "openai/gpt-5.4",          "variant": "high" },
    "momus":   { "model": "openai/gpt-5.4",          "variant": "xhigh" },
    // Hephaestus: do not touch — it needs GPT-5.3-codex, no fallback
  },

  "categories": {
    // Route simple spawned tasks to local model
    "quick":   { "model": "ollama/qwen2.5-coder:7b" },
    "writing": { "model": "ollama/qwen2.5-coder:7b" },

    // Keep heavy reasoning on cloud
    "deep":         { "model": "openai/gpt-5.3-codex", "variant": "medium" },
    "ultrabrain":   { "model": "openai/gpt-5.4",       "variant": "xhigh" }
  },

  "background_task": {
    "defaultConcurrency": 2,
    "providerConcurrency": {
      "ollama": 4,    // local endpoint can handle more parallelism
      "openai": 2,    // stay inside plan limits
      "anthropic": 2
    },
    "modelConcurrency": {
      "ollama/qwen2.5-coder:7b": 4
    }
  }
}

The cost-conscious alternative to full self-hosting

Before committing to a local GPU setup, consider the OpenCode Go + Kimi for Coding stack. At around $11/month total, it covers:

  • Kimi K2.5 for Sisyphus and Atlas (Claude-class orchestration quality at low cost)
  • GLM-5 for Prometheus, Metis, and Momus (solid reasoning, free tier available)
  • MiniMax M2.5 for Librarian and Explore (fast retrieval)

For most workloads this is cheaper than running a local inference server and does not require GPU hardware.


Oh My Opencode Built-in Tools: Hashline, Init-Deep, Ralph Loop, and MCPs

Hashline — hash-anchored edit tool

One of the most practical improvements in Oh My Opencode is how it handles code edits. Every line the agent reads comes back tagged with a content hash:

11#VK| function hello() {
22#XJ|   return "world";
33#MB| }

When the agent edits by referencing those tags, if the file changed since the last read the hash will not match and the edit is rejected before corruption. This eliminates the entire class of “stale line” errors where agents confidently edit lines that no longer exist. Grok Code Fast’s success rate on edit tasks went from 6.7% to 68.3% just from this change.

/init-deep — hierarchical context injection

Run /init-deep and Oh My Opencode generates AGENTS.md files at every relevant level of your project tree:

project/
├── AGENTS.md              ← project-wide context
├── src/
│   ├── AGENTS.md          ← src-specific context
│   └── components/
│       └── AGENTS.md      ← component-specific context

Agents auto-read relevant context at their scope. Instead of loading the entire repo into context at the start of every run, each agent only pulls in what is relevant to where it is working.

Prometheus planning mode — /start-work

For complex tasks, do not just type a prompt and hope. Press Tab to enter Prometheus mode or use /start-work. Prometheus interviews you like a real engineer: identifies scope, surfaces ambiguities, builds a verified plan before any execution agent runs. The “Decision Complete” standard means the plan leaves zero decisions to the implementer.

Ralph Loop — /ulw-loop

A self-referential execution loop that does not stop until the task is 100% complete. Use this for large, multi-step tasks where you want the system to self-verify and continue without your involvement. It is aggressive — make sure your concurrency limits are set before running it on an expensive cloud provider.

Built-in MCPs

Three MCP servers are pre-configured and always on:

  • Exa — web search
  • Context7 — official documentation lookup
  • Grep.app — GitHub code search across public repositories

You do not need to configure these. They are available to all agents by default.


For hands-on results and community benchmarks on how these agents perform in practice, see the Oh My Opencode experience article. To install the plugin from scratch, start with the Oh My Opencode quickstart.