LLM - Rost Glukhov | Personal site and technical blog

Kanban in Hermes Agent for Self Hosted LLM Workflows

Hermes Agent ships with a Kanban style task board that can easily saturate your self hosted LLM gateway if you let every task run at once.

Hermes Agent Skill Authoring — SKILL.md Structure and Best Practices

Hermes Agent treats skills as the default way to teach repeatable workflows. Official documentation describes them as on-demand knowledge documents aligned with the open agentskills.io shape, loaded through progressive disclosure so the model sees a small index first and only pulls full instructions when a task actually needs them.

Hermes Agent CLI cheat sheet — commands, flags, and slash shortcuts

Hermes Agent from Nous Research is a model-agnostic, tool-using assistant you run locally or on a VPS.

NemoClaw practical guide for secure OpenClaw operations in 2026

Most AI agent stacks still treat security as a post-demo fix. NemoClaw starts from the opposite assumption and makes isolation, policy, and routing day-zero defaults.

Agent Memory Providers Compared — Honcho, Mem0, Hindsight, and Five More

Modern assistants still forget everything when you close the tab unless something persists beyond the context window. Agent memory providers are services or libraries that hold facts and summaries across sessions — often wired in as plugins so the framework stays thin while memory scales.

AI Systems Memory — Persistent Knowledge and Agent Memory

This section collects guides on persistent knowledge and memory for AI systems — how assistants keep facts, preferences, and distilled context across sessions without stuffing every token into one prompt. Here, memory means intentional retention (user facts, summaries, plugin-backed stores), not GPU RAM or model weights.

Hermes Agent Memory System: How Persistent AI Memory Actually Works

You know the drill. You open a chat with an AI agent, explain your project, share your preferences, get some work done, and close the tab. Come back the following week and it’s like talking to a stranger — all context gone, every preference forgotten, the project re-explained from scratch.

OpenClaw Rise and Fall — Timeline and Real Reasons Behind the Collapse

OpenClaw did not fail as a product. It lost its fuel.

Llama-Server Router Mode - Dynamic Model Switching Without Restarts

For a long time, llama.cpp had a glaring limitation:
you could only serve one model per process, and switching meant a restart.

Claude Skills and SKILL.md for Developers: VS Code, JetBrains, Cursor

Most teams misuse Claude Skills in one of two ways. They either turn SKILL.md into a dumping ground, or they never graduate from giant copy-pasted prompts.

Hermes AI Assistant Skills for Real Production Setups

Hermes AI assistant, officially documented as Hermes Agent, is not positioned as a simple chat wrapper.

OpenClaw Skills Ecosystem and Practical Production Picks

OpenClaw has two extension stories, and they are easy to mix up.

Plugins extend the runtime. Skills extend the agent’s behavior.

OpenClaw Plugins — Ecosystem Guide and Practical Picks

This article is about OpenClaw plugins — native gateway packages that add channels, model providers, tools, speech, memory, media, web search, and other runtime surfaces.

OpenClaw Production Setup Patterns with Plugins and Skills

OpenClaw looks simple in demos. In production, it becomes a system.

Claude, OpenClaw, and the End of Flat Pricing for Agents

The quiet loophole that powered a wave of agent experimentation is now closed.

Vane (Perplexica 2.0) Quickstart With Ollama and llama.cpp

Vane is one of the more pragmatic entries in the “AI search with citations” space: a self-hosted answering engine that mixes live web retrieval with local or cloud LLMs, while keeping the whole stack under your control.