12 de junio de 2026
The software industry is in the midst of a profound transformation. Large language models (LLMs) have moved from generating simple code snippets to orchestrating entire development workflows. Yet, as any seasoned engineer knows, raw code generation is not the same as building production‑grade, high‑quality software. True quality demands design thinking, planning, structured execution, testing, review, and continuous refinement. This article explores how to combine two powerful open‑source components—a Hermes Agent powered by the Nous Research Hermes model family, and OpenCode, a terminal‑native AI coding agent—to create a development pipeline that produces robust, maintainable, and secure software.
By the end of this guide, you will understand the roles each system plays, how to integrate them into a cohesive workflow, and how to leverage their synergy to enforce quality gates at every stage of the software lifecycle. We’ll walk through concrete examples, configuration details, and design patterns you can adopt today.
Before diving into the tools, let’s define what “high‑quality software” means in a modern context. It is not merely code that compiles or runs without immediate errors. High‑quality software exhibits:
AI‑driven coding assistants often generate plausible‑looking code that fails one or more of these criteria. They may introduce subtle bugs, ignore security best practices, or produce spaghetti code that is impossible to maintain. The solution is not to abandon AI assistance but to couple it with deliberate engineering processes that enforce quality. This is where the combination of a planning‑oriented Hermes Agent and an execution‑oriented OpenCode shines: the former thinks about what to build and why, while the latter systematically writes, tests, and refines the code.
The term “Hermes Agent” refers to an autonomous agent built on the Hermes series of LLMs developed by Nous Research. Hermes models (especially Hermes‑3 and its function‑calling variants) are fine‑tuned for structured reasoning, tool use, and long‑horizon planning. They excel at:
A Hermes Agent wraps such a model in a cognitive architecture. Typically, it uses a framework like LangChain, CrewAI, or a custom loop that provides the model with a set of tools (functions it can call) and a memory system. The agent reasons about the goal, selects appropriate tools, interprets their results, and iterates until the objective is met. In the context of software development, a Hermes Agent can act as the technical lead or architect: it understands requirements, designs system components, defines interfaces, creates test strategies, and reviews deliverables.
Crucially, a Hermes Agent does not need to write code directly. Instead, it can delegate code‑writing and execution tasks to specialized sub‑agents—in our case, OpenCode. This division of labor mirrors how high‑performing human teams work.
OpenCode (available at github.com/opencode-ai/opencode) is an open‑source, terminal‑based AI coding agent that turns an LLM into a powerful pair programmer. Unlike chat‑oriented tools, OpenCode is designed to work directly with your file system and shell. It can:
OpenCode’s core loop is simple: it receives a user prompt, optionally gathers context from the repository, asks the model for edits/commands, applies them, and collects feedback (e.g., test results, linter output). This tight feedback loop makes it ideal for implementing features, fixing bugs, and refactoring code with an eye on automated quality checks. When combined with a planning agent, OpenCode becomes the “hands” that execute the plan while continuously verifying correctness.
Individually, both tools are impressive. Together, they form a complete, quality‑focused software factory. Here is why the combination is so powerful:
In essence, you are building a miniature AI‑powered development team: a tech lead (Hermes) and a skilled individual contributor (OpenCode) who works tirelessly, never cuts corners, and always runs the test suite.
A typical integration looks like this:
User / Product Requirement Input
A feature request, bug report, or user story is provided to the Hermes Agent.
Planning Phase (Hermes Agent)
The agent analyzes the requirement, consults the existing codebase (possibly via OpenCode’s context‑gathering capabilities), and produces a structured plan. The plan includes:
Delegation to OpenCode
For each task in the plan, Hermes Agent constructs a detailed prompt for OpenCode. The prompt includes:
opencode run --prompt "..." --auto-approve).Execution and Feedback (OpenCode)
OpenCode interprets the prompt, reads relevant files, proposes edits, applies them, and immediately runs the test suite, linters, and any other specified checks. It returns a structured summary: changed files, test results, lint scores, coverage report, and any errors encountered.
Evaluation and Gate (Hermes Agent)
The Hermes Agent examines the results. If quality checks pass, it marks the task as complete and moves on. If tests fail or code quality degrades, it may:
Integration and Final Review
After all tasks are done, the Hermes Agent can request OpenCode to run the full integration test suite, generate documentation, and create a pull request summary. A human developer then performs a final review and merges.
This loop ensures that every line of code is validated against a pre‑defined quality bar, and that architectural coherence is maintained across the entire feature.
To replicate the system, you will need:
ollama pull hermes3:latest). Ensure the model supports function calling.pip install opencode or install from source. Configure your LLM provider in ~/.config/opencode/config.yaml (or point OpenCode to the same Hermes model for a consistent experience, though you might want a more coding‑optimized model for OpenCode like deepseek-coder or gpt-4o).openai library (for API calls) and the subprocess module to invoke OpenCode. Alternatively, use frameworks like LangChain to manage agent state, but a minimal custom loop often gives more control.# ~/.config/opencode/config.yaml
model: "deepseek-coder-v2" # or gpt-4o, claude-3-opus
provider: "openai" # or anthropic, ollama
api_key: "your-api-key"
# Add custom linting and test commands
default_checks:
- "pytest --cov=src --cov-report=term"
- "ruff check ."
- "bandit -r src/ -ll"
This ensures that every time OpenCode runs, it automatically executes these quality checks before reporting success.
Let’s write a Python class that acts as the Hermes Agent. It will use function calling to decide when to invoke OpenCode and how to interpret the results.
import json
import subprocess
import openai
class HermesDevAgent:
def __init__(self, api_key, model="hermes-3-70b"):
self.client = openai.OpenAI(api_key=api_key, base_url="https://openrouter.ai/api/v1")
self.model = model
self.tasks = []
self.current_plan = None
def run(self, requirement: str):
# Step 1: Create a plan
plan = self.plan(requirement)
self.current_plan = plan
# Step 2: Execute each task
for task in plan["tasks"]:
success = self.execute_task(task)
if not success:
# Attempt repair or escalate
self.repair_task(task)
# Step 3: Final integration
self.finalize()
def plan(self, requirement: str):
prompt = f"""You are an expert software architect. Given the following requirement, produce a JSON plan.
The plan must contain 'tasks', each with 'id', 'description', 'files_to_modify', and 'acceptance_criteria'.
Include quality requirements: unit tests, linting pass, no security vulnerabilities.
Requirement: {requirement}"""
response = self.client.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
def execute_task(self, task: dict) -> bool:
# Construct a detailed prompt for OpenCode
prompt = f"""Implement the following task in the existing codebase:
{task['description']}
Acceptance criteria: {task['acceptance_criteria']}
After implementation, ensure:
- All existing tests pass.
- New unit tests are added for the new functionality.
- Ruff linting shows no errors.
- Bandit security scan finds no issues.
Return a summary of changes and test results."""
# Call OpenCode CLI
result = subprocess.run(
["opencode", "run", "--prompt", prompt, "--auto-approve", "--json"],
capture_output=True,
text=True
)
if result.returncode != 0:
return False
output = json.loads(result.stdout)
# Evaluate quality gates
if output.get("test_success") and output.get("lint_score", 0) >= 9.0 and output.get("security_issues") == 0:
return True
return False
def repair_task(self, task: dict, attempt=1):
if attempt > 3:
raise RuntimeError("Task failed after 3 repair attempts.")
# Ask Hermes to analyze the failure and suggest a fix
# Then call OpenCode again with fix instructions.
...
This simplified agent demonstrates the concept. In production, you’d want async execution, better error parsing, and integration with actual project context.
The true power of combining Hermes and OpenCode lies in the automated quality gates. Let’s examine how each quality attribute is enforced:
Hermes Agent plans tasks with explicit acceptance criteria. OpenCode is prompted to first write unit tests (if not already present) and then the implementation. The agent requires that tests pass before marking a task complete. This approximates test‑driven development (TDD), with the added benefit that the AI can generate both the tests and the code, while the Hermes Agent validates the logic.
OpenCode’s configuration includes linting tools (like ruff, pylint, or eslint). After each edit, it runs the linter and only returns success if the score meets the threshold. The Hermes Agent can be instructed to reject code that introduces technical debt. Over time, the codebase remains clean and consistent.
By integrating bandit (Python), npm audit, or trivy into OpenCode’s default checks, every change is scanned for common vulnerabilities. The Hermes Agent can also request specific security reviews for authentication or data‑handling modules, using its own reasoning to identify high‑risk areas.
For performance‑critical applications, the Hermes Agent can instruct OpenCode to run benchmarks before and after changes. If a performance regression exceeds a threshold, the task is sent back for optimization.
Imagine we need to add a “forgot password” flow to a web application.
1. Requirement Input
User story: “As a user, I want to reset my password via email so that I can regain access to my account.”
2. Planning (Hermes Agent)
The Hermes Agent queries the existing user model and auth module (via OpenCode’s context) and produces:
{
"tasks": [
{"id":1, "description":"Add reset token fields to User model", "files":["models/user.py"], "acceptance": ["migration created", "fields validated"]},
{"id":2, "description":"Create endpoint POST /auth/forgot-password", "files":["routes/auth.py"], "acceptance": ["accepts email, always returns 200", "sends email with reset link if user exists", "unit tests pass"]},
{"id":3, "description":"Create endpoint POST /auth/reset-password", "files":["routes/auth.py"], "acceptance": ["validates token, updates password", "unit tests, security scan pass"]}
]
}
3. Execution (OpenCode)
For task 2, Hermes prompts OpenCode:
Implement POST /auth/forgot-password endpoint.
- Accept JSON {"email": "..."}.
- Always return 200 OK with generic message.
- If user exists, generate a secure token, store it hashed, and send an email.
- Write unit tests covering valid email, non-existent email, and invalid input.
- Ensure ruff linting score >= 9, bandit no issues.
OpenCode creates/edits files, writes tests, runs pytest, ruff, and bandit. It returns a JSON summary indicating all checks passed.
4. Evaluation
Hermes Agent verifies that test coverage is adequate (maybe by parsing coverage report) and that no security issues were found. It then proceeds to the next task.
5. Integration and Merge
When all tasks are complete, Hermes runs the full test suite, generates a CHANGELOG entry, and opens a pull request using the GitHub CLI (another tool call). A human reviews and merges.
The result: a fully implemented, tested, and secure password‑reset feature, built with minimal human intervention but maximal quality assurance.
To push quality even further, consider these enhancements:
interrogate.While the Hermes‑OpenCode combo is powerful, it’s not magic. Be aware of these challenges:
How do you know the system is actually improving quality? Track these metrics over time:
In pilot projects, teams using a similar agent‑assisted workflow have reported a 40% reduction in bug density and a 25% increase in feature throughput, all while maintaining rigorous code standards.
The integration of planning agents and execution agents is still in its early days. Soon, we will see:
By adopting these patterns today, you position your team at the forefront of AI‑assisted software engineering.
High‑quality software is not an accident; it is the product of disciplined processes, continuous validation, and thoughtful design. AI tools can either erode quality through unchecked code generation or elevate it by enforcing best practices tirelessly and consistently. The combination of a Hermes Agent—with its powerful planning, reasoning, and orchestration capabilities—and OpenCode—with its direct, test‑driven, and lint‑enforced code manipulation—creates a development environment where every change is conceived, implemented, and verified against a strict quality standard.
You now have a blueprint: set up OpenCode with your preferred coding model, build a Hermes Agent that plans and delegates, and wrap everything in a feedback loop that refuses to accept anything less than excellent code. The future of software development is not about replacing engineers but about giving them superpowers. With Hermes and OpenCode, those superpowers are here, ready to be harnessed for the high‑quality software your users deserve.