AI Agent-Based CI/CD Automation — Hermes Agent Crons' state.db Structure and Isolated Execution Mechanics

A single YAML indentation error that breaks an entire build, or half a day wasted on external SaaS account permission issues — I've been through it multiple times, and I'm sure most of you have too. Those moments of looking up GitHub Actions' on: block syntax again and thinking, "why do I even have to memorize this?" If you've ever felt frustrated by traditional YAML-based workflows, you'll find the Crons system in Hermes Agent — NousResearch's self-evolving AI agent framework — quite intriguing. "every night at 12am, push changes to GitHub" — this single sentence defines an entire nightly auto-commit pipeline. I'll be honest: when I first saw it, I was a little taken aback.

Here's the key point: even though the LLM itself decides whether to deploy, a mechanism that guarantees isolated execution without state contamination between jobs is already built in. This is why it's worth considering seriously as a CI/CD alternative, not just a "cool natural language interface."

In this article, we'll dig into the internal structure of Hermes Crons' state.db storage and how isolated execution is guaranteed. We'll walk through CI health monitoring, nightly code review summaries, and multi-step deployment pipelines with real code.

TL;DR

Hermes Agent Crons defines scheduled tasks with a single natural language prompt sentence, and the LLM decides how to execute them.

All execution history is automatically saved to ~/.hermes/state.db (SQLite WAL + FTS5), and a completely new session is created for each job.

Using no_agent mode, which runs scripts without LLM costs, lets you build a cost-efficient hybrid pipeline.

Core Concepts

Why Hermes Agent Crons Is Different from CI/CD

Traditional CI/CD pipelines require you to explicitly specify "what to run" in YAML. Logic for analyzing test failure causes or deciding contextually whether to halt or proceed with a deployment must be implemented manually via shell scripts or complex conditionals. In contrast, Hermes Crons uses the LLM as the job's execution agent, so the agent interprets the intent expressed in a single prompt sentence and decides on its own how to execute.

Before diving into concrete examples, it's worth touching on the Skills concept first. Skills are a set of tools that can be attached to cron jobs. By connecting built-in or custom Skills like git-ops or docker-build to a job, the agent can actually invoke those tools to perform tasks like Git commits or Docker image builds. It's because of these Skills that a prompt like "deploy this" actually works. Writing Skills themselves is beyond the scope of this article and will be covered separately, but for now, think of them as "a set of tools the agent can use."

json

{
  "name": "nightly-deploy",
  "schedule": "0 2 * * *",
  "skills": ["git-ops", "docker-build"],
  "script": "~/.hermes/scripts/pre_check.sh",
  "workdir": "/srv/myapp",
  "delivery": {"channel": "slack", "target": "#deploys"}
}

This might not look very different from a traditional CI job. The difference is that the stdout of the script field is injected into the prompt, and then the agent reads that output and judges "whether it's safe to proceed with deployment right now." It's the LLM's reasoning — not YAML conditionals — that handles the decision logic.

state.db — Where All of Hermes' Execution History Accumulates

Hermes stores all sessions and messages in a single SQLite file at ~/.hermes/state.db. It operates in WAL (Write-Ahead Logging) mode, which handles concurrent reads and writes reliably and provides production-level reliability despite being file-based.

Table	Role
`sessions`	Session metadata — platform, model, start/end time, token count, cost, title
`messages`	Full message history — role, content, tool_calls, including reasoning tokens
`messages_fts`	FTS5 virtual table — automatically indexes content, tool_name, and tool_calls
`schema_version`	Single row tracking migration version

messages_fts is an FTS5 (Full-Text Search 5) virtual table that is automatically synchronized via trigger whenever messages are written. Because it uses a "content table" approach, it stores only the index separately, minimizing DB size while allowing months of execution logs to be searched in milliseconds. The agent itself can perform full-text searches across all past conversations using the built-in session_search tool. I'll admit my first reaction was "SQLite can do that?" — but having used it in practice, even tens of thousands of messages are searched instantly.

WAL Mode (Write-Ahead Logging): Write operations are first recorded in a separate WAL file and then applied to the main DB at checkpoint time. Reads and writes don't block each other, dramatically improving concurrency.

There is one important caveat. The official documentation explicitly prohibits directly querying state.db from external sources, because the internal schema may change between releases. If you need data for audit purposes, it's safer to access it through the official API or the session_search tool.

Isolation Mechanism — Why There's No State Contamination Between Jobs

When running cron jobs in production, you sometimes find yourself wondering, "did the state from the previous job affect this one?" Hermes blocks this problem structurally.

Every time the scheduler executes a job, it creates a fresh AIAgent session with completely empty conversation history and context. Regardless of what state a previous job was in or what failures it encountered, the next job is completely unaffected. It's equivalent to the guarantee that container-based CI runners provide by spinning up a new image for every execution, delivered at the filesystem level.

Duplicate execution prevention is handled via a ~/.hermes/cron/.tick.lock file lock. Because it's a cross-process file lock, even if multiple Hermes instances start on the same machine, the same job is prevented from running twice simultaneously.

yaml

~/.hermes/
├── state.db              # Persistent storage for all sessions and messages
├── cron/
│   ├── .tick.lock        # File lock for duplicate execution prevention
│   ├── jobs/             # Job definition JSON files
│   └── output/
│       └── {job_id}/
│           └── {timestamp}.md   # Per-job execution logs
└── scripts/              # Shell scripts for no_agent mode

Isolation: Because each job starts in an independent session, a failure or side effect in job A cannot contaminate job B's execution environment.

Practical Applications

Example 1: CI Health Monitoring Gate (no_agent Watchdog Mode)

The no_agent option introduced in the v0.13.0 Tenacity release is honestly the feature I was most glad to see. For simple check jobs that don't need LLM reasoning, you can eliminate API costs entirely. Specify it as "no_agent": true in JSON config, or as no_agent=True when creating jobs directly via the Python API.

bash

#!/bin/bash
# ~/.hermes/scripts/ci_check.sh
 
# Use jq -r flag to extract raw string without quotes
STATUS=$(curl -s https://api.github.com/repos/org/repo/actions/runs \
  | jq -r '.workflow_runs[0].conclusion')
 
[ "$STATUS" != "success" ] && echo "CI FAILED: $STATUS"
# If stdout is empty, stay silent — no notification on success

json

{
  "name": "ci-health-watchdog",
  "schedule": "*/5 * * * *",
  "script": "~/.hermes/scripts/ci_check.sh",
  "no_agent": true,
  "delivery": {
    "channel": "slack",
    "target": "#alerts",
    "only_if_output": true
  }
}

Component	Role
`"no_agent": true`	Passes script stdout directly without any LLM calls
`only_if_output: true`	Silences empty stdout — "notify only on problems" pattern
`/5 * * *`	Runs every 5 minutes (minimum unit is 60 seconds)

The strength of this pattern is zero API cost. For jobs that can be handled by scripts — CI status checks, disk space alerts, service health checks — registering them with "no_agent": true lets you run them at high frequency without worrying about costs.

Example 2: Nightly Automated Code Review Summary

This job retrieves the PR list every midnight, summarizes changes, categorizes high-risk items, and delivers them to Slack. Setting workdir to the repository root automatically injects AGENTS.md or CLAUDE.md, helping the agent understand the project context.

json

{
  "name": "nightly-pr-review-summary",
  "schedule": "0 0 * * *",
  "skills": ["github-mcp"],
  "workdir": "/path/to/myrepo",
  "prompt": "Retrieve the list of PRs opened today, summarize the changes in each PR, categorize items with high deployment risk, and send the results to the Slack #dev-review channel",
  "delivery": {
    "channel": "slack",
    "target": "#dev-review"
  }
}

To implement this with traditional CI/CD, you'd need to write the entire pipeline as scripts: GitHub API calls → diff parsing → risk classification logic → Slack message formatting. In Hermes, a single prompt sentence replaces all of that logic. LLM API costs are incurred, but compared to the development time it would take to implement this manually as scripts, that's a reasonable tradeoff.

Example 3: Chaining Dependent Jobs with context_from

Using the context_from field, you can automatically prepend job A's last execution stdout to job B's prompt, composing a sequential pipeline. What gets injected is the full text output of the preceding job, and there is a length limit — so be aware that if the preceding job's output is excessively long, it may be truncated.

json

[
  {
    "name": "run-tests",
    "schedule": "0 1 * * *",
    "skills": ["pytest-runner"],
    "prompt": "Run the entire test suite and summarize any failures and coverage"
  },
  {
    "name": "deploy-decision",
    "schedule": "30 1 * * *",
    "context_from": "run-tests",
    "skills": ["git-ops", "docker-build"],
    "workdir": "/srv/myapp",
    "prompt": "Based on the test results, decide whether to proceed with a production deployment, and execute the deployment if it is safe to do so"
  }
]

Field	Behavior
`context_from: "run-tests"`	Automatically prepends the last execution stdout of the `run-tests` job to the prompt
`"30 1 * * *"`	Set 30 minutes after the first job (01:00) to allow buffer for expected completion time (01:30)

What makes this pattern interesting is that instead of hardcoding deployment decision logic like "deploy if test failures are 0 and coverage is above 80%," the agent reads the context of the test results — which modules failed, whether any tests were temporarily skipped, etc. — and makes the judgment. I was skeptical at first about whether this would actually be accurate, but with well-structured prompts and test reports injected together, it turns out to be more accurate than expected.

Pros and Cons Analysis

Advantages

Item	Details
Configuration simplicity	Define scheduled tasks with a single natural language prompt sentence, no YAML
LLM reasoning	Can handle contextual decisions that rule-based CI/CD cannot — analyzing test failure causes, deciding whether to halt a deployment, etc.
No external dependencies	No serverless or SaaS accounts needed; self-contained in a single `~/.hermes/` directory
Isolation reliability	No state contamination between jobs; file lock guarantees no duplicate execution
Multi-channel delivery	Built-in support for delivering results to Slack, Discord, Email, and other channels
FTS5 audit trail	All execution history is automatically preserved in state.db in a full-text searchable form
no_agent cost optimization	Simple script jobs can run without LLM calls

Disadvantages and Caveats

Item	Details	Mitigation
Unstable state.db schema	Internal schema may change between releases; direct queries are officially prohibited	Use only the official API and `session_search` tool
60-second minimum granularity	Scheduler tick interval is 60 seconds — sub-second precision is not possible	Keep traditional cron for jobs requiring sub-second execution
FTS5 dependency	SQLite FTS5 may be missing in some Python 3.11 macOS builds	Use official Docker images to standardize the environment
Single-machine limitation	Default setup is based on local filesystem — difficult for distributed team collaboration	Wait for Pluggable SessionDB RFC (#23717) to complete, or use Docker Compose for a shared environment
LLM costs	All jobs except `no_agent` incur LLM API calls	Aggressively use `"no_agent": true` for high-frequency simple jobs
Debugging visibility	Failure logs exist only in `~/.hermes/cron/output/{job_id}/{timestamp}.md`	Recommend integrating Web Dashboard or a dedicated log aggregation tool

FTS5 (Full-Text Search 5): A full-text search extension module built into SQLite. It is significantly faster than LIKE searches and supports tokenization, ranking, and phrase search. However, it must be enabled at compile time and may be missing from some package builds.

The Most Common Mistakes in Practice

Attempting to extract audit logs by directly querying state.db — The schema can change at any time, and it's also an officially prohibited use case. If you need an audit trail, using the session_search tool or the Web Dashboard's session viewer is the safe approach.
Designing all jobs to run as LLM agents — If you omit the "no_agent" option for simple jobs like disk checks or HTTP pings, API costs accumulate unnecessarily. It's important to develop the habit of classifying jobs that require no reasoning as "no_agent": true from the start.
Timing misses in context_from chains — If the downstream job's schedule triggers while the upstream job is still running, it will receive the output from the previous run. It's best to schedule the downstream job with at least 10–15 minutes of buffer added to the upstream job's expected completion time.

Closing Thoughts

Hermes Agent's Crons system is a tool that lets you actually try, at a production level, a shift from the CI/CD paradigm of "explicitly specifying what to run in YAML" to an agentic execution model where the LLM directly decides whether to deploy. Thanks to two foundations — state.db's FTS5-based audit trail and fully isolated per-job sessions — you can weave the LLM's judgment into your pipeline without sacrificing the reproducibility and traceability that traditional CI/CD provides.

Three steps you can start with right now:

It's recommended to first register a single no_agent watchdog job. Install with pip install hermes-agent, write a simple health check script in ~/.hermes/scripts/, and place a JSON job file configured with "no_agent": true in the ~/.hermes/cron/jobs/ directory. You can see how the cron system works firsthand, with zero LLM API costs.
Running the Web Dashboard with the hermes dashboard command is helpful. In the local dashboard — which integrates a cron manager, live log viewer, and session management — you can view the log files under ~/.hermes/cron/output/ through a UI.
It's also worth porting one of the most frequently touched jobs in your existing CI into an LLM agent job. It's easier to debug if you start as a standalone job without context_from first, and then connect it into a context_from chain once the behavior is confirmed stable.

References

Official Documentation

Source Code & Releases

In-Depth Analysis & Community

#AI에이전트#CICD#LLM#SQLite#크론잡#자동화파이프라인#FTS5#NaturalLanguageOps#no_agent#WAL

AI Agent-Based CI/CD Automation — Hermes Agent Crons' state.db Structure and Isolated Execution Mechanics | DEV BAK - 기술블로그

AI Agent-Based CI/CD Automation — Hermes Agent Crons' state.db Structure and Isolated Execution Mechanics

TL;DR

Hermes Agent Crons defines scheduled tasks with a single natural language prompt sentence, and the LLM decides how to execute them.

All execution history is automatically saved to ~/.hermes/state.db (SQLite WAL + FTS5), and a completely new session is created for each job.

Using no_agent mode, which runs scripts without LLM costs, lets you build a cost-efficient hybrid pipeline.

Core Concepts

Why Hermes Agent Crons Is Different from CI/CD

json

{
  "name": "nightly-deploy",
  "schedule": "0 2 * * *",
  "skills": ["git-ops", "docker-build"],
  "script": "~/.hermes/scripts/pre_check.sh",
  "workdir": "/srv/myapp",
  "delivery": {"channel": "slack", "target": "#deploys"}
}

state.db — Where All of Hermes' Execution History Accumulates

Table	Role
`sessions`	Session metadata — platform, model, start/end time, token count, cost, title
`messages`	Full message history — role, content, tool_calls, including reasoning tokens
`messages_fts`	FTS5 virtual table — automatically indexes content, tool_name, and tool_calls
`schema_version`	Single row tracking migration version

WAL Mode (Write-Ahead Logging): Write operations are first recorded in a separate WAL file and then applied to the main DB at checkpoint time. Reads and writes don't block each other, dramatically improving concurrency.

Isolation Mechanism — Why There's No State Contamination Between Jobs

When running cron jobs in production, you sometimes find yourself wondering, "did the state from the previous job affect this one?" Hermes blocks this problem structurally.

yaml

~/.hermes/
├── state.db              # Persistent storage for all sessions and messages
├── cron/
│   ├── .tick.lock        # File lock for duplicate execution prevention
│   ├── jobs/             # Job definition JSON files
│   └── output/
│       └── {job_id}/
│           └── {timestamp}.md   # Per-job execution logs
└── scripts/              # Shell scripts for no_agent mode

Isolation: Because each job starts in an independent session, a failure or side effect in job A cannot contaminate job B's execution environment.

Practical Applications

Example 1: CI Health Monitoring Gate (no_agent Watchdog Mode)

bash

#!/bin/bash
# ~/.hermes/scripts/ci_check.sh
 
# Use jq -r flag to extract raw string without quotes
STATUS=$(curl -s https://api.github.com/repos/org/repo/actions/runs \
  | jq -r '.workflow_runs[0].conclusion')
 
[ "$STATUS" != "success" ] && echo "CI FAILED: $STATUS"
# If stdout is empty, stay silent — no notification on success

json

{
  "name": "ci-health-watchdog",
  "schedule": "*/5 * * * *",
  "script": "~/.hermes/scripts/ci_check.sh",
  "no_agent": true,
  "delivery": {
    "channel": "slack",
    "target": "#alerts",
    "only_if_output": true
  }
}

Component	Role
`"no_agent": true`	Passes script stdout directly without any LLM calls
`only_if_output: true`	Silences empty stdout — "notify only on problems" pattern
`/5 * * *`	Runs every 5 minutes (minimum unit is 60 seconds)

Example 2: Nightly Automated Code Review Summary

json

{
  "name": "nightly-pr-review-summary",
  "schedule": "0 0 * * *",
  "skills": ["github-mcp"],
  "workdir": "/path/to/myrepo",
  "prompt": "Retrieve the list of PRs opened today, summarize the changes in each PR, categorize items with high deployment risk, and send the results to the Slack #dev-review channel",
  "delivery": {
    "channel": "slack",
    "target": "#dev-review"
  }
}

Example 3: Chaining Dependent Jobs with context_from

json

[
  {
    "name": "run-tests",
    "schedule": "0 1 * * *",
    "skills": ["pytest-runner"],
    "prompt": "Run the entire test suite and summarize any failures and coverage"
  },
  {
    "name": "deploy-decision",
    "schedule": "30 1 * * *",
    "context_from": "run-tests",
    "skills": ["git-ops", "docker-build"],
    "workdir": "/srv/myapp",
    "prompt": "Based on the test results, decide whether to proceed with a production deployment, and execute the deployment if it is safe to do so"
  }
]

Field	Behavior
`context_from: "run-tests"`	Automatically prepends the last execution stdout of the `run-tests` job to the prompt
`"30 1 * * *"`	Set 30 minutes after the first job (01:00) to allow buffer for expected completion time (01:30)

Pros and Cons Analysis

Advantages

Item	Details
Configuration simplicity	Define scheduled tasks with a single natural language prompt sentence, no YAML
LLM reasoning	Can handle contextual decisions that rule-based CI/CD cannot — analyzing test failure causes, deciding whether to halt a deployment, etc.
No external dependencies	No serverless or SaaS accounts needed; self-contained in a single `~/.hermes/` directory
Isolation reliability	No state contamination between jobs; file lock guarantees no duplicate execution
Multi-channel delivery	Built-in support for delivering results to Slack, Discord, Email, and other channels
FTS5 audit trail	All execution history is automatically preserved in state.db in a full-text searchable form
no_agent cost optimization	Simple script jobs can run without LLM calls

Disadvantages and Caveats

Item	Details	Mitigation
Unstable state.db schema	Internal schema may change between releases; direct queries are officially prohibited	Use only the official API and `session_search` tool
60-second minimum granularity	Scheduler tick interval is 60 seconds — sub-second precision is not possible	Keep traditional cron for jobs requiring sub-second execution
FTS5 dependency	SQLite FTS5 may be missing in some Python 3.11 macOS builds	Use official Docker images to standardize the environment
Single-machine limitation	Default setup is based on local filesystem — difficult for distributed team collaboration	Wait for Pluggable SessionDB RFC (#23717) to complete, or use Docker Compose for a shared environment
LLM costs	All jobs except `no_agent` incur LLM API calls	Aggressively use `"no_agent": true` for high-frequency simple jobs
Debugging visibility	Failure logs exist only in `~/.hermes/cron/output/{job_id}/{timestamp}.md`	Recommend integrating Web Dashboard or a dedicated log aggregation tool

FTS5 (Full-Text Search 5): A full-text search extension module built into SQLite. It is significantly faster than LIKE searches and supports tokenization, ranking, and phrase search. However, it must be enabled at compile time and may be missing from some package builds.

The Most Common Mistakes in Practice

Attempting to extract audit logs by directly querying state.db — The schema can change at any time, and it's also an officially prohibited use case. If you need an audit trail, using the session_search tool or the Web Dashboard's session viewer is the safe approach.
Designing all jobs to run as LLM agents — If you omit the "no_agent" option for simple jobs like disk checks or HTTP pings, API costs accumulate unnecessarily. It's important to develop the habit of classifying jobs that require no reasoning as "no_agent": true from the start.
Timing misses in context_from chains — If the downstream job's schedule triggers while the upstream job is still running, it will receive the output from the previous run. It's best to schedule the downstream job with at least 10–15 minutes of buffer added to the upstream job's expected completion time.

Closing Thoughts

Three steps you can start with right now:

It's recommended to first register a single no_agent watchdog job. Install with pip install hermes-agent, write a simple health check script in ~/.hermes/scripts/, and place a JSON job file configured with "no_agent": true in the ~/.hermes/cron/jobs/ directory. You can see how the cron system works firsthand, with zero LLM API costs.
Running the Web Dashboard with the hermes dashboard command is helpful. In the local dashboard — which integrates a cron manager, live log viewer, and session management — you can view the log files under ~/.hermes/cron/output/ through a UI.
It's also worth porting one of the most frequently touched jobs in your existing CI into an LLM agent job. It's easier to debug if you start as a standalone job without context_from first, and then connect it into a context_from chain once the behavior is confirmed stable.

References

Official Documentation

Source Code & Releases

In-Depth Analysis & Community

#AI에이전트#CICD#LLM#SQLite#크론잡#자동화파이프라인#FTS5#NaturalLanguageOps#no_agent#WAL

Core Concepts

Why Hermes Agent Crons Is Different from CI/CD

state.db — Where All of Hermes' Execution History Accumulates

Isolation Mechanism — Why There's No State Contamination Between Jobs

Practical Applications

Example 1: CI Health Monitoring Gate (no_agent Watchdog Mode)

Example 2: Nightly Automated Code Review Summary

Example 3: Chaining Dependent Jobs with context_from

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Core Concepts

Why Hermes Agent Crons Is Different from CI/CD

state.db — Where All of Hermes' Execution History Accumulates

Isolation Mechanism — Why There's No State Contamination Between Jobs

Practical Applications

Example 1: CI Health Monitoring Gate (no_agent Watchdog Mode)

Example 2: Nightly Automated Code Review Summary

Example 3: Chaining Dependent Jobs with context_from

Pros and Cons Analysis

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Recommended Posts

Automating Deployment Pipelines with Hermes Agent

Centralizing Hermes Agent SKILL.md via Git Tap Lets Multiple Instances Share the Same Skill Base

Building a Local LLM Infrastructure with Ollama + Hermes — $0 API Costs, Zero Data Leakage

Hermes Agent SOUL.md and the 5-Pillar Architecture — An Inside Look at the Tier 3 Skill Auto-Generation Mechanism

Building an MCP Server with TypeScript: Connecting PostgreSQL and Grafana to Hermes AI Agent

Trust Boundaries That Break When AI Agents Call External Tools — How to Prevent Prompt Injection and Memory Poisoning with MAESTRO and OWASP ASI Top 10