Hermes Agent SOUL.md and the 5-Pillar Architecture — An Inside Look at the Tier 3 Skill Auto-Generation Mechanism

At the end of every session, the agent extracts the procedure from the work it performed in that conversation, saves it as a skill file, and reuses that skill in subsequent sessions — this was the first thing that caught my attention when I first encountered Hermes Agent. The phrase "self-improving agent" is easy to dismiss as marketing copy, but I wanted to verify exactly how this particular mechanism works in concrete terms.

Honestly, I was skeptical at first. I had to dig into the code myself to figure out whether "performance improves the more you use it" was a clever repackaging of fine-tuning, or actually something different. Hermes creates meaningful performance gains by solidifying LLM call results into procedural skill files on the filesystem and selectively injecting those skills into the system prompt of the next session. It's closer to external memory expansion than fine-tuning.

In this post, I'll break down the 5-Pillar architecture that underpins this mechanism one piece at a time — tracing it all the way to the point where the design intent becomes unmistakably clear: why SOUL.md occupies the first slot in the system prompt, and why prompt_builder.py reads only the frontmatter (the YAML metadata block at the top of a file) first.

Core Concepts

5-Pillar: How the Five Pillars Interlock

The fastest way to understand Hermes is to grasp how its five components relate to one another. Each looks like an independent component, but in practice they interlock organically.

Pillar	File / Component	Core Role
Memory	`MEMORY.md`, `USER.md`	Maintaining context across sessions
Skills	`~/.hermes/skills/*.md`	Reusing procedural knowledge
SOUL	`~/.hermes/SOUL.md`	Defining agent identity and tone
Crons	`state.db` + gateway daemon	Proactive scheduling
Self-Improvement	Auto-generation loop	Usage experience → skill extraction

Pillar 1 — How Memory Persists After a Session Ends

MEMORY.md is the agent's own notepad; USER.md records user preferences and task patterns. The combined token limit for both files is capped at roughly 1,300 tokens, which forces you to think about prioritization — a little frustrating at first, but ultimately a structure that leaves only "what truly matters."

The most important thing to understand upfront is that memory files are loaded exactly once at session start and then frozen. Updates made to the files mid-session are not reflected in the current session's prompt. I initially thought this was a bug, but it turned out to be an intentional design decision — the reason becomes immediately clear in Pillar 3.

If you need immediate reflection, you can restart the session.

Pillar 2 — The Procedural Recipe System: Skills

The most surprising thing I noticed when actually using it was how simple the structure of a skill file is. YAML frontmatter for metadata, markdown body for the actual procedure — that's it.

yaml

---
name: competitive-analysis
description: 경쟁사 주간 동향을 분석하고 Slack 요약을 전송하는 절차
trigger: [competitive, analysis, market research]
tier: 3
created_at: 2026-04-22T09:14:00Z
---
 
## 절차
 
1. 타깃 경쟁사 리스트를 USER.md에서 로드
2. 각 사이트의 최근 7일 변경사항 크롤링
3. 변경사항을 카테고리별로 분류 (제품, 가격, 채용)
4. 요약본 작성 후 Slack #competitive-intel 채널 전송
 
## 주의사항
 
- 크롤링 실패 시 3회 재시도 후 실패 알림
- 요약본은 500자 이내로 압축

Progressive Disclosure: prompt_builder.py reads only the frontmatter of a skill file first to assess relevance. The actual body is loaded only when the skill is deemed necessary. A practical design choice that conserves the context window.

The core logic behind this behavior can be summarized as follows:

python

# agent/prompt_builder.py (핵심 발췌 — 개념 수준)
def load_skills_for_session(query: str) -> list[Skill]:
    skills_dir = Path("~/.hermes/skills/").expanduser()
    candidates = []
 
    for skill_file in skills_dir.glob("*.md"):
        # 1단계: 프런트매터만 파싱 (본문 미로드)
        frontmatter = parse_frontmatter_only(skill_file)
        relevance = score_relevance(frontmatter, query)
        if relevance > THRESHOLD:
            candidates.append((skill_file, frontmatter, relevance))
 
    # 2단계: 상위 N개만 본문 로드
    candidates.sort(key=lambda x: x[2], reverse=True)
    return [load_full_skill(path) for path, _, _ in candidates[:MAX_SKILLS]]

The reason prompt size doesn't explode even as skills grow to dozens is precisely this two-stage loading approach.

Pillar 3 — Why SOUL.md Occupies the First Slot

SOUL.md defines the agent's tone, values, and response style. But it carries significance beyond being a simple configuration file.

Looking at the system prompt assembly order, SOUL.md sits at slot #1 — the very beginning.

css

[slot 1] SOUL.md
[slot 2] MEMORY.md + USER.md
[slot 3] Relevant skill frontmatter → selected skill bodies
[slot 4] Context files (.hermes.md / AGENTS.md / CLAUDE.md)
[slot 5] Tool usage guide
[slot 6] Model-specific instructions (provider-specific optimizations for Anthropic, OpenAI, etc.)

In Pillar 1, I promised to explain "why memory is frozen at session start" — here's the answer. Prompt caches from Anthropic and OpenAI require a stable prefix to generate cache hits. Placing the session-invariant SOUL.md first and the relatively stable MEMORY.md after it increases cache hit rates, which in turn reduces API costs. This is a design decision driven by economics, not performance.

bash

# SOUL.md 현재 내용 확인
> read me your soul file
 
# 피드백으로 직접 업데이트 (에이전트가 파일을 수정)
> 너무 장황해. 핵심만 짧게 답해줘
> 공식적인 어조 그만. 편하게 말해줘

Pillar 4 — From Reactive to Proactive: Crons

A built-in gateway daemon checks state.db every 60 seconds to execute scheduled tasks. Because it runs in an isolated agent session, it does not affect the current conversation.

# 자연어로 스케줄 등록
> 매일 밤 12시에 스테이징 서버 로그를 분석하고 Slack에 요약 전송해줘

This single command causes Hermes to simultaneously do two things — register a cron job and create a Tier 3 skill file containing the corresponding procedure. And the more the same task is repeated, the more refined the skill becomes.

Pillar 5 — Usage Experience Solidifies Into Procedure: Self-Improvement

This is Hermes's core differentiator. There are four conditions under which a Tier 3 skill is auto-generated:

A complex task is completed with 5 or more tool calls in a single session
The task has a multi-step structure
Error recovery or user correction occurs
The user explicitly confirms the result

All four conditions are heuristics for judging "is this task worth repeating?" The inclusion of error recovery as a condition is interesting — the error recovery process itself captures "what pitfalls exist and how to avoid them."

And starting from v0.12.0, an autonomous Curator scores, consolidates, and cleans up the skill library on a 7-day cycle. Similar skills are merged, and skills that aren't actually being used have their priority lowered. Not perfect, but a practical mechanism that prevents the library from accumulating indefinitely.

Practical Application

Example 1: Tracing the Full Tier 3 Skill Auto-Generation Flow

User requests a competitive analysis task
  → Agent performs a 7-step multi-step task (12 tool calls)
  → Crawling fails midway → recovers via retry logic (error recovery occurs)
  → User: "Perfect, do this every week" (explicit confirmation)

When this flow satisfies all four conditions, the following pipeline executes:

sql

Execute      → Perform the actual task (tool calls, API requests, file processing, etc.)
Evaluate     → Assess reuse value (5+ tool calls + multi-step + confirmation)
Extract      → Extract procedure, error patterns, and validation steps from the session log
Write        → Create a skill file in ~/.hermes/skills/
Validate     → Tool registry validation + dangerous pattern scan
Discoverable → Included in frontmatter scan targets starting from the next session

The code below is pseudocode intended to explain how the mechanism works. The actual implementation and detailed interfaces may differ.

python

# agent/skill_extractor.py (의사 코드 — 개념 설명용)
 
def should_generate_skill(session: Session) -> bool:
    return (
        session.tool_call_count >= 5
        and session.is_multistep
        and (session.had_error_recovery or session.had_user_correction)
        and session.user_confirmed
    )
 
def extract_procedure_steps(session: Session) -> list[str]:
    steps = []
    tool_calls = [tc for tc in session.tool_calls if tc.success]
 
    for i, call in enumerate(tool_calls):
        # 사용자 수정이 있었던 단계는 주의사항으로 마킹
        if call.had_user_correction:
            steps.append(
                f"{i+1}. ⚠️ {call.description} (수정 포인트: {call.correction_note})"
            )
        else:
            steps.append(f"{i+1}. {call.description}")
 
    return steps
 
def extract_skill(session: Session) -> Skill:
    procedure = extract_procedure_steps(session)
    pitfalls = extract_error_patterns(session)
    validation = extract_validation_steps(session)
 
    return Skill(
        name=generate_skill_name(session),
        procedure=procedure,
        pitfalls=pitfalls,
        validation=validation,
        tier=3,
    )

The key point is how extract_procedure_steps marks error recovery steps with ⚠️ and preserves them. The skill captures not just a simple success procedure, but also "where to be careful."

In the Validate step, along with tool registry verification, it scans for dangerous patterns such as prompt injection (in the context of AI agents, this refers to external inputs inadvertently modifying agent behavior — conceptually similar to SQL injection in web security, but targeting the LLM prompt), TODOs, and placeholders. The security layer is thin, but it's better than nothing.

Generated skills are saved to ~/.hermes/skills/ and are included in frontmatter scan targets starting from the next session. They cannot be used in the session in which they were created.

Example 2: Building a Consistent Coding Assistant via SOUL.md Customization

markdown

---
# ~/.hermes/SOUL.md
---
 
## 정체성
 
나는 시니어 풀스택 개발자 스타일로 응답한다. 핵심을 먼저 말하고, 부연은 짧게.
 
## 코딩 스타일
 
- TypeScript strict mode 기본
- async/await만 사용 (Promise.then 금지)
- 2-space 들여쓰기
 
## 응답 원칙
 
- 500자 넘는 답변은 요약 먼저, 상세 내용은 접어서
- 코드 블록엔 언어 항상 명시
- 설명 없이 코드만 요청받으면 코드만 반환

It's recommended to separate project-specific context into a .hermes.md file. This is also where you can remember that keeping SOUL.md short improves cache hit rates.

markdown

---
# 프로젝트 루트/.hermes.md
---
 
## 프로젝트 컨텍스트
 
- 스택: Next.js 15 + NestJS + PostgreSQL
- 테스트: 실제 DB 연결 필수 (mock 금지)
- 배포: Vercel (frontend) + Railway (backend)

File	Scope	Modified By
`SOUL.md`	Global (all projects)	User + agent
`USER.md`	Global (user preferences)	Agent (automatic)
`.hermes.md`	Per-project	User

Example 3: Automating a DevOps Workflow with Cron + Skill Combination

> Every day at 9 AM, compile a list of GitHub Actions failures,
  categorize them by severity, and send them to Slack #dev-alerts.
  If there are no failures, don't send a message.

As Hermes processes this request, it simultaneously performs the following internally:

yaml

# 자동 생성된 스킬 파일 예시
---
name: github-actions-failure-report
description: GitHub Actions 실패를 심각도별 분류 후 Slack 전송
trigger: [github actions, CI failure, daily report]
tier: 3
cron: "0 9 * * *"
---
 
## 절차
 
1. GitHub API로 최근 24시간 워크플로 실행 결과 조회
2. 실패 항목 심각도 분류 (critical / warning / info)
3. critical 항목 없으면 실행 중단
4. 요약 메시지 포맷팅 후 Slack 전송
 
## 에러 처리
 
- GitHub API 타임아웃 시 3회 재시도
- Slack 전송 실패 시 이메일 폴백

It's notable that the cron field is included directly in the skill file. Pillar 4 (Crons) and Pillar 2 (Skills) don't merely coexist — they integrate into a single file.

Pros and Cons

Advantages

Item	Details
Cumulative learning	Reports of 40% speed improvement on similar tasks when holding 20+ self-generated skills
Context efficiency	Progressive Disclosure minimizes unnecessary token consumption
Cache optimization	Immutable system prompt maximizes Anthropic/OpenAI prompt cache utilization
Declarative configuration	SOUL.md, MEMORY.md, USER.md are all plain markdown, directly editable
Model-agnostic	Compatible with both Anthropic and OpenAI APIs
Open-source ecosystem	520+ community skills on agentskills.io, including 16 official Anthropic skills

Disadvantages and Caveats

Item	Details	Mitigation
Prompt injection risk	A malicious session exceeding the 5-tool-call threshold can permanently store a corrupted skill (Issue #25833)	Regularly audit auto-generated skills; actively use user-locked skills
No accuracy guarantee	No mechanism-level accuracy guarantee for auto-generated skills	Periodically review Curator scoring results; manually review critical skills
Memory size limit	Combined MEMORY.md + USER.md cap of ~1,300 tokens	Keep only high-priority items; separate the rest into per-project `.hermes.md`
No mid-session reflection	Memory updates during a session are not applied to the current prompt	Restart the session if immediate reflection is needed
Risk of editing `prompt_builder.py`	A globally affecting product file; unsuitable for general customization	Use SOUL.md, USER.md, and `.hermes.md` for customization

User-locked skills: When a user locks a specific skill, the agent can only read it — modification is not permitted. Well-suited for stable procedures that don't need automatic improvement. Community discussion is ongoing in GitHub Issue #17583.

The Most Common Mistakes in Practice

Writing SOUL.md too long — SOUL.md occupies the first slot of the system prompt, but the longer it gets, the more its cache hit rate advantage erodes and the more context it consumes. It's more effective to include only core identity and distribute the rest to USER.md or .hermes.md.
Leaving auto-generated skills unreviewed — The Curator scores and cleans up on a 7-day cycle, but it's not perfect. Skills generated during error recovery in particular can solidify temporary workarounds for special circumstances into general procedures. It helps to make a habit of periodically browsing ~/.hermes/skills/.
Not explicitly confirming important work mid-session — One of the conditions for Tier 3 skill generation is "explicit user confirmation." Responses like "Perfect" or "Keep using this going forward" act as skill generation triggers. Conversely, if you casually confirm something while trying an experimental approach, you may end up with an unwanted skill being generated.

Closing Thoughts

Once you understand this architecture, the difference between using Hermes simply as "a smarter chatbot" versus "a procedural library that grows to fit your workflow" becomes unmistakably clear. How you use it ultimately determines what kind of agent it becomes — that's the structure.

Three steps you can start right now:

Open ~/.hermes/SOUL.md directly and write the agent's response tone and coding style to match your preferences. You can check the current state with the > read me your soul file command, and the agent will update the file directly with a single line of feedback.
Perform a task you repeat at least once a week 3–4 times and naturally satisfy the Tier 3 skill auto-generation conditions. Leave an explicit confirmation response like "Perfect" or "Keep using this going forward" when the task completes — this triggers skill generation.
Open the ~/.hermes/skills/ directory roughly every two weeks and browse the auto-generated skills. Delete skills that don't match your intentions, and protect frequently used skills with user-locked to keep things running stably.

References

#AI에이전트#자기개선#스킬시스템#프롬프트엔지니어링#LLM#Python#프롬프트캐싱#멀티에이전트#자동화#메모리관리

Hermes Agent SOUL.md and the 5-Pillar Architecture — An Inside Look at the Tier 3 Skill Auto-Generation Mechanism | DEV BAK - 기술블로그

Hermes Agent SOUL.md and the 5-Pillar Architecture — An Inside Look at the Tier 3 Skill Auto-Generation Mechanism

Core Concepts

5-Pillar: How the Five Pillars Interlock

The fastest way to understand Hermes is to grasp how its five components relate to one another. Each looks like an independent component, but in practice they interlock organically.

Pillar	File / Component	Core Role
Memory	`MEMORY.md`, `USER.md`	Maintaining context across sessions
Skills	`~/.hermes/skills/*.md`	Reusing procedural knowledge
SOUL	`~/.hermes/SOUL.md`	Defining agent identity and tone
Crons	`state.db` + gateway daemon	Proactive scheduling
Self-Improvement	Auto-generation loop	Usage experience → skill extraction

Pillar 1 — How Memory Persists After a Session Ends

If you need immediate reflection, you can restart the session.

Pillar 2 — The Procedural Recipe System: Skills

The most surprising thing I noticed when actually using it was how simple the structure of a skill file is. YAML frontmatter for metadata, markdown body for the actual procedure — that's it.

yaml

---
name: competitive-analysis
description: 경쟁사 주간 동향을 분석하고 Slack 요약을 전송하는 절차
trigger: [competitive, analysis, market research]
tier: 3
created_at: 2026-04-22T09:14:00Z
---
 
## 절차
 
1. 타깃 경쟁사 리스트를 USER.md에서 로드
2. 각 사이트의 최근 7일 변경사항 크롤링
3. 변경사항을 카테고리별로 분류 (제품, 가격, 채용)
4. 요약본 작성 후 Slack #competitive-intel 채널 전송
 
## 주의사항
 
- 크롤링 실패 시 3회 재시도 후 실패 알림
- 요약본은 500자 이내로 압축

Progressive Disclosure: prompt_builder.py reads only the frontmatter of a skill file first to assess relevance. The actual body is loaded only when the skill is deemed necessary. A practical design choice that conserves the context window.

The core logic behind this behavior can be summarized as follows:

python

# agent/prompt_builder.py (핵심 발췌 — 개념 수준)
def load_skills_for_session(query: str) -> list[Skill]:
    skills_dir = Path("~/.hermes/skills/").expanduser()
    candidates = []
 
    for skill_file in skills_dir.glob("*.md"):
        # 1단계: 프런트매터만 파싱 (본문 미로드)
        frontmatter = parse_frontmatter_only(skill_file)
        relevance = score_relevance(frontmatter, query)
        if relevance > THRESHOLD:
            candidates.append((skill_file, frontmatter, relevance))
 
    # 2단계: 상위 N개만 본문 로드
    candidates.sort(key=lambda x: x[2], reverse=True)
    return [load_full_skill(path) for path, _, _ in candidates[:MAX_SKILLS]]

The reason prompt size doesn't explode even as skills grow to dozens is precisely this two-stage loading approach.

Pillar 3 — Why SOUL.md Occupies the First Slot

SOUL.md defines the agent's tone, values, and response style. But it carries significance beyond being a simple configuration file.

Looking at the system prompt assembly order, SOUL.md sits at slot #1 — the very beginning.

css

[slot 1] SOUL.md
[slot 2] MEMORY.md + USER.md
[slot 3] Relevant skill frontmatter → selected skill bodies
[slot 4] Context files (.hermes.md / AGENTS.md / CLAUDE.md)
[slot 5] Tool usage guide
[slot 6] Model-specific instructions (provider-specific optimizations for Anthropic, OpenAI, etc.)

bash

# SOUL.md 현재 내용 확인
> read me your soul file
 
# 피드백으로 직접 업데이트 (에이전트가 파일을 수정)
> 너무 장황해. 핵심만 짧게 답해줘
> 공식적인 어조 그만. 편하게 말해줘

Pillar 4 — From Reactive to Proactive: Crons

A built-in gateway daemon checks state.db every 60 seconds to execute scheduled tasks. Because it runs in an isolated agent session, it does not affect the current conversation.

# 자연어로 스케줄 등록
> 매일 밤 12시에 스테이징 서버 로그를 분석하고 Slack에 요약 전송해줘

Pillar 5 — Usage Experience Solidifies Into Procedure: Self-Improvement

This is Hermes's core differentiator. There are four conditions under which a Tier 3 skill is auto-generated:

A complex task is completed with 5 or more tool calls in a single session
The task has a multi-step structure
Error recovery or user correction occurs
The user explicitly confirms the result

Practical Application

Example 1: Tracing the Full Tier 3 Skill Auto-Generation Flow

User requests a competitive analysis task
  → Agent performs a 7-step multi-step task (12 tool calls)
  → Crawling fails midway → recovers via retry logic (error recovery occurs)
  → User: "Perfect, do this every week" (explicit confirmation)

When this flow satisfies all four conditions, the following pipeline executes:

sql

Execute      → Perform the actual task (tool calls, API requests, file processing, etc.)
Evaluate     → Assess reuse value (5+ tool calls + multi-step + confirmation)
Extract      → Extract procedure, error patterns, and validation steps from the session log
Write        → Create a skill file in ~/.hermes/skills/
Validate     → Tool registry validation + dangerous pattern scan
Discoverable → Included in frontmatter scan targets starting from the next session

The code below is pseudocode intended to explain how the mechanism works. The actual implementation and detailed interfaces may differ.

python

# agent/skill_extractor.py (의사 코드 — 개념 설명용)
 
def should_generate_skill(session: Session) -> bool:
    return (
        session.tool_call_count >= 5
        and session.is_multistep
        and (session.had_error_recovery or session.had_user_correction)
        and session.user_confirmed
    )
 
def extract_procedure_steps(session: Session) -> list[str]:
    steps = []
    tool_calls = [tc for tc in session.tool_calls if tc.success]
 
    for i, call in enumerate(tool_calls):
        # 사용자 수정이 있었던 단계는 주의사항으로 마킹
        if call.had_user_correction:
            steps.append(
                f"{i+1}. ⚠️ {call.description} (수정 포인트: {call.correction_note})"
            )
        else:
            steps.append(f"{i+1}. {call.description}")
 
    return steps
 
def extract_skill(session: Session) -> Skill:
    procedure = extract_procedure_steps(session)
    pitfalls = extract_error_patterns(session)
    validation = extract_validation_steps(session)
 
    return Skill(
        name=generate_skill_name(session),
        procedure=procedure,
        pitfalls=pitfalls,
        validation=validation,
        tier=3,
    )

The key point is how extract_procedure_steps marks error recovery steps with ⚠️ and preserves them. The skill captures not just a simple success procedure, but also "where to be careful."

Generated skills are saved to ~/.hermes/skills/ and are included in frontmatter scan targets starting from the next session. They cannot be used in the session in which they were created.

Example 2: Building a Consistent Coding Assistant via SOUL.md Customization

markdown

---
# ~/.hermes/SOUL.md
---
 
## 정체성
 
나는 시니어 풀스택 개발자 스타일로 응답한다. 핵심을 먼저 말하고, 부연은 짧게.
 
## 코딩 스타일
 
- TypeScript strict mode 기본
- async/await만 사용 (Promise.then 금지)
- 2-space 들여쓰기
 
## 응답 원칙
 
- 500자 넘는 답변은 요약 먼저, 상세 내용은 접어서
- 코드 블록엔 언어 항상 명시
- 설명 없이 코드만 요청받으면 코드만 반환

It's recommended to separate project-specific context into a .hermes.md file. This is also where you can remember that keeping SOUL.md short improves cache hit rates.

markdown

---
# 프로젝트 루트/.hermes.md
---
 
## 프로젝트 컨텍스트
 
- 스택: Next.js 15 + NestJS + PostgreSQL
- 테스트: 실제 DB 연결 필수 (mock 금지)
- 배포: Vercel (frontend) + Railway (backend)

File	Scope	Modified By
`SOUL.md`	Global (all projects)	User + agent
`USER.md`	Global (user preferences)	Agent (automatic)
`.hermes.md`	Per-project	User

Example 3: Automating a DevOps Workflow with Cron + Skill Combination

> Every day at 9 AM, compile a list of GitHub Actions failures,
  categorize them by severity, and send them to Slack #dev-alerts.
  If there are no failures, don't send a message.

As Hermes processes this request, it simultaneously performs the following internally:

yaml

# 자동 생성된 스킬 파일 예시
---
name: github-actions-failure-report
description: GitHub Actions 실패를 심각도별 분류 후 Slack 전송
trigger: [github actions, CI failure, daily report]
tier: 3
cron: "0 9 * * *"
---
 
## 절차
 
1. GitHub API로 최근 24시간 워크플로 실행 결과 조회
2. 실패 항목 심각도 분류 (critical / warning / info)
3. critical 항목 없으면 실행 중단
4. 요약 메시지 포맷팅 후 Slack 전송
 
## 에러 처리
 
- GitHub API 타임아웃 시 3회 재시도
- Slack 전송 실패 시 이메일 폴백

It's notable that the cron field is included directly in the skill file. Pillar 4 (Crons) and Pillar 2 (Skills) don't merely coexist — they integrate into a single file.

Pros and Cons

Advantages

Item	Details
Cumulative learning	Reports of 40% speed improvement on similar tasks when holding 20+ self-generated skills
Context efficiency	Progressive Disclosure minimizes unnecessary token consumption
Cache optimization	Immutable system prompt maximizes Anthropic/OpenAI prompt cache utilization
Declarative configuration	SOUL.md, MEMORY.md, USER.md are all plain markdown, directly editable
Model-agnostic	Compatible with both Anthropic and OpenAI APIs
Open-source ecosystem	520+ community skills on agentskills.io, including 16 official Anthropic skills

Disadvantages and Caveats

Item	Details	Mitigation
Prompt injection risk	A malicious session exceeding the 5-tool-call threshold can permanently store a corrupted skill (Issue #25833)	Regularly audit auto-generated skills; actively use user-locked skills
No accuracy guarantee	No mechanism-level accuracy guarantee for auto-generated skills	Periodically review Curator scoring results; manually review critical skills
Memory size limit	Combined MEMORY.md + USER.md cap of ~1,300 tokens	Keep only high-priority items; separate the rest into per-project `.hermes.md`
No mid-session reflection	Memory updates during a session are not applied to the current prompt	Restart the session if immediate reflection is needed
Risk of editing `prompt_builder.py`	A globally affecting product file; unsuitable for general customization	Use SOUL.md, USER.md, and `.hermes.md` for customization

User-locked skills: When a user locks a specific skill, the agent can only read it — modification is not permitted. Well-suited for stable procedures that don't need automatic improvement. Community discussion is ongoing in GitHub Issue #17583.

The Most Common Mistakes in Practice

Writing SOUL.md too long — SOUL.md occupies the first slot of the system prompt, but the longer it gets, the more its cache hit rate advantage erodes and the more context it consumes. It's more effective to include only core identity and distribute the rest to USER.md or .hermes.md.
Leaving auto-generated skills unreviewed — The Curator scores and cleans up on a 7-day cycle, but it's not perfect. Skills generated during error recovery in particular can solidify temporary workarounds for special circumstances into general procedures. It helps to make a habit of periodically browsing ~/.hermes/skills/.
Not explicitly confirming important work mid-session — One of the conditions for Tier 3 skill generation is "explicit user confirmation." Responses like "Perfect" or "Keep using this going forward" act as skill generation triggers. Conversely, if you casually confirm something while trying an experimental approach, you may end up with an unwanted skill being generated.

Closing Thoughts

Three steps you can start right now:

Open ~/.hermes/SOUL.md directly and write the agent's response tone and coding style to match your preferences. You can check the current state with the > read me your soul file command, and the agent will update the file directly with a single line of feedback.
Perform a task you repeat at least once a week 3–4 times and naturally satisfy the Tier 3 skill auto-generation conditions. Leave an explicit confirmation response like "Perfect" or "Keep using this going forward" when the task completes — this triggers skill generation.
Open the ~/.hermes/skills/ directory roughly every two weeks and browse the auto-generated skills. Delete skills that don't match your intentions, and protect frequently used skills with user-locked to keep things running stably.

References

#AI에이전트#자기개선#스킬시스템#프롬프트엔지니어링#LLM#Python#프롬프트캐싱#멀티에이전트#자동화#메모리관리

Core Concepts

5-Pillar: How the Five Pillars Interlock

Pillar 1 — How Memory Persists After a Session Ends

Pillar 2 — The Procedural Recipe System: Skills

Pillar 3 — Why SOUL.md Occupies the First Slot

Pillar 4 — From Reactive to Proactive: Crons

Pillar 5 — Usage Experience Solidifies Into Procedure: Self-Improvement

Practical Application

Example 1: Tracing the Full Tier 3 Skill Auto-Generation Flow

Example 2: Building a Consistent Coding Assistant via SOUL.md Customization

Example 3: Automating a DevOps Workflow with Cron + Skill Combination

Pros and Cons

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Core Concepts

5-Pillar: How the Five Pillars Interlock

Pillar 1 — How Memory Persists After a Session Ends

Pillar 2 — The Procedural Recipe System: Skills

Pillar 3 — Why SOUL.md Occupies the First Slot

Pillar 4 — From Reactive to Proactive: Crons

Pillar 5 — Usage Experience Solidifies Into Procedure: Self-Improvement

Practical Application

Example 1: Tracing the Full Tier 3 Skill Auto-Generation Flow

Example 2: Building a Consistent Coding Assistant via SOUL.md Customization

Example 3: Automating a DevOps Workflow with Cron + Skill Combination

Pros and Cons

Advantages

Disadvantages and Caveats

The Most Common Mistakes in Practice

Closing Thoughts

References

Recommended Posts

AI Agent-Based CI/CD Automation — Hermes Agent Crons' state.db Structure and Isolated Execution Mechanics

Automating Deployment Pipelines with Hermes Agent

Centralizing Hermes Agent SKILL.md via Git Tap Lets Multiple Instances Share the Same Skill Base

Building an MCP Server with TypeScript: Connecting PostgreSQL and Grafana to Hermes AI Agent

Trust Boundaries That Break When AI Agents Call External Tools — How to Prevent Prompt Injection and Memory Poisoning with MAESTRO and OWASP ASI Top 10

AI Keeps Running Even Without the Cloud — Implementing an Edge AI On-Device Deployment Pipeline