Release Notes

All Bernard releases, newest first.

v0.7.0 2026-03-13

File Manipulation Tools new

Two new tools — file_read_lines and file_edit_lines — give Bernard precision file editing without shelling out to sed or awk.

  • Line-based editing with replace, insert, delete, and append operations
  • Atomic writes — all edits in a single call succeed or all fail
  • Binary file detection and pagination for large files (50 MB cap)
  • file_read_lines supports offset/limit for targeted reads across big files

Per-Agent Model Selection new

Override the provider and model for any specialist, sub-agent, or task invocation. Set a default model on specialists at creation time, or pass an override at call time.

  • Specialists store a default provider/model that is used every time they run
  • Sub-agents and tasks accept per-invocation provider/model overrides
  • Cross-provider safety — auto-resolves the default model when switching providers
  • Model tags shown in /specialists listings

Deterministic Single-Step Tasks enhanced

Tasks are now single-step: one LLM call with tool use, then structured output. Saved tasks can be invoked by ID, and the REPL shows tasks and routines separately.

  • Reduced to maxSteps: 2 (one tool-use round + structured result)
  • New taskId parameter — run saved task routines by ID or via /task-{id} in the REPL
  • Separate display for tasks vs routines in the REPL
  • Auto-context injection — working directory, available tools, memory, and RAG are included automatically

v0.6.2 2026-03-12

Critic Visibility — Adaptive Truncation fix

Tool-result summaries shown to the critic now use an adaptive 8000-char budget instead of a fixed 500-char slice, so the critic sees far more context before judging.

  • Adaptive 8000-char budget with a 500-char floor per tool result
  • Response truncation raised from 2000 to 4000 chars, args from 500 to 1000
  • Truncation markers (...) added to tool results
  • Critic prompt updated to not treat truncation as failure evidence

Error Handling & Eventual Consistency fix

Agents no longer retry the exact same failed command, and external API mutations get a short delay before re-verification to account for eventual consistency.

  • No retrying the exact same failed command across all agent types and cron jobs
  • Wait 2–5s before re-verifying external API mutations

Specialist Candidate Filtering fix

Candidates that have already been saved as specialists are automatically filtered out of future suggestions.

  • Auto-marks candidates as accepted when a specialist is created by draft ID or name
  • Defensive filtering at startup and in /candidates output

v0.6.1 2026-03-11

Critic Retry Loop fix

Critic mode now retries up to 2 times when it detects issues, feeding its verdict back to the agent for correction before giving up.

  • Up to 2 automatic retries when critic returns WARN or FAIL
  • Critic feedback is injected into conversation history so the agent can self-correct
  • Improved verdict parsing using regex-based extraction
  • REPL prompt now shows a diamond symbol instead of [CRITIC] label
  • New parseCriticVerdict utility extracted to output module

/create-task Command fix

Added /create-task guided creation command and refactored duplicate creation logic.

  • New /create-task REPL command for guided task routine creation
  • Refactored duplicate guided creation logic into a shared runGuidedCreation helper
  • Help menu now lists /routines, /create-routine, and /create-task commands

Specialist Auto-Dispatch fix

Specialists are now automatically matched to user input using keyword scoring, with high-confidence matches auto-dispatched.

  • Keyword-based scoring matches user input against saved specialists
  • Scores ≥ 0.8 trigger automatic dispatch; 0.4–0.8 prompt for confirmation
  • Match advisory injected into system prompt so the agent knows which specialists fit
  • Stop-word filtering prevents common words from inflating scores

v0.6.0 2026-03-10

Tasks new

Tasks are isolated, focused executions that return structured JSON output. Unlike sub-agents (which return free-form text), tasks always produce a {status, output, details?} response — making them ideal for machine-readable results, routine chaining, and conditional branching.

  • 5-step budget — tasks are meant to be quick and focused
  • Structured JSON output — always returns {status: "success"|"error", output, details?}
  • Completely isolated from the current session — no conversation history
  • Available as both a tool and a command via /task
  • Shared concurrency pool with sub-agents (4 slots max)
bernard
bernard> /task List all TypeScript files in src/
┌─ task — List all TypeScript files in src/
  ▶ shell: find src -name "*.ts" -type f
└─ task success: Found 23 .ts files

// The agent can also call tasks during routines:
bernard> run the deploy routine
  ▶ task: "Check test suite passes"
  ┌─ task — Check test suite passes
    ▶ shell: npm test
  └─ task success: All 994 tests passed
  ▶ shell: git push origin main

Specialists new

Specialists are reusable expert profiles — persistent personas with custom system prompts and behavioral guidelines that shape how a sub-agent approaches work. Unlike routines (which define what steps to follow), specialists define how to work.

  • Each specialist run gets its own generateText loop with a 10-step budget
  • One JSON file per specialist in ~/.local/share/bernard/specialists/
  • Manage via /specialists or invoke directly with /{id}
  • Up to 50 specialists with lowercase kebab-case IDs
bernard
bernard> create a specialist called "code-reviewer" that reviews code for correctness, style, and security
  ▶ specialist: create { id: "code-reviewer", name: "Code Reviewer", … }
   Specialist "Code Reviewer" (code-reviewer) created.

bernard> /code-reviewer review the changes in src/agent.ts
┌─ spec:1 [Code Reviewer] — review the changes in src/agent.ts
  ▶ shell: git diff src/agent.ts
└─ spec:1 done

// List all saved specialists:
bernard> /specialists

Specialist Suggestions new

Bernard automatically detects recurring delegation patterns in your conversations and suggests new specialists. Detection runs in the background when you exit a session or use /clear --save.

  • Review pending suggestions with /candidates
  • Each candidate includes a name, description, confidence score, and reasoning
  • Accept or reject candidates conversationally
  • Auto-dismissed after 30 days if not reviewed; up to 10 pending at a time
bernard
// On startup, Bernard notifies you of pending suggestions:
  2 specialist suggestion(s) pending. Use /candidates to review.

bernard> /candidates

  1. code-reviewer (confidence: 0.85)
    Reviews PRs for correctness, style, and security issues

  2. db-migrator (confidence: 0.72)
    Manages database schema changes and migration scripts

bernard> accept the code-reviewer candidate
   Specialist "Code Reviewer" (code-reviewer) created.

Critic Mode new

Critic mode adds planning, proactive scratch/memory usage, and post-response verification. Recommended for high-stakes work like deployments, git operations, and multi-file edits.

  • Planning — writes a plan to scratch before multi-step tasks
  • Proactive scratch — accumulates findings during complex work
  • Verification — a critic agent reviews the work and prints a verdict (PASS / WARN / FAIL)
bernard
bernard> /critic on
   Critic mode enabled.

bernard> refactor the auth module to use JWT
  ▶ scratch: write plan
  ▶ shell: cat src/auth.ts
  ▶ write: src/auth.ts
  ▶ shell: npm test

  ┌─ critic verdict
  │ PASS — All claimed edits match tool calls.
  │ Tests pass after refactor. No issues found.
  └─

bernard> /critic off
  Critic mode disabled.

Conversation Summaries new

A fourth specialized domain in Bernard's RAG memory system, joining Tool Usage Patterns, User Preferences, and General Knowledge. Conversation summaries capture what was discussed, approaches taken, tools/specialists/routines used, and outcomes.

  • Extracts high-level summaries of what was accomplished each session
  • Helps identify recurring patterns across conversations
  • All four domain extractors run in parallel — no extra latency
  • Search returns up to 5 results per domain (20 total max), preventing any single domain from crowding out others
bernard
// After a session, Bernard extracts memories by domain:
  ▶ Extracting facts…
    tool-usage: 3 facts
    user-preferences: 1 fact
    general: 2 facts
    conversations: 1 fact

// In a future session, recalled context is organized by domain:
bernard> /rag
  RAG Memory: 142 facts across 4 domains
    tool-usage       48 facts
    user-preferences 31 facts
    general          41 facts
    conversations    22 facts