Skip to main content

Meet Acolyte

Mar 12, 2026 · 5 min read

Acolyte is an open-source AI coding assistant that runs as a headless daemon in your terminal. Multi-provider, self-hosted, and built around lifecycle control, behavioral guards, and persistent memory. The host provides structure. The model does the work.

In late February, I went to a meetup where one of the Mastra founding engineers demoed building a simple coding agent live, different agents for each step, orchestrated through their framework. That was the last piece falling into place. I had the architecture experience, the daily AI-assisted workflows, and now a concrete starting point. I went home and started building.

Mastra got dropped along the way as the architecture evolved past what it offered, but without that demo the project would still be an idea. Sometimes you just need to see someone else do it to realize you can do it better.

What it is

Acolyte is an open-source AI coding assistant that runs in your terminal. It supports OpenAI, Anthropic, Google, and any OpenAI-compatible endpoint. It runs entirely on your infrastructure and gives you full control over how the agent behaves.

The architecture is daemon-based. A headless server handles all the AI work. The CLI connects over WebSocket RPC, the same protocol an editor plugin or custom client would use. The TUI has no special access. It is just another client. I wanted that separation from the start because it means anyone can build a different frontend without touching the engine.

What makes it different

Acolyte focuses on reliable agent behavior.

Most coding assistants are often close: they make the right edit, stay mostly in scope, and then keep going. They re-read files they already saw, run validation that was not requested, or drift into extra work even after the task is effectively done. If you have used any of these tools seriously, you know exactly what I mean.

The problem is not just prompts. Many systems lack a clear protocol between the model and the host runtime. When the loop stays open, the model keeps second-guessing itself.

Acolyte treats the host and model as separate responsibilities:

  • The model owns task judgment — how to solve the work.
  • The host provides structure — lifecycle control, tools, guards, and recovery.

Every request flows through a five-phase lifecycle:

resolve → prepare → generate → evaluate → finalize

Evaluators run after every generation. They handle formatting and linting automatically based on what the workspace profile detected. If lint errors remain, the evaluator feeds them back to the model. After the work phase, the agent transitions to verify mode for a read-only code review of the changes.

Behavioral guards run before every tool call. They detect and block degenerate patterns: step budgets, file churn loops, redundant searches, duplicate calls. These guards exist because I got tired of watching the agent waste cycles on things it should never do.

A per-task result cache sits after guards and returns identical read-only results instantly. Cache hits are silent in the UI, so the user never sees redundant tool calls cluttering the output. The agent re-reads a file it already has and the cache handles it without noise. This also saves roundtrips to the model and reduces token usage.

Memory and context

Memory works through context distillation.

Instead of compressing conversations when the window fills, Acolyte extracts structured facts and persists them across sessions in three tiers:

  • Session
  • Project
  • User

Retrieval is planned to use semantic similarity so the most relevant facts surface first. Token budgeting is proactive. The system prompt is measured and reserved first, and the remaining space fills by priority with age-based caps on older tool outputs.

Observability

Every tool call, guard decision, and evaluator action is emitted as a structured lifecycle event. When something goes wrong, you can see exactly what happened and why. No guessing, no re-running with extra logging.

The numbers

The codebase is around 18,000 lines of production code with a 0.8 test-to-code ratio and 13 runtime dependencies. Compared against eight other open-source AI coding agents, Acolyte has the smallest codebase, fewest dependencies, and one of the highest test densities.

But the more interesting number is the cumulative output: 160,791 lines, of which 96,673 were added, 64,118 removed, netting 32,555 lines of source. That is a 66% rewrite ratio. The code was not written once and left alone. It was continuously refined as real usage revealed what needed to change. The tool helped build itself, and that process taught me more about agent behavior than any amount of planning would have.

Why open source

I have been using Claude Code and Codex daily for months. Both are excellent, but both lock you into one provider and one way of doing things. I wanted an assistant I could observe, extend, and host myself. Acolyte is designed for developers who want the same thing.

Acolyte draws inspiration from across the ecosystem: Claude Code, Codex, open-source agents like Aider, Goose, and OpenHands, and projects like graphql-js that set the bar for clean contracts and minimal abstractions. The goal was not to start from scratch, but to take the best ideas and build something open that anyone can customize and extend.

Guards, memory strategies, lifecycle phases, and tools are all exposed as extensible contracts. When you need to extend, you implement an interface. When you do not, the defaults work.

Try it

Acolyte is ready to test today. It is very much a work in progress, but the fundamentals are in place. Clone the repository, run four commands, and you are up.

Share

Read next

No More Ink

Acolyte replaced Ink with a custom React reconciler for terminal rendering. The real reason was not fixing layout bugs. It was owning the input model, because every keybinding and cursor movement is product behavior in a tool people live in.