How I Build It · Blog · Christoffer Niska

Here is how I build the assistant, starting with the technical foundation and why I chose it.

I went with Bun because it runs TypeScript natively, keeps the runtime loop fast, and can produce binaries for easier distribution. TypeScript helps me move fast while keeping structure and quality under control. Early on I used Mastra for agent orchestration and Ink for CLI UX, but both were replaced as the architecture evolved. That is the nice thing about starting small: you can swap foundations without rewriting everything. I now use AI SDK directly for communicating with models and a custom renderer for terminal UI. Biome handles formatting and linting with minimal configuration.

I use Codex and Claude as execution and planning partners. They help generate code, pressure-test architectural ideas, and shape next steps, while I keep final decisions. Another thing that changed is architectural research. If I am building something I have not built before, I can now explore faster: compare options, test assumptions, and evaluate tradeoffs with AI support. Combined with my own experience, that makes it easier to make better decisions earlier. That shift alone has been worth the investment.

A core rule is order. We need to do the right things at the right time. If we build abstractions too early, we overengineer. If we delay necessary structure for too long, we create cleanup work. Most of the quality comes from sequencing, not from cleverness. I have learned this the hard way more than once.

Another hard rule is zero intentional technical debt. In AI co-developed projects, code is cheap to produce, so there is no good reason to keep known messes. If naming, structure, or behavior is wrong, we fix it immediately while context is still fresh. I also do not accept extra code that is not absolutely necessary. If it does not solve a real problem today, it stays out. This keeps the system understandable and makes verification faster.

Verification is evidence-driven. I ask the assistant to do tasks on its own behalf, then review logs, traces, and outcomes to see what actually happened. That gives hard data on whether behavior improved, stayed flat, or regressed. We need tests for the most critical paths, but not a pile of low-value tests that only add noise. If I cannot tell from the output whether something got better, the verification is not working.

This is true dogfooding. The system is tested by doing real work that matters, not by polished demo scenarios. The long-term goal is self-hosted development, where the assistant builds itself, but for now the loop stays human-guided with strict sequencing and measurable verification.

Share