Harden Everything · Blog · Christoffer Niska

Anthropic’s Glasswing announcement included a detail worth paying attention to. Their frontier model found a decades-old vulnerability in OpenBSD that had survived extensive security reviews. It found a 16-year-old flaw in FFmpeg that automated testing had missed millions of times. Both discoveries were fully autonomous. No human steering.

Thomas Ptacek’s recent post explains why. Vulnerability hunting is pattern matching and constraint solving, which is exactly what frontier models are good at. Elite security talent used to be scarce. It is not anymore. A hundred instances of Claude running continuously across all software targets is not a thought experiment anymore. It is a Tuesday.

If you build software that touches the filesystem and runs processes, this matters. If you build AI coding agents, it matters more.

Why agents are exposed

A coding agent is not a typical application. It runs shell commands, reads and writes files across your project, manages long-lived daemon processes, and handles API keys for multiple providers. The attack surface is wide and the stakes are high.

Acolyte already had a workspace sandbox, input validation through Zod schemas, and restricted shell execution. But “already had” is not the same as “hardened.” When I looked at the codebase through the lens of what a frontier model could find, the gaps became obvious.

I will be honest: nothing I found was catastrophic. No remote code execution, no credential leaks, no data loss in production. What I found was a collection of error paths that nobody tests because they never fire. Race conditions in daemon startup. Unchecked failures in update verification. Implicit assumptions about filesystem atomicity. Each one may look innocent on its own, but some bugs open the door for real exploits, and models will find them.

The point is not that these bugs are unusual. The point is that they exist in every codebase, and now they are cheap to find.

Daemon lifecycle

The daemon is the longest-running process in the system. It starts when you open the terminal, stays alive across sessions, and only stops when you explicitly kill it. Every startup, shutdown, and crash recovery path is a potential failure mode.

The first issue was a hang on startup. If the daemon process crashed immediately after spawning (for example, the port was already in use), the CLI would poll the health endpoint forever. The spawned process was dead, but nobody was watching proc.exited. The fix was to race the health poll against the process exit promise. If the process dies before becoming healthy, fail immediately instead of waiting for a ten-second timeout.

That fix exposed a second issue. The startup lock, which prevents two CLI instances from spawning competing daemons, was acquired before Bun.spawn. If spawn itself threw (say, the binary path was invalid), the lock leaked. Every subsequent startup attempt would wait for a lock that nobody held. Moving the spawn inside the try/finally block fixed the leak. I only found it because fast-failing spawn errors made the lock leak obvious in tests.

The stop path had its own problem. stopLocalServer with a lock file would send SIGTERM to the PID in the lock, but if that PID was dead and a different server was actually running on the port, the stop would “succeed” without actually stopping anything. The fix adds a graceful shutdown request via the admin endpoint when the lock PID is dead but the port is still healthy.

Then there was the server itself. It had no SIGTERM handler. When stopLocalServer sent a signal, the process died without flushing the trace store. Clean shutdown now calls closeDefaultTraceStore() before server.stop().

Finally, the retry logic in ensureLocalServer was unbounded. If another process kept recreating the startup lock, the function would recurse forever. A three-attempt bound with a clear timeout error message fixed the infinite loop.

Five fixes, all in the same subsystem, each one uncovered by the previous one. None of these are remotely exploitable on their own, but a process running without health checks or a daemon that cannot be stopped cleanly is the kind of degraded state that turns a local issue into something worse.

Update verification

The auto-updater downloads a binary from GitHub, verifies the checksum, extracts the tar, and replaces the running process. Every step in that pipeline had a gap.

The checksum verification function caught mismatches correctly but passed without error when it could not fetch the checksum file at all. A network error, a 404, a malformed response: all resulted in the binary being installed without any verification. A man-in-the-middle on the download could have served a tampered binary that would install and execute without warning. The fix: throw on any fetch failure instead of returning without warning.

The checksum URL was derived by string replacement (downloadUrl.replace('.tar.gz', '.sha256')), which meant a release without a checksum asset would always 404. I changed this to resolve the checksum asset from the GitHub release metadata, the same way the binary asset is resolved.

The tar extraction had no path traversal protection. A malicious tarball with ../../ entries could write files outside the extraction directory. The fix validates every archive entry before extraction by listing with tar tzf and rejecting any path segment equal to ...

After extraction, the code assumed the binary existed at join(outDir, 'acolyte') without checking. If the tarball structure changed, the error would surface later as a confusing copyFile failure. Now it calls access() and lstat() to verify both existence and that the entry is a regular file, not a symlink.

The re-exec after a successful update discarded the child process exit code. If the new binary crashed on first launch, the outer process exited 0. The user would see a clean exit despite the update failing. Forwarding the exit code makes the failure visible.

Session locks

Session locks prevent two CLI instances from modifying the same session concurrently. The original implementation used existsSync followed by writeFileSync, a textbook time-of-check-to-time-of-use race. Between the check and the write, another process could claim the lock.

The fix uses openSync with the wx (exclusive create) flag, which is atomic at the kernel level. If the file already exists, the call fails with EEXIST and the code checks whether the owner is still alive. If the owner is dead, it unlinks the stale lock and retries the exclusive create. Two attempts maximum. If both fail, another process won the race and that is the correct outcome. The race window was small but real: two terminals opened simultaneously could both claim the same session and corrupt each other’s state.

Auth comparison

The API key comparison used crypto.timingSafeEqual, which is the right primitive. But it compared the raw strings directly, which meant the function returned early on length mismatch. An attacker could determine the expected key length by measuring response times.

The fix hashes both inputs with SHA-256 before comparing. The hashes are always 32 bytes regardless of input length, so the comparison is always constant-time. Bun.CryptoHasher handles the hashing. timingSafeEqual from node:crypto handles the comparison, because Bun does not have a native constant-time compare and a hand-rolled XOR loop in JavaScript could be optimized by the JIT into something non-constant-time.

Config and credentials

The config parser swallowed parse errors. If you had a typo in your config.toml, the entire file was ignored and the system ran on defaults with no warning. You could spend an hour debugging why your model setting was not taking effect. The fix is to crash with a clear error message. If you wrote a config file, you meant it.

Feature flag merging was shallow. If your user config set features.syncAgents = true and your project config set features.cloudSync = true, the project object replaced the user object entirely, dropping syncAgents without warning. A one-line deep merge fixed it, but the bug had been there since feature flags were introduced.

Credential files were written with mode 600, but on systems where the file already existed with looser permissions, writeFile does not change the mode. An explicit chmod after every write ensures the permissions are correct regardless of the file’s prior state.

Memory pipeline

The memory store singleton cached its initialization promise but never cleared it on failure. If the first getMemoryStore() call failed (disk full, migration error), every subsequent call returned the same rejected promise. No recovery without restarting the process. Adding a .catch() that resets the promise lets the next call retry.

The cosineSimilarity function did not check whether the two vectors had the same length. If you switched embedding models between writes and reads, the loop would read past the end of the shorter array, producing garbage similarity scores with no error. A length assertion catches the mismatch immediately.

A subtler issue was in the token clamping function, which truncates distiller output to a budget. It used String.slice() on UTF-16 code units. A surrogate pair (any emoji, many CJK characters) cut at the wrong boundary produces a lone surrogate: an invalid string that could cause downstream issues with embedding models or SQLite storage. The fix strips any trailing high surrogate after each slice.

The math

None of these bugs had been exploited. But Ptacek describes organizations running hundreds of model instances continuously across all software targets, not just browsers and operating systems. Glasswing found bugs that human review missed for decades. The cost of finding these bugs is approaching zero. The cost of leaving them in place is not.

The work itself was not glamorous. Most fixes were under ten lines. The methodology was one question repeated for every error path: what state is the system in after this fails, and is that state safe? The answer should never be “I don’t know.”

If you are building tools that touch the filesystem, run processes, or handle credentials, the question is not whether your code has exploitable edge cases. It does. The question is whether you find them before someone else’s model does.