Making Mistakes Is Human (and AI). You Just Need the Right Safety Net.

Feb 14, 2026 · 10 min read

A child touches a hot stove and learns not to do it again. A developer runs rm -rf in the wrong directory and learns to double-check paths. A junior engineer pushes to production on Friday and learns why nobody does that.

Mistakes are how we learn. The question was never “how do we stop making mistakes?” - it was always “how do we make mistakes survivable?”

Version control. Undo buttons. Backups. Staging environments. Every meaningful safety net in software exists because we accepted that mistakes are inevitable and built systems to recover from them.

AI agents work the same way.

When Claude Code runs a command and it fails, it reads the error, adjusts, and tries again. When an AI coding assistant installs a wrong dependency, it needs to undo that and try a different one. When an AI agent configures a server incorrectly, something needs to catch that before it becomes a disaster.

The question isn’t “will AI make mistakes?” - it’s “how expensive are those mistakes?”

And that’s where things get interesting. Because there’s a technology that’s been making mistakes cheap for over two decades, long before anyone was thinking about AI agents.

Nix: Where Mistakes Are Free

Nix wasn’t designed for AI. It was designed by people who were tired of broken systems - package conflicts, failed upgrades, the “works on my machine” nightmare. The solution they built has three properties that turn out to be exactly what AI agents need.

Atomicity: All or Nothing

When you install a package with apt-get or brew and it fails halfway through, you’re left with a half-installed mess. Shared libraries might be overwritten. Config files might be partially updated. Your system is in an undefined state, and cleaning up is a manual archaeology project.

Nix doesn’t work that way. Every operation is atomic - it either completes fully or nothing changes. Install a package, it fails? Your system is exactly as it was before you tried. No partial state. No cleanup needed. No residue.

For a human, that’s convenient. For an AI agent, it’s transformative.

An AI agent can try installing a dependency, see it fail, and immediately try an alternative - without first having to figure out what half-installed artifacts the failed attempt left behind. The error surface is just the error message itself, not the error plus a corrupted environment.

# AI agent tries to add a package
$ nix profile install nixpkgs#python311Packages.torch
error: torch-2.1.0 is not available for aarch64-darwin

# System is unchanged. Agent tries an alternative.
$ nix profile install nixpkgs#python311Packages.tensorflow
# Works. No cleanup from the failed attempt needed.

Compare this to pip install torch failing with a C compilation error after already installing three subdependencies. Now the AI agent needs to figure out which packages to uninstall, whether they’re still needed by something else, and whether the environment is even in a usable state. That’s not a mistake recovery - that’s a debugging session.

Here’s the difference in state flow:

flowchart LR subgraph trad ["Traditional Package Manager"] direction LR t1([Try Install]):::blue --> t2([Fail Halfway]):::red t2 --> t3([Broken State]):::red t3 --> t4([Cleanup?]):::orange t4 -->|works| t5([Retry]):::blue t4 -->|fails| t6([Worse State]):::red t5 --> t1 end subgraph nix ["Nix"] direction LR n1([Try Install]):::blue --> n2([Fail]):::orange n2 --> n3([Clean State]):::green n3 --> n4([Retry]):::blue n4 --> n1 end classDef blue fill:#4a90d9,stroke:#357abd,color:#fff classDef red fill:#d94a4a,stroke:#bd3535,color:#fff classDef orange fill:#d9944a,stroke:#bd7a35,color:#fff classDef green fill:#4ab866,stroke:#35994d,color:#fff

The traditional path has two extra steps - and neither is guaranteed to work. “Cleanup” after a half-failed install is best-effort. Sometimes apt-get can unwind what it did. Sometimes it can’t. Sometimes the cleanup itself fails and leaves the system in a worse state than the original failure. An AI agent stuck in that loop has no reliable way forward.

With Nix, that entire failure mode doesn’t exist. A failed operation leaves the system exactly where it was. Not “probably where it was.” Exactly.

Rollbacks: Instant Undo

NixOS keeps every previous system configuration as a “generation.” Made a change that broke something? Roll back in one command:

# AI agent updates system configuration
$ sudo nixos-rebuild switch

# Something is wrong - web server won't start
$ sudo nixos-rebuild switch --rollback

# Back to the previous working state. Instantly.

This isn’t “restore from backup.” There’s no downtime, no data loss, no hoping the backup is recent enough. The previous generation is sitting right there, complete and ready. You can even select it from the boot menu if things went really wrong.

flowchart LR g1["Gen 1: nginx 1.24, postgres 15"]:::green g2["Gen 2: nginx 1.25, postgres 15"]:::green g3["Gen 3: nginx 1.25, postgres 16"]:::red g1 -- rebuild --> g2 g2 -- rebuild --> g3 g3 -. rollback .-> g2 g2 -. rollback .-> g1 classDef green fill:#4ab866,stroke:#35994d,color:#fff classDef red fill:#d94a4a,stroke:#bd3535,color:#fff

Every generation is a complete, bootable snapshot. The AI agent doesn’t need to figure out how to reverse individual changes - it just steps back to a known-good state.

For AI agents managing infrastructure, this changes the risk calculus entirely. An AI agent that can roll back a failed deployment in seconds is fundamentally different from one operating on a system where every change is permanent. The first agent can be bold - try configurations, test changes, experiment. The second has to be extremely conservative because every mistake is expensive.

The same principle applies at the package level. nix profile keeps a history of every change. Installed the wrong version?

# See what changed
$ nix profile history

# Roll back one step
$ nix profile rollback

No need to remember what was installed before. No need to manually reverse changes. The system remembers, and the AI agent can ask it.

Project Isolation: Go Wild, Stay Safe

Here’s the property that matters most for AI coding assistants.

When an AI agent works inside a nix develop shell, everything it does stays local to that project. It can install packages, change versions, add dependencies - none of it touches the global system. When the shell exits, the system is exactly as it was before.

# Enter project environment
$ nix develop

# AI agent experiments freely
[nix] $ python --version  # Python 3.11 for this project
[nix] $ pip install some-experimental-library
[nix] $ # try things, break things, explore

# Exit - global system unchanged
$ exit
$ python --version  # Still whatever was here before

This is the difference between giving an AI agent a sandbox and giving it access to your system. In a Nix environment, the agent can be as aggressive as it needs to be - trying packages, testing combinations, exploring solutions - without any risk of polluting your global system or breaking other projects.

Traditional package managers install globally by default. pip install puts packages somewhere shared. npm install -g modifies global state. An AI agent using these tools is always one bad command away from affecting everything else on the machine.

With Nix, the blast radius of any mistake is the project shell. Nothing more.

What Cheap Mistakes Enable

When mistakes are cheap, behavior changes.

A human developer who knows they can roll back is more willing to try bold refactors. A chef with a well-stocked pantry is more willing to experiment with recipes. And an AI agent operating in a Nix environment is more willing to explore solutions.

This isn’t theoretical. AI coding assistants already work by trial and error. They generate code, run it, see if it works, and iterate. The faster that feedback loop runs, the better the results. Nix accelerates the loop by removing the cleanup step between iterations.

flowchart LR subgraph trad ["Traditional Workflow"] direction LR a1([Try]):::blue --> a2([Fail]):::red a2 --> a3([Clean Up?]):::orange a3 -->|works| a4([Diagnose]):::orange a3 -->|fails| a6([Unknown State]):::red a4 --> a5([Retry]):::blue a5 --> a1 end subgraph nix ["Nix Workflow"] direction LR b1([Try]):::blue --> b2([Fail]):::orange b2 --> b3([Retry]):::blue b3 --> b1 end classDef blue fill:#4a90d9,stroke:#357abd,color:#fff classDef red fill:#d94a4a,stroke:#bd3535,color:#fff classDef orange fill:#d9944a,stroke:#bd7a35,color:#fff

That cleanup step isn’t just extra work - it’s unreliable. It’s a best-effort operation that sometimes fails, sometimes leaves residue, and sometimes puts the system in a state that’s different from both “before” and “after.” The developer (or AI agent) doesn’t even know if the environment is clean enough to retry. “Wait, what did that failed install leave behind? Is my environment still clean? Do I need to nuke the virtualenv and start over?”

With Nix, the answer is always the same: your environment is fine. Just try again. That’s not best-effort - it’s guaranteed by how Nix works.

At infrastructure scale, the impact is even bigger. Imagine an AI agent managing a fleet of NixOS servers. It rolls out a configuration change. Monitoring detects increased error rates. The agent rolls back the entire fleet to the previous generation

automatically, in seconds, with zero ambiguity about what “previous state” means.

That’s not science fiction. NixOS deployments already support this. The AI agent layer just makes the decision loop faster.

# AI agent deploys new configuration across fleet
$ deploy .#webservers

# Health check fails on 3 out of 10 nodes
# Agent decides to roll back
$ deploy .#webservers --rollback

# Entire fleet back to known-good state
# Total time: seconds, not hours

Compare this to an AI agent managing traditional servers with Ansible or manual configuration. Rolling back means running a different playbook (if one exists), hoping the playbook accounts for all side effects, and praying nothing changed in between. The cost of a mistake is high, so the agent must be conservative.

Conservative AI is slow AI. And slow AI is less useful AI.

The Safety Net Principle

I wrote recently about giving AI stable ground to stand on - reproducible environments where the only variable is the AI itself. This post is the other side of that coin.

Stable ground means the environment doesn’t shift under AI’s feet. A safety net means when AI jumps and falls, it bounces back instead of crashing.

Nix provides both. Reproducibility gives you the stable ground. Atomicity, rollbacks, and isolation give you the safety net.

Honest Caveats

I should be honest: Nix doesn’t make mistakes free. Nothing does.

Nix has a real learning curve. The language is unfamiliar, the error messages can be cryptic, and the documentation has gaps. Getting started is harder than apt-get install. For some teams and some projects, that upfront cost isn’t worth it.

And Nix doesn’t protect against every kind of mistake. It won’t stop an AI agent from writing bad SQL that corrupts your database, or from deploying application logic that’s just wrong. Nix handles the infrastructure layer - packages, system configuration, environment state. The application layer is still on you.

But for the class of problems it does solve - broken environments, half-installed packages, irreversible system changes, global pollution from project-specific dependencies - it gets you closer to cheap, reversible mistakes than any other tool I’ve used. Significantly closer.

And for AI agents specifically, that gap matters. An AI that can experiment without fear of breaking its environment iterates faster. An AI that can roll back a bad change in one command recovers faster. Those faster loops compound into better results.

Not every team needs Nix. But if you’re giving AI agents the keys to your development environments or your infrastructure, it’s worth asking: how expensive is a mistake right now? And could it be cheaper?

I’ve been building with Nix for over a decade - long before AI agents existed. Turns out the properties that make systems reliable for humans make them powerful for AI too. Nix isn’t for everyone, but for those willing to invest in the learning curve, it’s a remarkable safety net. Let’s connect if you’re exploring this intersection.