Software That Writes Itself: Pi, Four Tools, and the Death of Plugins

A coding agent with just 4 tools powers the fastest-growing GitHub repo in history. Pi doesn't download extensions — it builds them from scratch.

39 min read

39 min read

Blog Image
Blog Image
Blog Image

Here's a sentence I want you to sit with: someone ran Doom inside a coding agent's terminal interface. Not in a browser. Not in an emulator. Inside the TUI of a minimal agent called Pi. Mario Zechner, Pi's creator, proved it was possible — not because it was useful, but because when your architecture is simple enough, the ceiling for what you can build on top of it is effectively infinite.

Pi has four tools. Read. Write. Edit. Bash. That's the entire toolset. The shortest system prompt of any known coding agent. No browser automation built in. No file search. No git integration. No MCP. Four tools, and the expectation that if you need anything else, the agent will build it for you.

This spartan little engine is what sits underneath OpenClaw — the project that went from 9,000 to 195,000 GitHub stars in 66 days, making it the fastest-growing repository in GitHub's history. While the world fixated on OpenClaw's ability to run as a WhatsApp bot or Telegram agent, almost nobody talked about the thing that actually matters: the architectural thesis underneath. The thing that makes all of it possible.

Pi isn't a product. It's an argument. And the argument is this: software that writes itself is no longer theoretical. It's a daily workflow for some of the best engineers alive.

Less Is More (No, Actually)

"Less is more" is one of those phrases people deploy when they can't be bothered to build features. That's not what's happening here. Pi's minimalism is a deliberate architectural decision with specific consequences that matter enormously.

Most coding agents ship with dozens of tools. File search, grep, web fetch, git operations, browser control, various model-specific integrations. Each tool is a surface the model needs to understand. Each one consumes context window space. Each one is another dependency that can break, another thing that needs updating when APIs change, another thing the LLM might reach for when a simpler solution exists.

Armin Ronacher — creator of Flask, CTO of Sentry, and one of the most respected developers in the Python world — uses Pi as his primary coding agent. His explanation for why he's obsessed with it comes down to what's absent rather than what's present:

"Pi is interesting to me because of two main reasons. First of all, it has a tiny core. It has the shortest system prompt of any agent that I'm aware of and it only has four tools: Read, Write, Edit, Bash. The second thing is that it makes up for its tiny core by providing an extension system that also allows extensions to persist state into sessions, which is incredibly powerful."

The four tools aren't a limitation. They're a foundation. Read lets the agent understand what exists. Write lets it create new things. Edit lets it modify what's there. Bash lets it execute anything the operating system can do. If you think about it, that's the entire POSIX philosophy compressed into an agent interface. Every piece of software ever written was ultimately produced using some variation of those four capabilities.

And then there's a quality you rarely hear discussed in agent comparisons: Pi is written like excellent software. It doesn't flicker. It doesn't consume excessive memory. It doesn't randomly crash. Ronacher specifically calls this out — Zechner is "someone who takes great care of what goes into the software." In a field drowning in hastily-assembled agent frameworks held together with prompt engineering and hope, Pi is engineered. There's a difference.

What makes the minimalism interesting isn't the minimalism itself — it's what the minimalism enables.

The Agent That Extends Itself

Pi's extension system is where things get genuinely strange. Because the agent can read, write, and execute code, and because Pi ships with documentation and examples that the agent itself can access, Pi can build its own extensions. Not download them. Not install them from a marketplace. Build them from scratch, hot-reload them into the running session, test them, fix what's broken, and iterate until they work.

This is not a theoretical capability. Ronacher has built an entire suite of extensions for his Pi setup, and he's explicit about how they were created: none of them were written by him. They were all created by the agent to his specifications. He told Pi what he wanted. Pi built it. When something didn't work, he clarified. Pi fixed it.

Look at what he's running daily, all available in his agent-stuff repository:

  • /answer — Extracts questions from the agent's response and reformats them into a clean input interface. Ronacher doesn't use plan mode, so he needs a different mechanism for the back-and-forth of clarifying questions. The agent built him one.

  • /todos — An agent-managed task list stored as markdown files. Both the human and the agent can manipulate it, and sessions can claim tasks to mark them as in progress.

  • /review — Code review that exploits session tree branching. The agent reviews code in a separate branch before the human ever sees it, then brings findings back to the main session. Modelled after Codex's review interface.

  • /control — One Pi agent sends prompts to another. A multi-agent system with zero orchestration overhead.

  • /files — Lists all changed or referenced files in a session, with quick-look previews, VS Code diffs, and Finder reveal. Press shift+ctrl+r to quick-look the most recently mentioned file.

Beyond slash commands, Pi extensions can render custom TUI components directly in the terminal: spinners, progress bars, interactive file pickers, data tables, preview panes. This is where the Doom stunt becomes relevant — if the TUI framework is flexible enough to render a first-person shooter, it can certainly render a debugging dashboard or a deployment status board.

Ronacher also replaced all his browser automation MCPs with a single skill that uses raw Chrome DevTools Protocol. Not because the alternatives were bad, but because — in his words — "this is just easy and natural. The agent maintains its own functionality."

Read that last sentence again. The CTO of Sentry, a company that processes billions of error events, replaced his entire browser automation stack with something his AI agent wrote for him. And it works. And when it breaks, the agent fixes it. And when he doesn't need a tool any more, he throws it away. No uninstall process. No dependency cleanup. Just delete and move on.

Session Trees: Git for Conversations

Here's a problem every coding agent user has encountered: you're 40 messages into a productive session, and something breaks. A tool misbehaves. A dependency is wrong. A side-quest emerges. You need to fix it, but fixing it means burning context on debugging output that has nothing to do with your actual task. By the time you've resolved the issue, you've lost the thread of what you were doing, and the model's context window is cluttered with irrelevant token spent on a tangent.

Pi solves this with session trees. Sessions aren't linear sequences of messages — they're trees. You can branch at any point, go on a side-quest (fix the broken tool, investigate a bug, experiment with an approach), and then rewind back to where you were. Pi summarises what happened on the other branch, so you maintain awareness without polluting your main context.

This is git for conversations. Branch, do work, merge the knowledge back. Discard what you don't need. It's such an obviously correct model that it's genuinely surprising more agents haven't adopted it.

But session trees do something even more important for self-extending agents: they make the extension-writing loop safe. If the agent writes a broken extension, you branch, fix it on a side-quest, verify it works, then rewind to the main line. The main session never saw the breakage. The main context never got cluttered with stack traces and debugging output. You just resume where you left off, with a working tool and a clean context window.

There's another subtlety here that matters for the longer term. Sessions in Pi can contain messages from multiple model providers. The system deliberately avoids leaning into any model-provider-specific feature set that can't be transferred to another. This means your session history — your branching tree of work, your extension state, your todo lists — isn't locked to Claude or GPT or Gemini or DeepSeek. It's yours. Custom messages in the session files store extension state without ever being sent to the AI, ensuring your workflow data stays local and portable.

The portability is architectural, not aspirational. Pi was designed from the start for a world where you might switch models mid-session.

The Anti-MCP Manifesto

Pi doesn't support MCP. Not "doesn't support it yet." Not "support is on the roadmap." No MCP support, by design, with no plans to add it.

This is the most contrarian architectural decision in the agent space right now, and it might also be the most correct one.

The Model Context Protocol has become the industry's default answer to "how do we give agents access to external tools?" Anthropic created it. OpenAI adopted it. Every major agent framework is building MCP servers. The assumption is universal: MCP is how agents will interact with the world.

Pi says no. And Ronacher has been vocal about why. His argument, laid out in his "Tools: Code Is All You Need" post from July 2025, is empirical rather than ideological: MCP consumes too much context, isn't truly composable, and demands significant upfront input for every invocation. His test is simple — try completing a GitHub task with the GitHub MCP, then try it with the gh CLI tool. The CLI will use context more efficiently and get you to your result faster, every single time.

But Pi's rejection of MCP goes deeper than efficiency metrics. It's philosophical. If your agent can write code, and your agent can execute code, then your agent can build whatever integration it needs on the fly. Why would you download a pre-built MCP server for browser automation when your agent can write a script using CDP directly? Why would you install a GitHub MCP when gh already exists and the agent knows how to use Bash?

Every tool you load into the system prompt is context that isn't being used for your actual problem. Pre-built tool specifications are a tax on your agent's attention. Pi pays no such tax. It starts empty and builds what it needs.

When Pi users need MCP compatibility — say, because an organisation has already invested in MCP infrastructure — they use mcporter, a CLI bridge that exposes MCP calls as command-line tools. The agent doesn't need to know MCP exists. It just calls a CLI. Problem solved, zero context overhead, no tool specifications bloating the system prompt. It's an adapter, not an integration.

This is a genuinely radical position: the entire tool-specification approach to agent extensibility is wrong. The correct approach is to give the agent the ability to write and run code, then let it build whatever it needs. The specification is the code. The tool is the code. The integration is the code. Code is all you need.

Three Friends in Vienna

There's a human story underneath all of this architecture. Mario Zechner built Pi. Peter Steinberger built OpenClaw. Armin Ronacher became Pi's most prominent evangelist and power user. They're friends. They're all connected to the Vienna and Austrian tech scene. They've been pushing each other's thinking on agents for years.

Steinberger describes his approach as "sci-fi with a touch of madness." Zechner, by contrast, is what Ronacher calls "very grounded." The difference shows up in their projects: OpenClaw is the ambitious, user-facing layer — messaging integrations across WhatsApp, Telegram, Discord, and Signal; channels, skills, memory, the thing that makes your AI agent feel like a person in your contacts list. Pi is the engine underneath — minimal, precise, disciplined.

The relationship between Pi and OpenClaw matters because it proves something important: you can build a wildly popular, feature-rich consumer-facing product on top of a four-tool foundation. OpenClaw didn't need Pi to have built-in browser control, or native Git support, or an MCP server. It just needed Pi to be able to write code, run code, and extend itself. Everything else is emergent behaviour.

Pi is also explicitly designed as a collection of components you can build your own agent on top of. That's how OpenClaw works. That's also how Ronacher built his own personal Telegram bot, and how Zechner built mom, another agent variant. If you want to build your own agent connected to some service, you point Pi at itself and at mom, and it'll create one for you. The agent builds agents. That sentence should make you pause.

On 14 February 2026, Steinberger announced he was joining OpenAI, with OpenClaw moving to an open-source foundation. The project's 195,000-star trajectory will continue under community governance. But the real story was never OpenClaw's popularity. It was that a minimal coding agent, built by a careful engineer in Austria, turned out to be a sufficient foundation for an entire new category of software.

When Software Writes Its Own Tools

Let's think about what it actually means when an agent can fabricate its own tooling.

First, plugin marketplaces become vestigial organs. The entire model of "build a tool, publish it to a store, let others discover and install it" assumes that building the tool is the hard part and distribution is where the value lies. But if any user can say "build me a tool that does X" and get a working implementation in 3 minutes, the marketplace adds nothing. Why would I download someone else's browser automation skill when my agent can write one tailored to my exact workflow, my exact authentication setup, my exact error-handling preferences?

Ronacher makes this point viscerally. He has quite a few skills in his setup, and crucially, he throws skills away when he doesn't need them. He built a skill to read Pi sessions that other engineers shared, which helps with code review. He has a skill for commit message formatting and changelog updates. He built a skill to redirect pip and python calls to uv instead. All disposable. All replaceable. All built by the agent, not downloaded from anywhere.

Second, vendor lock-in dissolves. If your tools are written by your agent, to your specifications, using standard protocols and CLIs, you're not dependent on any particular vendor's tool ecosystem. Your agent can rewrite its own integrations to work with a different provider overnight. The switching cost approaches zero because the implementation is disposable — only the specification of what you wanted matters.

Third — and this is the implication most people in the agent space haven't yet internalised — the bottleneck shifts permanently from implementation to specification. When the cost of producing software approaches zero, the only differentiator is the quality of the description. How precisely can you articulate what you want? How clearly can you define the edge cases? How well do you understand the problem you're solving?

This connects directly to Nate B Jones' thesis on specification-driven development: as production costs collapse toward zero, specification becomes the entire game. The person who can clearly articulate "I want a browser automation skill that uses CDP, handles authentication cookies, supports session persistence, and retries on network failures with exponential backoff" gets exactly that in minutes. The person who says "make me a browser thing" gets something generic and unsatisfying. Same agent. Same four tools. Wildly different outcomes. The variable is the human's ability to specify.

Ronacher captures this perfectly:

"Part of the fascination that working with a minimal agent like Pi gave me is that it makes you live that idea of using software that builds more software. That taken to the extreme is when you remove the UI and output and connect it to your chat. That's what OpenClaw does."

Software that builds more software. Not as a marketing slogan. As a daily workflow. As the new normal for how serious engineers are already working in early 2026.

This has implications for hiring, for education, for how we think about software engineering as a discipline. If the implementation bottleneck vanishes, we need people who are excellent at specification — at understanding problems deeply enough to describe solutions precisely. We need product thinkers, not just code writers. We need people who can articulate failure modes and user experience requirements with the same rigour they currently bring to writing functions. The irony is that this is what good software engineering was always supposed to be. We just got distracted by the typing.

What Comes Next

I've spent 26 years in software and ecommerce. Long enough to recognise when something fundamental has changed and the industry hasn't caught up yet. The shift isn't OpenClaw's star count, impressive as it is. It isn't the Moltbook frenzy. It isn't even Peter Steinberger joining OpenAI.

The shift is this: three friends in Vienna built a minimal agent with four tools, and it turned out that four tools is enough. Enough to build a browser automation suite. Enough to build a code review system. Enough to build a task manager. Enough to build a multi-agent orchestrator. Enough to run Doom, if you're feeling frivolous. Enough to power the fastest-growing open-source project in the history of GitHub.

Read. Write. Edit. Bash. Everything else is emergent.

The old model — build tools, package them, distribute them through marketplaces, hope someone downloads them — is already looking like an artefact of a different era. Not because the tools were bad, but because the assumption underneath them (that building tools is hard and time-consuming) is no longer true. When your agent can fabricate whatever tool it needs, on demand, tailored to your exact requirements, the entire supply chain of pre-built integrations starts to look like a detour around a problem that no longer exists.

Pi proves that the minimal viable agent is far smaller than anyone assumed. And if your foundation is small enough, everything above it becomes fluid — modifiable, disposable, self-repairing. The agent writes a tool. The tool breaks. The agent fixes the tool. The tool becomes obsolete. The agent discards it and writes a better one. No release cycle. No dependency management. No pull requests. Just specification and execution, looping indefinitely.

Software that writes itself isn't a metaphor any more. It's a GitHub repository with a four-tool system prompt and a growing community of engineers who've stopped writing their own extensions by hand. They describe what they want instead. The machine handles the rest.

The only question left — the only question that was ever going to matter — is how well you can describe what you want built.

Explore Topics

Icon

0%

Explore Topics

Icon

0%