We Thought Loop Engineering Was Just Burning Money
We kept hearing about loop engineering. It was everywhere — YouTube videos, AI newsletters, Twitter threads. "Maker-verifier loops." "Agentic iteration." "Let your AI agents argue with each other." We probably watched 10 to 15 videos trying to understand why everyone was so excited about it.
And for the longest time, we thought it was just a fancy way to burn tokens.
Here's why. We'd already built something serious. Our AI team — you know, the Smart Minions — was working. We have a PM agent that reads Jira tickets and plans the work. A dev agent that writes code. A QA agent that reviews everything and catches bugs before it ships. They work in pipelines. PM approves, dev builds, QA tests, it ships. We have a whole separate team that generates content every night — articles, challenge scenarios, venture studio journeys. The system works. It's been building real features, real code, shipping to production for months.
So when someone says "now add a loop where AI argues with itself for 10 rounds," our honest reaction was: why? We already have agents doing complex work in sequence. Adding more rounds of back-and-forth felt like adding a meeting to a system that was already delivering results. The critics called it "token burning" and we nodded along. Made sense to us.
But we couldn't let it go. Something about it kept nagging. The people we respected — not the YouTube influencers, but the people actually shipping AI products — kept mentioning loops as a real tool they reached for in certain situations. Not all situations. Certain ones.
So we did what we always do. We picked a fight. We opened Gemini and said "convince me this isn't just expensive nonsense." Got a polite theoretical answer. Pushed back. "Give me something real." Opened ChatGPT. Same dance. We went back and forth across multiple sessions, multiple days. Not casually — we were genuinely trying to either prove it or kill it so we could move on.
The moment it clicked was embarrassingly simple. We asked ourselves: "When do we NOT know what the answer should look like?"
That was the whole thing. Our agent pipelines work beautifully when we know what we want. "Build this feature." "Write this article." "Review this code." The task is defined. The output shape is known. The PM knows what to approve, the dev knows what to build, QA knows what to test. It's execution. And for execution, you don't need a debate. You need a clear pipeline.
But what about when we don't know what we want? What about the genuinely open-ended problems — like "what tool should we build next?" That's not a task. That's a search. And a search done in one pass just gives you whatever the AI thinks of first. It looks polished. It sounds smart. But nobody tested it. Nobody said "wait, does this already exist?" or "would anyone actually use this more than once?"
That's where loops belong. Not everywhere. Just there — in the space between "we don't know what to build" and "now we're confident enough to build it."
So we found our test case. Every night, we have an agent that searches the internet and tries to figure out what new tool we should build for our platform. For months, it kept coming back with the same type of idea — accounting tools, receipt trackers, quarterly tax calculators. Night after night after night. Because we'd hardcoded the search space, and nobody was pushing back on what it found.
We rebuilt it as a maker-verifier loop. Five rounds. Each round, a maker searches the web, proposes something, and immediately gets challenged by a verifier. "What's your weakest claim?" "Name the tool that already does this." "Is this something people use weekly, or is it a novelty they try once and forget?" If the idea dies in round two, good. We just saved ourselves from building the wrong thing.
The first night we ran it, it came back with something none of us had considered. Small business owners — freelancers, contractors, cleaners — freeze when they need to have an uncomfortable conversation with a client. Late payment. Price increase. Firing a difficult customer. They know what they need to say but can't get the wording right. They don't need a lawyer for an $800 invoice. They just need someone to write the message so they can copy, paste, and send it before they lose their nerve.
The verifier challenged it. Checked for competitors. Tested feasibility. Scored it. Final verdict: BUILD. We created the Jira ticket that same night. We're calling it Tough Talk.
Here's what we actually learned from this whole journey. Loop engineering isn't a replacement for what we were already doing. It's not better. It's for a different kind of problem. Our pipelines — PM, dev, QA, ship — that's for when we know the destination. Loops are for when we're still figuring out where we're going. One is a team executing a plan. The other is a team exploring a map.
We ran it that night and it found a tool worth building — on the very first try. That felt good. But we're not going to pretend we've figured out loop engineering completely. This is just our understanding so far. Execution vs. discovery. Pipelines for known work, loops for open-ended searches. That framing made sense to us and gave us a real place to apply it.
Maybe there's more to it. Maybe we'll find other use cases as we keep exploring. But for now, this is what clicked — and it was enough to stop dismissing it and start using it.