← All writing

The Sum of All Tokens

By James Webster · 14 April 2026 · 22 min read

There is a line of argument doing the rounds online that goes something like this: why would anyone pay for software any more when you can just prompt it into being? SaaS is over. Licences are dead. The future is a world of bespoke tools, each one generated on demand, thrown away the moment it stops being useful.

It is half true, and that is what makes it interesting.

I have spent the last year cancelling SaaS subscriptions at sheepCRM and replacing them with things we built ourselves. I have also spent the last year starting projects that felt promising for about ten percent of their length and then stalled, half finished, in a folder I do not look at often. Both of those are real. The argument I want to make is not that SaaS is dead, and not that AI-generated replacements are a fantasy, but that the two are now in genuine competition and the competition plays out shape by shape, not all at once. Some SaaS is already finished. Some of it is safer than it has ever been. Telling them apart is the actual skill, and it is much harder than "prompt a replacement and see what happens."

🔧 The Fathom replacement

Until recently we paid for a product called Fathom. It is a finance tool that sits on top of Xero and produces cash-flow forecasts, dashboards, scenario models - the things you want when you are running a small software business and trying to keep an honest eye on the runway. Fathom is a genuinely good product. I liked using it. I recommend it to people who need what it does.

We cancelled it earlier this year because we built something that fits us better.

The thing worth noticing is why it fits better. It is not because our version is slicker or more beautiful than Fathom. It is not. It is because our version has access to data Fathom does not. Our pricing model. Our project pipeline. Decisions that have already been made internally but will not show up in the accounting system for weeks. And a specific internal abstraction we use to describe project complexity, a points system where a 60-point project and a 120-point project mean very particular things to us and nothing at all to a generic finance tool. Fathom cannot reach those numbers because they do not exist in its schema. They live in our heads and in our own tooling.

When you pull that together, Fathom's output is a shadow of ours. Not because Fathom is badly built. Because we were asking Fathom to infer what we already knew.

The eighty percent Fathom did for us was easy enough to recreate. Cash-in, cash-out, a rolling projection, scenario toggles. None of that is hard when you have a week of evenings and an AI assistant that can wire the chart library up while you go and make the tea. The twenty percent that made it worth building was the bit Fathom could never have sold us in the first place: the shape of our particular business, the numbers only we know, the assumptions only we hold.

AI did not kill Fathom. Our own data did. AI just made acting on that realisation cheap enough to bother.

🐤 Cloud Canary and the MCP angle

The second example is different in shape, which is why I want to tell it next.

A few years back we cancelled Datadog. Mainly on cost - Datadog prices itself at enterprise scale and we are not at enterprise scale - but with real regret. Datadog is a brilliant product. Anyone who has lived inside a proper Datadog dashboard, with everything stitched together and every log line within one click of the metric that drew your attention to it, knows the feeling. We missed it every time a customer flagged something and we had to go hunting through three different log streams in three different places to work out what had happened.

Late last year I spent a day prompting Claude Code to build us a replacement. I called it Cloud Canary. The first version was rough but usable inside a day. Over the following few weeks I spent maybe another three or four days on it in odd evenings, fixing the things that did not quite fit. It is now the tool we use every day. It ingests our log streams, brings them into one place, and gives us a searchable web interface on top.

That part is not interesting. That part is what Datadog already does, only better. The interesting part is the MCP layer.

What I actually wanted, and what a SaaS vendor was never going to build for me in 2026, was the ability to say to another AI tool: "A customer is reporting the following problem. Go and search the logs. Come back with possible causes and candidate fixes." Cloud Canary's MCP interface does that. The logs are addressable by agents. The interface is not a human with a mouse; it is a peer AI, which is a different design problem and a different product.

I did not build Cloud Canary because I could do it better than Datadog. I cannot. I built it because I could do enough of it, shaped for us, with the AI-native hook that mattered to me, at a cost that made sense. And because Datadog would have charged us the wrong amount for the wrong product from our point of view.

💰 Software is not free. It is the sum of all tokens, plus something harder.

The lazy version of the "AI kills SaaS" argument treats software as if it is about to become free. It is not. The tokens cost money. The compute costs money. The time you spend prompting, rejecting, re-prompting and reviewing costs more than either. What has actually changed is not that software has become free; it is that the cost of software has stopped being a licence fee and started being the sum of your token spend plus the value of the human direction feeding those tokens.

Title this piece The Sum of All Tokens if you like, but the tokens are the easy half of the sum. The hard half is the direction they are pointed in.

Anyone who has worked with a modern coding assistant for more than a weekend has noticed the same thing. Give it roughly the same prompt twice and you get two different answers. Give it the same prompt on two different days and you might get four. Each answer is plausible. Each answer is, in some reasonable sense, correct. But they are not the same piece of software, and they are not reaching the same peak.

Without a human steering, what you get out of an AI generation is a local maximum. It works, more or less, in the shape the model happened to produce this time. It can even look impressive. What you do not get, without sustained human direction, is the global maximum - the version a domain expert would look at and recognise as right. That version is not available on a single prompt. It needs taste, iteration, and the long, slow work of telling the model "no, not that, try again" until the thing in front of you matches the thing in your head.

This is what I meant in The Long Obedience when I quoted Nietzsche's line: "The essential thing 'in heaven and in earth' is that there should be a long obedience in the same direction; there thereby results, and has always resulted in the long run, something which has made life worth living." I wrote that about fifteen years of sheepCRM. It applies just as cleanly to the thirty-sixth revision of a prompt. The obedience is the direction that turns a plausible draft into a fitted tool. Without it, you are standing on a local hill wondering why you can see a bigger one in the distance.

🗂️ Seven shapes of SaaS

The argument about AI and SaaS is being had at the level of the category, and that is where it goes wrong. "SaaS" now covers everything from Stripe to a reporting dashboard, and there is no honest conversation to be had about what survives until you pull those apart.

I want to avoid the word "tier" here. Tiers stack on top of one another, and what I am describing are not layers of a stack. They are different shapes of product, each surviving or not for different reasons. They sit next to each other, not above and below.

Here are the seven shapes I have ended up thinking in.

🧱 Commercial-foundational shapes

Stripe. AWS. Twilio. Cloudflare. The core public cloud databases. These are the rails the rest of us build on. Their moat is not really the code; it is that a card can be charged in a hundred jurisdictions, that your messages actually get delivered, that the infrastructure stays up at 3am on a bank holiday. They could, in theory, be competed with. In practice nobody reasonable does it, because the cost of building the equivalent trust and compliance footprint is enormous and the reward for succeeding is that you are now Stripe, or nearly.

This shape is safer from AI generation than it has ever been, because the rest of us are offloading more of our plumbing to it as we build custom things above it.

🏛️ Regulated-foundational shapes

Companies House. HMRC's Making Tax Digital APIs. DVLA. The NHS's Spine. The Gov.UK identity pieces. The reason this shape survives is not that it has outcompeted alternatives; it is that there are no alternatives. A monopoly by legislation, not by market. You cannot replace HMRC with a clever weekend project, and if you built one it would still not be HMRC.

Worth separating from the commercial-foundational shape because the mechanics are different. A commercial foundation is safe until someone crosses an enormous moat. A regulated foundation is safe until a government decides to stand up a second one, which is not a thing that happens. From a builder's point of view, both are "do not try to replace," but the reasons, and therefore the failure modes, are not the same.

🧠 Deep-expertise shapes

Xero. Specialist clinical coding tools. Serious legal platforms. The obvious-looking parts of these products - filing a VAT return, rendering a P&L - are in principle doable by a sufficiently motivated builder with a good AI and a long weekend. Making Tax Digital is a well-specified XML schema. It is exactly the kind of tedious spec a human hates reading and an AI devours.

The non-obvious part is harder. How accountants want their books laid out is decades of accumulated taste, informed by real arguments with real clients about how a purchase invoice should sit next to a credit note. That is not a spec. That is a million small judgements. The surface of Xero is cherry-pickable. The core is not, unless you happen to carry the equivalent taste yourself, and most of us do not.

🛡️ Adversarial shapes

These are the products where the problem keeps moving. A one-shot AI build will be beautiful on launch day and leaky by the end of the quarter, because the attackers have adapted and your code has not. This is the shape where long-obedience compounding matters most. The vendors in it are not selling you a product so much as a continuing commitment to stay ahead of an enemy.

You can absolutely build a shallow replacement for any of these on a weekend. You will not build one that still works next year without becoming a specialist yourself.

👥 Collaboration shapes

Slack, Notion, Figma, Linear. GitHub. These look like candidates for replacement until you remember that the reason they work is that everyone you work with is already on them. You cannot prompt a network into being. Even if you build a technically superior clone of Slack for free, nobody joins it, and a communication tool nobody joins is not a communication tool.

But I want to be honest about this shape, because I was close to calling it "extremely safe" and that is lazy. Network effects are not permanent. The same mechanism that makes a product unassailable once it is the default can eject it from that position faster than anyone expects if the herd decides to move. We have seen the pattern in the last two years in coding tools specifically: a market leader emerges, becomes the thing everyone talks about, and then is displaced eighteen months later when the conversation shifts to a new provider. Once enough feet are pointing at the exit, the stampede does the rest.

Figma is the live example worth watching. The Adobe acquisition was blocked. AI-native design-to-code tools - v0, Bolt, Lovable, Framer - are actively trying to collapse the design-to-handoff-to-build dance into a single prompt, which would make a lot of what Figma sells less central. Penpot circles on the open-source side. Figma is not falling today. It might not fall at all. But "safe because of the network" does not mean safe forever, and the collaboration shape is one where fashion can turn a fortress into a former fortress surprisingly quickly. The network effect is real. So is the stampede.

🔗 Integration shapes

Zapier, n8n, Make. This one is genuinely contested. Part of it is foundational - the work of keeping a hundred connector libraries up to date, across version upgrades and API deprecations, is exactly the sort of unglamorous maintenance nobody wants to do in their spare evenings. Part of it is thin-wrapper - if the integration you need is one or two APIs, you can and probably should just write it. My honest prediction is that the glue products survive for the long tail of messy integrations and lose to custom code for the simple ones.

🪣 Thin-wrapper shapes

This is the shape where SaaS is going to die, and it is the shape my Fathom and Cloud Canary examples both sit in. Tools whose job is essentially to take data you already have, shape it, and show it back to you, with maybe a few sensible defaults on top. None of that is protected. All of it is within reach of a competent person with a weekend and a model subscription.

Thin-wrapper is not an insult. Some of these products are beautifully made, and some of them are very profitable right now. They are going to get squeezed hard, because the thing they sell is exactly the thing AI generation is best at producing and the customer's own data is exactly the thing the vendor cannot see.

If you run a SaaS product, the important question to answer in 2026 is honestly which shape you are. Most of the anxious ones, I suspect, are in the thin-wrapper shape and telling themselves they are in the deep-expertise one.

⚠️ The scratch-the-itch problem, which I have not solved

I would like to pretend the rest of my own year looked as tidy as the Fathom and Cloud Canary stories. It did not.

I have a folder of projects that got well past the first promising ten percent and then stalled. One of them, in particular, has been nagging at me for months. It is a little tool I have been playing with in the area of ontology and naming in an AI world - memory, vocabulary, the way a group of agents and humans agree on what things are called. The project works, in the sense that it runs and does something useful. It has not landed, in the sense that it has not quite become the thing I thought it was going to be.

The temptation, when I look at that folder, is to blame the AI. The AI is not to blame. What I did was scratch the itch too early. I had a feeling, I had a weekend, I had a coding assistant that would cheerfully generate whatever I pointed it at, and I dived in before I had spent long enough actually understanding the problem. The result is a tool that is technically fine and conceptually half baked.

Here is where I am genuinely torn. Is the discipline I am missing the discipline of not starting - of sitting with a problem for longer before prompting anything? Or is it the discipline of not finishing - of pushing through the trough when a tool almost but not quite works, instead of quietly closing the laptop and moving on to the next shiny thing? I can make the case for either. I suspect the honest answer is "both, and which one matters depends on the project."

The AI landscape is moving fast enough that "try it and see" is probably the only strategy most of us can realistically execute, even the experienced ones. The scarce skill, if that is true, is not classical patience. It is taste for which shape a problem lives in, and therefore how much depth you are going to have to bring before the AI's help is worth anything. Some problems really are a weekend. Some problems are a decade. Knowing which is which, in the first hour, is the skill I am still trying to develop. I do not think I am alone.

🤖 The missing market: maintenance agents for custom tools

There is one more thing that has to be true for this whole argument to hold up over the long run, and I do not think it is true yet.

If we all start replacing thin-wrapper SaaS with little custom builds, and if the "cost" of each little custom build is genuinely just a day of prompting, we have not actually saved anything unless we also have a story for maintenance. Software rots. Dependencies drift. The Enphase UI changes. The log format moves. The SaaS vendor we were leaning on deprecates an endpoint.

The enterprise world has spotted this. There is already a crop of products in the "AI SRE" space - Bits AI SRE, Resolve AI, Rootly, incident.io's AI SRE, Azure SRE Agent, LogicMonitor's Edwin - all pointed at the problem of large-company incident response. They watch the alerts, do the triage, push the rollback, page the human when they have to. It is a serious and well funded slice of the market.

Closer to the shape I am actually describing, a UK startup called Phoebe is worth naming. Founded by Matt Henderson and James Summerfield, formerly CEO and CIO of Stripe Europe, Phoebe picked up a $17M seed in August 2025 led by GV and Cherry Ventures. I should say up front: I worked with both Matt and James as a consultant at their previous company, Rangespan, which they sold to Google in 2014. So treat what follows with whatever pinch of salt that deserves. Phoebe calls itself "the immune system for your software" - swarms of agents that read live telemetry, logs, traces and code commits, work out what is going wrong before it takes a system down, and generate preemptive fixes. It plugs into your existing observability stack read-only. Early customers are Trainline and PPRO.

Phoebe is not quite what I had in mind. Trainline and PPRO are not weekend builders; the pitch is still "immune system for production software at a company that has production software." But it is the closest thing I have seen to the right shape, and the Stripe DNA makes it worth watching. If Phoebe, or something like it, can work its way down-market from enterprise incident response to per-tool stewardship - something cheap enough to run quietly over one weekend project without demanding a contract negotiation - then the picture changes.

Because what I still have not seen is the small-codebase version. A steward for a single custom tool, the thing that watches Cloud Canary's own logs, notices regressions, reads the error trail, proposes a patch, opens a pull request and closes the loop. Cheap enough to run per-tool, not per-fleet. Happy to sit there quietly for a year looking after three or four of my weekend projects and a handful of internal tools, without demanding an enterprise contract.

If that product shows up, the total cost of ownership of a custom tool collapses, and the thin-wrapper shape gets very much wider. If it does not, then the maintenance-free monthly fee of a competent SaaS product holds value for longer than the generation-cost argument alone suggests, because the fee is not just buying features. It is buying someone else's willingness to care about the thing on Tuesday afternoon when it breaks.

I would very much like someone to build this. If you are, or if you think Phoebe is going to grow sideways into this space, I would like to hear about it.

🧭 The long obedience, honestly held

I do not want to finish this piece with a tidy answer, because I do not have one.

What I think I can say honestly is this. SaaS is not dying uniformly. The thin-wrapper shape is already being hollowed out, and the Fathom and Datadog replacements in my own life are small pieces of evidence for a much bigger pattern. The two foundational shapes, commercial and regulated, are safer than ever - the commercial ones because the rest of us are finally getting out of their way and letting them do the thing only they can do, the regulated ones because it was never up for grabs in the first place. The adversarial shape looks safe for the same reason a castle looks safe: not because it cannot be taken, but because taking it is a full-time job. The collaboration shape is protected by its network but exposed to fashion, and I would not bet on any particular name in that shape being the leader in three years even if the shape itself survives. The deep-expertise and integration shapes sit in the middle, and their fate will turn on whether enough builders develop the taste and discipline to direct AI well enough to cut into them meaningfully.

What I also think I can say is this. Software is not free, and pretending it is going to be is the mistake I see most often. Software is the sum of your token spend plus your human insight. The tokens have become astonishingly cheap. The insight is the scarce half, and it is not getting cheaper. If anything, the gap between a local maximum and a global maximum is widening, because the tools that find local maxima for you are now so good that it is easy to mistake one for the other.

Whether the discipline that closes that gap is the discipline of not starting or the discipline of not finishing, I am still genuinely not sure. Both, probably, and the ratio will be different for every project I take on this year. What I am sure of is that the direction itself - the long obedience in the same direction, the willingness to keep asking "is this actually what we want, or just what we could easily make?" - has not become less valuable because AI arrived. It has become more valuable, because without it, what AI mostly does is produce a large quantity of plausible software that nobody actually wanted.

The sum of all tokens is not a total. It is a term in a bigger equation. The rest of the equation is still us.

James Webster is the founder of sheepCRM and director of Croftsware. This piece is a companion to The Long Obedience and The Clean Interface.