The Clean Interface
I built a solar optimisation system over a weekend. It has two halves.
One half is mathematically sophisticated. A 300-variable linear programming solver, running in a small corner of AWS, picking the cheapest half-hour slots in which to charge and discharge a household battery across a twenty-four-hour window. It runs in milliseconds. It has never failed.
The other half, on paper, is simpler. It logs into a web portal and clicks a button.
It has broken three times.
That is the inversion nobody warned me about, and it is the most useful thing I learned in the whole weekend. The sophisticated part of modern software is robust now. The simple part is fragile. If you think AI has flipped the difficulty graph of software work, you are right. You are just probably wrong about which way.
🧮 The half that got easy
A bit of setup first, because it is useful later.
I am on Octopus Agile, a UK electricity tariff where the unit rate changes every half hour, tracking the wholesale market. Rates are published at 4pm for the following day. Prices swing from three pence to fifty pence per kilowatt hour, and sometimes go negative, which is a small pleasure I never quite get used to. My setup is eight kilowatts of solar on a south-facing roof in Witney, ten kilowatt hours of Enphase battery storage, and a Tesla on a Pod Point home charger. The question, forty-eight times a day, is what the battery and the car should be doing against what the sun is doing and what the household is actually consuming.
I described the problem to an AI assistant and asked the obvious question: how do you find the cheapest plan? The answer came back in a single line. This is a linear programming problem. Not a library recommendation. Not a code snippet. A whole problem-framing I did not know existed.
The distance between "I have a scheduling problem" and "this is a well-studied class of optimisation" is the distance I would once have crossed by reading papers for a fortnight. It collapsed into a single exchange. The AI knew the solver I needed (CVXPY), the solar forecast API I needed (Solcast), the local gateway I needed (Enphase Envoy). Each integration was an hour's work. The work that used to be figuring out what kind of problem this is was already done. All that was left was writing it down.
By Sunday afternoon, the LP was running in AWS, re-solving the next twenty-four hours every time the forecast or the household numbers shifted. It never breaks. It is, in the most literal sense, backed by decades of mathematical proof. This is the bit everyone expects to be hard, and it was not.
🔒 The half that didn't
The problem is that you cannot tell the Enphase battery what to do.
Enphase does not expose a public write API for the battery profile. You can read things. You can see the state of charge and the production and the import and the export. You cannot set the mode. You cannot say, in code, "charge from the grid between 2am and 5am, then stop." Their proprietary web UI is the only way, and the proprietary web UI is not designed to be driven by anyone other than a human with a mouse.
I asked the AI for alternatives. It suggested the local Envoy gateway. I tried that, and the write endpoints are not exposed. It suggested the cloud REST API. I tried that, and the battery profile endpoints are not part of the documented surface. It suggested the Home Assistant integration, and MQTT interception on the local network, and a few other things I have forgotten. Each suggestion was plausible. Each one was wrong.
It is important to be honest about why. The AI was not hallucinating. Every suggestion was a reasonable place to look. But the AI could not test any of its suggestions against a piece of hardware it had never touched, and it had no way of knowing which of the theoretically possible interfaces Enphase had chosen to lock down. The gap between what an API could do and what the manufacturer lets it do is not in any training corpus. It is a product decision, and product decisions do not leak into documentation. They leak into broken afternoons.
So I did the unreasonable thing. I stood up a headless browser in an AWS container and drove it through the Enphase web portal with Playwright. Click login. Dismiss the cookie-consent overlay. Find the battery profile page. Toggle the charge-from-grid switch, which is not a checkbox but a React component wrapper that has to be clicked at exact pixel coordinates. Set the schedule windows through three stacked dropdowns that only reveal their options when scrolled into view. Click the Apply button, which may or may not be covered by an overlay, so there are four fallback strategies including one that just sets the button's z-index to ninety-nine thousand nine hundred and ninety-nine and injects the click directly in JavaScript.
There is also a disclaimer checkbox rendered at approximately one pixel wide.
None of this is clever software. It is coordinate geometry and z-index hacks, wrapped around a fragile assumption that Enphase will not redesign their cookie consent provider next Tuesday. It breaks. I have fixed it three times. Each fix takes an hour. The correct trade-off is to continue fixing it, because the support team is me, and the alternative is paying a hundred pounds a month for the wrong product.
🎯 The clean interface is everything
Here is the generalisation, and it is the bit worth carrying out of the weekend.
AI is extraordinary at problems that have clean interfaces. A linear programming solver has a clean interface. You describe the variables, the constraints and the objective function, and the solver returns a global optimum. There is no disagreement about what the inputs are. There is no ambiguity about what a correct answer looks like. AI, in that world, is a cartographer who has already walked the whole continent and can point at any coordinate you name.
AI is considerably worse at problems without clean interfaces. A closed hardware ecosystem does not have a clean interface. A vendor's web UI does not have a clean interface. Anything that depends on what a human happened to build last Tuesday, for reasons that made sense to them at the time, does not have a clean interface. AI in that world is a very well-read tourist. Happy to recommend places it has not been. Unable to tell you whether the restaurant is still open.
This is what I meant in The Sum of All Tokens when I said that AI generation produces a local maximum and needs human direction to reach the global one. The LP is the global maximum for a well-specified problem. A solver plus the right framing is enough. But the moment the problem stops being well-specified - the moment the interface gets messy - the AI falls back to producing plausible options it cannot test, and the human work returns. Not clever work. Judging work. Picking between the four suggestions, noticing that none of them can actually reach the hardware, and walking the long way round via a browser nobody should ever need to automate.
Most people I know who are excited about AI-assisted development have the mental model the wrong way round. They imagine AI closes the hard problems and leaves them doing the easy glue. My experience is the exact reverse. The hard problems are already closed, because the hard problems have clean interfaces. It is the glue that is now where the difficulty sits, and the glue is precisely where AI can only guess.
🧭 Taste moved, it did not disappear
I wrote in The Long Obedience that taste is the thing AI cannot be trusted with. Not yet. I still think that. What the weekend clarified, though, is that taste has not stopped being required. It has moved.
Taste used to live in the sophisticated work. Choosing the right architecture. Knowing which pattern to reach for. Deciding, as a seasoned engineer, that a particular kind of problem wanted a particular kind of solution. That taste is now, in effect, installed in the model. If I ask an AI what class of problem I have, it will tell me, and it will be right. The cartographer has absorbed all the maps.
Where taste is still required is at the messy edges. Which of the four ways to reach the Enphase hardware is going to work, given that none of them look from here like they will. Whether the savings number the system is claiming is honest, or whether I am deceiving myself. When to walk away from an approach that is not converging and pick a different one. Whether a 1-pixel checkbox is a sign that you should not be using this system at all, or a sign that you should write a fallback click strategy and get on with it.
That work is smaller, sometimes, than the work AI has just done for you. It is also the work that decides whether your weekend project actually lives. You can tell the difference, looking at the built system, between the parts where taste was brought to bear and the parts where it was not. The LP is beautiful because someone cared about it. The Playwright code is ugly because someone had to write it anyway. Both are shaped by choices, and those choices were mine, and without them the whole thing would have fallen over.
AI has not removed the need for a person who can judge. It has moved the judgement from the sophisticated parts of the stack to the edges where the interfaces are dirty, and it has made that judgement cheaper to apply because everything else has become faster. That is a genuinely good deal. It is just not the deal the slogans suggest.
👤 Software for one
The last thing worth saying about the weekend is that it produced a piece of software for one person. Me. There is no second user. There will never be a second user. The dashboard is ugly. The error handling is rude. The Playwright code would make a professional engineer unwell. And the system works.
A decade ago this would not have been a reasonable thing to build. The fixed cost of standing up a scheduling system - learning the maths, picking the stack, wiring the integrations, writing the dashboard - would have been enormous relative to a saving of a hundred pounds or so a month. Generic SaaS would have been the only sensible answer, and the generic SaaS would not have existed for my exact tariff, my exact hardware and my exact house, so I would have lived with a rougher approximation.
That calculation has changed. A weekend, a pound or two a month of AWS, and an LP I did not previously know was a discipline, is enough to get a system that genuinely fits. The generic tool would have cost more and delivered less. This is the pattern I suspect we will see many times over the next few years, not just for hobby projects but for serious ones. Small teams solving their own specific problems, at a level of fit that used to be possible only in the very largest organisations with the very biggest budgets.
It is a quietly important shift, and it is what I was pointing at in The Sum of All Tokens when I argued that the thin-wrapper shape of SaaS is the one most exposed. This weekend is a tiny piece of evidence. One household. One tariff. One specific bit of hardware. But multiplied across every small organisation with some data and some domain knowledge, it adds up.
The LP will still work in five years
The part of this system I am most proud of, in the way engineers care about these things, is the LP. It is clean, provably correct, mathematically elegant, and will still work in five years even if I never touch it again. The part I am least proud of is the Playwright code. It is held together with coordinate geometry and optimism. It will break within months of the next Enphase UI refresh, and I will fix it in an hour when it does, because that is the deal.
The piece of this I am most proud of is going to outlast me. The piece I am least proud of is the piece most likely to need me again by Christmas. This is the inversion, and it is fine. It is, in fact, the shape of where AI-assisted building has put us, and it explains both the excitement and the quiet frustration that so many people seem to feel about the tools at the same time. The bits they are astonished by really are astonishing. The bits they are defeated by really are still defeating. And which bit is which is precisely the opposite of what anyone expected.
Knowing where the clean interfaces are, and where they are not, is the new skill. The sophistication is robust. The simplicity is fragile. That is the weekend, and that is the lesson.