By Tess |
February 13, 2026
two weeks ago I started using LLMs to write code. previously I tested Claude Code every few months and would occasionally oneshot a function here and there, but I was still basically typing ~all my code by hand. I'm now writing ~none of it and reading little of it (at least for new projects).
in this time I built enterprise, an LLM chat/agent platform. it's nothing special, but it works, and would have taken me far longer to build to the same quality level under the previous paradigm. here I'll share the workflow I've come to, and the observations and opinions that led to it.
I don't claim excessive generality. everything that follows may only apply if you are:
I don't like the term vibecoding, or at least I don't like it applied to what I've been doing. I think it points to a different engineering philosophy, one that attempts to transcend precise understanding and structure. I don't think that philosophy is currently effective for most medium to large projects, nor is it the most effective way to utilize LLMs. what I'm describing here is sometimes called spec driven development. I call it programming in markdown.
reading code sucks, reading LLM code especially sucks. don't do it.
there's a no-mans-land of LLMs writing half your code, or writing bad code that needs human review/touch up. you want to push through to treating code almost like a machine language.
you write your app or library or whatever in markdown. you commit your markdown to git. the markdown gets compiled to machine code via Rust (or some other language, but Rust good, see below). an LLM is just one part of that compilation pipeline. yes, the compiler is incremental and nondeterministic and expensive. yes you commit your partially compiled artifacts (source code) to git. yes this is all very strange and statey, but it's still best thought of as a compilation process.
information does not flow back from the LLM when everything is going well. ideally you reason about and work with the concepts you've written, not the ones the LLM has come up with. the LLM's job is to turn changes to your specification into changes to the source code, without you ever knowing how the source code worked.
the process is janky. sometimes, often, the result is broken despite the spec being correct. you then tell the LLM to fix the problem, and because code is committed to git along with the spec, it (hopefully) stays fixed. sometimes you need to care about the code for one reason or another. the compiler frame isn't always perfect, but I think this workflow is a good starting point for how to think about LLM development.
as of now, LLMs do a bad job at code architecture and a worse job of reevaluating and refactoring architectural decisions. providing the models with the right high level design and philosophy for a project is one of the human engineer's main jobs. getting this right is as important as ever. in trad coding if you get this wrong you have to work with code that degrades in quality and explodes in quantity as time goes on. if you get this wrong with AI your LLM suffers the same fate, and you have to deal with the downstream effects: things don't work and your LLM breaks more things trying to fix them.
there's no easy solution to architecture. not enough abstraction is terrible, too much abstraction is worse and increases the chance of the scariest problem of all: the wrong abstractions.
the answer is, as far as I can tell, you just have to be right. trad software engineering experience helps, but you don't want to over-index on lessons from trad programming. LLMs work very differently, and so different areas of the tradeoff space become relevant. your architecture doesn't have to be perfect, but you get much better long-term results if it's something sane and it's explicitly laid out.
architectural refactors and cleanups have costs, but are possible. in the ideal case they're as easy as rewriting a dozen lines of markdown and letting it churn for a while. in practice there's probably bugs that need to be fixed. you may want to do a few passes of unguided cleanup, you may even want to glance over the diff for anything obviously horrendous, but it's workable. each major refactor doesn't necessarily permanently degrade your code quality.
writing and reading code and English are all work. humans hate work. humans are (sometimes) smart, which allows them to reduce work by writing dense code and little English. other smart humans can read between the lines to understand the philosophy that led to the code being written, and work with that philosophy to plan changes that will integrate nicely with the existing code. there's a general explicit understanding that comments and long-form documentation are "good" (in the sense better to have them than not), but there's a general implicit understanding that they are often not worth it.
LLMs love work. they love reading. they love writing. they could do it all day. however, they're very not smart. they need all the help they can get to understand anything even slightly complex. when LLMs own your code you'll inevitably end up with more, dumber code than a human would write. you should also aim to end up with way more English. have them write comments. have them write documents. have them decide to write and read documents autonomously. blindly commit said documents. these are not for humans, they're for transferring the model's ideas about the code across contexts.
I do think it's important to keep LLM-written documents and human-written specification clearly separate, and explicitly tell the models to mistrust LLM-written documents. models have been trained to follow instructions, and it's easy for a previous model's schizoposting to be interpreted by a future model as sacred texts.
I don't like writing Rust, especially as a solo dev. it is, however, a fantastic LLM language. most Rust code is modern, relatively high quality and reasonably consistent, so machine generated code also tends to have those attributes. the verbosity of the language is no issue for models, having defaults and characteristics that favor performance counteracts models not caring about perf at all, and Opus 4.5+ level models don't seem to get bogged down with borrowchecker errors or weird type problems. Rust good.
so what does this actually look like in practice? your workflow won't look exactly like mine, but the gist is:
spec/ and commit themImplement the change or complete the task specified in the last commitTASK.md describing the bug or problemyou'll also want a CLAUDE with instructions about always following and not changing the spec, writing notes, etc. this workflow results in all human input going into commit history, which may be desirable for various reasons. I've automated some parts with scripts, see the Enterprise repo for details.