A human SWE can use an LLM to refactor and reduce some of the debt just as easily too. I think fundamentally, the possible rate of new code and new technical debt introduced by LLMs is much higher than a human SWE. Left unchecked, a human still needs sleep and more humans can't be added with more compute.
There's an interesting aspect to the LLM debt being taken on though in that I'm sure some are taking it on now in the bet/hopes that further advancements in LLMs will make it more easily addressable in the future before it is a real problem.
All code is technical debt though. We can't spend infinite hours finding the absolute minima of technical debt introduced for a change, so it is just finding the right balance. That balance is highly dependent on a huge amount of factors: how core is the system, what is the system used for, what stage of development is the system, etc.
I spend about half my day working on LLM-generated code and half my day working on non-LLM-generated code, some written by senior devs, some written by juniors.
The LLM-generated code is by far the worst technical debt. And a fair bit of that time is spent debugging subtle issues where it doesn't quite do what was prompted.
My gut says that is not a property of LLM evangelists, but a property of current internet culture in general. People with strong, divisive, and engaging opinions seem to do well (by some definition of well) online.
It's weird how some people seem to treat using an LLM as part of their personality in a borderline cult like way. So someone saying they don't use it or don't find it useful triggers an anger response in them.
That is not novel - see language/framework choice, OS (or even distro) preferences, editor wars, indentation. People develop strong opinions about tools, technology, and techniques regardless of domain. LLM maximalists just have the unfortunate capability to generate infinite content about their specific shiny thing.
This. For every absurd LLM cheerleader, there’s a corresponding LLM minimalist who trots out the “stochastic parrot” line at every possible occasion along with the fact that they do CrossFit and don’t own a TV.
I mean, isn't driving the business forward really what matters (outside of academia, open source, and other such endeavors). We live in a hyper competitive market. All else being equal, if company A can produce "millions of lines of slop", constantly living on the knife-edge of disaster but not falling over it, they will beat company B that artificially slows themselves down. Up until the point company A implodes, but that's not necessarily a given if pre-LLM companies are any indication.
Huh? Where did I say that's what I like? I'm just trying to discuss for discussion's sake. Personally, I want a world that rewards the people who put their thought, care, and craftsmanship into something more than those that don't. In order to live in that world, I think we need to discuss the parts (maybe the whole) that don't and why that might be.
"I find LLMs useful as a sort of digital clerk - searching the web for me, finding documentation, looking up algorithms. I even find them useful1 in a limited coding capacity; with a small context and clear guidelines."
I am curious why the author doesn't think this saves them time (i.e. makes them more productive).
I never had terribly high output as a programmer. I certainly think LLMs have helped increased the amount of code that I can write, net total, in a year. Not to superhuman levels or even super-me levels, just me++.
But, I think the total time spent producing code has gone down to a fraction and has allowed me more time to spend thinking about what my code is meant to solve.
I wonder about two things:
1. maybe added productivity isn't going to be found in total code produced, because there is a limit on how much useful code can be produced that is based on external factors
2. do some devs look at the output of an LLM and "get the ick" because they didn't write it and LLM-code is often more verbose and "ugly", even though it may work? (this is a total supposition and not an accusation in any way. i also understand that poorly thought out, overly verbose code comes with problems over time)
The first of those is about taste, and it's real, and engineers with bad taste write unstable buggy systems.
The second of those is about priority. If all you want is functional code, any old thing will do. That's what I do for one-off scripts. But if you plan to support the code at 2am when exposed to production requests on the internet, you need to understand it, which is about legibility and coherence.
I hope you do have taste, and I hope you value more than simple "it works" tests. But it might be worth looking there for why some struggle with LLM output.
For what it's worth, I use coding agents all the time, but almost never accept their output verbatim outside of boilerplate code.
See for some, "over time" means "the next guy's problem". That is true before LLMs of course. And that's even the prevailing school of thought in most organizations. And to some extent it is correct to accept some tech debt because otherwise you'll never get anything done.
For those who have been around a while, dealing with the "over time" of yesteryear is a daily occurrence. So naturally they are more averse to it. And LLMs seem to dramatically shorten the duration of "over time".
It seems you find LoC as a measure of productivity. This would answer your question as to why the author does not find it makes them more productive. If total output increases, but quality decreases (which in terms of code means more bugs) then has productivity increased or has it stayed the same?
To answer my own question, if you can pump out features faster but turn around and spend more time on bugs than you do previously then your productivity is likely net neutral.
There is a reason LoC as a measure of productivity has been shunned from the industry for many, many years.
I didn't mean to imply LoC as a measurement of productivity. What I really mean is more "amount of useful code produced to a level the human-using-the-llm determines to be useful".
To try and give an example, say that you want to make a module that transforms some data and you ask the LLM to do it. It generates a module with tons of single-layer if-else branches with a huge LoC. Maybe one human dev looks at it and says, "great this solves my problem and the LoC and verbosity isn't an issue even though it is ugly". Maybe the second looks at it and says, "there's definitely some abstraction I can find to make this easier to understand and build on top of."
Depending on the scenario and context, either of them could be correct.
LoC is a terrible metric for comparing productivity of different developers, even before you get to Goodhart's Law.
OTOH, for a given developer to implement a given feature in a given system, at the end of the day, some amount of code has to be written.
If a particular developer finds that AI lets him write code comparable to what he would have written, in lieu of the code he would have written, but faster than he can do it alone, then looking at lines written might actually be meaningful, just in that context.
I also feel like it makes me more productive but measuring software engineering productivity is famously difficult. If there was an easy way to measure it, managers at bigco would have employed it with abandon years ago.
We've built a new auth platform with some new identity primitives and capability-style tokens using biscuits.
Right now, I'm trying to figure out ways to apply it and am looking into offering integrations with extremely fine-grained access control that wouldn't have it otherwise. So adding a fine-grained access layer in front of stuff like backend-for-frontend (BFF) systems, brownfield stuff with poor auth, or even OAuth stuff that just have really coarse scopes.
Are there any integrations out there that people want but the access control is bad for them? I'll build one for you!
I wonder if this makes room in the market for some simpler device for payments. Something like a wearable that you can tap-to-pay and has the signed software attenuation but nothing else so you can't be tracked using GPS.
Heh yeah, my comment does kinda scream credit card. What I really mean is something programmable for narrow use-cases like multiple forms of payments, transit, or other stuff like building access.
Long ago we used to have ‘mini’ credit cards. You could get a two-thirds size magstripe card from some major banks that’d go right on your keychain. Discover had a cute little bean keychain with a flip-out magstripe card (the Discover2Go) as well.
At the same time there was also the Exxon-Mobil Speedpass RFID fob, and I remember there being a huge discussion about “the battle of the keychain” and whose payment instrument would win being on your keys to be used the most alongside your loyalty cards.
This will be the answer as we move away from screens as phones. Smart watches have slowly edged in, but I foresee some 'no screen' being the answer to payments, access control, etc
that exists. It's called Felica, and it's used all over Japan. train passes, vending machine, convenience stores, many restaurants. Built into iphone and a few androids.
Note that the payments are tied to a card/chip but you can (at the moment) buy new card no id/registration required
Nice. We had this in the 90s in Holland. It was called chipknip. (Knip is old slang for wallet).
It was really like digital cash, the money was loaded onto a chip. So if you lost it you lost all the money. There was no pin code either, just like a real wallet. Unfortunately it was not really anonymous because the Dutch government are really into surveillance.
It didn't really last very long, it was only popular for parking machines. In those days 2G was expensive so validating transactions online was rare.
You are very on base. In fact, there is a deep conflict that needs to be solved: the non-determinism is the feature of an agent. Something that can "think" for itself and act. If you force agents to be deterministic, don't you just have a slow workflow at that point?
Driving a car that old puts yourself and others on the road at greater risk due to lack of safety features compared to a modern car. One could argue being able to afford a new car and not buying one to extoll other virtues is neglecting your own and communal good.
If you’re afraid to carefully drive a high quality and well maintained older car that was designed from the ground up with safety and quality at the absolute forefront- say an 80s Mercedes or Volvo, you would benefit from relaxing a bit and being willing to take slightly more risk in life.
Besides, I am not wholly convinced that improved safety tech is a replacement for the type of safety first engineering used in every tiny detail of those old cars, that mitigate certain types of accidents and injury that won’t be addressed in crash testing.
Back-up cameras are really important for kids who can't be seen in a rearview mirror. Those can be retrofit into an older car, but after having a kid I can see why these became mandatory.
A well designed car and proper driving technique make a backup camera unnecessary.
Many old cars have excellent rewards visibility without needing any camera- no camera will compare to a first generation Porsche Boxster with the top down for example, where you can directly see behind you by looking back. Volvo wagons are great like that also.
I also, as a rule never back anywhere that I haven’t seen directly just a few seconds before. I always back into parking places so I can see them facing forwards and not back up when starting out, and if I do need to back up when starting out I walk behind the car and look around first and then immediately get in and back up.
How is a toddler going to get behind my car before I can get in it that I did not notice was nearby and start visually tracking from standing there? How is a backup camera going to help when I backed into the spot and am now pulling out forwards? That’s just not a realistic concern. Also, backup cameras cannot see much closer to the wheel than those cars I mentioned with good visibility.
Tech really won’t help you here- safe driving requires looking where your vehicle is going with your own eyes. The field of view of a backup camera is insufficient- even if you have one, it’s usually better to be looking directly behind you and not use it. I see cars with backup cameras and sonar hit each other in parking lots all the time, because they thought the camera was a replacement for looking and situational awareness.
No backup camera will let you observe the nail the wind blew close to your tire to puncture it soon as you move the car. Perhaps it is best if I just stay home.
I have not yet flattened a kid with my car, but I suppose there's still time. Also backup cameras are very important for today's vehicles which are gigantic monsters compared with cars of the 90s. My car is quite low to the ground.
Also, show me the stats on how many toddlers are pancaked by lack of backup cameras each year per capita. That will inform me about how truly "important" this problem supposedly is.
Safety features like tracking where you are, and requiring a subscription for seat heating ?
I just think the everything connected to cloud approach sucks, but Im communal danger now.
I was driving a 30 year old car for a while (Miata). I'd say I was pretty low risk as you tend to go slowly in classic ones so things don't blow up. Also the small size reduces the risk to other road users compared to driving a massive suv or some such.
It's a full identity and authorization platform targeted for service-to-service use cases. But my focus the last couple months has been to make provisioning identity super easy, and I think I've done that (at least compared to something like SPIRE).
So if anybody has CI/CD pipelines, AI agents, edge-functions, or multi-cloud workloads they want to give auditable identity, I can help!
There's an interesting aspect to the LLM debt being taken on though in that I'm sure some are taking it on now in the bet/hopes that further advancements in LLMs will make it more easily addressable in the future before it is a real problem.
reply