When AI promises speed but delivers debugging hell

namaria · 2025-01-26T12:38:19 1737895099

Coding is trying to order bytes into doing arbitrary stuff that is useful because of some transient conjunction of factors in the real world.

We have developed programming languages because coding in machine language is horrible, and over the decades we've refined them into tools people can use fluently and just directly think in code when they have to make a computer system behave in a certain way.

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

qwertox · 2025-01-26T13:18:22 1737897502

The hype is around having AI replace your typing, that is, to code for you.

The hype should not be around replacing the typing, but in assisting your thoughts.

When you code, there's the dialog in your brain which thinks about the code and also creates the questions which you know you must answer in order to then transition to the dialog with the machine, that is, to type code.

And in this first part LLMs can be extremely useful, which will come to the point where you select a line, then explain your intent, and while the AI retrieves documentation and possible solutions, you can reason about the problem and then pick and choose from what the AI has collected for you.

> Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

The question is then if assistants are "sitting next to you", like a secretary and a mentor, or if they are sitting between you and the editor, as the thing you need to control.

An assistant can be a really effective refinement in the programming process. Even so far that it ends up motivating you instead of you constantly getting demotivated due to hitting the wall of "not another problem that I need to solve before I can really continue" (which happens all to often).

barrell · 2025-01-26T13:31:45 1737898305

Personally, I haven’t found LLMs to be helpful for the internal dialogue. Even with a lot of exposition, code samples, and documentations, it always provides either obvious solutions (store items in a vector!), pointless modifications (use map and filter instead of reduce!), or it just makes up APIs that don’t exist.

I think it’s really good with the 101-level academics side. Learning the basics of anything through a conversational manner can be massively helpful.

As soon as your situation exceeds textbook level, I’ve found them to always be a waste of my time, and nothing I’ve seen as of late makes me think they’re trending in a direction to be helpful in this scenario

klabb3 · 2025-01-26T14:48:52 1737902932

> As soon as your situation exceeds textbook level, I’ve found them to always be a waste of my time

Doubly so. A very knowledgeable and helpful tutor would say ”you’re asking for advanced or detailed guidance. I don’t know the specifics, but I can point you to some places that you’d be able to find these answers on your own”.

What the AI does is continue its babbling confidently while being incredibly wrong and non-sensical, like a person suffering a stroke or concussion where the mannerisms are normal but they don’t remember their name. It seems completely unable to judge its own knowledge (probably because it is).

ConspiracyFact · 2025-01-26T15:20:38 1737904838

Yes, it would have to be unable to reason about its own confidence levels, wouldn’t it? It produces content—and, as far as I understand, sort of simulates basic reasoning—by making predictions based on a huge corpus of text. The larger this corpus becomes, and the more sophisticated its method of analyzing it, the better it becomes at “reasoning about” the things described in its corpus.

But the question “How sure are you?” inherently refers to something—the LLM’s “mental state”, if such a thing can be said to exist—that isn’t referred to anywhere in the corpus. No improvement in the quality of the corpus or the power of the predictions made based on the corpus can have any impact on this problem.

seba_dos1 · 2025-01-26T15:22:10 1737904930

I want LLMs to answer simple, but tedious questions that arise when I do the thinking. I want them to help me find relevant sections in multi-thousand page datasheets regardless of whether I happened to use the same synonym that documentation's author has used. I want them to remind me the meaning of a term that was defined 12 chapters ago without me having to context switch and look for it again. I want them to consolidate information spread over 6 PDFs that I need to look at to understand something.

I want them to be an interface between me and reliable resources. I want them to essentially facilitate ad-hoc fact retrieval without requiring me to master field-specific jargon first. I don't need them to do any thinking, that's my job. I don't want them to try to answer my questions, I want them to point me to resources that let me answer them and save me the time on searching them in the process. You don't need to know where to look beforehand if you're a machine that can ingest whole libraries in seconds - so let me actually benefit from this power rather than try to provide me with a sketchy equivalent of a clueless intern trying to make a good impression and not realizing how tremendously bad they're at it.

I believe LLMs (or, more specifically, complex systems utilizing LLMs) can end up being incredible productivity boosters, but right now they're being so hopelessly misapplied that it will take a good while for them to get there. LLMs can already be somewhat helpful if you approach them carefully, but they're still far from life-changing - unless your life can be changed by reducing the amount of boilerplate you need to type to code, then I guess you're already happier.

spacemanspiff01 · 2025-01-26T13:59:15 1737899955

Where I find coding assistants the most useful _is_ in writing code that I already want to write.

Ala - I need to write this unit test, it has these checks, it validates these methods.

Or write a log message for me about what error got encountered here. Those are annoying to write out, but often the llm has enough context that I just start to write and it completes it appropriately.

All of these are things I can easily do myself, are easy to validate correctness, but if I were to write them would consume my limited mental energy for the day.

diggan · 2025-01-26T14:04:14 1737900254

Sounds like everyone have different use cases for LLMs, which makes sense, we need help with different things.

Personally, I try to get rid of writing boilerplate (which it sounds like you're talking about?) fully, no one should write that.

Instead I get help from LLMs to deal with areas where I'm not super great, like math. So I know what results I want, but I'm not sure how to get them. So setup a bunch of tests and the interface I want, then let the LLM figure out how to do the implementation itself.

tomrod · 2025-01-26T14:30:24 1737901824

I find it is good at writing about 90% of tests, either from requirements or existing code.

It is terrible once you get 3 or 4 tweaks into main parts of the code base. Just like autocorrect degrades over time.

aleph_minus_one · 2025-01-26T16:17:54 1737908274

> Where I find coding assistants the most useful _is_ in writing code that I already want to write.

... for a very specific definition of "code that I already want to write". :-)

The code that I want to write often involves very novel (and sometimes quite mathematical) concepts. Or code that is "low-level wizardry". Thus the AI nearly by definition will commonly fail with the tasks.

Also, this kind of code often (though not always) has the property that "if you do something wrong, everything falls apart", i.e. it is the kind of problem that has a low tolerance concerning any "AI hallucinations", meaning the code is either (nearly) perfectly right, or not helpful. Debugging any wrong code (even if you have testcases available) will commonly take a very long time.

skydhash · 2025-01-26T13:42:29 1737898949

Not everyone is the same, but I have a different view of programming that what you describe. Most tasks involve thinking about the domain and what implementation techniques to use while trying to reduce the technical debt in the project. For the domain, I talk to people, rely on past experience (mine or others), or do research. For implementation techniques I look at other people’s code, read books, or ask someone more experienced. Both are heavily influenced by the context, aka what already exists in the project and the constraints that I have to deal with. I heavily distrust LLM because it cannot assimilate the context like an experienced person and provide me direction based on experience. Why experience? Because the problem and the constraints always exist in the real world.

api · 2025-01-26T13:43:46 1737899026

The hype in some areas is around replacing coders, which is a fantasy without orders of magnitude better systems.

TheOtherHobbes · 2025-01-26T14:11:20 1737900680

Yes, that's exactly it. Many corporations firing coders because "AI can do that" and then discovering that AI can't.

It's offshoring 2.0. $developing_country is cheaper so let's hire them, and why is everything broken now?

This fantasy economy is based entirely on Numbers Must Go Up. Everyone seems to have forgotten the numbers are linked to real people in a physical world, and the real people and the physical world both have rules and consequences of their own which don't care about quarterly returns.

api · 2025-01-26T17:56:22 1737914182

> This fantasy economy is based entirely on Numbers Must Go Up.

Everyone with any form of savings wants numbers to go up. Everyone who wants to retire wants numbers to go up. The financial industry is just doing what their customers want, and anyone with net-positive wealth is a customer.

The problem is they can't forever. The Earth is not infinitely large and there is not infinite demand. Even if we did settle the Moon or Mars, it probably wouldn't make a huge difference to the terrestrial economy because of the distances involved. We'd just have founded another economy out there that would follow its own growth and maturation curve.

I think this is why you see such an outright panic right now about birth rates. If people don't have more and more kids, number can't go up. I'm predicting -- to the extent that it's not already here -- the emergence of something I'm calling authoritarian natalism. The government will try to more or less force people to have kids, probably by taking away womens' rights and/or taxing people who are childless. It won't work, but it will be attempted in some places.

Eventually the financial industry as we know it will collapse. It's inevitable. A major political question of the 21st century might be how much pain we inflict on ourselves in an attempt to save it.

I know a lot of people will cheer for that, but it will also likely mean the end of low to zero risk compound interest on savings and the end of retirement among other things. I don't expect retirement as an institution or widespread practice to survive much longer than 20-30 more years.

JumpCrisscross · 2025-01-26T14:32:04 1737901924

> It's offshoring 2.0

Doesn’t this analogy break down inasmuch as offshoring eventually worked?

hatefulmoron · 2025-01-26T14:51:31 1737903091

T-shirts are made in Bangladesh while code is still being made by extremely well compensated professionals in developed countries. That's despite code being easier to transport internationally than clothes. It's hard to say that it's worked yet.

JumpCrisscross · 2025-01-26T16:15:47 1737908147

> while code is still being made by extremely well compensated professionals in developed countries

T-shirts are also being made in small print shops across America. As is code being written on the cheap in Poland, Brazil and India.

hatefulmoron · 2025-01-26T23:23:04 1737933784

Sure, but I don't think that the code being produced in the US is "small code shops" producing artisanal code. It's not about whether code is produced in other places, it's if it's produced at a high scale locally. I don't think it's a problem that Poland and India have programmers.

adrianN · 2025-01-26T16:34:09 1737909249

A lot of code is made by people in countries with relatively low incomes, eg eastern Europe. Most of the software in your car is probably written by an offshore team for example.

nogridbag · 2025-01-27T02:38:21 1737945501

That probably explains why most people use Android Auto and Apple CarPlay!

adrianN · 2025-01-27T03:58:23 1737950303

Probably not for their engine control unit or their abs controller.

csomar · 2025-01-26T15:24:07 1737905047

The problem is not with offshoring, per se but the mentality of the person doing the offshoring and why they are doing it.

lazide · 2025-01-26T14:47:09 1737902829

Did it?

out_of_protocol · 2025-01-26T15:19:39 1737904779

It did, e.g. Poland have very good bang for buck ratio. Something like Bangladesh wouldn't work for IT, but more developed countries will

lazide · 2025-01-28T04:47:07 1738039627

The challenge with Poland, is that if there is going to be WW3 soon it looks likely to be the immediate front lines. So relatively high risk, from a business disruption perspective.

mmusson · 2025-01-26T16:45:36 1737909936

I think there is a skill issue. Just like in any other pursuit, some people are going to be better at using AI productively. It is a tool. You are still responsible for the quality of the resulting code whatever the mix of human and tool generated.

ozim · 2025-01-26T18:38:22 1737916702

Hard disagree.

Assistant should not help you thinking, any AI agent/tool should be doing what you want with minimal amount of explanation.

Only way I accept current hype is if I am able to type in "make a Twitter clone" it does the implementation, I can run it, I write "make it red, silver and yellow color themed" and it does just that. I am the one doing thinking here - I don't care about technical details. That should be state of art.

I can write my own Twitter clone and if I have to write prompt after prompt it is going to take me more time and more typing so it is useless.

A person that cannot write their own Twitter clone is not going to prompt their way out to having working and deployed Twitter clone.

insign_bit · 2025-01-26T13:40:29 1737898829

This reminds me of something Dijkstra wrote almost 50 years ago in On the foolishness of "natural language programming [1]:

> When all is said and told, the "naturalness" with which we use our native tongues boils down to the ease with which we can use them for making statements the nonsense of which is not obvious.

Although this is obviously not about LLMs, its astonishing how many parallels can be drawn to today's usage of AI systems.

1: https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...

bwfan123 · 2025-01-26T15:40:26 1737906026

Wow, thanks for sharing this beautiful essay by a legend that captures the essence of the llm debate.

That, the use of formal languages - although an evil (since it is not a natural human language) is essential to avoid nonsense (ie hallucinations). While intuition/natural language is more imaginative, formalism (ie narrow interfaces) is a forcing function to make things work.

According to Dijkstra, the use of natural languages regressed civilization by thousands of years (because of the nonsense and imprecision) ! So, expect thousands of years of LLM hell if we adopted it to replace our formal languages.

Indeed, math and other symbolism/formalism is a crowning achievement of humans.

bambax · 2025-01-26T14:20:07 1737901207

I use AI (Sonnet) to write relatively complex SQL queries, but that always need a review before implementation. It's sometimes brillant, and at the same time can suggest wrapping a single, atomic query in a transaction, for no reason. Just asking "why?" will result in profuse apologies and thank yous, but it never explains what caused the mistake in the first place.

Trusting an AI to code an app from start to finish seems crazy to me but hey... if some people can pull it off, good for them I guess.

roland35 · 2025-01-26T17:10:37 1737911437

That's the type of thing AI is great for! I find AI is pretty decent at generating data related code with Pandas, and I since I only rarely use Pandas it saves me a ton of time relearning everything.

Where ai starts breaking down is how to effectively incorporate a new feature in a complicated existing codebase. That is where us engineers can continue to hold an advantage.

csomar · 2025-01-26T15:22:00 1737904920

This massively underestimate what current LLMs can do. Yesterday, I was able to create a 600 lines script in 20 minutes or so that essentially setup a Cloudflare worker bindings (KV, Queues, Hyperdrive, etc...). The complexity is very low and debuggu-ability is easy. Reading this infra. code is fast. However, if I was to do this manually, it would have taken me a full day reading through the docs and trying the implementation back and forth for each binding I am connecting to.

Claude 3.5 did it from the first shot.

kuschku · 2025-01-26T15:40:27 1737906027

And 2 months later I get asked to debug code like yours when it doesn't work for a customer and have to spend days or weeks digging into your code before I notice the LLM took some shortcuts that work most of the time, but are ever so slightly broken in edge cases, followed by me having to rebuild it all from scratch.

I literally just spent a full week on such a project. Respectfully, fuck people who don't read the docs/spec.

csomar · 2025-01-27T05:59:47 1737957587

Respectfully, I don't understand what your problem is. First, you don't have to debug any code. That's your personal choice. Second, whether LLM are used or not is irrelevant. The original developer is the party that decides to check and double-check his work and whether he extensively verified what the code is doing (and maybe tightly tested it). LLMs are not making this worse or better.

kuschku · 2025-01-27T13:31:24 1737984684

First:

> First, you don't have to debug any code. That's your personal choice.

To quote the Simpsons:

> Money can be exchanged for goods and services

Second:

> LLMs are not making this worse or better.

The bottleneck is never "writing code", it's "thinking". LLMs can solve one, but not the other. They allow you to produce more thought (and give you false confidence), but the code has less thought put into it, making it worse.

roland35 · 2025-01-26T18:29:14 1737916154

Maybe it's easier to verify a working program with the docs rather than try to build something from scratch? I agree it's a bad idea to fire off AI code without any sort of understanding though!!

danny_codes · 2025-01-26T19:57:52 1737921472

Agree it's good for boilerplate, provided the thing you want to do is extremely basic / just setup. Once you need something slightly more complex it seems to break down rather quickly.

Claude 3.5 is pretty good at giving you the right hints though. If you aren't familiar with a library it's definitely faster than grepping through docs. If you are an expert in a library than it's pretty useless.

jason_zig · 2025-01-26T15:33:52 1737905632

You still have to become a domain expert to debug it though?

hombre_fatal · 2025-01-26T15:42:47 1737906167

Not when it's code that was only hard to write because you needed to know the right incantations to pipe data between different services.

Now you see the incantations that mostly work and the job of transforming it is easy.

Java's Bouncy Castle crypto library is a good example of this. The thing you're trying to do might be simple, but to do it, you might need to instantiate 8+ Java classes. It doesn't mean it's complex to read or hard to debug.

ConspiracyFact · 2025-01-26T16:03:01 1737907381

> The thing you're trying to do might be simple, but to do it, you might need to instantiate 8+ Java classes. It doesn't mean it's complex to read or hard to debug.

I’m skeptical that code that needs to instantiate eight separate classes will remain easy to debug in the general case.

kuschku · 2025-01-27T13:45:43 1737985543

LLMs give you a lot of false confidence, just because something looks right doesn't mean it is.

Especially with cryptography you should NEVER use LLMs. Read the docs, write down some notes, and make sure you properly understand everything before you use it. You need to really think it through before you end up leaking user data or worse.

petercooper · 2025-01-26T14:25:52 1737901552

Only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive.

AppleScript might convince me of your argument.. ;-) But, seriously, we've been putting abstractions between "us and the bytes" ever since Fortran and COBOL appeared (and indeed, earlier). We can argue about the quality and expressiveness of those abstractions, and there are a lot of arguments against natural languages in this task, but the broad idea of putting things in between developers and machines is sound so it's worth continuing to explore IMHO.

klabb3 · 2025-01-26T15:05:55 1737903955

All those layers are stable and deterministic. You can open it up and check, in most cases, how the upper layer calls the lower layer, should it be necessary.

If you have an LLM between you and the code, there is not even such a thing as ”source code”, only a history of prompts. You can’t check in your prompts in git, and re-generate the same code later.

In fact, it’s more like the antithesis of the reproducible builds movement. It’s introducing a proprietary networked high latency chaos agent into the critical path.

petercooper · 2025-01-26T15:38:32 1737905912

I accept your points today, but I'm optimistic this problem will be resolved in the mid-term. I just feel some of the arguments smell similar to concerns levelled at the earliest compiler developers about the perils of abstraction. Now we gleefully stack layer upon layer with most developers being unable to grok more than a layer below their choice of abstraction (if they're good!)

I am extremely LLM-optimistic though and largely in favor of abstractions, so that fuels my viewpoint. I still remember my dad, an embedded developer in the 90s-00s, ranting about how many people were starting to use 'inefficient and unpredictable' C compilers than whatever assembly he was using. I reckon he'd be appalled to learn that now even assembly isn't always a reliable model of what's really happening on the CPU thanks to microcode and optimizations.. ;-)

klabb3 · 2025-01-27T00:48:06 1737938886

> I just feel some of the arguments smell similar to concerns levelled at the earliest compiler developers about the perils of abstraction

Like you, I am a sucker for abstraction (often to my detriment). I just don’t see any resemblance between an LLM layer and abstraction, aside from that they both introduce complexity. I guess that abstraction superficially ”hides” its inner workings but this is a misnomer, it’s never actually hidden, it’s only claimed that you don’t need to look behind the curtain (but you can, and then you can see exactly what’s going on).

> I'm optimistic this problem will be resolved in the mid-term

Which one? Stability? Assume you can get open source models running locally and with no floating point non-determinism. Now you have some narrow sense local stability, yes. For exactly the same version of the LLM compiler and same input (all prompt history), you can reproduce the result. But this is not enough. If you roll a new version of the compiler (meaning new training data or algorithm), all previous responses are invalidated. So how would you maintain backwards compatibility?

With abstractions OTOH (like say C++), you don’t check in the generated assembly or binary. It’s enough to check in the C++ code, the rest is derived directly. When C++ releases their new foot-machinegun feature, you simply update your compiler and your old code works.

My point is that LLMs are not engineering products. They are closer to organic in the sense they act in the moment, and by necessity they are non-stable with respect to small variations in input and training. I heard this good quote from this researcher at anthropic who claimed we don't program the (neural) networks, we ”grow” them.

ConspiracyFact · 2025-01-26T16:12:24 1737907944

The thing is, going from a high-level scripting language to a natural-language interface is a massive jump. The difference between a python script and a hand-crafted executable is essentially nothing in comparison. I don’t think that a single technology is going to bridge that yawning gap overnight.

roland35 · 2025-01-26T17:13:22 1737911602

Plus I feel like at least compilers are at least deterministic? AI is not repeatable or provable at all

skydhash · 2025-01-27T03:08:08 1737947288

> I still remember my dad, an embedded developer in the 90s-00s, ranting about how many people were starting to use 'inefficient and unpredictable' C compilers than whatever assembly he was using

I remember reading that C compilers were buggy/generated lots of instructions. He may have a point back then

ConspiracyFact · 2025-01-26T16:09:01 1737907741

> It’s introducing a proprietary networked high latency chaos agent into the critical path.

Well (and amusingly) said. The same (or at least a very similar) problem exists at the other end of the pipeline, i.e., whenever a user has to use a natural-language interface to get software to do something they want. Are we really going to tell our AI assistants to take complex actions on our behalf in the real world, and then just sit back? Are we really going to do this when money is involved?

withinboredom · 2025-01-26T14:58:29 1737903509

sure; but these languages we speak to computers in are deliberately non-ambiguous and lack nuance. Natural language is both ambiguous and nuanced.

"Have a nice day!" can mean many things (an insult, or sincere, for example).

petercooper · 2025-01-26T15:26:40 1737905200

I accept your point and agree in principle! However, I think there's a lot of fertile ground to be researched and experimented with here. I'm mostly pushing back against the parent's assertion that "putting natural language encoding between [a developer] and the bytes" is inherently of no net benefit.

(I'd also argue many programming languages introduce ambiguity, are riddled with undefined behaviors, subject to 'dialects', and distorted by issues similar to those affecting natural languages. C++ perhaps most notoriously. This is something I'd rather debate in good humor in person, though, as I suspect this could be an entirely separate problem.. ;-))

withinboredom · 2025-01-26T16:14:44 1737908084

I’m not sure why you need to do any research. Just look at how hard it is to drag software requirements from a business person. Or how hard it is to explain to a “pedantic asshole” how to make a sandwich (https://youtu.be/FN2RM-CHkuI?si=3lanUOhXkj2sP8GA).

out_of_protocol · 2025-01-26T15:23:10 1737904990

Very maddening, some regular phrasees can be interpreted in multitude of ways, often contradictory. Even with full context it's still ambiguous

bccdee · 2025-01-27T05:47:34 1737956854

Exactly. The formalism of computer code will always create some amount of boilerplate—there's no perfect language—but in my (admittedly limited) experience, an LLM is a middleman which distances you unacceptably from your own code. Whether you review the code or you write the code, the intellectual effort of deciding which approach is best and understanding the solution still needs to be undertaken. All it's saving you are the keystrokes, at which point it's glorified intellisense.

seanmcdirmid · 2025-01-27T06:07:50 1737958070

Note that intelligence/code completion was never about saving keystrokes, so I assume LLMs shouldn’t be either. Code completion has always been about getting a list of things you can do on a particular type, saving you from having to remember everything in your head (and APIs have gotten more numerous as a result). LLM code assistance that I’ve seen is poorly designed in that it usually gives you one most likely choice (so save keystrokes) and doesn’t allow you to browse through a bunch of likely possibilities.

s_dev · 2025-01-26T12:48:15 1737895695

AI is a tool like any other. Autocomplete on steroids -- markov chains taken to the extreme.

We already put natural language between us and the bytes. Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

namaria · 2025-01-26T12:56:01 1737896161

The memory and compute requirements to develop and run these models make no sense if the marginal improvement in autocomplete is the big end result. They only make sense in a world where machine can derive intent from natural language and actually conform to what people mean when they ask for something. This is clearly a fantastical result that LLMs are very short of.

nailer · 2025-01-26T13:08:35 1737896915

It’s interesting. I would’ve agreed that ‘driving intent from natural language is something that LLMs have fallen far short of’ maybe a month ago.

Since then, I spent a week trying to get cursor to work, and after dealing dealing with all the bugs, and restarting the composer each time with a new prompt, was able to get what I would consider a quality output for a moderately complex app (a parimutuel betting market).

The issue isn’t that LLMs are terrible, it’s the software like cursor is buggy and poorly written.

It should know that I don’t want to use code from an old version of the library I am using because the new library I am using is already in my projects dependencies.

It should let me set up preferences for different programming languages. And preferences for all programming languages.

So when I give it a prompt, it looks at the dependencies and language rules I already have set up, adds those to the prompt and produces the quality output I’m seeing now without me having to manually specify all those things.

Short version: LLMs rule the software is just shitty.

Xmd5a · 2025-01-26T13:45:18 1737899118

My experience has been the opposite, I found cursor to be an improvement over comparable tools such as aider.

I was able to write a plugin for ComfyUI (a 60k loc python/js codebase) in 2 hours thanks to semantic search. It's not an exercise I'm versed in.

It wasn't that different from the kind of internal monologue I'd have held in my head had I done it on my own, including misguided confidence that gets crushed 5 minutes later as you read other parts of the code that show you had the wrong understanding of how it actually works.

In this context, LLMs can be very useful because a ground truth already exists to compare their replies against.

nailer · 2025-01-26T18:32:21 1737916341

> My experience has been the opposite, I found cursor to be an improvement over comparable tools such as aider.

That sounds like a similar finding to what I had (comparing to copilot in my own case).

My point, which my post maybe didn’t make so well, was a huge amount of the prompt should’ve been written for me in order to get to an acceptable result sooner.

meltyness · 2025-01-26T14:54:26 1737903266

I see this and conclude the opposite that they were adhering to the principle. Basically AI writes the buggy code that is upsetting you.

Similar experience trying to use GenAIScript, btw, and peering inside the box the code and product is pretty well incomprehensible.

nailer · 2025-01-26T18:35:43 1737916543

My post above does not seem to be well written as it’s been frequently misinterpreted.

Yes, the AI is writing the buggy parts that upset me but my point was creating a good quality prompt would’ve taken a lot less time if Cursor had had some reasonable defaults.

otabdeveloper4 · 2025-01-26T13:24:34 1737897874

LLM's don't "know" anything, it's just a souped-up Stackoverflow search.

senordevnyc · 2025-01-27T01:05:18 1737939918

I totally agree. I think almost all coding could be done by today’s best LLMs, IF they had the right context and tooling. Using Cursor is sometimes like magic, but it also feels painfully clear that the LLM is being held back by a lack of information, leaving me to have to interface between the codebase and the LLM, in both directions. Selecting which files to include in context feels so stupid, and like something that will hopefully quickly go away.

llm_trw · 2025-01-26T13:20:48 1737897648

>The memory and compute requirements to develop and run these models make no sense

There is the story that von Neumann flew off the handle the first time he saw an assembler.

>>How dare you waste compute cycles on this frivolity? Just use machine code like everyone else.

chongli · 2025-01-26T13:41:29 1737898889

There is the story that von Neumann flew off the handle the first time he saw an assembler.

That was in the 1940s when labour was very cheap and compute was insanely expensive. We’re talking hundreds to thousands of programmers’ salaries for the cost of one computer.

llm_trw · 2025-01-26T13:49:21 1737899361

How much did the hardware to train gpt4 cost?

ttyprintk · 2025-01-26T14:32:17 1737901937

I’d expect $50-$100m. 1/40 the value of Rockstar energy drinks.

llm_trw · 2025-01-26T23:37:38 1737934658

The hardware costs tens of billions. The electricity cost is around what you guessed.

JTyQZSnP3cQGa8B · 2025-01-26T13:13:50 1737897230

> AI is a tool like any other. Autocomplete on steroids

No, AI is a shitty tool that has yet to prove its utility. Autocomplete works by analyzing the official API and interface, it's completely different than AI which hallucinates meaning between words and also stuff that it was fed before it met you.

> variable names (a hard part of computer science)

Naming is for software engineering, not CS. One more confusion by people who want to sell us AI at all cost.

criley2 · 2025-01-26T13:22:54 1737897774

At some point, you become the luddite. Maybe you have no experience with modern AI dev tools, maybe you work in a language that is underrepresented in models meaning off the shelf tools don't work well, or maybe you're just an old curmudgeon who will die on a hill.

But modern AI tools are far beyond "auto complete". (I actually turn off those in-line completions, I feel they ruin flowstate). The tools now are fully prompted, with multi-file editing, with full codebase context, with web/search and doc integration, and for "on the rails" development are producing high quality code for "easier" tasks.

These modern models and tools can solve nearly every single leet code problem faster than you. They can do every single Advent of Code problem likely 10X-100X faster than you can.

In my professional, high standards, very legal and contract driven web app world, AI tools are still very useful for doing "on the rails" development. Is it architecting entire systems? No of course not (yet). Is it emulating existing patterns and extending them for new functionality 10X faster than a Jr or Mid? Yes it is. Is it writing nearly perfect automated tests based on examples? Yes it is. It is scaffolding new ideas and putting down a great starting point? Yep. And it's even able to iterate on featurework pretty well, and much faster than Jr/Mid.

The kind of work I'd give to a Jr/Mid and expect to take 2-3 days before they need serious feedback up and down the change, these AI are doing in about 30 seconds, maybe 90 seconds if you need to iterate a few times on the prompt.

I get that "AI" is a buzzword that is pumping valuations and making business people see $$$.

But coding assistants are not that. For many programmers, they are quickly becoming valuable tools that do in fact speed up development.

JTyQZSnP3cQGa8B · 2025-01-26T14:10:50 1737900650

> you work in a language that is underrepresented in models meaning off the shelf tools don't work well

That's exactly what happens, and why I think the whole hype is a joke. I have tried all the models and tools though, it's always an annoying mess.

> tools can solve nearly every single leet code problem faster than you

That would be useful if I was paid to "leet code" or solve Christmas games. This is not a good rebutal though but it made me smile.

> The kind of work I'd give to a Jr/Mid

Good, but I don't want to know what happens in 20 years when there are no more juniors to feed the AI and work on becoming seniors. I will be retired by then and I'll enjoy writing my own open-source stuff.

otabdeveloper4 · 2025-01-26T13:27:17 1737898037

> Our modern model and tools can solve nearly every single leet code problem faster than you.

That's expected, since all the leetcode problems have ready-to-use solutions on the internet.

(In fact, the reason they ask leetcode questions isn't to test your IQ, it's to know if you've read the obvious and available literature.)

criley2 · 2025-01-26T13:33:59 1737898439

Two points:

>That's expected, since all the leetcode problems have ready-to-use solutions on the internet.

1) If the implication is "The model knows the answer and regurgitates it like lyrics to a song" then I would push back. Put a leet code problem into deepseek r1 chain-of-reasoning model and watch it spend 2 minutes spitting out 5000 words thinking through every single facet of the problem and genuinely solving it at a level that is higher than 95% of programmers.

And point 2)

If you do believe it's fundamentally about how much the model has been trained on, then it has seen your CRUD app and has already seen 10,000 times the feature or system you're about to write -- so it should be a foregone conclusion that it can also do all of that development work too. Only the higher order architecting and proprietary domains should be challenging for it, as there would be far less examples to train on (scarcity) or the model doesn't understand a complex solution (architecting systems at scale is something it can't do).

(I also point out how well these models did for Advent of Code 2024, when there were zero examples in the training data for it).

zeroonetwothree · 2025-01-26T15:39:55 1737905995

Is it “thinking” or is it regurgitating analysis of the problem it found somewhere on the internet?

criley2 · 2025-01-26T17:23:15 1737912195

Are you "thinking" or are you regurgitating analysis of the problem based on what you read on the internet, too?

This one is funny because for something like leet code, nearly everyone just reads the best answers, learns them and learns how to regurgitate them in an interview environment.

otabdeveloper4 · 2025-01-27T05:32:14 1737955934

> thinking through

It's not "thinking", it's regurgitating an internet search and padding it out with markov-chain style text autocompletion.

The 5000 words of padding do not actually provide any value, it's verbal white noise to fill space.

> ...then it has seen your CRUD app and has already seen 10,000 times the feature or system you're about to write

Well, yes. Lots of pointless waste in software engineering. Fortunately I don't write CRUD apps and AI does nothing at all for me in a professional context.

aleph_minus_one · 2025-01-26T14:35:00 1737902100

> (In fact, the reason they ask leetcode questions isn't to test your IQ, it's to know if you've read the obvious and available literature.)

Rather: it tests whether you are sufficiently docile and devoted to be willing to cram lots of leetcode exercise books that have no relevance for the programming concepts that the job will involve, just for a lottery ticket for a somewhat well-paid position.

I know that there is so much more to programming and related topics that is sooo much deeper (in particular if non-trivial mathematics becomes involved) than these leetcode-style brainteasers. So I strongly prefer to read about such deeply intellectually inspiring topics related to programming instead of jumping through the idiotic hoops that other people want me to.

Indeed, I thus fail the test for docility and devotedness, but I honestly can't take organisations seriously that demand such jumping through hoops.

toprerules · 2025-01-26T15:01:20 1737903680

I feel like 95% of the developers who are touting AI are doing web dev or app development - a field which has low stakes, low barrier to entry, and an incredible amount of reinventing the wheel - all things that an LLMs is naturally going to excel with. I can't imagine you'd hold these same beliefs if you were writing control software for a life saving medical device where a single bug could kill someone, where the board package is proprietary and not understood by an LLM, and the thing is written in a C dialect combining only half the features of C99 with a subset of obscure compiler flags.

matwood · 2025-01-26T14:03:44 1737900224

I think people are running into a couple of challenges. One is keeping up with the pace of improvement. The tools are much improved over 6-12-24 months ago. A poor first impression can leave people thinking a tool is terrible forever more. Second is that someone must learn to work with the new tools. The hype can lead people to think the tools will magically just do all these things. The reality is, like most tools, it takes some trial and error to learn how to best use it.

StefanBatory · 2025-01-26T14:34:41 1737902081

ChatGPT in 2023 was a fun toy, but just that.

Claude in 2025, especially with Project feature is far better. It can complete CRUD project on its' own, and all I have to do is to fix glaring issues and design API before.

Which might be not impressive to someone, but it is good at that. And few years ago, it would not be possible.

cess11 · 2025-01-26T18:33:13 1737916393

CRUD generation is a staple in web framework CLI tooling since like forever. Once upon a time Wordpress was this, the boilerplate application you scripted out and then adapted. In certain programming languages macros or type systems power up this kind of tactic, and IDE:s typically have very good support for these kinds of shortcuts.

Then the project management tooling does a lot more, like automatically reverse engineer existing databases and so on.

dagw · 2025-01-26T13:19:10 1737897550

Autocomplete works by analyzing the official API and interface, it's completely different than AI

You can (and should) give the AI access to your existing codebase and any relevant documentation to use as context if you want good results. If you give the AI zero context for the problem it is trying to solve, of course it will struggle. If you give it all the necessary context, it will do much better.

I've found that just uploading the documentation of the API or library you are working with before asking the AI questions about it makes a huge difference in the quality of its output.

sixstringtheory · 2025-01-26T15:02:05 1737903725

>> variable names (a hard part of computer science)

>Naming is for software engineering, not CS.

I figured they were referencing the “two hard problems of computer science”, those two being naming things, cache invalidation and off by one errors.

Everybody knows the hardest problems in software engineering are assembling promo packets and building consensus on number of spaces per indent.

nailer · 2025-01-26T18:36:51 1737916611

‘Number of spaces per indent’ begs the question: why spaces? The proper indentation character is the corn emoji.

aleph_minus_one · 2025-01-26T13:07:45 1737896865

> Hence why most keywords and variable names (a hard part of computer science) are in simple English and it is considered a net positive.

As I'm not a native English speaker, I disagree. I learned programming long before I got decent in English, and even today I just consider the English keywords in programming languages to be some "abstract mathematical concept" that by mere coincidence is named after some real, existing English word. Even today, being somewhat decent in English, I stil think this way when I see program code.

I actually would insist that this is a much more useful way to think about good programming, since this way you have no difficulties to ask yourself all the time whether it would make sense to replace some "English-named" concept by something more useful, but which has no analogue in the English language (or any other natural language).

tgv · 2025-01-26T13:48:46 1737899326

There have been studies (in the 80s or 90s, I never wrote down references, unfortunately, but they probably involved lexical priming) that support that idea. They suggest that English keywords get a meaning of their own for non-native speakers.

llm_trw · 2025-01-26T13:18:31 1737897511

49 20 72 61 74 68 65 72 20 74 68 69 6e 6b 20 74 68 61 74 20 74 68 65 20 6e 61 6d 65 73 20 6f 66 20 6b 65 79 77 6f 72 64 73 20 6d 61 74 74 65 72 20 61 20 6c 6f 74 2e

kmstout · 2025-01-26T14:08:37 1737900517

4b 65 6e 74 20 50 69 74 6d 61 6e 20 72 65 6c 61 74 65 64 20 5b 30 5d 20 74 68 65 20 63 6f 6d 6d 65 6e 74 20 6f 66 20 61 20 70 72 6f 67 72 61 6d 6d 65 72 20 69 6e 20 50 61 6e 61 6d 61 2c 20 77 68 6f 20 6c 69 6b 65 6e 65 64 20 45 6e 67 6c 69 73 68 20 6b 65 79 77 6f 72 64 73 20 74 6f 20 6d 75 73 69 63 61 6c 20 6e 6f 74 61 74 69 6f 6e 73 20 62 6f 72 72 6f 77 65 64 20 66 72 6f 6d 20 49 74 61 6c 69 61 6e 2e 0a 0a 5b 30 5d 20 68 74 74 70 73 3a 2f 2f 64 65 76 65 6c 6f 70 65 72 73 2e 73 6c 61 73 68 64 6f 74 2e 6f 72 67 2f 73 74 6f 72 79 2f 30 31 2f 31 31 2f 30 33 2f 31 37 32 36 32 35 31 2f 6b 65 6e 74 2d 6d 2d 70 69 74 6d 61 6e 2d 61 6e 73 77 65 72 73 2d 6f 6e 2d 6c 69 73 70 2d 61 6e 64 2d 6d 75 63 68 2d 6d 6f 72 65

zahlman · 2025-01-26T13:30:46 1737898246

>Hence why most keywords and variable names (a hard part of computer science) are in simple English

"Natural language" is about far more than individual words.

d_tr · 2025-01-26T13:10:04 1737897004

Bad take. Identifiers are just labels.

nailer · 2025-01-26T13:12:42 1737897162

I don’t think anyone disagrees that identifiers are labels. If you’re claiming that these labels are unimportant, I’d be interested in why you think this.

d_tr · 2025-01-26T13:52:14 1737899534

Identifiers are super important and should be chosen wisely, but a C program with English identifiers is still a C program, while an LLM prompt is in fact NL and a whole new layer of that between your brain and the bytes. Which is why I think that saying "we already put NL between us and the bytes" minimizes that difference and is a bad take.

dartos · 2025-01-26T14:55:23 1737903323

Exactly.

Not to mention that our existing programming languages have a deterministic output given the same code and the same compiler.

LLMs do not.

Thus, LLM prompts are an entirely different class of tool than a programming language.

This should be obvious to anyone who has written code, but alas.

nailer · 2025-01-26T19:47:59 1737920879

Ooh. Good point. I guess we need traditional languages as a way to debug the created machine code. But… what if we didn’t? Ie the LLM made bytecode and we had some better way to talk about the concrete implementation.

seba_dos1 · 2025-01-26T16:14:00 1737908040

They mostly aren't important though. When I first learned Pascal, JavaScript and PHP as a child, I had barely any idea what all those English words meant. Later on, when I was learning English in middle school, I was remembering their meanings by recalling what they do in code.

myth2018 · 2025-01-26T16:26:08 1737908768

I agree, and it's sad that, despite all those pitfalls, some companies and CEOs will keep pushing the idea that human programmers can be replaced by AI.

But well I guess there's a bright side to see here: those LLMs applied to software development might become the new Genexus and there are gonna be plenty of open positions for humans to rewrite entire systems in a not so far future.

jstummbillig · 2025-01-26T13:45:25 1737899125

That's obviously wrong, as demonstrated by the engineers invested in building tools to enable that.

gessha · 2025-01-26T14:08:05 1737900485

A lot of engineers invested in building crypto stuff and we didn’t go far in personal banking. Hype-driven development is not guaranteed to succeed.

jstummbillig · 2025-01-26T20:31:27 1737923487

That was not what was argued, and not what I am arguing. OP claimed that "only someone who has never built anything of significant complexity and utility can think that putting natural language encoding between you and the bytes is a net positive."

Unless OP is also willing to claim that all the people who are working on LLM dev tools are frauds, and act against their better knowledge, OPs claim is obviously false. The entire premise that the people who build these tools operate under is that natural language "between you and bytes" can be a net positive.

trinix912 · 2025-01-26T14:04:30 1737900270

Which has been proved again and again to only get you 70% there [1].

[1] https://addyo.substack.com/p/the-70-problem-hard-truths-abou...

dartos · 2025-01-26T14:51:15 1737903075

Popularity doesn’t demonstrate anything

fcatalan · 2025-01-26T13:17:30 1737897450

This has been my experience with a recent try to guide the LLM to a complete implementation of a small internal tool. I had in an hour what would have taken me 4 or 5 to write. But after that, it was an endless loop of the LLM adding logging code to find some bug and failing to fix it, only to add more logging code and ineffectual changes and so on. The problem is that even after it's lost at sea, it's still answering in a completely confident and self assured tone, so when you decide to take matters in your hands you might be too far gone from sanity and have an unfixable mess in your hands. I guess I can go back to where it strayed and retake it from there, but by now the experiment seems to be a failure.

almog · 2025-01-26T13:40:52 1737898852

Back in early 2023 I tried to write a tool to do my taxes based on my broker CVS files. Since I wasn't familiar with how the data was structured, I let the LLM lead me while building this in incremental steps. The result was not just buggy, it simply failed to detect the relationships in the data (multiple somewhat implicitly embedded tables that needed to be joined). Even after I pointed this out, it failed to handle it, getting stuck in the same kind of loop you described.

To this day, no LLM that I tried passed this task of leading the development while detecting the underlying structure of the data.

morsecodist · 2025-01-26T13:31:37 1737898297

At least in my experience as soon as something goes a little wrong it just gets worse from there. The more of it's confusion and contradictory information are in the chat history the worse it gets. It also has to make changes to the code so you accumulate these spurious changes and the problem gets more confusing. I've had some luck starting over with a new chat asking what is wrong but if that doesn't work I just assume I'm on my own.

diggan · 2025-01-26T14:05:59 1737900359

I've found that quality degrades really quickly after just the first reply, for some reason. They all seem heavily biased towards one-shot correct answers, and as you say, they go down the wrong path really quickly if you even get the first message slightly wrong.

I tend to restart chats from the beginning pretty much all the time, because of this.

iamflimflam1 · 2025-01-26T15:19:19 1737904759

I’ve also found this to be the case. Starting a new chat or in Cursor composer session puts things back on the right track. Also, prompting is really important. A lot of people just seem to think they have some kind of oracle - “fix the bug” - how is anything supposed to work from that?

KronisLV · 2025-01-26T13:40:21 1737898821

> But after that, it was an endless loop of the LLM adding logging code to find some bug and failing to fix it, only to add more logging code and ineffectual changes and so on. The problem is that even after it's lost at sea, it's still answering in a completely confident and self assured tone, so when you decide to take matters in your hands you might be too far gone from sanity and have an unfixable mess in your hands.

I wonder how much better or worse things would get, if we took the human factor out of the loop. Give the LLM the ability to run tests and see the results, then iterate on its own output and branch off with different approaches, gradually increase the temperature etc.

Maybe it’d turn out that you need 10 LLMs running in parallel for an hour to fix something, or perhaps even a 100 would never stumble upon a solution for a particular type of problem. And even then I wonder, whether it’d get better if you fed it your entire codebase or the codebases of the entire libraries or frameworks that you use (though at that point you’re either training it yourself or are selectively finding and feeding the correct bits not to exceed the context).

renewedrebecca · 2025-01-26T16:27:19 1737908839

But why? What is there to be gained in all of this work around the inherent limitations of this technology?

whattheheckheck · 2025-01-28T04:45:52 1738039552

Get more people into computer science. Knuth said early on in his career he thought he needed to make the computer faster or cheaper but really it was about getting more users. Anyone can program. Or try to then learn about computer science

svat · 2025-01-28T20:57:04 1738097824

Where has Knuth said that? Sounds like the opposite of what I've generally heard from him (not caring about popularity etc).

KronisLV · 2025-01-26T17:49:49 1737913789

Exploration of what’s possible and what’s not, identifying whether the weaknesses can or cannot be addressed.

A bit like traditional autocomplete can help streamline familiarising oneself with various libraries, a clear step ahead when compared to just needing to dig through documentation as much.

Maybe there’s a class of code problems that LLMs can be decent at solving, given the ability to iterate, verify solutions and what works or doesn’t, perhaps with 10x more compute than is utilized in the typical chat mode of interaction though.

williamcotton · 2025-01-26T13:38:47 1737898727

Part of the skill in using these tools is recognizing when it spins off the rails and backtracking immediately. Most of the time something can be gleaned from that wrong approach which can then guide further attempts.

joshstrange · 2025-01-26T13:44:30 1737899070

This is my experience with Aider. When I first started using it, I turned off the auto git commits, but I’ve since turned them back on because they serve as perfect rollback points. My personal style is only commit once I have a feature fully working but with Aider it's best to have it commit after each exchange.

I've gone 2-6 steps down a path before realizing this isn't going to work or the LLM is stuck in a loop. I just hard reset back to the first commit in that chain and either approach the task differently or skip it if it wasn't really that important.

fragmede · 2025-01-26T13:34:02 1737898442

you're not graded on getting the LLM to output perfect code, the point is to get the code in git and PR'd. If your LLM tooling doesn't automatically commit to git so you can trivially go back to "where it strayed" you need to find a better tool. (My current favorite is aider)

It's a tool not a person. When was the last time you got mad at a hammer for being smug?

sebastiennight · 2025-01-27T10:38:11 1737974291

To drive the point home, hammers are quite smug.

Mine always thinks it nailed it on the first try, and it's pretty hard-headed when you point out mistakes.

If you can't work around those limitations, you're screwed.

alwinaugustin · 2025-01-26T14:37:36 1737902256

The current state of so-called AI does not provide much meaningful assistance in software development beyond basic tasks such as explaining workflows, breaking down thought processes, and performing simple conversions. I believe that generative AI, in its current form, is not true artificial intelligence. Rather, it is a sophisticated prediction engine that lacks genuine reasoning or understanding.

True AI should be capable of comprehending problems and devising its own solutions, rather than merely generating statistically likely outputs. Until AI reaches that level of cognitive ability, its applications in the real world remain limited, and much of what we see today is largely hype.

Tokenization and embeddings merely help models predict the most probable next token, a process that is executed at scale using vast computational resources. This is not intelligence but large-scale probabilistic prediction. The terminology used in computer science, especially in recent years, can often be misleading.

lubujackson · 2025-01-26T14:56:30 1737903390

I think this comment underlines the biggest difference between people that say AI is a transformative tool and people that say it is nowhere close to working as expected.

I never expect some magic "understanding" to ever arrive, but doing remedial pattern matching is already a hugely valuable power that frees up humans to do more interesting work. This is how I use current AI - spitting out 5 line functions I could spend 5 minutes writing that he can do in 3 seconds and take me 10 seconds to review. Like "check for circular references" or "use Django ORM to write a query for all categories that have this flag for users that have this permission".

It doesn't "write the app" or solve difficult problems for me (unless it is some configuration issue). I can paste in a error code and save myself a few minutes of manual debugging. If I add a new parameter to a function it prefills the correct type definition and things like that. These are all micro-improvements but add up to a lot of saved time. Some people have success with editing across files but I rarely even try that - it excels at solving discrete, repeatable bits of work with tidy solutions so I use it for that.

Until AI can return "I don't know" or, better, "did you want it this way or that way?" it will be severely limited. Yes, it acts like a junior dev in some ways, but a junior dev that never asks any questions, which is not the junior dev you ever want to give important work.

notjoemama · 2025-01-26T14:43:02 1737902582

Do we really want this? As soon as possible, employers will fire software engineers and replace them with AI. I’m positive they will not care about what AI can do, only how many salaries they can eliminate and still achieve the same results. You and I will not be the inheritors of AI.

jfengel · 2025-01-26T14:50:33 1737903033

I think that by the time AI can genuinely replace software engineers, a lot else in society will change.

It's hard to predict what it will look like. I could write both utopian and dystopian narratives and I can pretty much guarantee they'll both be wrong. Not "in the middle" but something unexpected, the way nobody predicted cat videos or doomscrolling.

But you are almost certainly right that we will not be the inheritors.

fellowmartian · 2025-01-26T14:48:23 1737902903

Yes, because employers will also be replaced by AI. Technology penetration won’t stop at some arbitrary boundary, it will go all the way through to logical conclusion. We have a chance at qualitatively better world, but we’ll need to act and push for new economic systems - when the time comes.

pavel_lishin · 2025-01-26T14:48:38 1737902918

Maybe I read too much science fiction, but my first thought when speaking about "true AI" isn't the worry that a lot of us will get fired, it's the worry that we'll have created an army of digital slaves.

karamanolev · 2025-01-26T15:02:00 1737903720

"Army of digital slaves" doesn't really sound that bad when I think about it. As long as it's your army and not your adversary's army... In what ways do you think "an army of digital slaves" is bad?

pavel_lishin · 2025-01-26T15:17:13 1737904633

I guess only in the sense that any form of slavery is bad and morally reprehensible, and ethically inexcusable.

lazide · 2025-01-26T14:46:15 1737902775

“and still achieve the same results”.

That is the part that won’t actually happen, at least pretty quickly.

CharlieDigital · 2025-01-26T14:09:42 1737900582

I have a non-technical friend who in the last two months has bootstrapped a SaaS startup using nothing but AI. He's got just over a handful of paying customers at this point on a monthly subscription[0].

I asked him to show me his process[1] after trying my hand (20 year, principal) and noticed a big difference in how we used AI: I instruct the AI how to code, he asks the AI to fix problems. In other words, I have a tendency to look at the code and ask the AI to fix it in more specific and direct ways that I want it fixed. On the other hand, if something doesn't work, my friend will copy/paste the error to the AI directly out of the dev tools console and ask the AI to fix the error. The two approaches are totally different.

My lesson here is that you're not meant to debug AI generated code; hand the error off to the AI and let it fix itself. I think if you're debugging AI generated code, you're doing AI generated code wrong. If you're an experienced dev picking up AI coding, I think you need to shift your mindset entirely. Ideally, someone out there will just create a closed loop where the AI can fix itself when it finds an error (integrate some browser and autonomous test loop into Cursor, for example, and let it fix its own errors).

Conclusion: if you're going to use AI to code, commit to it and use AI to fix the errors as well. Use AI for every aspect of it.

[0] Yes, I'm sure there are security holes and code issues galore, but those can always be fixed later when he's proven the business model.

[1] Yes, I have told him that he should create a YT channel or stream on Twitch because the content itself is super interesting how well he's been able to use AI.

abossy · 2025-01-26T14:14:45 1737900885

In my experience, AI isn't very good at debugging AI-generated code. If it fails to make the right insight, it loops continuously until it's completely off the rails. I'm surprised your friend hasn't fully gotten stuck with this, as it seems like a huge risk for his startup.

CharlieDigital · 2025-01-26T14:51:24 1737903084

Having had an inside view of a YC startup that went from seed to C, I can tell you that code quality means a lot less than one would think when it comes to the early days of a startup.

The biggest risk to a startup is that you get the business model wrong or you don't ship code, even if it's the code is buggy and messy.

InvOfSmallC · 2025-01-26T15:49:01 1737906541

I don't know which specific LLM your friend used but pasting the error to the LLM usually ends in a endless loop where they tell you to do the same thing over and over again or the solution doesn't really work or generate another error.

So maybe he was lucky or he is using a very good LLM I'm not aware of.

CharlieDigital · 2025-01-26T16:08:38 1737907718

Claude Sonnet. If your choices are to pay out of pocket for an offshore contractor and wait for weeks or pay $20/mo. for an LLM, it's pretty clear that even if you have to sit there for a few days until you get what you want, using the LLM is the better bet if you're non-technical. In either case, the code would be of questionable quality and a non-technical person would not be able to tell the difference anyways. I see it as a wash.

arend321 · 2025-01-26T14:32:00 1737901920

This probably only works if you glue a bunch of high level, popular APIs together. It might work, but will be fragile and expensive.

dartos · 2025-01-26T14:57:48 1737903468

> fragile and expensive

Unfortunately, that’s the most common kind of software in the saas industry anyway.

CharlieDigital · 2025-01-26T14:52:14 1737903134

Most SaaS apps today can be done by gluing together popular APIs (e.g. Stripe, Shopify, etc.).

No better or worse than hiring cheap offshore contractors to do the same, IMO.

bendauphinee · 2025-01-26T14:36:10 1737902170

As an experienced developer, that’s also how I use it. What I’m finding is that it generally rabbit holes as I give it new errors that it’s previous fix has produced.

However, usually after three or four of those kind of fixes, I can walk it back to the starting point before the initial error, and I now know how to prompt it to produce correct code, because I now have a better mental model of how the thing is supposed to work.

This has been super helpful in my process of learning new things, as well as relearning things I haven’t worked with in a while.

jfengel · 2025-01-26T14:54:44 1737903284

In my experience, fixing security issues after the fact is extremely challenging. A secure system has a very different architecture, with security as its fundamental function and business logic almost an afterthought.

It's not impossible to fix later. But it's often more effective to scrap and rewrite. Hopefully your proven business model has yielded enough money for that, before someone else has pwned it.

Timber-6539 · 2025-01-26T15:18:12 1737904692

One can only imagine how many corners your friend had to cut to get to the product you call finished.

CharlieDigital · 2025-01-26T15:33:28 1737905608

He's got paying customers organic inbound by word of mouth only; there must be some value there.

kuschku · 2025-01-26T15:47:09 1737906429

> word of mouth

> must be some value

https://en.wikipedia.org/wiki/Parasite_(2019_film)#Plot

CharlieDigital · 2025-01-26T16:04:55 1737907495

I don't see how these two would be connected.

His SaaS solves a problem for a very specific industry that's quite small (his market research yielded about 6000 customers) so it's a small, niche industry where a lot of the small business proprietors know each other through a trade group and more or less have the same problem: the alternative solutions currently in the market are expensive, entrenched, legacy providers that operate through a POS while his solution is web-based and costs less.

skeeterbug · 2025-01-26T14:16:02 1737900962

I wonder what this codebase will look like after a year or so of doing this.

extesy · 2025-01-26T14:42:10 1737902530

Bugs will escalate from syntax errors to business logic errors ("one customer was charged twice"). There won't be anything to copy/paste, no AI will be able to fix these errors and no human will touch this codebase with a long pole.

iamflimflam1 · 2025-01-26T15:37:46 1737905866

Have you seen the job market at the moment? Humans will do a lot of things to keep a roof over their heads.

jiehong · 2025-01-26T15:18:00 1737904680

I’ve seen some users do that, and got stuck in a loop where the AI is saying "ah, this error is because", and not fixing it properly, or fixing it and adding a different issue by modifying part of the code that is not related at the same time. Next, the code is fixed but by added the old fixed issue.

I saw that with users asking VBA code to be generated by people trying to automated part of email and excel work.

CharlieDigital · 2025-01-26T16:01:19 1737907279

It's possible, but his choices are 1) hire someone else, 2) just sit there and prompt again until it's fixed. Since he's bootstrapping this with < $50/mo, the choice is simple.

Also, it may be the case that the corpus of training data with VBA is not as good as it is with React these days.

plagiarist · 2025-01-26T15:28:10 1737905290

I have tried that and the AI gets stuck just attempting whatever and writing more code that won't even compile. I have had more success trying to get it to follow steps or examples.

Maybe the language your friend is using has more examples for training, or perhaps the dynamism of some languages get it to runtime errors that have better details it can work with.

CharlieDigital · 2025-01-26T16:11:40 1737907900

React and JS so I think it has some benefits since 1) it has a large corpus of recent training data, 2) the browser gives pretty good errors.

I also tried it and the biggest issue I ran into is that I'm very specific about what I want. I wanted to use `nanostores` for state and routing. Problem is that the LLM keeps using code from `react-router` instead of `@nanostores/router`. As soon as I point it out, the LLM fixes it, but the first pass code generation is almost always wrong, even using an instruction file (as documented in both Cursor and GH Copilot).

That's when I realized that we are using the AI in two totally different ways: he simply doesn't care about the implementation, prop drilling, any of the technical details. None of that matters to him except that when "this button is clicked, that action happens". So however complex or inefficient or imperfect the code is, he doesn't care whereas I still have a tendency to read the code and try to ask the AI to do it in specific ways.

arrowsmith · 2025-01-26T14:30:44 1737901844

> integrate some browser and autonomous test loop into Cursor

Doesn't this exist yet? It's such an obvious idea I'd be astonished if no-one has done it.

CharlieDigital · 2025-01-26T14:53:31 1737903211

They exist in separate pieces; I've not seen it integrated into one loop yet.

Code gen -> show the AI an example of how it's supposed to work -> error -> code gen -> AI tries it again by itself -> Code gen

imtringued · 2025-01-26T14:53:18 1737903198

This only works if there is an error message. Do you instruct the AI to fill in the code with asserts and not implemented exceptions?

CharlieDigital · 2025-01-26T14:56:32 1737903392

I was only on a session with him for like 15 minutes and he showed me his prompt history. Basically when he hits an error, he will paste the error and give a simple instruction like "I'm getting this error when I click this button: <ERROR_HERE>" and then repeat until it's fixed. Nothing special; imagine a non-technical PM giving directions to a junior dev except this junior dev codes nearly instantaneously.

Madmallard · 2025-01-26T14:12:46 1737900766

Yeah this works for crud apps with conventional methods for accounts email payments and not at all for anything complex especially if it isn’t a super commonly used language or framework. Try coding a single game with AI that isn’t something done 10000 times already. It actually is impossible.

llamaimperative · 2025-01-26T14:14:11 1737900851

The vast majority of code in the world is the former though

sarchertech · 2025-01-26T14:38:30 1737902310

The last 20 years of programming tells me that this isn’t the case.

This is only the case for new projects which don’t yet have users. Add users to even the simplest project and it evolves into a special snowflake with never before seen edge cases.

That’s why low code solutions are great for prototyping but eventually always explode into a nightmare of complexity.

zug_zug · 2025-01-26T13:05:35 1737896735

This roughly mirrors my experience so far. Mind you I'm an extremely qualified engineer who has worked at FAANG.

Except I'd add that as one gets experience working with the AI I can only assume they'd get much better at making it go smoothly. For example, I wouldn't manually rewrite localhost, I'd tell the AI "Why is localhost everywhere? Will this worker if I deploy to a droplet?" and it will fix it for you.

Also I just paste error-messages directly into the AI and it usually knows how to fix them.

Sometimes it's net positive, sometimes it's net-negative due to creating a mess that's really hard to get out of or debug. But I imagine it's only a matter of time until the scopes in which it's cost-effective go up.

I don't like that AI is a threat of huge monopolistic and job-reducing potential, but I don't think downplaying it is a long-term strategy to combat that.

skydhash · 2025-01-26T13:51:11 1737899471

> For example, I wouldn't manually rewrite localhost, I'd tell the AI "Why is localhost everywhere? Will this worker if I deploy to a droplet?" and it will fix it for you.

The solution is multi occur (emacs), quickfix list (vim), or any editors that have whole project find and replace.

MaKey · 2025-01-26T14:00:59 1737900059

Which will also be much faster because you don't have to worry about sanitizing your code before sending it to an LLM or that the LLM made a mistake somewhere along the way.

GeoAtreides · 2025-01-26T14:08:37 1737900517

> I'm an extremely qualified engineer who has worked at FAANG.

> I just paste error-messages directly into the AI

...

scarface_74 · 2025-01-26T14:46:43 1737902803

I find it funny that commenters on HN actually think their having past or current experience working at a FAANG is some sort of signal for two reasons.

On HN especially, that’s really nothing novel, many of us have (including me) and the only thing that it takes to get into one as a software engineer is memorizing the solution to coding problems.

When I’m hiring - mostly for green field initiatives - coming from BigTech is usually a negative signal for me.

zug_zug · 2025-01-26T15:06:11 1737903971

I'm not sure what your point is here...

Where the author went wrong in this post is that he tried to interpret an error ("I was asking claude to solve the wrong problem"), was wrong, and then wasted a lot of his own time.

I really think it's best practice when describing a problem to anybody that you start with what you observe and then if you want to hint your suspicions you call those out afterward as such. If you're very confident the LLM is going down a wrong path, you can ask it things like "How would I test the theory that environment variables aren't set in my docker container?"

HL33tibCe7 · 2025-01-26T12:53:34 1737896014

This. Great, AI can produce code. But it produces code without inducing understanding of the code in the person who wrote (or rather supervised the production of) it, which is half the point.

At some point AI will probably be good enough that this won’t matter. But it feels like we’re still a long way off that.

stuaxo · 2025-01-26T12:48:19 1737895699

Oh look, a load of future work to fix these.

Why is this just like the last cost cutting exercise where the cheapest people in India produced a lot of "interesting" code.

SunlitCat · 2025-01-26T13:15:54 1737897354

Way back in the 2000 (or even before that, can't remember!) I wanted to get into winsock programming. I found a page where someone from India explained that with examples.

The variables, functions and so on had names like:

a aa aaa b bb bbb

It helped me to grasp the basic concept, but was kinda hard to follow, tho. :D

nailer · 2025-01-26T13:15:23 1737897323

You can update requirements, educate developers, and fix bad code with an LLM many orders of magnitude faster than you can with Wipro.

llm_trw · 2025-01-26T13:23:56 1737897836

Because ignoring a heroic effort from all the women in India the number of Indian developers does not double every 4 years.

The number of flops a gpu can output on the other hand does.

scarface_74 · 2025-01-26T14:49:25 1737902965

See also: almost every bespoke internal app written in FoxPro, VB, Excel with VBScript, etc

42lux · 2025-01-26T12:58:35 1737896315

Can anyone explain why everyone is so hyper focused on speed? 500 images per second, 100 minutes video in 30 minutes, thousand lines of loc per hour. Who is going to consume all that?

martin-t · 2025-01-26T13:05:30 1737896730

Most of what generative models produce is shit so they have to produce a lot in hopes _some_ of it is OK-ish.

It's also about responsiveness. LLMs produce junior-level quality of code at a rate of hundreds of lines per minute. I need it to produce enough to spot where it's completely wrong as quickly as possible to I can change the prompt.

It's like a edit-compile-run cycle which you also need to be fast or you lose attention.

I was tempted to say it's another _step_ in the edit-compile-run but often the code is so bad I don't even bother compiling.

deergomoo · 2025-01-26T15:33:58 1737905638

I'm firmly of the belief that most software would benefit immensely from us all slowing the hell down and putting more thought into what we build. But it would appear stability and a focus on core strengths doesn't sell nearly as well as endless new features for the marketing sheet added as quickly as possible.

geor9e · 2025-01-26T16:07:24 1737907644

"Who is going to consume 1000 lines of code per hour?" he types into his mass-manufactured thinking machine running an advanced operating system, before clicking reply, sending it across a global mesh of said devices.

osmsucks · 2025-01-26T13:00:03 1737896403

Other machines.

JTyQZSnP3cQGa8B · 2025-01-26T14:21:53 1737901313

The images are almost good but still in the uncanny valley. The code is almost good but full of bad practices and hidden bugs and undefined behavior. Since most AI grifters are neither coders nor artists, all they can do is produce more more more capitalism-style.

owenthejumper · 2025-01-26T13:34:02 1737898442

I have had great experience with Claude for coding, but you really need to be a programmer yourself, to be able to divide the problems into manageable chunks.

bboygravity · 2025-01-26T13:54:59 1737899699

Same here, I really don't get all the "it's totally useless for programming" posts on here.

It makes me think many people haven't taken the time to actually learn to use the tool.

It just feels like they tried Copilot or ChatGPT for 5 minutes last year and concluded that all LLM's are useless and will be useless forever.

It makes me wonder if those people know that Claude 3.5 sonnet projects and/or Cursor with Claude exist?

Do they not appreciate some help to document their code? Do they never need to write or quickly understand scripts or code in one of the 100's of languages/stacks they're not too familiar with that they might encounter in the wild? How to get out of yet another git mess? Build a proof of concept in an hour that would've taken you days? A refresher on how to set up x toolchain to get started asap (the nr 1 hardest thing in programming :p) etc etc.

chillingeffect · 2025-01-26T14:11:19 1737900679

Same here. I see these tools as teaching me patiently and challenging me (unwittingly) in areas where i'm out of depth. When i'm lucky they will do simpler stuff for me, but for $40/month, I don't feel entitled to a SaaS-unicorn-terraformer.

MaKey · 2025-01-26T14:15:22 1737900922

> Do they not appreciate some help to document their code?

How does an LLM help there? What the code does should be obvious by looking at it, WHY it was written that way is the interesting question. Answering it often requires more context and domain knowledge.

> Do they never need to write or quickly understand scripts or code in one of the 100's of languages/stacks they're not too familiar with that they might encounter in the wild?

I'd rather take the time to do it myself because if I'm not familiar with a language/stack I won't be able to spot mistakes made by the LLM as easily.

> How to get out of yet another git mess?

Learn to solve the git issue and apply the knowledge in the future so you don't rely on yet another tool.

> Build a proof of concept in an hour that would've taken you days?

I question the premise.

> A refresher on how to set up x toolchain to get started asap (the nr 1 hardest thing in programming :p) etc etc.

How often do you do that? I think it's worth spending the time to do it yourself so you get an understanding of what exactly you're doing there. When you're done you can document the process and come back to it next time.

bboygravity · 2025-01-26T15:47:41 1737906461

What you're basically saying here is: you should just learn more and know more faster.

And what I'm saying is: that's exactly what LLM's are super useful for.

To answer your last question: about every 6 months or so. I'm a freelancer, I do a new project for a new client every 6 months on average. All of their toolchains, build systems, OS of choice for the dev machine, OS of choice for the SoC, documentation methods, PCB design tools, version management systems, release systems, testing frameworks are completely different per client and change constantly (even within the same company) depending on department and moment in time.

MaKey · 2025-01-27T00:26:57 1737937617

> What you're basically saying here is: you should just learn more and know more faster.

I didn't say anything about speed. I think you should take the time to deeply understand what you are working on.

> And what I'm saying is: that's exactly what LLM's are super useful for.

I disagree, LLMs aren't good teachers. You won't be able to spot subtle issues with their output if you're not already familiar with the topic.

> To answer your last question: about every 6 months or so. [...]

I don't see the big advantage of using an LLM there. It can't set up the environment for you.

bboygravity · 2025-01-28T19:27:33 1738092453

> I don't see the big advantage of using an LLM there. It can't set up the environment for you.

I can give it tons of random documentation without having to read through it to understand what parts are useful without having to filter through the irrelevant/mistyped/outdated/badly written stuff, even give it undocumented (shitty old) code and ask specific questions about how it works or what the likely intention was.

I promise you it is useful to me. I use it every day. It can't set up the environment for me all by itself, but it sure can help me do it WAY faster and understand it way faster. Especially the boring stuff nobody wants to do.

PS: who has time to deeply understand all aspects of what they are working on? This makes no sense to me in the context of a job. If I would take time to deeply understand every tool, script or even source code I touch, I would NEVER get to doing ANY work.

cduzz · 2025-01-26T14:48:18 1737902898

Programs are communication between 2 loosely coupled audiences -- the humans who have to maintain / modify the code and the computer that gets to run the code.

Human language, used to convey ideas to other humans, is imprecise. It's fine that it's imprecise because the media (humans) have both good error correction and a reasonable set of global defaults.

Computer languages require enormous precision because they're some mechanical translation to a set of machine code runtime.

Perhaps you can train an LLM on lots of code, and it'll find semantic relationships between some clever code it's been trained on to and your specific request. Perhaps not, and it'll just give a dumb answer or an incorrect answer, (ideally some code copilot will actually try running the candidate answer code against your specific ask?) -- but once the answer gets complex you run into the "it's much harder to debug code than write it, so don't write code that's almost too complex for you to understand" problem.

At work, I constantly have to remind people "don't use math data structures for identities" "but int is smaller" "Are you ever going to want the 95th percentile customerID?" "no that's silly" "then it isn't a number". Or I get to constantly remind people "a string with lots of curly braces and quotes isn't necessarily json; if you're not using a serialized API and just sending bytes to stdout someone else has to parse it" "but I'm using a logging library" "does anything else ever send stuff to stdout while your logging library is running?" "oh yes, we're going to open a ticket to debug that." So I'm not optimistic that running code written by a machine is long-term viable.

That said -- there are situations where machine generated code works -- I think it's been a long time since anyone manually drew masks for etching dies when making CPUs.

jbirer · 2025-01-26T13:41:50 1737898910

Anyone who has ever worked with VCs or shareholders before knows that, if you tell them the reality and limitations of something, they will either fire you or ignore what you say. They have been desperate to remove the leverage programmers have due to their skill and replace us with AI that they don't have to pay salaries to. All we can do at this point is just take VC money promising them exactly what they want to hear, that they will be able to replace us with a NLP model. Sometimes you just can't save people but you can profit from their voluntary fall from the cliff?

keyle · 2025-01-26T13:30:52 1737898252

If you don't why it works when it works, you won't know why it doesn't work when it doesn't work.

The key issues here were staying on top of the AI's help.

Use AI wisely: as an assistant, not as a drunken lead developer.

rchowe · 2025-01-26T14:16:07 1737900967

I played with OpenHands for a few days (using gpt-4o since I already had an OpenAI account). I found it to be decent at writing new code, but then it had a hard time making changes when there was a lot of repetitive code (in a TypeScript / React project that I had it create with vite).

One of the interesting things about OpenHands is that you can see what the AI is doing in the terminal window where you launched it. Since it can't really load the whole codebase into its context window, it does a lot of greping files, showing 10 lines on either side of the match, and then doing a search and replace based on this. This is pretty similar to what a human might do: attempt to identify the relevant function and change it.

I think I might have better luck with a simpler project, e.g. a Sinatra or Flask app where each route is relatively self-contained. I might give it or Cursor another try in the future when the tech has progressed a bit.

morsecodist · 2025-01-26T15:22:22 1737904942

I appreciate posts that are about practical usage of AI and it's strengths/weaknesses and the kind of conversation it generates. Conversations about AI are tough for me to navigate because there are camps of people that seem very invested in AI being either omniscient or completely useless. I regularly see people saying that AI is at the level that it can replace engineers or build whole apps. When I try this with state of the art models, I am seeing results that are nowhere close. That said, I still use AI every day during my development and I have a flow I think makes me way more productive. I want more conversations like this about the mechanics of using AI as it currently, and honestly evaluating it's strengths and weaknesses without getting into hypothetical debates about the future or whether or not the AI "understands".

sega_sai · 2025-01-26T14:48:39 1737902919

It seems there is a battle of two opposite view-points. One is that LLMs are just dumb autocompletes with no ability to understand anything. Another is that LLMs can already right now be substitutes for programmers. I personally thing it is neither, but for experts who know what they are doing it is a massive time saver. I.e. in cases you know what code you want to write, but it's tedious, LLMs can do for you. Also LLMs are great in cases where you are less familiar with a new API, language, but have generally good understanding of programming.

Despite my broadly positive view on usefulness of LLMs, I do not think they are good enough (yet) to build a full system from scratch without an expert supervisor. This should not IMO be used as a 'proof' they are dumb autocompleters.

deergomoo · 2025-01-26T15:29:52 1737905392

> I.e. in cases you know what code you want to write, but it's tedious, LLMs can do for you

I feel like I'm living on another planet when I see this point. I have almost never in my career encountered the situation where actually typing out the code is the time consuming part. The time consuming part is knowing what code you want to write, running it in a variety of circumstances to gain confidence that it's correct, and iterating when it isn't.

Please don't think I'm saying you're wrong by the way—if anything this just shows how diverse programming can be as a career. But I see this point raised a lot and it doesn't match my experience at all.

deadbabe · 2025-01-26T14:58:02 1737903482

Experts who know what they are doing have long had alternatives beyond LLMs to make their work faster.

They have open source libraries, stack overflow, tutorials, documentation, simple code generator tools and snippets.

The speed up we’re seeing is from LLMs basically caching all those things into a huge mathematical model and retrieving information in summarized form ready for consumption.

And while speed is always nice, LLMs are expensive, require maintenance themselves to maintain relevant context, are still error prone, and terrible at true innovation.

In a few years we’ll be talking about the big “AI crash” and “what went wrong” when it has been obvious to experts all along. Winter is coming.

sega_sai · 2025-01-26T15:20:01 1737904801

I am sorry, but the comparison of 'stack overflow' and tutorials to LLMs is bizarre. The amount of time to get to the answer from LLMs is drastically shorter. And claiming that the they only 'cache thing' is just wrong. They are certainly capable of correctly answering things that were not directly in their training set.

deadbabe · 2025-01-26T18:08:14 1737914894

Do you have any examples of a question you could ask to an AI right now that you couldn’t find from a basic search on stack overflow and Google? Didn’t think so.

cushychicken · 2025-01-26T14:46:22 1737902782

One thing I think would have helped the author: write a spec first.

Seriously. It seems stupid. But AI works a lot better with a written spec.

The incredible thing is that the AI can actually be an excellent resource for writing the spec. And it will actually produce better code when you feed the spec back into said AI!

The current generation of AI seems to have fooled a lot of people into thinking that somehow you can jump straight to coding. (Well, you can, and it will probably work if you want to make something small or limited in scope.) Not so!

But, on the bright side, it’s just as good at design as code if you ask the right questions!

I say this having used 4 and 4o extensively in this manner. Just started using sonnet3.5 in this way in the last month or so, and it is amazing at this.

n_ary · 2025-01-26T15:03:53 1737903833

The issue with AI is, it generates what it is trained on. Most publicly available coding contents/examples are just docs or blogspam(geeksforgeeks/javapoint/whatever) where mostly surface level code is mostly peddled. Even, many OSS(small scale) do not have best practices or good code base, just enough to get whatever is needed to be done. Now when you train AI on such data, it’ll excel reproducing(statistically) the same thread of code.

Once the quality of training data improves(somehow getting access to high quality codebase behind corporate walls by promoting these assistants and ingesting the codebase), the output improves.

There was a popular saying, garbage in garbage out.

siva7 · 2025-01-26T13:48:19 1737899299

It delivers debugging hell if you don't know what you do which is usually the case for inexperienced developers. It assists experienced developers very well who can sort through which parts of output are useful from the AI and which not so.

ibloomt · 2025-01-26T15:05:41 1737903941

Heh, after decades of functional programmers being the "well, actually..." crowd at every conference, turns out they were right all along. Just for the wrong reasons!

The pitch:

AI generates tons of plausible-looking garbage Static types catch garbage at compile time OCaml/F#/Haskell fans quietly sipping tea in the corner

The irony? We spent years debating static vs dynamic typing for human developers. But the killer use case may ended up being catching AI hallucinations.

Finally, a business case for monads that doesn't require a PhD!

Time to dust off those Haskell books. Who knew safety could be so profitable? Plot twist: Category theory becomes a required interview question by 2025

peterkelly · 2025-01-26T14:42:13 1737902533

I dream of a world in which more investment is put into creating better programming languages and runtime environments than trying to use LLMs as a way of coping with the complexities of current systems.

selimnairb · 2025-01-26T16:08:17 1737907697

I was recently experimenting with local-only LLM coding assistant in JetBrains products. They did speed things up a bit, but I quickly realized that they were essentially automating the creating a copy-paste errors, resulting in time lost to debugging errors I never would have introduced myself, so I stopped using them.

andix · 2025-01-26T12:30:27 1737894627

My social feeds are full of tech bros who keep telling people AI codes everything for them. AI obviously has some impressive coding skills, but for me it never really worked well.

So is this just an illusion they create, or is it really possible to build software with AI, at least at a mediocre level?

I'm looking for open source projects that were built mostly with AI, but so far I couldn't find any bigger projects that were built with AI coding tools.

dagw · 2025-01-26T12:54:51 1737896091

AI isn't great at creating software, but it is great at writing functions. I often ask AI to "write a function that takes A, looks up B in a SQL database, and returns C, or write a function that implements the FooBar algorithm in C++" and on the whole that works pretty well. Asking it to write documentations for those functions also works really well. Asking it to write unit tests for those functions works pretty well (although you have to be extra careful, because sometimes the tests are wrong).

What you have to do, and what AI cannot do well, is to decide where in the codebase to put those functions, and decide how to structure the code around those functions. You have to decide how and when and why to call each of those functions.

javier2 · 2025-01-26T12:56:45 1737896205

When I have to be that specific with it, it would be faster for me to just write it directly in my normal IDE with great auto complete

dagw · 2025-01-26T13:07:10 1737896830

it would be faster for me to just write it directly in my normal IDE

Then you are a much better developer than me (which you may very well be). I'd like to think I'm pretty good, and I've many times spent hours trying to think through complex SQL queries or getting all the details right in some tricky equation or algorithm. Writing the same code with an AI often takes 2-20 minutes.

If it's faster for me, it might not be faster for everybody, but it is probably faster for many people.

jprete · 2025-01-26T13:42:46 1737898966

The way to get better is to do it a lot. Every time you dig through a problem to solve it, you're not just learning about that problem - you're learning about every problem near it in the problem space, including the meta-problems about how to get information, solve problems, and test solutions.

In a sense you're slowly building the LLM in your head, but it's more valuable there because of the much-better idea evaluation, and lack of network/GPU overhead.

Ekaros · 2025-01-26T14:15:33 1737900933

If you have to spend significant amount of time thinking things through how do you know the output from AI is correct and covers all details?