Hacker News new | past | comments | ask | show | jobs | submit login

I entirely agree about their utility.

HN, and the internet in general, have become just an ocean of reactionary sandbagging and blather about how "useless" LLMs are.

Meanwhile, in the real world, I've found that I haven't written a line of code in weeks. Just paragraphs of text that specify what I want and then guidance through and around pitfalls in a simple iterative loop of useful working code.

It's entirely a learned skill, the models (and very importantly the tooling around them) have arrived at the base line they needed.

Much Much more productive world by just knuckling down and learning how to do the work.

edit: https://aider.chat/ + paid 3.5 sonnet




> Much Much more productive world by just knuckling down and learning how to do the work.

The fact everyone that say they've become more productive with LLMs won't say how exactly. I can talk about how VIM have make it more enjoyable to edit code (keybinding and motions), how Emacs is a good environment around text tooling (lisp machine), how I use technical books to further my learning (so many great books out here). But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them. It's all claims that it's great with no further elaboration on the workflows.

> I haven't written a line of code in weeks. Just paragraphs of text that specify what I want and then guidance through and around pitfalls in a simple iterative loop of useful working code.

Code is intent described in terms of machinery actions. Those actions can be masked by abstracting them in more understandable units, so we don't have to write opcodes, but we can use python instead. Programming is basically make the intent clear enough so that we know what units we can use. Software engineering is mostly selecting the units in a way to do minimal work once the intent changes or the foundational actions do.

Chatting with a LLM look to me like your intent is either vague or you don't know the units to use. If it's the former, then I guess you're assuming it is the expert and will guide you to the solution you seek, which means you believe it understands the problem more than you do. The second is more strange as it looks like playing around with car parts, while ignoring the manuals it comes with.

What about boilerplate and common scenarios? I agree that LLMs helps a great deal with that, but the fact is that there are perfectly good tools that helped with that like snippets, templates, and code generators.


Ever seen someone try and search something on Google and they are just AWFUL at it? They can never find what they're looking for and then you try and can pull it up in a single search? That's what it is like watching some people try to use LLM's. Learning how to prompt an LLM is as much a learned skill as much as learning how to phrase internet searches is a learned skill. And as much as people decried that "searching Google isn't a real skill" tech-savvy people knew better.

Same thing except now it's also many tech-savvy people joining in with the tech-unsavvy in saying that prompting isn't a real skill...but people who know better know that it is.

On average, people are awfully bad at describing exactly what it is they want. Ever speak with a client? And you have to go back and forward for a few hours to finally figure out what it is they wanted? In that scenario you're the LLM. Except the LLM won't keep asking probing questions and clarifications - it will simply give them what they originally asked for (which isn't what they want). Then they think the LLM is stupid and stop trying to ask it for things.

Utilizing an LLM to its full potential is a lot of iterative work and, at least for the time being, requires having some understanding of how it works underneath the hood (eg. would you get better results by starting a new session or asking it to forget previous, poorly worded instructions?).


I'm not arguing that you can't get result with LLMs, I'm just asking is it worth the actual effort especially when there's better way to get that result you're seeking (or if the result is really something that you want).

An LLM is a word (token?) generator which can be amazingly consistent according to its model. But rarely is my end goal to generate text. It's either to do something, to understand something, or to communicate. For the first, there are guides (books, manuals, ...), for the second, there are explanations (again books, manuals,...), and the third is just using language to communicate what's on my mind.

That's the same thing with search engines. I use them to look for something. What I need first is a description of that something, not how to do the "looking for". Then once you know what you want to find, it's easier to use the tool to find it.

If your end goal can be achieved with LLMs, be my guest to use them. But, I'm wary of people taking them at face value and then pushing the workload unto everyone else (like developers using electron).


It's hard to quantify how much time learning how to search saves because the difference can range between infinite (finding the result vs not finding it at all) to basically no difference (1st result vs 2nd result). I think many people agree it is worth learning how to "properly search" though. You spend much less time searching and you get the results you're looking for much more often. This applies outside of just Google search: learning how to find and lookup information is a useful skill in and of itself.

ChatGPT has helped me write some scripts for things that otherwise probably would have taken me at least 30+ minutes and it wrote them in <10 seconds and they worked flawlessly. I've also had times where I worked with it to develop something that ended up taking me 45 minutes to only ever get error-ridden code that I had to fix the obvious errors and rewrite parts of it to get it working. Sometimes during this process it actually has taught me a new approach to doing something. If I had started from scratch coding it by myself it probably would have taken me only 10~ minutes. But if I was better at prompting what if that 45 minutes was <10 minutes? It would go from from a time loss to a time save and be worth using. So improving my ability to prompt is worthwhile as long as doing so trends towards me spending less time prompting.

Which is thankfully pretty easy to track and test. On average, as I get better at prompting, do I need to spend more or less time prompting to get the results I am looking for? The answer to that is largely that I spend less time and get better results. The models constantly changing and improving over time can make this messy - is it the model getting better or is it my prompting? But I don't think models change significantly enough to rule out that I spend less time prompting than I have in the past.


> how much time learning how to search saves

>>> you do need to break down the problem into smaller chunks so GPT can reason in steps

To search well, you need good intuition for how to select the right search terms.

To LLM well, you can ask the LLM to break the problem into smaller chunks, and then have the LLM solve each chunk, and then have the LLM check its work for errors and inconsistencies.

And then you can have the LLM write you a program to orchestrate all of those steps.


Yes you can. What was the name of the agent that was going to replace all developers? Devin or something? It was shown it took more time iterate over a problem and created terrible solutions.

LLMs are in the evolutionary phase, IMHO. I doubt we're going to see revolutionary improvements from GPTs. So I say time and time again: the technology is here, show it doing all the marvelous things today. (btw, this is not directed at your comment in particular and I digressed a bit, sorry).


> asking is it worth the actual effort

If prompting ability varies then this is not some objective question, it depends on each person.

For me I've found more or less every interaction with an LLM to be useful. The only reason I'm not using it continually for 8 hours a day is because my brain is not able to usefully manage that torrent of new information and I need downtime.


It works quite nicely if you consider LLMs as a translator (and that’s actually why Transformers were created).

Enter technical specifications in English as input language, get code as destination language.


English as input language works in simple scenarios but breaks down very very quickly. I have to get extremely specific and deliberate. At some point I have to write pseudocode to get the machine to get say double checked locking right. Because I have enough experiences where varying the prompting didn't work, I revert to just writing the code when I see the generator struggling.

When I encounter somebody who says they do not write code anymore, I assume that they either:

1. Just don't do anything beyond the simplest tutorial-level stuff

2. or don't consider their post-generation edits as writing code

3. or are just bullshitting

I don't know which it is for each person in question, but I don't trust that their story would work for me. I don't believe they have some secret sauce prompting that works for scenarios where I've tried to make it work but couldn't. Sure I may have missed some ways, but my map of what works and what doesn't may be very blurry at the border, but the surprises tend to be on the "doesn't work" side. And no Claude doesn't change this.


I definitely still write code. But I also prefer to break down problems into chunks which are small enough that an LLM could probably do them natively, if only you can convince it to use the real API instead of inventing a new API each time — concrete example from ChatGPT-3.5, I tried getting it to make and then use a Vector2D class — in one place it had sub(), mul() etc., the other place it had subtract(), multiply() etc.

It can write unit tests, but makes similar mistakes, so I have to rewrite them… but it nevertheless still makes it easier to write those tests.

It writes good first-drafts for documentation, too. I have to change it, delete some stuff that's excess verbiage, but it's better than the default of "nobody has time for documentation".


Exactly! What is this job that you can get where you don't code and just copy-paste from ChatGPT? I want it!

My experience is just as you describe it: I ask a question whose answer is in stackoverflow or fucking geeks4geeks? Then it produces a good answer. Anything more is an exercise in frustration as it tries to sneak nonsense code past me with the same confident spiel with which it produces correct code.


It's absolutely a translator, but they're similar good/bad/weird/hallucinaty at natural translation translations, too.

Consider this round-trip in Google Translate:

"དེ་ནི་སྐད་སྒྱུར་པ་ཞིག་ཡིན། འོན་ཀྱང་ཁོང་ཚོ་རང་བྱུང་སྐད་སྒྱུར་གྱི་སྐད་སྒྱུར་ནང་ལ་ཡག་པོ/ངན་པ/ཁྱད་མཚར་པོ/མགོ་སྐོར་གཏོང་བ་འདྲ་པོ་ཡོད།"

"It's a translator. But they seem to be good/bad/weird/delusional in natural translations. I have a"

(Google translate stopped suddenly, there).

I've tried using ChatGPT to translate two Wikipedia pages from German to English, as it can keep citations and formatting correct when it does so; it was fine for the first 2/3rds, then it made up mostly-plausible statements that were not translated from the original for the rest. (Which I spotted and fixed before saving, because I was expecting some failure).

Don't get me wrong, I find them impressive, but I think the problem here is the Peter Principle: the models are often being promoted beyond their competence. People listen to that promotion and expect them to do far more than they actually can, and are therefore naturally disappointed by the reality.

People like me who remember being thrilled to receive a text adventure casette tape for the Commodore 64 as a birthday or christmas gift when we were kids…

…compared to that, even the Davinci model (that really was autocomplete) was borderline miraculous, and ChatGPT-3.5 was basically the TNG-era Star Trek computer.

But anyone who reads me saying that last part without considering my context, will likely imagine I mean more capabilities than I actually mean.


> On average, people are awfully bad at describing exactly what it is they want. Ever speak with a client? And you have to go back and forward for a few hours to finally figure out what it is they wanted?

One of them it was the entire duration of me working for them.

They didn't understand why it was taking so long despite constantly changing what they asked for.


Building the software is usually like 10% of the actual job, we could do a better job of teaching that.

The other 90% is mostly mushy human stuff, fleshing out the problem, setting expectations etc. Helping a group of people reach a solution everyone is happy with has little to do with technology.


Mostly agree. Until ChatGPT, I'd have agreed with all of that.

> Helping a group of people reach a solution everyone is happy with has little to do with technology.

This one specific thing, is actually something that ChatGPT can help with.

It's not as good as the best human, or even a middling human with 5 year's business experience, but rather it's useful because it's good enough at so many different domains that it can be used to clarify thoughts and explain the boundaries of the possible — Google Translate for business jargon, though like Google Translate it is also still often wrong — the ultimate "jack of all trades, master of none".


We're currently in the shiny toy stage, once the flaws are thoroughly explored and accepted by all as fundamental I suspect interest will fade rapidly.

There's no substance to be found, no added information; it's just repeating what came before, badly, which is exactly the kind of software that would be better off not written if you ask me.

The plan to rebuild society on top of this crap is right up there with basing our economy on manipulating people into buying shit they don't need and won't last so they have to keep buying more. Because money.


The worry I have is that the net value will become great enough that we’ll simply ignore the flaws, and probabilistic good-enough tools will become the new normal. Consider how many ads the average person wades through to scroll an Insta feed for hours - “we’ve” accepted a degraded experience in order to access some new technology that benefits us in some way. To paraphrase comedian Mark Normand: “Capitalism!”


Scary thought, difficult to unthink.

I'm afraid you might be right.

We've accepted a lot of crap lately just to get what we think we want, convenience is a killer.


Indeed, even if I were to minimise what LLMs can do, they are still achieving what "targeted advertising" very obviously isn't.


They're both short sighted attempts at extracting profit while ignoring all negative consequences.


To extent I agree, I think that's true for all tech since the plough, fire, axles.

But I would otherwise say that most (though not all*) AI researchers seem to be deeply concerned about the set of all potential negative consequences, including mutually incompatible outcomes where we don't know which one we're even heading towards yet.

* And not just Yann LeCun — though, given his position, it would still be pretty bad even if it was just him dismissing the possibility of anything going wrong


> That's what it is like watching some people try to use LLM's.

Exactly. I made a game testing prompting skills a few days earlier, to share with some close friends, and it was your comment that inspired me to translate the game into English and submitted to HN. ( https://news.ycombinator.com/item?id=41545541 )

I am really curious about how other people write prompts, so while my submission only got 7 points, I'm happy that I can see hundreds of people's own ways to write prompts thanks to HN.

However, after reading most prompts (I may missed some), I found exactly 0 prompts containing any kind of common prompting techniques, such as "think step by step", explaining specific steps to solve the problem instead of only asking for final results, few-shots (showing example inputs and outputs). Half of the prompts are simply asking AI to do the thing (at least asking correctly). The other half do not make sense, even if we show the prompt to a real human, they won't know what to reply with.

Well... I expected that SOME complaints about AI online are from people not familiar with prompting / not good at prompting. But now I realized there are a lot more people than I thought not knowing some basic prompting techniques.

Anyway, a fun experience for me! Since it was your comment made me want to do this, I just want to share it with you.


Could you reference any youtube videos, blog posts, etc of people you would personally consider to be _really good_ at prompting? Curious what this looks like.

While I can compare good journalists to extremely great and intuitive journalists, I don't have really any references for this in the prompting realm (except for when the Dall-e Cookbook was circulating around).


Sorry for the late response - but I can't. I don't really follow content creators at a level where I can recall names or even what they are working on. If you browse AI-dominated spaces you'll eventually find people who include AI as part of their workflows and have gotten quite proficient at prompting them to get the results they desire very consistently. Most AI stuff enters into my realm of knowledge via AI Twitter, /r/singularity, /r/stablediffusion, and Github's trending tab. I don't go particularly out of my way to find it otherwise.

/r/stablediffusion used to (less so now) have a lot of workflow posts where people would share how they prompt and adjust the knobs/dials of certain settings and models to make what they make. It's not so different from knowing which knobs/dials to adjust in Apophysis to create interesting fractals and renders. They know what the knobs/dials adjust for their AI tools and so are quite proficient at creating amazing things using them.

People who write "jailbreak" prompts are a kind of example. There is some effort put into preventing people from prompting the models and removing the safeguards - and yet there are always people capable of prompting the model into removing its safeguards. It can be surprisingly difficult to do yourself for recent models and the jailbreak prompts themselves are becoming more complex each time.

For art in particular - knowing a wide range of artist names, names of various styles, how certain mediums will look, as well as mix & matching with various weights for the tokens can get you very interesting results. A site like https://generrated.com/ can be good for that as it gives you a quick baseline of how including certain names will change the style of what you generate. If you're trying to hit a certain aesthetic style it can really help. But even that is a tiny drop in a tiny bucket of what is possible. Sometimes it is less about writing an overly detailed prompt but rather knowing the exact keywords to get the style you're aiming for. Being knowledgeable about art history and famous artists throughout the years will help tremendously over someone with little knowledge. If you can't tell a Picasso from a Monet painting you're going to find generating paintings in a specific style much harder than an art buff.


> But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them. It's all claims that it's great with no further elaboration on the workflows.

To give an example, one person (a researcher at DeepMind) recently wrote about specific instances of his uses of LLMs, with anecdotes about alternatives to each example. [1] People on HN had different responses with similar claims with elaborations on how it has changed some of their workflows. [2]

While it would be interesting to see randomized controlled trials on LLM usage, hearing people's anecdotes brings to mind the (often misquoted) phrase: "The plural of anecdote is data". [3] [4]

[1] https://nicholas.carlini.com/writing/2024/how-i-use-ai.html

[2] https://news.ycombinator.com/item?id=41150317

[3] http://blog.danwin.com/don-t-forget-the-plural-of-anecdote-i...

[4] originally misquoted as "Anecdote is the plural of data."


> (often misquoted) phrase

You misquoted it there! It should be: The plural of anecdote is data.


Thank you! Another instance of a variant of Muphry's Law.

https://en.wikipedia.org/wiki/Muphry's_law


It's actually "the plural of 'anecdote' is not 'data'".


Apparently what you've said is the most common misquotation. See [3] above.


Oh interesting, thanks! I much prefer that formulation.


In the CUDA example [1] from carlini's "how I Use AI", I would guess that o1 would need less handholding to do what he wanted.

[1] https://chatgpt.com/share/1ead532d-3bd5-47c2-897c-2d77a38964...


Or people say "I've been pumping out thousands of lines of perfectly good code by writing paragraphs and paragraphs of text explaining what I want!" its like what are you programming dog? and they will never tell you, and then you look at their github and its like a dead simple starter project.

I recently built a Brainfuck compiler and TUI debugger and I tested out a few LLM's just to see if I could get some useful output regarding a few niche and complicated issues, and it just gave me garbage that looked mildly correct. Then I'm told its because I'm not prompting hard enough... I'd rather just learn how to do it at that point. Once I solve that problem, I can solve it again in the future in .25x the time.


Here's the thing. 99% of people aren't writing compilers or debuggers, they're writing glorified CRUDs. LLM can save a lot of time for these people, just like 99% of people only use basic arithmetic operations, and MS Excel saves a lot of time for these people. It's not about solving new problems, it's about solving old and known problems very fast.


> "99% of people aren't writing compilers or debuggers"

Look, I get the hype - but I think you need to step outside a bit before saying that 99% of the software out there is glorified CRUDs...

Think about the aerospace/defense industries, autonomous vehicles, cloud computing, robotics, sophisticated mobile applications, productivity suites, UX, gaming and entertainment, banking and payment solutions, etc. Those are not small industries - and the software being built there is often highly domain-specific, has various scaling challenges, and takes years to build and qualify for "production".

Even a simple "glorified CRUD", at a certain point, will require optimizations, monitoring, logging, debugging, refactoring, security upgrades, maintenance, etc...

There's much more to tech than your weekend project "Facebook but for dogs" success story, which you built with ChatGPT in 5 minutes...


This is almost entirely written by LLMs:

https://github.com/williamcotton/guish

I was the driver. I told it to parse and operate on the AST, to use a plugin pattern to reduce coupling, etc. The machine did the tippy-taps for me and at a much faster rate than I could ever dream of typing!

It’s all in a Claude Project and can easily and reliably create new modules for bash commands because it has the full scope of the system in context and a ginormous amount of bash commands and TypeScript in the training corpus.


One good use case is unit tests, since they can be trivial while at the same time being cumbersome to make. I could give the LLM code for React components, and it would make the tests and setup all the mocks which is the most annoying part. Although making "all the tests" will typically involve asking the LLM again to think of more edge cases and be sure to cover everything.


> I recently built a Brainfuck compiler and TUI debugger

Highly representative of what devs make all day indeed


Yea, obviously not, but the smaller problems this bigger project was composed of were things that you could see anywhere. I made heavy use of string manipulation that could be generally applied to basically anything


Really? Come on. You think trying to make it solve "niche and complicated issues" for a Brainfuck compiler is reasonable? I can't take this seriously. Do you know what most developer jobs entail?

I never need to type paragraphs to get the output I want. I don't even bother with correct grammar or spelling. If I need code for x crud web app who is going to type it faster, me or the LLM? This is really not hard to understand.


For many of us programming is a means to an end. I couldn't care less about compilers.


Specifically within the last week, I have used Claude and Claude via cursor to:

- write some moderately complex powershell to perform a one-off process

- add typescript annotations to a random file in my org's codebase

- land a minor feature quickly in another codebase

- suggest libraries and write sample(ish) code to see what their rough use would look like to help choose between them for a future feature design

- provide text to fill out an extensive sales RFT spreadsheet based on notes and some RAG

- generat some very domain-specific realistic sounding test data (just naming)

- scaffold out some PowerPoint slides for a training session

There are likely others (LLMs have helped with research and in my personal life too)

All of these are things that I could do (and probably do better) but I have a young baby at the moment and the situation means that my focus windows are small and I'm time poor. With this workflow I'm achieving more than I was when I had fully uninterrupted time.


> But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them.

I'm an iOS dev, my knowledge of JS and CSS is circa 2004. I've used ChatGPT to convert some of my circa 2009 Java games into browser games.

> Chatting with a LLM look to me like your intent is either vague or you don't know the units to use

Or that you're moving up the management track.

Managers don't write code either. Some prefer it that way.


I have used chatGPT to write test systems for our (physical) products. I have a pretty decent understanding of how code/programs works structurally, I just don't know the syntax/language (Python in this case).

So I can translate things like

"Create an array, then query this instrument for xyz measurements, then store those measurements in the array. Then store that array in the .csv file we created before"

It works fantastic and saved us from outsourcing.


The key difference is that this is a multidisciplinary conversational interface, and a tool in itself for interrelating structured meaning and reshaping it coherently enough so that it can be of great value both in the specific domain of the dialog, and in the potential to take it on any tangent in any way that can be expressed.

Of course it has limitations and you can't be sleep at the wheel, but that's true of any tool or task.


For one, I spend less time on Stackoverflow. LLMs can usually give you the answer to little questions about programming or command-line utilities right away.


I think people who are successfully using it to write code are just chaining APIs together to make the same web apps you see everywhere.


The vast majority of software is "just chaining APIs together". It makes sense that LLMs would excel at code they've been trained on the most, which means they can be useful to a lot of people. This also means that these people will be the first to be made redundant by LLMs, once the quality improves enough.


I would say all software is chaining APIs together.


Well, that depends on how you look at it.

All software calls APIs, but some rely on literally "just chaining" these calls together more than writing custom behavior from scratch. After all, someone needs to write the APIs to begin with. That's not to say that these projects aren't useful or valuable, but there's a clear difference in the skill required for either.

You could argue that it's all APIs down to the hardware level, but that's not a helpful perspective in this discussion.


| You could argue that it's all APIs down to the hardware level, but that's not a helpful perspective in this discussion.

Yes, that's what I'm arguing. Why isn't useful? I think it's useful, because it demystifies things. You know that in order to do something, you need to know how to use the particular API.


Here's one from simonw

https://gist.github.com/simonw/97e29b86540fcc627da4984daf5b7...

There are more to be found on his blog on the ai-assisted-programming tag. https://simonwillison.net/tags/ai-assisted-programming/


> The fact everyone that say they've become more productive with LLMs won't say how exactly. But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them.

A pretty literal response: https://www.youtube.com/@TheRevAlokSingh/streams

Plenty of Lean 4 and Cursor.


> The fact everyone that say they've become more productive with LLMs won't say how exactly.

I have python scripts which do lot of automation like downloading pdfs, bookmarking pdfs, processing them, etc. Thanks to LLMs I dont write a python code myself, I just ask an LLM to write it, I just provide the requirement. I just copy the code generated by the AI model and run it. If there any errors, I just ask AI to fix it.


> The fact everyone that say they've become more productive with LLMs won't say how exactly.

Anecdotally, I no longer use StackOverflow. I don’t have to deal with random downvotes and feeling stupid because some expert with a 10k+ score on 15 SE sites each votes my question to be closed. I’m pretty tech savvy, been doing development for 15 years, but I’m always learning new things.

I can describe a rough idea of what I want to an LLM and get just enough code for me to hit the ground running…or, I can ask a question in forum and twiddle my thumbs and look through 50 tabs to hopefully stumble upon a solution in the meantime.

I’m productive af now. I was paying for ChatGPT but Claude has been my goto for the past few months.


You clearly have made up your mind that it can't be right but to me it's like arguing against breathing. There are no uncertainties or misunderstandings here. The productivity gains are real and the code produced is more robust. Not in theory, but in practice. This is a fact for me and you trying to convince me otherwise is just silly when I have the result right in front of me. It's also not just boilerplate. It's all code.


>There are no uncertainties or misunderstandings here. The productivity gains are real and the code produced is more robust. Not in theory, but in practice.

So, that may be a fact for you but there are mixed results when you go out wide. For example [1] has this little nugget:

>The study identifies a disconnect between the high expectations of managers and the actual experiences of employees using AI.

>Despite 96% of C-suite executives expecting AI to boost productivity, the study reveals that, 77% of employees using AI say it has added to their workload and created challenges in achieving the expected productivity gains. Not only is AI increasing the workloads of full-time employees, it’s hampering productivity and contributing to employee burnout.

So not everyone is feeling the jump in productivity the same way. On this very site, there are people claiming they are blasting out highly-complex applications faster than they ever could, some of them also claiming they don't even have any experience programming. Then others claiming that LLMs and AI copilots just slow them down and cause much more trouble than they are worth.

It seems like just with programming itself, that different people are getting different results.

[1]https://www.forbes.com/sites/bryanrobinson/2024/07/23/employ...


Just be mindful that it is one's interest to push the "LLMs suck, don't waste your time with them" narrative once they figure out how to harness LLMs.

"Jason is a strong coder, and he despises AI tools!"


In my view these models produce above average code which is good enough for most jobs. But the hacker news sampling could be biased towards the top tier of coders - so their personal account of it not being good enough can also be true. For me the quality isn't anywhere close to good enough for my purposes, all of my easy code is already done so I'm only left working on gnarly niche stuff which the LLMs are not yet helpful with.

For the effect on the industry, I generally make the point that even if AI only replaces the below average coder it will cause a downward pressure on above average coders compensation expectation.

Personally, humans appear to be getting dumber at the same time that AI is getting smarter and while, for now, the crossover point is at a low threshold that threshold will of course increase over time. I used to try to teach ontologies, stats, SMT solvers to humans before giving up and switching to AI technologies where success is not predicated on human understanding. I used to think that the inability for most humans to understand these topics was a matter of motivation, but have rather recently come to understand that these limitations are generally innate.


It is also a problem of ego.

It is difficult if you have been told all your life that you are the best, to accept the fact that a computer or even other people might be better than you.

It requires lot of self-reflection.

Real top-tiers programmers actually don’t feel threatened by LLMs. For them it is just one more tool in the toolbox like syntax highlighting or code completion.

They choose to use these tools based on productivity gains or losses, depending on the situation.


Not to diminish your point at all: I think it's also just a fear that the fun or interesting part of the task is being diminished. To say that the point of programming is to solve real world problems ('productivity') is true, but in my experience it's not necessarily true for the person doing the solving. Many people who work as programmers like to program (as in, the process of working with code, typing it, debugging it, building up solutions from scratch), and their job is an avenue to exercise that part of their brain.

Telling that sort of person that they're going to be more productive by skipping all the "time consuming programming stuff" is bound to hurt.


The solution to this is to code your own things for fun.


> Real top-tiers programmers actually don’t feel threatened by LLMs.

They should, because LLMs are coming for them also, just maybe 2-3 years later than for programmers that aren't "real top-tier".

The idea that human intellect is something especially difficult to replicate is just delusional. There is no reason to assume so, considering that we have gone from hole card programming to LLMs competing with humans in a single human lifetime.

I still remember when elite chessplayers were boasting "sure, chess computers may beat amateurs, but they will never beat a human grandmaster". That was just a few short years before the Deep Blue match.

The difference is that nobody will pay programmers to keep programming once LLMs outperform them. Programmers will simply become as obsolete as horse-drawn carriages, essentially overnight.


> They should, because LLMs are coming for them also, just maybe 2-3 years later than for programmers that aren't "real top-tier".

Would you be willing to set a deadline (not fuzzy dates) when my job is going to be taken by an LLM and bet $5k on that?

Because the more I use LLMs and I see their improvement rate, the less worried I am about my job.

The only thing that worries me is salaries going down because management cannot tell how bad they're burying themselves into technical debt and maintenance hell, so they'll underpay a bunch of LLM-powered interns... which I will have to clean up and honestly I don't want to (I've already been cleaning enough shit non-LLM code, LLMs will just generate more and more of that).


> Would you be willing to set a deadline (not fuzzy dates) when my job is going to be taken by an LLM and bet $5k on that?

This is just a political question and of course so long as humans are involved in politics they can just decide to ban or delay new technologies, or limit their deployment.

Also in practice it's not like people stopped traditional pre-industrial production after industrialization occurred. It's just that pre-industrial societies fell further and further behind and ended up very poor compared to societies that chose to adopt the newest means of production.

I mean, even today, you can make a living growing and eating your own crops in large swathes of the world. However you'll be objectively poor, making only the equivalent of a few dollars a day.

In short I'm willing to bet money that you'll always be able to have your current job, somewhere in the world. Whether your job maintains its relative income and whether you'd still find it attractive is a whole different question.


> The difference is that nobody will pay programmers to keep programming once LLMs outperform them. Programmers will simply become as obsolete as horse-drawn carriages, essentially overnight.

I don't buy this. A big part of the programmer's job is to convert vague and poorly described business requirements into something that is actually possible to implement in code and that roughly solves the business need. LLMs don't solve that part at all since it requires back and forth with business stakeholders to clarify what they want and educate them on how software can help. Sure, when the requirements are finally clear enough, LLMs can make a solution. But then the tasks of testing it, building, deploying and maintaining it remain too, which also typically fall to the programmer. LLMs are useful tools in each stage of the process and speed up tasks, but not replacing the human that designs and architects the solution (the programmer).


> > Real top-tiers programmers actually don’t feel threatened by LLMs.

> They should, because LLMs are coming for them also, just maybe 2-3 years later than for programmers that aren't "real top-tier".

Not worrying about that because if they've gotten to that point (note: top tier programmers also need domain knowledge) then we're all dead a few years later.


Re: Compensation expectations, I figured out a long time ago that bad programmers create bad code, and bad code creates work for good programmers.

If the amount of bad code is no longer limited by the availability of workers who can be trained up to "just below average" and instead anyone who knows how to work a touchscreen can make AI slop, this opens up a big economic opportunity.


One could hope, but in my view perception precedes reality and even if that is the reality the perception is that AI will lower compensation demands and those doing the layoffs/hiring will act accordingly.

You could also make the same claims about outsourcing, and while it appears that in most cases the outsourcing doesn't pay off, the perception that it would has really damaged CS as a career.


And like with outsourcing it starts with the jobs at the lower end of the skill range in an industry, and so people at the higher end don't worry about it, and later it expands and they learn that they too are not safe.

What happened a couple of decades ago in poetry [1] could happen now with programming:

> No longer is it just advertising jingles and limericks made in Haiti and Indonesia. It's quatrains, sonnets, and free-form verse being "outsourced" to India, the Philippines, Russia, and China.

...

> "Limericks are a small slice of the economy, and when people saw globalization creating instability there, a lot said, 'It's not my problem,'" says Karl Givens, an economist at Washington's Economic Policy Institute. "Now even those who work in iambic pentameter are feeling it."

[1] http://www.watleyreview.com/2003/111103-2.html


Anything that makes fewer people get into programming is good for the field of CS. Only those who truly care go into it


What sort of problems do you solve? I tried to use it. I really did. I've been working on a tree edit distance implementation base on a paper from 95. Not novel stuff. I just can't get it to output anything coherent. The code rarely runs, it's written in absolutely terrible style, it doesn't follow any good practices for performant code. I've struggled with getting it to even implement the algorithm correctly, even though it's in the literature I'm sure it was trained on.

Even test cases have brought me no luck. The code was poorly written, being too complicated and dynamic for test code in the best case and just wrong on average. It constantly generated test cases that would be fine for other definitions of "tree edit distance" but were nonsense for my version of a "tree edit distance".

What are you doing where any of this actually works? I'm not some jaded angry internet person, but I'm honestly so flabbergasted about why I just can't get anything good out of this machine.


This kind of problems is really not where LLMs shine.

Where you save loads of time is when you need to write lots of code using unfamiliar APIs. Especially when it's APIs you won't work with a lot and spending loads of time learning then would just be a waste of time. In these cases LLMs call tell you the correct API cells and it's easy to verify. The LLM isn't really solving some difficult technical problem, but saves lots of work.


This exactly. LLMs can't reason, so we shouldn't expect them to try. They can do translation extremely well, so things like converting descriptions to 90-95% correct code in 10-100x less time, or converting from one language to another, are the killer use cases IMO.

But expecting them to solve difficult unsolved problems is a fundamental misunderstanding of what they are under the hood.


I picked this problem specifically because it's about "converting from one language to another". The problem is already solved in the literature. I understand that doing cutting edge research is a different problem, and that is explicitly not what I'm doing here, nor what I am expecting of the tool. I have coauthored an actual published computer science paper, and this excercise is VERY far from the complexity of that.

Could you share some concrete experience of a problem where aider, or a tool like it, helped you? What was your workflow, and how was the experience?


I'm a senior engineer (as in, really senior, not only years of experience). I can get familiar with unfamiliar APIs in a few hours and then I can be sure I'm doing the right thing, instead of silently failing to meet edge cases and introducing bugs because I couldn't identify what was wrong in the LLM output (because, well, I'm unfamiliar with the API in the first place).

In other words: LLMs don't solve any noteworthy problems, at least yet.


I feel sort of the same way but I'm desperate to understand what I'm missing. So many people sing such high praises. Billions are being invested. People are proclaiming the end of software developers. What I'm looking at can't be the product they are talking about.

I'm perfectly happy reading man pages personally. Half the fun of programming to me is mastering the API to get something out of it nobody expected was in there. To study the documentation (or implementation) to identify every little side effect. The details are most of the fun to me.

I don't really intend to use the AI for myself, but I do really wish to see what they see.


Maybe for happy path cases. I've tried to ask ChatGPT how you can do a certain non-obvious thing with Kafka, and it just started inventing things. Turns out, that thing isn't actually possible to do with Kafka (by design).


I think that contemporary models are trained for engagement, not for actual help.

My experience is the same as yours, but I noticed that while LLMs circa two years ago tried to come up with the answer, current generation of LLMs tries to make me come with the answer. And that not helping at all.


Did you tell it that? Are you trying to converse and discuss or are you trying to one shot stuff? If it gets something wrong, tell it. Don't just stop and try another prompt. You have to think of it as another person. You can talk to it, question it, guide it.

Try starting from ground zero and guiding it to the solution rather than trying to one shot your entire solution in one go.

I want you to implement this kind of tree in language x.

Ok good, now I want you to modify it to do Y.

Etc.


I've tried both. One time I actually tried so hard that I ran out of context, and aider just dumped me back to the main prompt. I don't think It's possible to guide it any more than that.

My problem is that the solution is right there in the paper. I just have to understand it. Without first understanding that paper, I can't possibly guide the AI towards a reasonable implementation. The process of finding the implementation is exactly the understanding of the paper, and the AI just doesn't help me with that. In fact, all too often I would ask it to make some minor change, and it would start making random changes all over the file, completely destroying my mental model of how the program worked. Making it change that back completely pulls me out of the problem.

When it's a junior at my job, at least I can feel like I'm developing a person. They retain the conversation and culture I impart as part of the problem solving process. When I struggle against the computer, it's just a waste of my time. It's not learning anything.

I'm still really curious what you're doing with it.


That’s fine until your code makes its way to production, an unconsidered side effect occurs and then you have to face me.

You are still responsible for what you do regardless of the means you used to do it. And a lot of people use this not because it’s more productive but because it requires less effort and less thought because those are the hard bits.

I’m collecting stats at the moment but the general trend in quality as in producing functional defects is declining when an LLM is involved in the process.

So far it’s not a magic bullet but a push for mediocrity in an industry with a rather bad reputation. Never a good story.


Wasn't there a recent post about many research papers getting published with conclusions derived from buggy/incorrect code?

I'd put more hope in improving LLMs/derivatives than improving the level of effort and thought in code across the entire population of "people who code", especially the subset who would rather be doing something else with their time and effort / see it as a distraction from the "real" work that leverages their actual area of expertise.


> You are still responsible for what you do regardless of the means you used to do it. And a lot of people use this not because it’s more productive but because it requires less effort and less thought because those are the hard bits.

Yeah, that's...the whole point of tools. They reduce effort. And they don't shift your responsibility. For many of us, LLMs are overwhelmingly worth the tradeoffs. If your experience differs, then it's unfortunate, and I hate that for you. Don't use 'em!


Ugh, dude, I used to push bad code into production without ChatGPT. It is such a stupid argument. Do you really think people are just blindly pushing code they can't make heads or tails of? That they haven't tested? Do you seriously think people are just one shotting code and blasting it into prod? I am completely baffled by people in this industry that just don't get it. Learn to prompt. Write tests. Wtf.


My problem is that, for a surprising number of applications, it's taken me longer to have the conversation with chatgpt to get the code I want than just doing it myself.

Copilot and the likes are legit for boilerplate, some test code, and posix/power shell scripting. Anything that's very common it's great.

Anything novel though and it suffers. Did AWS just release some new functionality and only like 4 people have touched it so far on GitHub? Are you getting source docs incomplete or spread out amongst multiple pages with some implicit/betwen-the-lines spec? Eh, good luck, you're probably better off just reading the docs yourself or guess and checking.

Same goes for versioning, sometimes it'll fall back into an older version of the system (ex Kafka with kraft vs zookeeper)

Personally, the best general use case of LLMs for me is focus. I know how to break down a task, but sometimes I have an issue staying focused on doing it and having a reasonably competent partner to rubber duck with is super useful. It helps that the chat log then becomes an easy artifact to more or less copy paste, and chatgpt doesn't do a terrible job reformatting either. Like for 90% of the stuff it's easier than using vim commands.


It seems great for like straightforward linear code, elisp functions, python data massage scripts, that sort of thing. I had it take a shot at some new module for a high volume Go server with concurrency/locking concerns and nil pointer receivers. I got more panics from the little bit of code GPT wrote than all my own code, not because it was bad code but because when I use dangerous constructs like locking and pointers that can be nil, I have certain rigid rules for how to use them and the generated code did not follow those rules.


> Do you really think people are just blindly pushing code they can't make heads or tails of? That they haven't tested?

Yes, most definitely. I've recently been introduced to our CTOs little pet project that he's been building with copious help from ChatGPT, and it's genuinely some of the most horrid code I've ever seen in my professional career. He genuinely doesn't know what half of it even does when I quizzed him about some of the more egregious crap that was in there. The real fun part is that now that it's a "proven" PoC some poor soul is going to have to maintain that shit.

We also have a mandate from the same CTO to use more AI in our workflows, so I have 0 doubts in my mind that people are blindly pushing code without thinking about it, and people like myself are left dealing with this garbage. My time & energy is being wasted sifting through AI-generated garbage that doesn't pass the smell test if you spend a singular minute of effort reading through the trash it generates.


Yes that's exactly what they are doing.

I literally had someone with the balls to tell me that it was ChatGPT's fault.

Due diligence and intelligence has shit the fucking bed quite frankly.


Do you think ChatGPT has changed any of those answers from Yes to No? Because it hasn't.

People blindly copied stack overflow code, they blindly copied every example off of MSDN, they blindly copy from ChatGPT - your holier than thou statements are funny, and frankly most LLMs cannot leave a local maxima, so anyone who says they dont write any code anymore I frankly think they are not capable of telling the mistakes, both architecturally and specifically that they are making.

More and different prompting will not dig you out of the hole.


This. Most people I know that use LLMs to be super productive are like "make me a button, it's red" (hyperbolic statement but you know what I mean). I can do that faster and better myself.

When I'm deeply stuck on something and I think "let's see if an LLM could help here", I try (and actually tried many times) to recruit those prompting gurus around me that swear LLMs solve all their problems... and they consistently fail to help me at all. They cannot solve the problem at all and I'm just sitting there, watching the gurus spend hours prompting in circles until they give up and leave (still thinking LLMs are amazing, of course).

This experience is what makes me extremely suspicious of anyone on the internet claiming they don't write code anymore but refusing to show (don't tell!) -- when actually testing it in real life it has been nothing but disappointment.


> Do you really think people are just blindly pushing code they can't make heads or tails of? That they haven't tested? Do you seriously think people are just one shotting code and blasting it into prod?

Yes, and I see proof of it _literally every day_ in Code Reviews where I ask juniors to describe or justify their choices and they shrug and say "That's what Copilot told me to put".


That sounds more like poor hiring decisions.


That sounds more like moving the goalposts. The claim (via sarcastic comment) was that people do not simply push code that they do not understand - and I provided a counter-example. No-one in that conversation disagrees that that's a bad practice - but until and unless I have full mandate to hire and fire whoever I want to work with, or to change jobs at will, I'm going to have to work with people whose development practices I disagree with.


> I've found that I haven't written a line of code in weeks

Which is great until your next job interview. Really, it's tempting in the short run but I made a conscious decision to do certain tasks manually only so that I don't lose my basic skills.


ChatGPT voice interface plugged into the audio stream, with the prompt:

- I need you to assist me during a programming interview, you will be listening to two people, the interviewer and me. When the interviewer asks a question, I'd like you to feed me lines that seem realistic for an interview where I'm nervous, don't give me a full blown answer right away. Be very succinct. If I think you misunderstood something, I will mention the key phrase "I'm nervous today and had too much coffee". In this situation, remember I'm the one that will say the phrase, and it might be because you've mistaken me by the interviewer and I want you to "reset". If I want you to dig deeper than what you've provided me with, I'll say the key phrase "Let's dig deeper now". If I think you've hallucinated and want you to try again, I'll say "This might be wrong, let me think for just a minute please". Remember, other than these key phrases, I'll only be talking to the interviewer, not you.

On a second screen of some sort. Other than that, interviewers will just have to accept that nobody will be doing the job without these sort of assistants from now on anyway. As an interviewer I let candidates consult online docs for specific things already because they'll have access to Google during the job, this is just an extension of that.


I recently interviewed a number of people about their SQL skills. The format I used was to share two queries with them a couple days ahead of time in a google doc, and tell them I will ask them questions about those queries during the interview.

Out of maybe twenty people I interviewed this way, only three of them pointed out that one of the queries had a failing error in it. It was something any LLM would immediately point out.

Beyond that: the first question I asked was: "What does this query do, what does it return?" I got responses ranging from people who literally read the query back to me word by word, giving the most shallow and direct explanation of what each bit did step-by-step, to people who clearly summarized what the query did in high-level, abstract terms, as you might describe what you want to accomplish before you write the query.

I don't think anyone did something with ChatGPT live, but maybe?


This made me laugh. I can't deny it isn't already happening. But wow people work so hard to avoid working hard.


It's not about avoiding hard work - the audience on HN skews wealthy due to heavy representation of skilled devs in their 30s+, but the average person does not earn anything close to FAANG salaries. Even most devs in general don't earn like that. The interview process being fairly well understood in general, any advantage that can possibly get a person from $60k/year to generationally-life-changing $300k/year will be used eventually.


And I wrote this as a knee-jerk reaction after reading the parent, I imagine people will be putting way more effort if it can get them a great job. And to be honest, if they can fool you, they can most likely do the job as well. Most of the industry tests at a higher skill level than what they actually require on the day to day anyway.


It's almost inspiring, isn't it?


I think the point is to avoid pointless hard work.


Not everyone is doing coding interviews. Some might struggle with a particular language due to lack of muscle memory, but can dictate the logic in pseudocode and can avoid pitfalls inferred from past experience. This sort of workflow is compatible with LLMs, assuming a sufficient background (otherwise one can't recognize when the output diverges from your intent).

I personally treat the LLM as a rubber duck. Often I reject its output. In other cases, I can accept it and refactor it into something even better. The name of the game is augmentation.


I sometimes get the idea from statements like this - and HN's focus on interviewing in general - that people are switching jobs a dozen times a year or something. How often are most people switching jobs? I've had 5 jobs in the last 20 years.


I'm old, and well-paid for my geographic region (but for various mostly stupid reasons utterly broke). No amount of value created (at least, for my skill level) will protect me from ageism and/or budget cuts.


This. I’ve been using elixir for ~6 months (guided by Claude) and probably couldn’t solve fizz buzz at a whiteboard without making a syntax error. Eek.


Who cares? If I'm hiring you to make a product, I care that the higher order logic is correct, that the requirements are all catered for, and that the code does reasonable things in all cases. Things I don't care about are FizzBuzz, programming on whiteboards, and not making syntax errors.


This is how companies fail. 5 years down the line no one is able to change anything in the system because it's so poorly architected (by being a bunch of Claude copypastes cobbled together) that it takes one month to do a one-day task (if it's even possible).


I guess we should change our hiring practices to optimize for FizzBuzz and getting all the syntax right first try.


I can see how you got that impression from my comment (if you ignore how I mentioned architecture), so let me elaborate:

It's the opposite. FizzBuzz and getting the syntax right is what LLMs are good at... but there's so much more nuance at being experienced with a language/framework/library/domain which senior engineers understand and LLMs don't.

Being able to write Elixir assisted by an LLM does not mean you can produce proper architecture and abstractions even if the high level ideas are right. It's the tacit knowledge and second-order thinking that you should hire for.

But the thing is, if someone cannot write Elixir without syntax errors unless using an LLM, well, that's a extremely good proxy that they don't know the ins and outs of the language, ecosystem, best practices... Years of tacit knowledge that LLMs fail to use because they're trained on a huge number of tutorial and entry-level code ridden with the wrong abstractions.

The only code worse than one that doesn't work is one that kinda works unless your requirements change ever so slightly. That's a liability and you will pay it with interests.

To give a concrete example: I am very experienced with React. Very. A lot. The code that LLMs write for it is horrid, bug-ridden, inflexible and often misuses its footgun-y APIs like `useEffect` like a junior fresh out of a boot camp would, directly contradicting the known best practices for maintainable (and often even just "correct") code. But yeah it superficially solves the problem. Kinda. But good luck when the system needs to evolve. If it cannot do proper code that's <500 lines how do you expect it to deal with massive systems that need to scale to 10s of KLOC across an ever-growing twine?

But management will be happy because the feature shipped and time to market was low... until you can no longer ship anything new and you go out of business.


Ah, sorry, I read your comment as disagreeing with me, now I see it's the opposite. Exactly, LLMs (for now) are good at writing low-level code, but we need someone to work on architecture.

I had an idea the other day of an LLM system that would start from a basic architecture of an app, and would zoom down and down on components until it wrote the entire codebase, module by module. I'll try that, it sounds promising.


You need to prep for job interviews anyway. I'd rather spend the majority of my time being productive.


Job interview? You might be surprised at the number of us who don’t code for a job.


I'd bet most people on this forum program professionally.


I would take that bet.


Me too.


Somebody tested people on Hacker News to evaluate programming competency.

This was part of a larger evaluation comparing the Hacker News population to people on Reddit programming subreddits.

Here is a very heated discussion of the result:

https://news.ycombinator.com/item?id=33293522

It appears that Hacker News is perhaps NOT populated by the programming elite. In contrast, there are real wizards on Reddit.

Surprising, I know.


Not surprising given how bad the takes here are and how many of the users here are dumb kids right out of college who are aspiring founders.


Unnecessarily negative. Maybe rethink it.


Not surprised there would be a “heated” discussion as a result of this one link, that measured only those who engaged it, and how? I opened the link, hit Submit just to see what would happen… now the percentage of HN users who are competent programmers is even fewer than before, by that metric.


I’ve made the decision to embrace being bad at coding but getting a ton of work done using an LLM and if my future employer doesn’t want massive productivity and would prefer being able to leetcode really well then I unironically respect that and that’s ok.

I’m not doing ground breaking software stuff, it’s just web dev at non massive scales.


You future employer might expect you to bring some value through your expertise that doesn't come from her LLM. If you want to insist on degrading your own employability like this, I guess it's your choice.


For the most part, businesses don't care how you deliver value, just that you do. If programmer A does a ticket in 3 days with an LLM, and programmer B takes a week to do the same ticket, but doesn't use an LLM, with programmer B choosing not to out of some notion of purity, who's more employable?


Productivity is not the only aspect of our profession that matters, and in fact it's probably not even the most important part. I'm not suggesting we get stuck or handcraft every aspect of our code, and there are multitudes of abstractions and tools that enhance productivity, including everything from frameworks to compilers.

What I'm saying is what the original comment is doing, having the LLM write all their code, will make them a less valuable employee in the long term. Participating in the act of programming makes your a better programmer. I'd rather have programmer B if they take the time to understand their code, so that when that code breaks at 4am and they get the call, they can actually fix it rather than be in a hole they dug with LLMs that they can't dig out of.


You don't need to call them at 4am, you can keep a git log of the prompts that were used to generate the code and some professional 4am debugger can sit there and use an LLM to fix it.

Probably not a practical option yet, but if we're looking at the long term that is where we are heading. Or, realistically, the even longer term where the LLM self-heals broken systems.


While a git log of prompts seems like a novel idea to me, I don't believe it would work - not because of temperature and LLMs being non-deterministic and the context window overflowing, but because at a certain level of complexity LLMs simply fail, even though they are excellent at fixing simple bugs.


Lol, yeah the prompt is definitely going to help clarify what the code actually does.


See, if you work in AI, say, as an AI researcher, asking them not to be allowed to use AI models in the interview is basically not an option.

Also, often folks in this space are better at cheating than you will be at detecting them. Don't believe me? https://bigvu.tv/captions-video-maker/ai-eye-contact-fix


LLMs are certainly not useless.

But "lines of code written" is a hollow metric to prove utility. Code literacy is more effective than code illiteracy.

Lines of natural language vs discrete code is a kind of preference. Code is exact which makes it harder to recall and master. But it provides information density.

> by just knuckling down and learning how to do the work?

This is the key for me. What work? If it's the years of learning and practice toward proficiency to "know it when you see it" then I agree.


we're a post illiteracy society now


> I've found that I haven't written a line of code in weeks

How are people doing this, none of the code that gpt4o/copilot/sonnet spit out i ever use because it never meets my standards. How are other people accepting the shit it spits out.


You're listing plain models, so I'm assuming you're using them directly. Aider and similar agents use those models but they don't step at the first answer. You can add test running and a linter to the request and it will essentially enter a loop like: what are the steps to solve (prompt)?; here's a map of the repository, which files do you need?; what's your proposed change?; here's the final change and the test run, do you think the problem has been solved?; (go back to the beginning if not)

See the video at https://plandex.ai/ to get an idea how it works.


That just sounds/looks like more work then just doing it normally? what am I missing?


Depends on the task but if you're going high level enough, it's not more work. Think about it this way: if you're doing proper development you're going to write code, tests and commit messages. Since you know what you want to achieve, write a really good commit message as the prompt, start writing tests and let the agent run in the meantime. Worst case, it doesn't work and you do the code yourself. Best case, it worked and you saved time.

(Not sure if that was clear but the steps/loop described before happens automatically, you're not babysitting it)


You put it behind an API call and run the loop automatically for every coding query


I'm using Cursor and till now the "test run" part is manual, like Cursor doesn't care about testing or actually checking the code it wrote works

Any tips how I could integrate that? Do I need to switch to aider/plandex?


For someone who didn't study a STEM subject or CS in school, I've gone from 0 to publishing a production modern looking app in a matter of a few weeks (link to it on my profile).

Sure, it's not the best (most maintainable, non-redundant styling) code that's powering the app but it's more than enough to put an MVP out to the world and see if there's value/interest in the product.


> HN, and the internet in general, have become just an ocean of reactionary sandbagging and blather about how "useless" LLMs are.

This is cult like behaviour that reminds me so much of the crypto space.

I don't understand why people are not allowed to be critical of a technology or not find it useful.

And if they are they are somehow ignorant, over-reacting or deficient in some way.


I think it's perfectly ok to be critical of technology as long as one is thoughtful rather than dismissive. There is a lot of hype right now and pushing back against it is the right thing to do.

I'm more reacting against simplistic and categorical pronouncements of straight up "uselessness," which to me seems un-curious and deeply cynical, especially since it is evidentially untrue in many domains (though it is true for some domains). I just find this kind of emotional cynicism (not a healthy skepticism, but cynicism) to be contrary to the spirit of innovation and openness, and indeed contrary to evidence. It's also an overgeneralization -- "I don't find it useful, so it's useless" -- rather than "Why don't I find it useful, and why do others do? Let me learn more."

As future-looking HNers, I'd expect we would understand the world through a lens of "trajectories" rather than "current state". Just because LLMs hallucinate and make mistakes with a tone of confidence today -- a deep weakness -- doesn't mean they are altogether useless. We've witnessed that despite their weaknesses, we are getting a lot of value from them in many domains today and they are getting better over time.

Take neural networks themselves for instance. For most of the 90s-2000s, people thought they were a dead end. My own professor had great vitriol against Neural Networks. Most of the initial promises in the 80s truly didn't pan out. Turns out what was missing was (lots of) data, which the Internet provided. And look where we are today.

Another area of cynicism is self-driving cars (Level 5). Lots of hype and overpromise, and lots of people saying it will never happen because it requires a cognitive model of the world, which is too complicated, and there are too many exceptional cases for there to ever be Level 5 autonomy. Possibly true, but I think "never" is a very strong sentiment that is unworthy of a curious person.


I generally agree, although an important aspect of thinking in terms of "trajectories" is recognizing when a particular trajectory might end up at a dead end. One perspective on the weaknesses of current LLMs is that it's just where the things are today and they can still provide value even while the technology improves. But another perspective is that the persistence of these weaknesses indicates something more fundamentally broken with the whole approach that means it's not really the path towards "real" AI, even if you can finesse it into doing useful things in certain applications.

There's also an important nuance differentiating rejection of a general technological endpoint (e.g. AGI or Level 5 self-driving cars) with a particular technological approach to achieving those goals (e.g. current LLM design or Tesla's autopilot). As you said, "never" is a long time and it takes a lot of unwarranted confidence to say we will never be able to achieve goals like AGI or Level 5 self-driving. But it seems a lot more reasonable to argue Tesla or OpenAI (and everyone else doing essentially the same thing as OpenAI) are fundamentally on the wrong track to achieving those goals without significantly changing their approach.

I agree that none of that really warrants dismissive cynicism of new technology, but being curious and future-looking also requires being willing to say when you think something is a bad approach even if it's not totally useless. Among other reasons, our ability to explore new technology is not limitless, and hype for a flawed technology isn't just annoying but may be sucking all the oxygen out of the room not leaving any for a potentially better alternative. Part of me wants to be optimistic about LLMs, but another part of me thinks about how much energy (human and compute) has gone into this thing that does not seem to be providing a corresponding amount of value.


I appreciate this thoughtful comment.

You are absolutely right that the trajectories, if taken linearly, might hit a dead end. I should clarify that when I mentioned "trajectories" I don't mean unpunctuated ones.

I am myself not convinced that LLMs -- despite their value to me today -- will eventually lead to AGI as a matter of course, nor the type of techniques used in autopilot will lead to L5 autonomy. And you're right that they are consuming a lot of our resources, which could well be better invested in a possibly better alternative.

I subscribe to Thomas Kuhn's [1] idea of scientific progress happening in "paradigms" rather than through a linear accumulation of knowledge. For instance, the path to LLMs itself was not linear, but through a series of new paradigms disrupting older ones. Early natural language processing was more rule-based (paradigm), then it became more statistical (paradigm), and then LLMs supplanted the old paradigms through transformers (paradigm) which made it scale to large swaths of data. I believe there is still significant runway left for LLMs, but I expect another paradigm must supplant it to get closer to AGI. (Yann Lecun said that he doesn't believe LLMs will lead to AGI).

Does that mean the current exuberant high investments in LLMs are misplaced? Possibly, but in Kuhn's philosophy, typically what happens is a paradigm will be milked for as much as it can be, until it reaches a crisis/anomaly when it doesn't work anymore, at which point another paradigm will supplant it.

At present, we are seeing how far we can push LLMs, and LLMs as they are have value even today, so it's not a bad approach per se even though it will hit its limits at some point. Perhaps what is more important are the second-order effects: the investments we are seeing in GPUs (essentially we are betting on linear algebra) might unlock the kind of commodity computational power the next paradigm needs to disrupt the current one. I see parallels between this and investments in NASA resulting in many technologies that we take for granted today, and military spend in California producing the technology base that enabled Silicon Valley today. Of course, these are just speculations and I have no more evidence that this is happening with LLMs than anyone else.

I appreciate your point however and it is always good to step back and ask, non-cynically, whether we are headed down a good path.

[1] https://en.wikipedia.org/wiki/The_Structure_of_Scientific_Re...


This entire comment can be summarised as: everyone who doesn't think like me is wrong.

Not everyone is interested in seeing the world through the hopes and dreams of e/acc types and would prefer to see it as it is today.

LLMs are a technology. Nothing more. It can be as amazing or useless as anyone likes.


And this comment can be summarized as "Nuh uh, I'm right". When summarizing longer bits of text down to a single sentence, nuance and meaning gets lost, making the summarization ultimatele useless, contributing nothing to the discussion.


Crypro and AI have similarities and differences.

The similarities include intense "true believer" pitches and governments taking them seriously.

The differences include that the most famous cryptocurrency can't function as a direct payment mechanism for just lunch purchases in just Berlin (IIRC nor is it enough for all interbank transactions so it can't even be a behind-the-scenes system by itself), while GenAI output keeps ending up in places people would rather not find it like homework and that person on Twitter who's telling you Russia Did Nothing Wrong (and also giving you a nice cheesecake recipe because they don't do any input sanitation).


Also, I'm deeply skeptical of crypto too due to its present scamminess, but I am keeping an open mind that there is a future in which crypto -- once it gets over this phase of get-rich-quick schemers -- will be seen as just another asset class.

I read somewhere that historically bonds in their early days were also associated with scamminess but today they're just a vanilla asset.


I'm honestly more optimistic about cryptocurrency as a mechanism of exchange rather than an asset. As a mechanism of exchange, cryptocurrency has some actually novel properties like distributed consensus that could be useful in certain cases. But an asset class which has zero backing value seems unworkable except for wild speculation and scams. Unfortunately the incentives around most cryptocurrencies (and maybe fundamental to cryptocurrency as an idea) greatly emphasize the asset aspects, and it's getting to be long enough since it became a thing that I'm starting to become skeptical cryptocurrency will be a real medium of exchange outside of illegal activities and maybe a few other niche cases.


bonds have utility, crypto does not


just like with crypto and NFTs and the metaverse, they are always focused on what is suppsoedly coming down the pipe in the future and not what is actually possible today


I use sonet 3.5 and while it's actually usable for codegen (compared to gpt/copilot) it's still really not that great. It does well at tasks like "here's a stinky collection of tests that accrued over time - clean this up in style of x" but actually writing code still shows fundamental lack of understanding of underlying API and problem (the most banal example being constantly generating `x || Array.isArray(x)` test)


> I've found that I haven't written a line of code in weeks

Please post a video of your workflow.

It’s incredibly valuable for people to see this in action, otherwise they, quite legitimately, will simply think this is not true.


Who cares what they think? In fact, the fewer who uses this the better for the ones that do. It's not in my self-interest to convert anyone and I obviously don't need to convince myself when I have the result right in front of me. Whether you believe it or not does not make me less productive.


The obvious answer is you’ll get called a liar and shrill.

I’m not saying you are; I think there are a lot of legitimate AI workflows people use.

…but, there are a lot of people trying to sell AI, and that makes them say things about it which are just flat out false.

/shrug

But you know; freedom of speech; you can say whatever you want if you don’t care what people think of you.

My take on it is showing people things (videos, blogs, repos, workbooks like Terence posted) moves the conversation from “I don’t believe you” to “let’s talk about the actual content”. Wow, what an interesting workflow, maybe I’ll try that…

If you don’t want to talk to people or have a discussion that extends beyond meaningless trivia like “does AI actually have any value” (obviously flame bait opinions only comment threads)… why are you even here?

If you don’t care, then fine. Maybe someone else will and they’ll post an interesting video.

Isn’t that the point of reading HN threads? What do you win by telling people not to post examples of their workflow?

It’s incredibly selfish.


> HN, and the internet in general, have become just an ocean of reactionary sandbagging and blather about how "useless" LLMs are.

Now imagine how profoundly depressing it is to visit a HN post like this one, and be immediately met with blatant tribalism like this at the very top.

Do you genuinely think that going on a performative tirade like this is what's going to spark a more nuanced conversation? Or would you rather just the common sentiment be the same as yours? How many rounds of intellectual dishonesty do we need to figure this out?


> Meanwhile, in the real world, I've found that I haven't written a line of code in weeks. Just paragraphs of text that specify what I want and then guidance through and around pitfalls in a simple iterative loop of useful working code.

could it be that you are mostly engaged in "boilerplate coding", where LLMs are indeed good?


People in general don't like change and are naturally defending against it. And the older people get the greater the percentage of people fighting against it. A very useful and powerful skill is to be flexible and adaptable. You positioned yourself in the happy few.


How much do you typically pay in a month of tokens?


> Meanwhile, in the real world, I've found that I haven't written a line of code in weeks. Just paragraphs of text that specify what I want and then guidance through and around pitfalls in a simple iterative loop of useful working code.

Comment on first principles:

Following the dictum that you can't prove the absence of bugs, only their presence, the idea of what constitutes "working code" deserves much more respect.

From an engineering perspective, either you understand the implementation or you don't. There's no meaning to iteratively loop of producing working code.

Stepwise refinement is a design process under the assumption that each step is understood in a process of exploration of the matching of a solution to a problem. The steps are the refinement of definition of a problem, to which is applied an understanding of how to compute a solution. The meaning of working code is in the appropriateness of the solution to the definition of the problem. Adjust either or both to unify and make sense of the matter.

The discipline of programming is rotting when the definition of working is copying code from an oracle you run it to see if it goes wrong.

The measure of works must be an engineering claim of understanding the chosen problem domain and solution. Understanding belongs to the engineer.

LLMs do not understand and cannot be relied upon to produce correct code.

If use of an LLM puts the engineer in contact with proven principles, materials and methods which he adapts to the job at hand, while the engineer maintains understanding of correctness, maybe that's a gain.

But if the engineer relies on the LLM transformer as an oracle, how does the engineer locate the needed understanding? He can't get it from the transformer: he's responsible for checking the output of the transformer!

OTOH if the engineer draws on understanding from elsewhere, what is the value of the transformer but as a catalog? As such, who has accountability for the contents of the catalog? It can't be the transformer because it can't understand. It can't be the developer of the transformer because he can't explain why the LLM produces any particular result! It has to be the user of the transformer.

So a system of production is being created whereby the engineer's going-in position is that he lacks the understanding needed to code a solution and he sees his work as integrating the output of an oracle that can't be relied upon.

The oracle is a peculiar kind of calculator with a unknown probability of generating relevant output that works at superhuman speeds, while the engineer is reduced to an operator in the position of verifying that output at human speeds.

This looks like a feedback system for risky results and slippery slope towards heretofore unknown degrees of incorrectness and margins for error.

At the same time, the only common vernacular for tracking oracle veracity is in arcane version numbers, which are believed, based on rough experimentation, to broadly categorize the hallucinatory tendencies of the oracle.

The broad trend of adoption of this sketchy tech is in the context of industry which brags about seeking disruption and distortion, regards its engineers as cost centers to be exploited as "human resources", and is managed by a specialized class of idiot savants called MBAs.

Get this incredible technology into infrastructure and in control of life sustaining systems immediately!


What sort of code do you write this way?


Probably nothing a junior programmer wouldn't be able to do relatively easily.


Curious why Aider? Why not Cursor ?


writing code is the easy part, designing is hard and not LLMable


Given how hard we thought programming was a year or two ago, I wouldn't bank my future on design being too hard for an LLM. They're already quite good at helping writing design docs.


Lol nope. When I'm trying to get it do make something big/complicated I start by telling it it's a software project manager and have me build a spec sheet on the design. Then I hand that off to an architect to flesh out the languages, libraries, files needed etc. Then from that list you can have it work on individual files and functions.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: