Hacker News new | past | comments | ask | show | jobs | submit login

> Much Much more productive world by just knuckling down and learning how to do the work.

The fact everyone that say they've become more productive with LLMs won't say how exactly. I can talk about how VIM have make it more enjoyable to edit code (keybinding and motions), how Emacs is a good environment around text tooling (lisp machine), how I use technical books to further my learning (so many great books out here). But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them. It's all claims that it's great with no further elaboration on the workflows.

> I haven't written a line of code in weeks. Just paragraphs of text that specify what I want and then guidance through and around pitfalls in a simple iterative loop of useful working code.

Code is intent described in terms of machinery actions. Those actions can be masked by abstracting them in more understandable units, so we don't have to write opcodes, but we can use python instead. Programming is basically make the intent clear enough so that we know what units we can use. Software engineering is mostly selecting the units in a way to do minimal work once the intent changes or the foundational actions do.

Chatting with a LLM look to me like your intent is either vague or you don't know the units to use. If it's the former, then I guess you're assuming it is the expert and will guide you to the solution you seek, which means you believe it understands the problem more than you do. The second is more strange as it looks like playing around with car parts, while ignoring the manuals it comes with.

What about boilerplate and common scenarios? I agree that LLMs helps a great deal with that, but the fact is that there are perfectly good tools that helped with that like snippets, templates, and code generators.




Ever seen someone try and search something on Google and they are just AWFUL at it? They can never find what they're looking for and then you try and can pull it up in a single search? That's what it is like watching some people try to use LLM's. Learning how to prompt an LLM is as much a learned skill as much as learning how to phrase internet searches is a learned skill. And as much as people decried that "searching Google isn't a real skill" tech-savvy people knew better.

Same thing except now it's also many tech-savvy people joining in with the tech-unsavvy in saying that prompting isn't a real skill...but people who know better know that it is.

On average, people are awfully bad at describing exactly what it is they want. Ever speak with a client? And you have to go back and forward for a few hours to finally figure out what it is they wanted? In that scenario you're the LLM. Except the LLM won't keep asking probing questions and clarifications - it will simply give them what they originally asked for (which isn't what they want). Then they think the LLM is stupid and stop trying to ask it for things.

Utilizing an LLM to its full potential is a lot of iterative work and, at least for the time being, requires having some understanding of how it works underneath the hood (eg. would you get better results by starting a new session or asking it to forget previous, poorly worded instructions?).


I'm not arguing that you can't get result with LLMs, I'm just asking is it worth the actual effort especially when there's better way to get that result you're seeking (or if the result is really something that you want).

An LLM is a word (token?) generator which can be amazingly consistent according to its model. But rarely is my end goal to generate text. It's either to do something, to understand something, or to communicate. For the first, there are guides (books, manuals, ...), for the second, there are explanations (again books, manuals,...), and the third is just using language to communicate what's on my mind.

That's the same thing with search engines. I use them to look for something. What I need first is a description of that something, not how to do the "looking for". Then once you know what you want to find, it's easier to use the tool to find it.

If your end goal can be achieved with LLMs, be my guest to use them. But, I'm wary of people taking them at face value and then pushing the workload unto everyone else (like developers using electron).


It's hard to quantify how much time learning how to search saves because the difference can range between infinite (finding the result vs not finding it at all) to basically no difference (1st result vs 2nd result). I think many people agree it is worth learning how to "properly search" though. You spend much less time searching and you get the results you're looking for much more often. This applies outside of just Google search: learning how to find and lookup information is a useful skill in and of itself.

ChatGPT has helped me write some scripts for things that otherwise probably would have taken me at least 30+ minutes and it wrote them in <10 seconds and they worked flawlessly. I've also had times where I worked with it to develop something that ended up taking me 45 minutes to only ever get error-ridden code that I had to fix the obvious errors and rewrite parts of it to get it working. Sometimes during this process it actually has taught me a new approach to doing something. If I had started from scratch coding it by myself it probably would have taken me only 10~ minutes. But if I was better at prompting what if that 45 minutes was <10 minutes? It would go from from a time loss to a time save and be worth using. So improving my ability to prompt is worthwhile as long as doing so trends towards me spending less time prompting.

Which is thankfully pretty easy to track and test. On average, as I get better at prompting, do I need to spend more or less time prompting to get the results I am looking for? The answer to that is largely that I spend less time and get better results. The models constantly changing and improving over time can make this messy - is it the model getting better or is it my prompting? But I don't think models change significantly enough to rule out that I spend less time prompting than I have in the past.


> how much time learning how to search saves

>>> you do need to break down the problem into smaller chunks so GPT can reason in steps

To search well, you need good intuition for how to select the right search terms.

To LLM well, you can ask the LLM to break the problem into smaller chunks, and then have the LLM solve each chunk, and then have the LLM check its work for errors and inconsistencies.

And then you can have the LLM write you a program to orchestrate all of those steps.


Yes you can. What was the name of the agent that was going to replace all developers? Devin or something? It was shown it took more time iterate over a problem and created terrible solutions.

LLMs are in the evolutionary phase, IMHO. I doubt we're going to see revolutionary improvements from GPTs. So I say time and time again: the technology is here, show it doing all the marvelous things today. (btw, this is not directed at your comment in particular and I digressed a bit, sorry).


> asking is it worth the actual effort

If prompting ability varies then this is not some objective question, it depends on each person.

For me I've found more or less every interaction with an LLM to be useful. The only reason I'm not using it continually for 8 hours a day is because my brain is not able to usefully manage that torrent of new information and I need downtime.


It works quite nicely if you consider LLMs as a translator (and that’s actually why Transformers were created).

Enter technical specifications in English as input language, get code as destination language.


English as input language works in simple scenarios but breaks down very very quickly. I have to get extremely specific and deliberate. At some point I have to write pseudocode to get the machine to get say double checked locking right. Because I have enough experiences where varying the prompting didn't work, I revert to just writing the code when I see the generator struggling.

When I encounter somebody who says they do not write code anymore, I assume that they either:

1. Just don't do anything beyond the simplest tutorial-level stuff

2. or don't consider their post-generation edits as writing code

3. or are just bullshitting

I don't know which it is for each person in question, but I don't trust that their story would work for me. I don't believe they have some secret sauce prompting that works for scenarios where I've tried to make it work but couldn't. Sure I may have missed some ways, but my map of what works and what doesn't may be very blurry at the border, but the surprises tend to be on the "doesn't work" side. And no Claude doesn't change this.


I definitely still write code. But I also prefer to break down problems into chunks which are small enough that an LLM could probably do them natively, if only you can convince it to use the real API instead of inventing a new API each time — concrete example from ChatGPT-3.5, I tried getting it to make and then use a Vector2D class — in one place it had sub(), mul() etc., the other place it had subtract(), multiply() etc.

It can write unit tests, but makes similar mistakes, so I have to rewrite them… but it nevertheless still makes it easier to write those tests.

It writes good first-drafts for documentation, too. I have to change it, delete some stuff that's excess verbiage, but it's better than the default of "nobody has time for documentation".


Exactly! What is this job that you can get where you don't code and just copy-paste from ChatGPT? I want it!

My experience is just as you describe it: I ask a question whose answer is in stackoverflow or fucking geeks4geeks? Then it produces a good answer. Anything more is an exercise in frustration as it tries to sneak nonsense code past me with the same confident spiel with which it produces correct code.


It's absolutely a translator, but they're similar good/bad/weird/hallucinaty at natural translation translations, too.

Consider this round-trip in Google Translate:

"དེ་ནི་སྐད་སྒྱུར་པ་ཞིག་ཡིན། འོན་ཀྱང་ཁོང་ཚོ་རང་བྱུང་སྐད་སྒྱུར་གྱི་སྐད་སྒྱུར་ནང་ལ་ཡག་པོ/ངན་པ/ཁྱད་མཚར་པོ/མགོ་སྐོར་གཏོང་བ་འདྲ་པོ་ཡོད།"

"It's a translator. But they seem to be good/bad/weird/delusional in natural translations. I have a"

(Google translate stopped suddenly, there).

I've tried using ChatGPT to translate two Wikipedia pages from German to English, as it can keep citations and formatting correct when it does so; it was fine for the first 2/3rds, then it made up mostly-plausible statements that were not translated from the original for the rest. (Which I spotted and fixed before saving, because I was expecting some failure).

Don't get me wrong, I find them impressive, but I think the problem here is the Peter Principle: the models are often being promoted beyond their competence. People listen to that promotion and expect them to do far more than they actually can, and are therefore naturally disappointed by the reality.

People like me who remember being thrilled to receive a text adventure casette tape for the Commodore 64 as a birthday or christmas gift when we were kids…

…compared to that, even the Davinci model (that really was autocomplete) was borderline miraculous, and ChatGPT-3.5 was basically the TNG-era Star Trek computer.

But anyone who reads me saying that last part without considering my context, will likely imagine I mean more capabilities than I actually mean.


> On average, people are awfully bad at describing exactly what it is they want. Ever speak with a client? And you have to go back and forward for a few hours to finally figure out what it is they wanted?

One of them it was the entire duration of me working for them.

They didn't understand why it was taking so long despite constantly changing what they asked for.


Building the software is usually like 10% of the actual job, we could do a better job of teaching that.

The other 90% is mostly mushy human stuff, fleshing out the problem, setting expectations etc. Helping a group of people reach a solution everyone is happy with has little to do with technology.


Mostly agree. Until ChatGPT, I'd have agreed with all of that.

> Helping a group of people reach a solution everyone is happy with has little to do with technology.

This one specific thing, is actually something that ChatGPT can help with.

It's not as good as the best human, or even a middling human with 5 year's business experience, but rather it's useful because it's good enough at so many different domains that it can be used to clarify thoughts and explain the boundaries of the possible — Google Translate for business jargon, though like Google Translate it is also still often wrong — the ultimate "jack of all trades, master of none".


We're currently in the shiny toy stage, once the flaws are thoroughly explored and accepted by all as fundamental I suspect interest will fade rapidly.

There's no substance to be found, no added information; it's just repeating what came before, badly, which is exactly the kind of software that would be better off not written if you ask me.

The plan to rebuild society on top of this crap is right up there with basing our economy on manipulating people into buying shit they don't need and won't last so they have to keep buying more. Because money.


The worry I have is that the net value will become great enough that we’ll simply ignore the flaws, and probabilistic good-enough tools will become the new normal. Consider how many ads the average person wades through to scroll an Insta feed for hours - “we’ve” accepted a degraded experience in order to access some new technology that benefits us in some way. To paraphrase comedian Mark Normand: “Capitalism!”


Scary thought, difficult to unthink.

I'm afraid you might be right.

We've accepted a lot of crap lately just to get what we think we want, convenience is a killer.


Indeed, even if I were to minimise what LLMs can do, they are still achieving what "targeted advertising" very obviously isn't.


They're both short sighted attempts at extracting profit while ignoring all negative consequences.


To extent I agree, I think that's true for all tech since the plough, fire, axles.

But I would otherwise say that most (though not all*) AI researchers seem to be deeply concerned about the set of all potential negative consequences, including mutually incompatible outcomes where we don't know which one we're even heading towards yet.

* And not just Yann LeCun — though, given his position, it would still be pretty bad even if it was just him dismissing the possibility of anything going wrong


> That's what it is like watching some people try to use LLM's.

Exactly. I made a game testing prompting skills a few days earlier, to share with some close friends, and it was your comment that inspired me to translate the game into English and submitted to HN. ( https://news.ycombinator.com/item?id=41545541 )

I am really curious about how other people write prompts, so while my submission only got 7 points, I'm happy that I can see hundreds of people's own ways to write prompts thanks to HN.

However, after reading most prompts (I may missed some), I found exactly 0 prompts containing any kind of common prompting techniques, such as "think step by step", explaining specific steps to solve the problem instead of only asking for final results, few-shots (showing example inputs and outputs). Half of the prompts are simply asking AI to do the thing (at least asking correctly). The other half do not make sense, even if we show the prompt to a real human, they won't know what to reply with.

Well... I expected that SOME complaints about AI online are from people not familiar with prompting / not good at prompting. But now I realized there are a lot more people than I thought not knowing some basic prompting techniques.

Anyway, a fun experience for me! Since it was your comment made me want to do this, I just want to share it with you.


Could you reference any youtube videos, blog posts, etc of people you would personally consider to be _really good_ at prompting? Curious what this looks like.

While I can compare good journalists to extremely great and intuitive journalists, I don't have really any references for this in the prompting realm (except for when the Dall-e Cookbook was circulating around).


Sorry for the late response - but I can't. I don't really follow content creators at a level where I can recall names or even what they are working on. If you browse AI-dominated spaces you'll eventually find people who include AI as part of their workflows and have gotten quite proficient at prompting them to get the results they desire very consistently. Most AI stuff enters into my realm of knowledge via AI Twitter, /r/singularity, /r/stablediffusion, and Github's trending tab. I don't go particularly out of my way to find it otherwise.

/r/stablediffusion used to (less so now) have a lot of workflow posts where people would share how they prompt and adjust the knobs/dials of certain settings and models to make what they make. It's not so different from knowing which knobs/dials to adjust in Apophysis to create interesting fractals and renders. They know what the knobs/dials adjust for their AI tools and so are quite proficient at creating amazing things using them.

People who write "jailbreak" prompts are a kind of example. There is some effort put into preventing people from prompting the models and removing the safeguards - and yet there are always people capable of prompting the model into removing its safeguards. It can be surprisingly difficult to do yourself for recent models and the jailbreak prompts themselves are becoming more complex each time.

For art in particular - knowing a wide range of artist names, names of various styles, how certain mediums will look, as well as mix & matching with various weights for the tokens can get you very interesting results. A site like https://generrated.com/ can be good for that as it gives you a quick baseline of how including certain names will change the style of what you generate. If you're trying to hit a certain aesthetic style it can really help. But even that is a tiny drop in a tiny bucket of what is possible. Sometimes it is less about writing an overly detailed prompt but rather knowing the exact keywords to get the style you're aiming for. Being knowledgeable about art history and famous artists throughout the years will help tremendously over someone with little knowledge. If you can't tell a Picasso from a Monet painting you're going to find generating paintings in a specific style much harder than an art buff.


> But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them. It's all claims that it's great with no further elaboration on the workflows.

To give an example, one person (a researcher at DeepMind) recently wrote about specific instances of his uses of LLMs, with anecdotes about alternatives to each example. [1] People on HN had different responses with similar claims with elaborations on how it has changed some of their workflows. [2]

While it would be interesting to see randomized controlled trials on LLM usage, hearing people's anecdotes brings to mind the (often misquoted) phrase: "The plural of anecdote is data". [3] [4]

[1] https://nicholas.carlini.com/writing/2024/how-i-use-ai.html

[2] https://news.ycombinator.com/item?id=41150317

[3] http://blog.danwin.com/don-t-forget-the-plural-of-anecdote-i...

[4] originally misquoted as "Anecdote is the plural of data."


> (often misquoted) phrase

You misquoted it there! It should be: The plural of anecdote is data.


Thank you! Another instance of a variant of Muphry's Law.

https://en.wikipedia.org/wiki/Muphry's_law


It's actually "the plural of 'anecdote' is not 'data'".


Apparently what you've said is the most common misquotation. See [3] above.


Oh interesting, thanks! I much prefer that formulation.


In the CUDA example [1] from carlini's "how I Use AI", I would guess that o1 would need less handholding to do what he wanted.

[1] https://chatgpt.com/share/1ead532d-3bd5-47c2-897c-2d77a38964...


Or people say "I've been pumping out thousands of lines of perfectly good code by writing paragraphs and paragraphs of text explaining what I want!" its like what are you programming dog? and they will never tell you, and then you look at their github and its like a dead simple starter project.

I recently built a Brainfuck compiler and TUI debugger and I tested out a few LLM's just to see if I could get some useful output regarding a few niche and complicated issues, and it just gave me garbage that looked mildly correct. Then I'm told its because I'm not prompting hard enough... I'd rather just learn how to do it at that point. Once I solve that problem, I can solve it again in the future in .25x the time.


Here's the thing. 99% of people aren't writing compilers or debuggers, they're writing glorified CRUDs. LLM can save a lot of time for these people, just like 99% of people only use basic arithmetic operations, and MS Excel saves a lot of time for these people. It's not about solving new problems, it's about solving old and known problems very fast.


> "99% of people aren't writing compilers or debuggers"

Look, I get the hype - but I think you need to step outside a bit before saying that 99% of the software out there is glorified CRUDs...

Think about the aerospace/defense industries, autonomous vehicles, cloud computing, robotics, sophisticated mobile applications, productivity suites, UX, gaming and entertainment, banking and payment solutions, etc. Those are not small industries - and the software being built there is often highly domain-specific, has various scaling challenges, and takes years to build and qualify for "production".

Even a simple "glorified CRUD", at a certain point, will require optimizations, monitoring, logging, debugging, refactoring, security upgrades, maintenance, etc...

There's much more to tech than your weekend project "Facebook but for dogs" success story, which you built with ChatGPT in 5 minutes...


This is almost entirely written by LLMs:

https://github.com/williamcotton/guish

I was the driver. I told it to parse and operate on the AST, to use a plugin pattern to reduce coupling, etc. The machine did the tippy-taps for me and at a much faster rate than I could ever dream of typing!

It’s all in a Claude Project and can easily and reliably create new modules for bash commands because it has the full scope of the system in context and a ginormous amount of bash commands and TypeScript in the training corpus.


One good use case is unit tests, since they can be trivial while at the same time being cumbersome to make. I could give the LLM code for React components, and it would make the tests and setup all the mocks which is the most annoying part. Although making "all the tests" will typically involve asking the LLM again to think of more edge cases and be sure to cover everything.


> I recently built a Brainfuck compiler and TUI debugger

Highly representative of what devs make all day indeed


Yea, obviously not, but the smaller problems this bigger project was composed of were things that you could see anywhere. I made heavy use of string manipulation that could be generally applied to basically anything


Really? Come on. You think trying to make it solve "niche and complicated issues" for a Brainfuck compiler is reasonable? I can't take this seriously. Do you know what most developer jobs entail?

I never need to type paragraphs to get the output I want. I don't even bother with correct grammar or spelling. If I need code for x crud web app who is going to type it faster, me or the LLM? This is really not hard to understand.


For many of us programming is a means to an end. I couldn't care less about compilers.


Specifically within the last week, I have used Claude and Claude via cursor to:

- write some moderately complex powershell to perform a one-off process

- add typescript annotations to a random file in my org's codebase

- land a minor feature quickly in another codebase

- suggest libraries and write sample(ish) code to see what their rough use would look like to help choose between them for a future feature design

- provide text to fill out an extensive sales RFT spreadsheet based on notes and some RAG

- generat some very domain-specific realistic sounding test data (just naming)

- scaffold out some PowerPoint slides for a training session

There are likely others (LLMs have helped with research and in my personal life too)

All of these are things that I could do (and probably do better) but I have a young baby at the moment and the situation means that my focus windows are small and I'm time poor. With this workflow I'm achieving more than I was when I had fully uninterrupted time.


> But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them.

I'm an iOS dev, my knowledge of JS and CSS is circa 2004. I've used ChatGPT to convert some of my circa 2009 Java games into browser games.

> Chatting with a LLM look to me like your intent is either vague or you don't know the units to use

Or that you're moving up the management track.

Managers don't write code either. Some prefer it that way.


I have used chatGPT to write test systems for our (physical) products. I have a pretty decent understanding of how code/programs works structurally, I just don't know the syntax/language (Python in this case).

So I can translate things like

"Create an array, then query this instrument for xyz measurements, then store those measurements in the array. Then store that array in the .csv file we created before"

It works fantastic and saved us from outsourcing.


The key difference is that this is a multidisciplinary conversational interface, and a tool in itself for interrelating structured meaning and reshaping it coherently enough so that it can be of great value both in the specific domain of the dialog, and in the potential to take it on any tangent in any way that can be expressed.

Of course it has limitations and you can't be sleep at the wheel, but that's true of any tool or task.


For one, I spend less time on Stackoverflow. LLMs can usually give you the answer to little questions about programming or command-line utilities right away.


I think people who are successfully using it to write code are just chaining APIs together to make the same web apps you see everywhere.


The vast majority of software is "just chaining APIs together". It makes sense that LLMs would excel at code they've been trained on the most, which means they can be useful to a lot of people. This also means that these people will be the first to be made redundant by LLMs, once the quality improves enough.


I would say all software is chaining APIs together.


Well, that depends on how you look at it.

All software calls APIs, but some rely on literally "just chaining" these calls together more than writing custom behavior from scratch. After all, someone needs to write the APIs to begin with. That's not to say that these projects aren't useful or valuable, but there's a clear difference in the skill required for either.

You could argue that it's all APIs down to the hardware level, but that's not a helpful perspective in this discussion.


| You could argue that it's all APIs down to the hardware level, but that's not a helpful perspective in this discussion.

Yes, that's what I'm arguing. Why isn't useful? I think it's useful, because it demystifies things. You know that in order to do something, you need to know how to use the particular API.


Here's one from simonw

https://gist.github.com/simonw/97e29b86540fcc627da4984daf5b7...

There are more to be found on his blog on the ai-assisted-programming tag. https://simonwillison.net/tags/ai-assisted-programming/


> The fact everyone that say they've become more productive with LLMs won't say how exactly. But no one really show how they're actually solving problems with LLMs and how the alternatives were worse for them.

A pretty literal response: https://www.youtube.com/@TheRevAlokSingh/streams

Plenty of Lean 4 and Cursor.


> The fact everyone that say they've become more productive with LLMs won't say how exactly.

I have python scripts which do lot of automation like downloading pdfs, bookmarking pdfs, processing them, etc. Thanks to LLMs I dont write a python code myself, I just ask an LLM to write it, I just provide the requirement. I just copy the code generated by the AI model and run it. If there any errors, I just ask AI to fix it.


> The fact everyone that say they've become more productive with LLMs won't say how exactly.

Anecdotally, I no longer use StackOverflow. I don’t have to deal with random downvotes and feeling stupid because some expert with a 10k+ score on 15 SE sites each votes my question to be closed. I’m pretty tech savvy, been doing development for 15 years, but I’m always learning new things.

I can describe a rough idea of what I want to an LLM and get just enough code for me to hit the ground running…or, I can ask a question in forum and twiddle my thumbs and look through 50 tabs to hopefully stumble upon a solution in the meantime.

I’m productive af now. I was paying for ChatGPT but Claude has been my goto for the past few months.


You clearly have made up your mind that it can't be right but to me it's like arguing against breathing. There are no uncertainties or misunderstandings here. The productivity gains are real and the code produced is more robust. Not in theory, but in practice. This is a fact for me and you trying to convince me otherwise is just silly when I have the result right in front of me. It's also not just boilerplate. It's all code.


>There are no uncertainties or misunderstandings here. The productivity gains are real and the code produced is more robust. Not in theory, but in practice.

So, that may be a fact for you but there are mixed results when you go out wide. For example [1] has this little nugget:

>The study identifies a disconnect between the high expectations of managers and the actual experiences of employees using AI.

>Despite 96% of C-suite executives expecting AI to boost productivity, the study reveals that, 77% of employees using AI say it has added to their workload and created challenges in achieving the expected productivity gains. Not only is AI increasing the workloads of full-time employees, it’s hampering productivity and contributing to employee burnout.

So not everyone is feeling the jump in productivity the same way. On this very site, there are people claiming they are blasting out highly-complex applications faster than they ever could, some of them also claiming they don't even have any experience programming. Then others claiming that LLMs and AI copilots just slow them down and cause much more trouble than they are worth.

It seems like just with programming itself, that different people are getting different results.

[1]https://www.forbes.com/sites/bryanrobinson/2024/07/23/employ...


Just be mindful that it is one's interest to push the "LLMs suck, don't waste your time with them" narrative once they figure out how to harness LLMs.

"Jason is a strong coder, and he despises AI tools!"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: