Hacker News new | past | comments | ask | show | jobs | submit login

Claude 3.5 Sonnet is the first model that made me realize that the era of AI-aided programming is here. Its ability to generate and modify large amounts of correct code - across multiple files/modules - in one response beats anything I've tried before. Integrating that with specialized editors (like https://www.cursor.com) is an early vision of the future of software development.



I've really struggled every time I've pulled out any LLM for programming besides using Copilot for generating tests.

Maybe I've been using it for the wrong things—it certainly never helps unblock me when I'm stuck like it sounds like it does for some (I suspect it's because when I get stuck it's deep in undocumented rabbit holes), but it sounds like it might be decent at large-scale rote refactoring? Aside from specialized editors, how do people use it for things like that?


At least from my experience:

You take Claude, you create a new Project, in your Project you explain the context of what you are doing and what you are programming (you have to explain it only once!).

If you have specific technical documentation (e.g. rare programming language, your own framework, etc), you can put it there in the project.

Then you create a conversation, and copy-paste the source-code for your file, and ask for your refactoring or improvement.

If you are lazy just say: "give me the full code"

and then

"continue the code" few times in a row

and you're done :)


> in your Project you explain the context of what you are doing and what you are programming (you have to explain it only once!).

When you say this, you mean typing out some text somewhere? Where do you do this? In a giant comment? In which file?


In "Projects" -> "Create new project" -> "What are you trying to achieve?"


Provide context to the model. The code you're working on, what it's for, where you're stuckhat you've tried, etc. Pretend it's a colleague that should help you out and onboard it to your problem, then have a conversation with it as of your are rubber ducking your colleague.

Don't ask short one-off questions and expect it to work (it might just, depending on what you ask, but probably not if you're deep on some proprietary code base with no traces in the LLMs pretraining).


I've definitely tried that and it doesn't work for the problems I've tried. Claude's answers for me always have all the hallmarks of an LLM response: extremely confident, filled with misunderstandings of even widely used APIs, and often requiring active correction on so many details that I'm not convinced it wouldn't have been faster to just search for a solution by hand. It feels like pair programming with a junior engineer, but without the benefit of helping train someone.

I'm trying to figure out if I'm using it wrong or using it on the wrong types of problems. How do people with 10+ years of experience use it effectively?


I'm sure I'm going to offend a bunch of people with this, but my experience has been similar to yours, and it reminds me of something "Uncle" Bob Martin once mentioned: the number of software developers is roughly doubling every two years, which means that at any given time half of the developer population has less than two years experience.

If you're an experienced dev, having a peer that enthusiastically suggests a bunch of plausible but subtly wrong things probably net-net slows you down and annoys you. If you're more junior, it's more like being shown a world of possibilities that opens your mind and seems much more useful.

Anyway, I think the reason we see so much enthusiasm for LLM coding assistants right now is the overall skew of developers to being more junior. I'm sure these tools will eventually get better, at least I hope they do because there's going to be a whole lot of enthusiastically written but questionable code out there soon that will need to be fixed and there probably won't be enough human capacity to fix it all.


Thanks for saying it explicitly. I definitely have the same sense, but was hoping someone with experience would chime in about use cases they have for it.


I'm a mathematician and the problems I work on tend to be quite novel (leetcode feel but with real-world applications). I find LLMs to be utterly useless at such tasks; "pair programming a junior, but without the benefit" is an excellent summary of my experience as well.


It's good for writing that prototype you're supposed to throw away. It's often easy to see the correct solution after seeing the wrong one.


I think the only way to answer that is if you can share an example of a conversation you had with it, where it broke down as you described.


For what I’m working on, I can also use the wrong approaches. Going through my fail often fail fast feedback loop is a lot more efficient with LLMs. Like A LOT more.

Then when I have a bunch of wrong answers, I can give those as context as well to the model and make it avoid those pitfalls. At that point my constraints for the problem are so rigorous that the LLMs lands at the correct solution and frankly writes out the code 100x faster than I would. And I’m an advanced vim user who types at 155 wpm.


> And I’m an advanced vim user who types at 155 wpm.

See, it's comments like this that make me suspect that I'm working on a completely different class of problem than the people who find value in interacting with LLMs.

I'm a very fast typer, but I've never bothered to find out how fast because the speed of my typing has never been the bottleneck for my work. The bottleneck is invariably thinking through the problem that I'm facing, trying to understand API docs, and figuring out how best to organize my work to communicate to future developers what's going on.

Copilot is great at saving me large amounts of keystrokes here and there, which is nice for avoiding RSI and occasionally (with very repetitive code like unit tests) actually a legit time saver. But try as I might I can't get useful output out of the chat models that actually speeds up my workflow.


I have always thought of it as a way to figure out what doesn't work and get to a final design, not necessarily code. Personally, it's easy to verify a solution and figure out use cases that wouldn't work out. Keep on iterating until I have either figured out a mental model of the solution, or figured out the main problems in such a hypothetical solution.


Oh yes, totally agree, it's like if you have a very experienced programmer sitting next to you.

He still needs instructions on what to do next, he lacks a bit of "initiative", but from a pure coding skills it's amazing (aka, we will get replaced over time, and it's already the case, I don't need help of contractors, I prefer to ask Claude).


More like an insanely knowledgeable but very inexperienced programmer. It will get basic algorithms wrong (unless it's in the exact shape it has seen before). It's like a system that automatically copy-pastes the top answer from stackoverflow in your code. Sometimes that is what you want, but most of the time it isn't.


This sentiment is so far from the truth that I find it hilarious. How can a technically adept person be so out of touch with what these systems are already capable of?


LLMs can write a polite email but it can't write a good novel. It can create art or music (by mushing together things it has seen before) but not art that excites. It's the same with code. I use LLMs daily and I've seen videos of other people using tools like Cursor and so far it looks like these LLMs can only help in those situations where it is pretty obvious (to the programmer) what the right answer looks like.


With all of that, ChatGPT is actually one of the top authors in Amazon e-books.

But I agree that for some creative tasks, like writing or explaining a joke, or some novel algorithms, it's very bad.


The LLM generated e-book thing is actually a serious problem. Have you read any of it? Consumers could lose trust unless it’s fixed. If you buy a book and then realise nobody, not even the seller, has ever read it, as it turns into incomprehensible mush regularly, are you more or less likely to buy a book from the same source?


Hilarious (or even shocking) is the sentiment that people are actually so overhyped by these tools.


I keep hearing this comment everywhere Claude is mentioned, as if there is a coordinated PR boost on social media. My personal experience with Claude 3.5 however is, meh. I don't see much difference compared to GPT-4 and I use AI to help me code every day.


Yeah they really like to mention it everywhere, like yeah it's good but imo not as good as some people make it out to be. I have used it recently for libgdx on kotlin and there are things where it struggles, and the code it sometime gives it's not really "good" kotlin but it takes a good programmer to know what is good and what is not


I think in more esoteric languages it wont work as well. Python, C++ it is excellent, suprisingly its Rust is also pretty damn good.

(I am not a paid shiller, just in awe of what Sonnet 3.5 + Opus can do)


Kotlin isn't exactly an esoteric language though


User error.


Please consider avoiding more ad hominem attacks or revising the ones you've already plastered onto this discussion.


How are you liking cursor? I tried it ~a year ago, and it was quite a bit worse than ferrying back and forth between ChatGPT and VSCode.

Is it better than using GitHub Copilot in VSCode?


Definitely better. I ended my Copilot subscription.


Oh interesting, will give it another go, thnx




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: