Hacker News new | past | comments | ask | show | jobs | submit | cube2222's comments login

Yeah. I love the Zed AI assistant for the way it manages context (inline edits in the context of the chat) and how "raw" it is. However, I mostly use Goland for coding, so was getting annoyed at having to switch constantly.

So, in a couple of evenings (with already having a working PoC evening one) I managed to basically replicate a Zed AI-like integration as a Goland plugin (much less polished and feature-rich, but covering exactly what I need).

I've never written kotlin nor a jetbrains plugin before, and since what I wanted is quite complex, it would've easily taken me 1-4 weeks full-time work otherwise - which is to say I would've never done it. It required a ton of grepping around the intellij-community repo too, to see what is available, and what existing patterns are (all done by the AI of course).

In this case I vibe coded it (I deem it fine for side-projects, not for production work of course) with Claude Code and it cost me ~$100 all in all, while I was mostly chilling / gaming. Later stages definitely required me to read the code and understand it, since as you say, vibe coding falls apart at a certain point pretty drastically. But having a working skeleton (and code I can modify instead of authoring from scratch) I could easily polish it to sufficient stability.

All in all, a pretty magical experience, and I now use my new plugin all day. Not only amazing bang for buck, but just enabling me to do side projects that I otherwise just wouldn't have the time nor want to put in the effort to do.


That's an excellent article - it's great when people share not only their victories, but mistakes, and what they learned from them.

That said regarding both rapid gameplay mechanic iteration and modding - would that not generally be solved via a scripting language on top of the core engine? Or is Rust + Bevy not supposed to be engine-level development, and actually supposed to solve the gameplay development use-case too? This is very much not my area of expertise, I'm just genuinely curious.


It does solve the gameplay development use case too. Bevy encourages using lots of small 'systems' to build out logic. These are functions that can spawn entities or query for entities in the game world and modify them and there's also a way to schedule when these systems should run.

I don't think Bevy has a built-in way to integrate with other languages like Godot does, it's probably too early in the project's life for that to be on the roadmap.


> It's just boosting people's intention.

This.

It will in a sense just further boost inequality between people who want to do things, and folks who just want to coast without putting in the effort. The latter will be able to coast even more, and will learn even less. The former will be able to learn / do things much more effectively and productively.

Since good LLMs with reasoning are here, I've learned so many things I otherwise wouldn't have bothered with - because I'm able to always get an explanation in exactly the format that I like, on exactly the level of complexity I need, etc. It brings me so much joy.

Not just professional things either (though those too of course) - random "daily science trivia" like asking how exactly sugar preserves food, with both a high-level intuition and low-level molecular details. Sure, I could've learned that if I wanted too before, but this is something I just got interested in for a moment and had like 3 minutes of headspace to dedicate to, and in those 3 minutes I'm actually able to get an LLM to give me an excellent tailor-suited explanation. This also made me notice that I've been having such short moments of random curiosity constantly, and previously they mostly just went unanswered - now each of them can be satisfied.


> Since good LLMs with reasoning are here

I disagree. I get egregious mistakes often from them.

> because I'm able to always get an explanation

Reading an explanation may feel like learning, but I doubt it. It is the effort of going from problem/doubt to constructing a solution - and the explanation is a mere description of the solution - that is learning. Knowing words to that effect is not exactly learning. It is an emulation of learning, a simulacrum. And that would be bad enough if we could trust LLMs to produce sound explanations every time.

So not only getting the explanation is a surrogate of learning something, you also risk internalizing spurious explanations.


Every now and then I give LLMs a try, because I think it's important to stay up to date with technology. Sometimes there have been specs that I find particularly hard to parse in domains I am a bit unfamiliar in where I thought the AI could help. At first the solutions seemed correct but then on further inspection, no they were far more convoluted than needed, even if they worked.

I can tell when my teammate’s code contains LLM-induced/written code, because it “functionally works” but does so in a way that is so overcomplicated and unhinged that a human isn’t likely to have gone out of their way to design something so wildly and specifically weird.

That's why I don't bother with LLMs even for scripts. Scripts are short for a reason, you only have so much time to dedicate on it. And often you pillage from one script to use in another, because every line is doing something useful. But almost everything I generated with LLM are both long and full of abstractions.

"Risk of internalizing spurious explanations" is an excellent way of putting it. LLM output is, essentially, a polished-looking, authoritative-sounding summary of what the top few Google results probably say about a topic. Nine times out of ten, the explanation may be spot on. But "the first few google results" are not, in general, a reliable source. And after getting nine correct answers in a row from the LLM, it's unfortunately very tempting to accept the tenth at face value without consulting any primary sources.

I've been finding that ChatGPT is helpful when taking a "first dive" into an unfamiliar topic. But, after studying the topic at greater depth through primary sources, I'll start to see many subtle errors, or over-simplifications, or claims stated as facts which are actually controversial among experts, in the ChatGPT answers. Overall, I'd say ChatGPT can provide a good approximation of truth, which can speed up research by providing instant context. But it should not by any means be the final destination when researching a topic.


I think so too. Otherwise every Google maps user would be an awesome wayfinder. The opposite is true.

First, as you get used to LLMs you learn how to get sensible explanations from them, and how to detect when they're bullshitting around, imo. It's just another skill you have to learn, by putting in the effort of extensively using LLMs.

> Reading an explanation may feel like learning, but I doubt it. It is the effort of going from problem/doubt to constructing a solution - and the explanation is a mere description of the solution - that is learning. Knowing words to that effect is not exactly learning. It is an emulation of learning, a simulacrum. And that would be bad enough if we could trust LLMs to produce sound explanations every time.

Every person learns differently, and different topics often require different approaches. Not everybody learns exactly like you do. What doesn't work for you may work for me, and vice versa.

As an aside, I'm not gonna be doing molecular experiments with sugar preservation at home, esp. since as I said my time budget is 3 minutes. The alternative here was reading about it on wikipedia or some other website.


> It's just another skill you have to learn, by putting in the effort of extensively using LLMs.

I'd rather just skip the hassle and keep using known good sources for 'learning about' things.

It's fine to 'learn about' things, that is the extent of most of my knowledge. But from reading books, attending lectures, watching documentaries, science videos on youtube or, sure, even asking LLMs, you can at best 'learn about' things. And with various misconceptions at that. I am under no illusion that these sources can at best give me a very vague overview of subjects.

When I want to 'learn something', actually acquire skills, I don't think that there is any other way than tackling problems, solving them, being able to build solutions independently and being able to explain these solutions to people with no shared context. I know very few things. But I am sure to keep in mind that the many things I 'know about' are just vague apprehensions with lots of misconceptions mixed in. And I prefer to keep to published books and peer reviewed articles when possible. Entertaining myself with 'non-fiction' books, videos etc is to me just entertainment. I never mistake that for learning.


Some problems do not deserve your full attention/expertise.

I am not a physicist and I will most likely never require to do anything related to quantum physics in my daily life. But it's fun to be able to have a quick mental model to "have an idea" about who was Max Planck.


Why would you need to run inference through a GPU farm for that? The wikipedia article on him is pretty interesting.

Funny you should mention him, I am very interested in his conceptions about the nature of reality:

'Planck said in 1944, "As a man who has devoted his whole life to the most clear headed science, to the study of matter, I can tell you as a result of my research about atoms this much: There is no matter as such. All matter originates and exists only by virtue of a force which brings the particle of an atom to vibration and holds this most minute solar system of the atom together. We must assume behind this force the existence of a conscious and intelligent spirit [orig. geist]. This spirit is the matrix of all matter."'


Reading an explanation is the first part of learning, chatgpt almost always follows up with “do you want to try some example problems?”

I used chatgpt to get comfortable with DIYing my pool filter work. I started clueless "there is a thing that looks like $X, what is it" to learning I own a sand filter and how to maintain it.

My biggest barrier to EVERYTHING is not knowing the right word or term to search. LLMs ftw.

A proper LLM would let me search all of my work's artifacts when I ask about some loose detail I half remember. As it is, I know of a topic and I simply can't find the _exact word_ to search so I can't find the right document or slack conversation


This is very accurate imo - it really is the skill of proper delegation. Same for asking AI questions in an unbiased way so it doesn’t just try to please you - this has made me better at asking questions to people as well!

It’s like a slightly over-eager junior-mid developer, which however doesn’t mind rewriting 30k lines of tests from one framework to another. This means I can let it handle that dirty work, while focusing on the fun and/or challenging parts myself.

I feel like there’s also a meaningful split of software engineers into those who primarily enjoy the process of crafting code itself, and those that primarily enjoy building stuff, treating the code more as a means to an end (even if they enjoy the process of writing code!). The former will likely not have fun with AI, and will likely be increasingly less happy with how all of this evolves over time. The latter I expect are and will mostly be elated.


> It’s like a slightly over-eager junior-mid developer

One with brain damage maybe, I tried out having Claude & Gemini modify a Go program with an absolutely trivial change (change the units displayed in an output type) and it got one of the four lines of code correct (the actual math for the unit conversion) and the rest was incorrect.

In the end, I integrated the helper function it output myself.

SOTA models can generate two or three lines of code accurately at a time and you have to describe them with such specificity that I've usually already done the hard part of the thinking by the time I have a specific enough prompt, that it's easier to just type out the code.

At best they save me looking up a unit conversion formula, which makes them about as useful as a search engine


That sounds very unlike my experience. I frequently get it to modify / create large parts of files at a time, successfully.

Coding agents use extreme numbers of tokens, you’d be getting rate limited effectively immediately.

A typical small-medium PR with Claude Code for me is ~$10-15 of API credits.


I've ended up with $5K+ in a month using sonnet 3.7, had to dial it back.

I'm much happier with gemini 2.5 pro right now for high performance at a much more reasonable cost (primarily using with RA.Aid, but I've tried it with Windsurf, cline, and roo.)


Hoooly hell. I swear the AI coding products are basically slot machines.

Or the people using them are literally clueless.

That's the largest I've heard of. Can you share more detail about what you're working on that consumes so many tokens?

It's really easy to get to $100 in a day using sonnet 3.7 or o3 in a coding agent.

Do that every day for a month and you're already at $3k/month.

It's not hard to get to $5k from there.


Sure but how? Still wondering more specifically what you're doing. And 3-5k is unfortunately my entire month's salary

I'm developing an open source coding agent (RA.Aid).

I'm using RA.Aid to develop itself (dogfooding,) so I'm constantly running the coding agent.

That cost is my peak cost, not average.

It's easy to scale back costs to 1/10 the cost and still get 90% of the quality. Basically that means using models like gemini 2.5 pro or Deepseek v3 (even cheaper) rather than expensive models like sonnet 3.7 and o3.


Just try the most superior model deep-seek

Exactly. Just like Michelin the tire company created Michelin star restaurants list to make people drive and use more tires

Too expensive for me to use for fun. Cheap enough to put me out of a job. Great. Love it. So excited. Doesn't make me want to go full Into The Wild at all.

I don’t think this is at the level of putting folks out of a job yet, frankly. It’s fine for straightforward changes, but more complex stuff, like concurrency, I still end up doing by hand.

And even for the straightforward stuff, I generally have a mental model of the changes required and give it a high level list of files/code to change, which it then follows.

Maybe the increase in productivity will reduce pressure to hire? We’ll see.


I didn't know this, thank you for the anecdata! Do you think it'd be more reasonable to generalize my suggestion to "This CLI should be included as part of ChatGPT's pricing"?

Could be reasonable for the $200/month sub maybe?

But then again, $200 upfront is a much tougher sell than $15 dollars per PR.


Trust me bro, you don't need RAG, just stuff your entire codebase into the prompt (also we charge per input token teehee)

Fingers crossed for this to work well! Claude Code is pretty excellent.

I’m actually legitimately surprised how good it is, since other coding agents I’ve used before have mostly been a letdown, which made me only use Claude in direct change prompting with Zed (“implement xyz here”, “rewrite this function with abc”, etc), so very hands-on.

So I’ve went into trying out Claude Code rather pessimistically, and now I’m using it all the time! Sure, it ends up costing a bunch, but it’s easy to justify $15 for a prompting session if the end result is a mostly complete PR, done much faster.

All that is to say - competition is good, fingers crossed for codex!


Claude Code has a closed license https://github.com/anthropics/claude-code/blob/main/LICENSE....

There is fork named Anon Kode https://github.com/dnakov/anon-kode which can use more models and non-Anthropic ones. But the license is unclear for it.

It's interesting to see codex to be Apache License. Maybe somebody extends it to be usable with competing models.


If it's a fork of the proprietary code, the license is pretty clear, it's violating copyright.

Now whether or not anthropic care enough to enforce their license is separate issue, but it seems unwise to make much of an investment in it.


They call it a "fork" but it doesn't share any code. It's from scratch afaik

In terms of terminal-based and open-source, I think aider is the most popular one.

yes! It's great! I like it!

But it has one downside: It's not so good on unknown big complex code bases where you don't know how it's structured. I wished they (or somebody else) would add an AI or an automation to add files dynamically or in a smart way when you don't know the codebase structure (with the expense of burning more tokens).

I'm thinking Codex (have not checked it yet), Claude Code, Anon Kode and all the AI editors/plugins doing a better job there (and potentially burning more tokens).

But that's the only downside I can think of about aider.


I’m not positive, but I think if you do

/context this_will_be_my_prompt

it will do a few requests on its own to decide what you need in context, add those files, and return back to you so you can continue on.


I was under the impression Aider did exactly what you're describing using it's repo map feature.

Not really, repo map only gives LLMs an overview of the codebase, but aider doesn't automatically bring files into the context - you have to explicitly add the files you wish for it to see in their entirety to the context. Claude Code/Codex and most other tools do this automatically, that's why they're much more autonomous.

Aider regularly asks me the authorization to access files that I didn't explicitly add.

(This happens when the LLM mentions them.)

I didn't like not seeing the reasoning of the models

Seconded. I was surprised by how good Claude Code is, even for less mainstream languages (Clojure). I am happy there is competition!

Fingers crossed for what?

I started using claude code everyday. It’s kinda expensive and hallucinates a ton (tho with custom prompt i’ve mostly tamed it).

Hope more competition can bring price down.


too expensive. I cant understand why everyone is into claude code vs using claude in cursor or windsurf.

I think it depends a lot on how you value your time. I'm personally willing to spend hundreds or thousands per month happily if it saves me enough hours. I'd estimate that if I were to do consulting, I'd likely be charging in the $150-250 per hour range, so by my math, it's pretty easy to justify any tools that save me even a few hours per month.

Or, increasingly, how the company values your time. If Claude Code can make a $100K/year dev 10% more productive, it's worth it to the employer to pay anything under $1600/month for it (assuming fully loaded cost of the employee to the business is twice salary).

Productivity and business value are not linearly related. It could provide 0 business value to make someone 10% more productive.

I was thinking of productivity as generation of business value rather than something less correlated like lines of code produced. But sure, it's probably more accurate to directly say "business value".

ok but in what way a terminal is a bettter UI than an IDE? I am trying all of them on a weekly basis and windsurf UX seems miles ahead/ more efficient than a terminal. that is also what OAI believes or else they wouldnt try to buy it

I like the terminal UX because VS Code (and any forks of it) is not my editor of choice, and swapping around to use an editor just for AI coding is annoying (I was doing that with the Zed Assistant a lot).

With Claude Code I can stay in Goland, and have Claude Code in the terminal.


You could also try JetBrains' Junie and Sourcegraph Cody.

windsurf also have plugins to jetbrains - they rebranded the whole company from codeium to windsurf and their jetbrains plugin also support cascade.

I was very unimpressed with their original AI assistance implementation, so I’m gonna wait to see some user stories / reviews before I put my time into that, and so far I have seen effectively no mention of Junie anywhere.

Moreover, there’s no way to bring your own key, with the highest subscription tier being $20 per month flat it seems, which is the cost of just 1-3 sessions with Claude Code. Thus, without evidence to the contrary, I’m not holding my breath for now.


One thing that is clearly better in the terminal is secrets management/environment variables.

It's also much easier to control execution in a structured and reliable way in the terminal. Here's an automated debugging use case, for example: https://www.youtube.com/watch?v=g-_76U_nK0Y


After I have a session going on, the Claude Code terminal app has been given the permission to do everything I want it to. Then I just let it burn itself out doing whatever. It's a background task. That's the big advantage. I don't baby sit it.

Not a better UI at all but seems like they're able to then focus on what matters in these early stages and that's quality of output.

Are you still working 40 hours a week? If so, what's the difference?

I don’t - if I can use a tool that saves me 10 hours a week, that’s 10 hours more beach time for me.

Accomplishing more in that 40 hours?

And being paid more? Most salaried employees would not be.

You get the same results for cheaper by using a different tool (Windsurf's better imho).

That may be, but I think tools with a fixed monthly fee are always going to have an incentive to reduce their own costs on the backend and route you toward less capable models, cut down context size, produce less output, stop before the task is truly finished, etc.

Given how much time these models can save me, I'd rather optimize for capability and just accept whatever the price is as a cost of doing business. (Within reason I guess—I probably wouldn't go beyond $2-3k per month at this point, unless there was very clear ROI on that spend.)

Also, it's not only about saving time. More powerful AI tools allow me to build things it would otherwise be impossible to build... that's just as important as the time/cost equation.


It's literally the same model. I can build more complex stuff in windsurf as the IDE is better than Cline/Roocode integration in vscode. It's still the same model under the hood. Sonnet 202500219

I mean, you pour money down the drain if you think it's helping, have at it :P


It's the same model but not necessarily the same context. Like he said, those tools try to be very 'smart' with context to save costs.

You're not actually getting all the files you add in the context window, you're getting a RAG'd version of it, which is generally much worse if the un-RAG'd code is still within the effective context limit.


I've spent more than 40 hours/week and close to $1,000 in API credits using these tools. For me the ranking goes. But, we all will have difference experiences.

1. Claude Code 2. Cursor 3. Cline. 4. Windsurf


How you can place windsuf in number 4 is interesting, especially given it's very similar to cursor but is leaner on the UI and Cline is a vs-code plugin that very verbose.

I'll stick with Windsurf, especially given their upcoming announcement.


I care a lot less about UI and more about quality of output. Windsurf has had some of the lowest quality outputs for me.

$1000 over how many 40 hour weeks?

Honesty not sure quite a few. 6-8 or so?

How do you price this in? If you’re charging by the hour, paying out of pocket to reduce your hours seems self-defeating unless you raise your rates enough to cover both the costs and the lost hours. I can’t imagine too many clients would accept “I’m very expensive per hour because I’m fast, because I get AI to do most of it.”

As the OP said, he can now tackle more complex tasks:

> More powerful AI tools allow me to build things it would otherwise be impossible to build...

https://news.ycombinator.com/item?id=43709775


> if it saves me enough hours

You're being paid to type? I want your job.


Claude Code has been able to produce results equivalent to a junior engineer. I spent about $300 API credits in a month but got the value out of it far surpassing that.

If you have AWS credits...

export CLAUDE_CODE_USE_BEDROCK=1

export ANTHROPIC_MODEL=us.anthropic.claude-3-7-sonnet-20250219-v1:0

export ANTHROPIC_API_TYPE=bedrock


Is this for Claude Code?

Yep

and where are these export used? Ader?

Anecdotally, Claude code performs much better than Claude within Cursor. Not sure if it’s a system prompt thing or if I’ve just convinced myself of it because the aesthetic is so much better, but either way the end result feels better to me.

One has the incentive to burn through as much tokens and the other has an incentive to use as little as possible

Great point.

My choice conspiracy is resource allocation and playing favorites.

I tried switching from Claude Code to both Cursor and Windsurf. Neither of the latter IDEs fully support MCP implementations (missing basic things like tool definitions and other vital features last time I tried) and both have been riddled with their own agentic flow issues (cursor going down for a week a bit ago, windsurf requiring paid upgrades to "get around" bugs, etc).

This is all ignoring the controversies that pop up around e.g. Cursor seemingly every week. As an IDE, they're both getting there -- but I have objectively better results in Claude Code.


that's what my Ramp card is for.

seriously though, anything that makes me smarter and more productive has a threshold in the thousands-of-dollars range, not hundreds


Why is using cursor with sonnet cheaper than using claude code?

probably because cursor is betting on many paying people not using their tool to full extend. Like people paying on their gym memberships but not going to the gym.

Or they are burning VC money.


I've read anecdotal evidence that it uses tokens more sparingly than Claude Code - supported by the, likewise anecdotal, evidence that Claude Code is more effective in practice. However, that would be reasonable, as basically 1-3 sessions with Claude Code cost what a whole month of Cursor costs.

I believe most of these people (including myself) in this context are just talking about boring / obvious / easy code. And yeah, I’d say a large percentage of code changes I do is plumbing for the few interesting ones.

AI handles the boring bits very well, and to me that is the most energetically draining part of coding (the hard parts are fun and invigorating).

Whether it only fits the current codebase’s context or not doesn’t really matter, you just give it important samples from and info about the codebase at the start of your prompt. My baseline prompt length is ~30k tokens due to that.

I do review and polish everything that’s generated though, as needed. Vibe coding (not even reading / understanding the generated code) I believe was coined primarily for having fun in side projects. If you’re using it for production code then you’re likely holding it wrong.


Yeah, the context that comes to mind are basic class methods or simple data cleaning functions for messing with numpy/pandas data


I'm frequently constructing context based on up-to-date docs using curl + html2markdown[0] and custom css selectors, which is extremely tedious. MCP servers for docs would be very useful for me.

That said, I don't really expect the AI itself to come up with docs to read (maybe some day). I want it predominantly so I can manually reference it in my prompt (in e.g. the Zed assistant panel) like `/npmdocs packagename packageversion`.

But even for AI "self-driven" use-cases, I primarily see the value in read-only MCP servers that provide more context, just in an "as-needed" way, instead of me putting it there explicitly.

[0]: https://github.com/JohannesKaufmann/html-to-markdown


What exactly is Apidog though?

I feed LLMs documentation in case of obscure languages and it more or less works (with Claude).


I have no idea, I was talking about a theoretical generic npm docs mcp server, which I’ve just realized this is not.


I see. Thanks for the html-to-markdown though, I used to just copy paste either the whole HTML or select all text on a website and feed that into LLMs but it is such a waste of context and it does not work as well. Claude has a specific format and simonw has a project that converts source code (?) to the format Claude "likes". I think it supports more than just Claude (?).


A good recent in-depth article about this: https://andymasley.substack.com/p/individual-ai-use-is-not-b...


That and conventionally standard names, like r for request/reader, w for writer, etc.

Agreed. Functions shouldn’t be full of short non-descriptly named variables.

The longer the lifetime/scope of a variable, and the more variables there are, the more descriptive the names should be.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: