Hacker Newsnew | past | comments | ask | show | jobs | submit | pama's commentslogin

> I can't find a single open source codebase, actively used in production, and primarily maintained and developed with AI

This popular repo (35.6k stars) documents the fraction of code written by LLM for each release since about a year ago. The vast majority of releases since version 0.47 (now at 0.85) had the majority of their code written by LLM (average code written by aider per release since then is about 65%.)

https://github.com/Aider-AI/aider

https://github.com/Aider-AI/aider/releases


I think we need to move the goalposts to "unrelated to/not in service of AI tooling" to escape easy mode. Replace some core unix command-line tool with something entirely vibecoded. Nightmare level: do a Linux graphics or networking driver (in either Rust or C).

Yes, I agree. The same way that when you ask "Are there any production codebases written in Language X", you typically mean "excluding the Language X compiler & tooling itself." Because of course everyone writing a tool loves bootstrapping and dogfooding, but it doesn't tell you anything about production-readiness or usefulness / fitness-for-purpose.

I didnt read your draft paper, but your premise in HN sounds a bit off to me. AGI does not assume the ability for finding or learning an optimal solution to every problem (with that assumption it would be trivial to prove it impossible in many different ways). Independent of the exact definition, a system of intelligence that is better or equal to the best human in any domain would be at least termed AGI. (If there exist a couple incompressible problems along the way you can memorize the human solution.) If you proved AGI impossible under such a (weaker?) definition you would prove that humans can no longer improve in any domain (as the set of all humans is a general intelligence). Or you would need to assume that there is something special inside humans, which no technology can ever build. I disagree with both premises.

Independent of the exact definition, a system of intelligence that is better or equal to the best human in any domain would be at least termed AGI.

Exactly. There's this "thing" you see in certain circles, where people (intentionally?) mis-interpret the "G" in AGI as meaning "the most general possible intelligence". But that's not the reality. AGI has pretty much always been taken to mean "AI that is approximately human level". Going beyond that is getting into the realm of Artificial Super Intelligence, or Universal Artificial Intelligence.


What I remember is that a lot of people used the word 'AI', other (including me) said 'thats not intelligence, it's too specific', and poof, a new word came to replace the word AI, 'AGI', to mean an AI that can adapt to new, unforseen situations.

The LLM that will convince me that AGI is near is one that will understand language enough to find the linguistic rules of conlangs they weren't trained on (or more specifically, engineered languages made by linguists with our current knowledge of how languages work), and create grammatically correct sentences. Something someone trained on can do with great effort, but it's more due to breaking habits and limited brainpower than real complexity.


I think that OP's conclusion may be true in a not very meaningful sense: once a particular non-trivial threshold of competence is defined for every task (infinitely many), then any policy must be bad at some of them.

I use a personally-adjusted variant of M-x shell for most of my last 20 years of Emacs use (after trying various hacks including things mentioned in this article). I end up having about 20–100 named shell buffers on each machine, different projects, logs, etc, and I keep all histories of each shell in separate history files. It helps that the buffers are practically infinte size and I can use standard Emacs editing for anything, plus comint, which is a great simple interface to shells. I have some tools to split the Emacs frame into multiple pieces and reorganize them quickly so I can see any number of my open shells, limited only by the monitor size. I dont understand how people live without such shell management solutions; I’ve seen expert tmux users struggle with keeping track of all shells, whereas Emacs has tons of tools to work on buffers that directly work: starting from simple ibuffer, bookmarks, previews, to multi-occur (which can specialize to shells or subsets of them), to whatever you wish to do with buffers really.

btop/htop/nvtop or other curses-heavy tools have specialized solutions (open vterm or eat and run it), but all the composing unix tools run in M-x shell.


I think if you were to write a post (or better yet a video) to demonstrate your usage of your workflow, that would be a great resource! I love working in M-x shell but your workflow sounds next level.

Cool. Would it be possible to eliminate that little vocab format conversion requirement for the vocab I see in the test against tiktoken? It would be nice to have a fully compatible drop in replacement without having to think about details. It also would be nice to have examples that work the other way around: initialize tiktoken as you normally would, including any specialized extension of standard tokenizers, and then use that initialized tokenizer to initialize a new tokendagger and test identity of results.


Alright, 0.1.1 should now be a true drop-in replacement. I'll write up some examples soon.


Ah good catch. Updating this right now.


V3 is number 5 in your list. R1-0528(free) is number11 and R1(free) is number15. Openrouter separates the free (in the top 20 list you shared) vs paid (further down) instances of V3 and R1, and of course it doesnt count the direct connection to the providers, or the various self-hosted solutions (choice of companies working in sensitive areas, including many of my friends).


Not sure what you are referring to—do you have a pointer to a technical writeup perhaps? In training and inference MLA has way less flops than MHA, which is the gold standard, and way better accuracy (model performance) than GQA (see comparisons in the DeepSeek papers or try deepseek models vs llama for long context.)

More generally, with any hardware architecture you use, you can optimize the throughput for your main goal (initially training; later inference) by balancing other parameters of the architecture. Even if training is suboptimal, if you want to make a global impact with a public model, you aim for the next NVidia inference hardware.


I had related questions and checked out the project a bit deeper though I havent tested it seriously yet. The project did start work over a year ago based on relevant papers, before vllm or sglang had decent solutions; it might still be adding performance in some workflows though I havent tested it and some of the published measurements in the project are now stale. Caching LLM kv-cache to disk or external memory servers can be very helpful at scale. Cache management and figuring out cache invalidation is hard anyways and I am not sure at what level a tight integration with inference servers or specialized inference popelines can help vs a lose coupling that could advance each component separately. It would be nice if there were decent protocols used by all inference engines to help this decoupling.


Is your aim targetting the inference at scale or specialized/new/simpler inference pipelines? Sglang and vllm have disaggregated prefix and decoding serving (eg https://docs.vllm.ai/examples/online_serving/disaggregated_s... or https://github.com/sgl-project/sglang/issues/3554 and https://github.com/sgl-project/sglang/issues/4655) — could your solution enable a model-agnostic cache store/server or is that orthogonal to what you are trying to achieve?


The best places to work at are full of people that are intrinsically motivated. “Good job” is implied. Feedback, including criticism, is expected, as it helps improve things further or recognise perfection/good enough. Academics do have egos, but typically they compete against academics in other, remote departments, and many of the best ones behave in ego free ways in their groups and collaborations. Same in good industry teams, where having management get out of the way and letting the intrinsically motivated contributors work themselves is key. The competition is the outside world, not the eating of a larger share of the resources of the same team. If you like problem solving, stay humble and help others; you will have a lifetime of fun, even if it feels rocky some bad days.


The performance figure in the link clearly says that it it’s a significant improvement to the 3050.


The charts are from the Verge, not exactly known for their integrity in regards to anything.

It's also with DLSS on, so you could just as easily have the framerate be 100 FPS, 1000 FPS, or 10000 FPS. The GPU doesn't actually have to render the frame in that case, it just has to have a pixel buffer ready to offload to whatever hardware sends it over the link to the display. Apparently some people actually really like this, but it isn't rendering by any reasonable definition.



This is a better link: https://gamersnexus.net/gpus/nvidia-selling-lies-rtx-5070-fo...

It's about the RTX 5070 but the criticisms still hold for the RTX 5050 since Nvidia is still doing the same shenanigans.


Looking at 20 articles, they seem to be biased when it comes to AMD vs Nvidia.


I don't follow everything but they give flak to AMD too, maybe in a different way.


Calling out NVIDIA's bullshit (and oh boy did they deliver a lot of it in the past few years) is not the same as being biased.


Why would anyone trust Nvidia to not stretch the truth, especially with a press release? It's been shown multiple times they inflate their numbers.


This is creative marketing from nVidia. Notice the "With DLSS 4".

That's AI frame hallucination which the 5050 has.

Without the DLSS, the numbers from independent reviewers has basically been exactly on par with the previous generations (about 10% increase in performance).


That's Nvidia's marketing slide and if you note the fine print they are tested at different settings. The RTX 5050 is using 4x frame gen which the 3050 isn't. Techpowerup has the RTX 5050 as being 20% faster than the 3050 give or take, which is certainly not enough to justify upgrading


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: