Thanks for your comment! Maybe it's time for picol2.c, same amount of lines of code, but floats and [expr]. I'm going to use a few hours today to get there. Such small changes would make it more practical for real use cases where one wants to retain understandability.
It depends on the use case. For instance you open a socket, write there any value you have without serialization, read it in the other side, data transfer done.
The convenience of not having to marshal data over a network is certainly a use case. But I'll admit that two of the worst programs I ever saw were written in Perl and TCL. Somehow just a big jumble of inscrutable regexes.
When "everything is a string" then you have no choice but to literally treat everything as a string. Painful. Very cool project though.
The objections to tcl I see most often seem to reflect an outdated or incomplete understanding of the language and its capabilities.
As with many interpreted languages, maybe it's just too easy to iterate interactively with code until it "just works", rather than taking the time to design the code before writing it, leading to perception that it is a "write-only" language.
However, despite its reputation, even Perl can be written in a way that is human-readable. I agree with you that it is more a reflection of the programmer or the work environment than of the language itself.
> When "everything is a string" then you have no choice but to literally treat everything as a string.
As someone who has been developing tcl almost daily for more than 30 years, both object oriented and imperative code, I have not found it necessary to think this way.
Can you explain what leads you to this conclusion?
It is absolutely possible and even not so hard. If you use Redis Vector Sets you will easily see 20k - 50k (depending on hardware) queries per second, with tens of millions of entries, but the results don't get much worse if you scale more. Of course all that serving data from memory like Vector Sets do. Note: not talking about RedisSearch vector store, but the new "vector set" data type I introduced a few months ago. The HNSW implementation of vector sets (AGPL) is quite self contained and easy to read if you want to check how to achieve similar results.
Author here. Thanks for sharing—always great to see different approaches in the space. A quick note on QPS: throughput numbers alone can be misleading without context on recall, dataset size, distribution, hardware, distance metric, and other relevant factors. For example, if we relaxed recall constraints, our QPS would also jump significantly. In the VectorDBBench results we shared, we made sure to maintain (or exceed) the recall of the previous leader while running on comparable hardware—which is why doubling their throughput at 8K QPS is meaningful in that specific setting.
You're absolutely right that a basic HNSW implementation is relatively straightforward. But achieving this level of performance required going beyond the usual techniques.
Yep you are right, also: quantization is a big issue here. For instance int8 quantization has minimal effects on recall, but makes dot-product much faster among vectors, and speedups things a lot. Also the number of components in the vectors make a huge difference. Another thing I didn't mention is that for instance Redis implementation (vector sets) is threaded, so the numbers I reported is not about a single core. Btw I agree with your comment, thank you. What I wanted to say is simply that the results you get, and the results I get, are not "out of this world", and are very credible. Have a nice day :)
Appreciate the thoughtful breakdown—you're absolutely right that quantization, dimensionality, and threading all play a big role in performance numbers. Thanks for the kind words and for engaging in the discussion. Wishing you a happy Year of the Horse—新春快乐,马年大吉!
Things are a bit more complicated. Actually Redis the company (Redis Labs, and previously Garantia Data) offered since the start to hire me, but I was at VMWare, later at Pivotal, and just didn't care, wanted to stay actually "super partes" because of idealism. But actually Pivotal and Redis Labs shared the same VC, It made a lot more sense to move to Redis Labs, and work there under the same level of independence, so this happened. However, once I moved to Redis Labs a lot of good things happened, and made Redis maturing much faster: we had a core team all working at the core, I was no longer alone when there were serious bugs, improvements to make, and so forth. During those years many good things happened, including Streams, ACLs, memory reduction stuff, modules, and in general things that made Redis more solid. In order to be maintained, an open source software needs money, at scale, so we tried hard in the past to avoid going away from BSD. But eventually in the new hyperscalers situation it was impossible to avoid it, I guess. I was no longer with the company, I believe the bad call was going SSPL, it was a license very similar to AGPL but not accepted by the community. Now we are back to AGPL, and I believe that in the current situation, this is a good call. Nobody ever stopped to: 1. Provide the source on Github and continue the development. 2. Release it under a source available license (not OSI approved but practically very similar to AGPL). 3. Find a different way to do it... and indeed Redis returned AGPL after a few months I was back because maybe I helped a bit, but inside the company since the start there was a big slice that didn't accept the change. So Redis is still open source software and maintained. I can't see a parallel here.
The search for speed is vain. Often Claude Code Opus 4.6, on hard enough problems, can do the impression of acting fast without really making progresses because of lack of focus on what matters. Then you spin the much slower GPT 5.3-Codex and it fixes everything in 3 minutes of doing the right thing.
What codex often does for this, write a small python script and execute that to bulk rename for example.
I agree that there is use for fast "simpler" models, there are many tasks where the regular codex-5.3 is not necessary but I think it's rarely worth the extra friction of switching from regular 5.3 to 5.3-spark.
I will always take more speed. My use of LLMs always comes back to doing something manually, from reviewing code to testing it to changing direction. The faster I can get the LLM part of the back-and-forth to complete, the more I can stay focused on my part.
disagree. while intelligence is important, speed is especially important when productionizing AI. it’s difficult to formalize the increase in user experience per increase in TPS but it most definitely exists.
Hi! This model is great, but it is too big for local inference, Whisper medium (the "base" IMHO is not usable for most things, and "large" is too large) is a better deal for many environments, even if the transcription quality is noticeable lower (and even if it does not have a real online mode). But... It's time for me to check the new Qwen 0.6 transcription model. If it works as well as their benchmarks claim, that could be the target for very serious optimizations and a no deps inference chain conceived since the start for CPU execution, not just for MPS. Since, many times, you want to install such transcription systems on server rent online via Hetzner and other similar vendors. So I'm going to handle it next, and if it delivers, really, time for big optimizations covering specifically the Intel, AMD and ARM instructions sets, potentially also thinking at 8bit quants if the performance remain good.
Same experience here with Whisper, medium is often not good enough. The large-turbo model however is pretty decent and on Apple silicon fast enough for real time conversations. The addition of the prompt parameter can also help with transcription quality, especially when using domain specific vocabulary. In general Whisper.cpp is better with transcribing full phrases than with streaming.
And not to forget, for many use cases more than just English is needed. Unfortunately right now most STT/ASR and TTS focus on English plus 0-10 other languages. Thus being able to add with reasonable effort more languages or domain specific vocabulary would be a huge plus for any STT and TTS.
1. In the real world, for a similar task, there are little reasons for: A) not giving the compiler access to all the papers about optimizations, ISAs PDFs, MIT-licensed compilers of all the kinds. It will perform much better, and this is a proof that the "uncompressing GCC" is just a claim (but even more point 2).
2. Of all the tasks, the assembler is the part where memorization would help the most. Instead the LLM can't perform without the ISA documentation that it saw repreated infinite number of times during pre-training. Guess what?
3. Rust is a bad language for the test, as a first target, if you want an LLM-coded Rust C compiler, and you have LLM experience, you would go -> C compiler -> Rust port. Rust is hard when there are mutable data structures with tons of references around, and a C compiler is exactly that. To compose complexity from different layers is an LLM anti pattern that who worked a lot with automatic programming knows very well.
4. In the real world, you don't do a task like that without steering. And steering will do wonders. Not to say that the experiment was ill conceived. The fact is that the experimenter was trying to show a different point of what the Internet got (as usually).
> the experimenter was trying to show a different point of what the Internet got (as usually)
All of your points are important, but I think this is the most important one.
Having written compilers, $20k in tokens to get to a foundation for a new compiler with the feature set of this one is a bargain. Now, the $20k excludes the time of to set up the harness, so the total cost would be significantly higher, but still.
The big point here is that the researchers in question demonstrated that a complex task such as this could be achived shockingly cheaply, even when the agents were intentionally forced to work under unrealistically harsh conditions, with instructions to include features (e.g. SSA form) that significantly complicated the task but made the problem closer to producing the foundation for a "proper" compiler rather than a toy compiler, even if the outcome isn't a finished production-ready multi-arch C-compiler.
1. Make long pauses: 1h of work, stop for 30 minutes or more. The productivity gain should leave you more time to rest. Alternatively work just 50% of time, 2h the morning, 2h the evening, instead of 8 hours. Yet trying to deliver more than before.
2. Don't mix N activities. Work in a very focused way in a single project, doing meaningful progresses.
3. Don't be too open-ended in the changes you do just because you can do it in little time now. Do what really matters.
4. When you are away, put an agent in the right rails to reiterate and provide potentially some very good result in terms of code quality, security, speed, testing, ... This increases the productivity without stressing you. When you return back, inspect the results, discard everything is trash, take the gems, if any.
5. Be minimalistic even if you no longer write the code. Prompt the agent (and your AGENT.md file) to be focused, to don't add useless dependencies, nor complexity, to take the line count low, to accept an improvement only the complexity-cost/gain is adequate.
6. Turn your flow into specification writing. Stop and write your specifications even for a long time, without interruptions. This will improve a lot the output of the coding agents. And it is a moment of calm focused work for you.
(1) is not something the typical employee can do, in my experience. They're expected to work eight hours a day. Though I suppose the breaks could be replaced with low effort / brain power work to implement a version of that.
Work for a smaller company with more reasonable expectations of a knowledge worker.
You're an engineer, not a manager, or a chef, or anything else. Nothing you do needs to be done Monday-Friday between the hours of 8 and 5 (except for meetings). Sometimes it's better if you don't do that, actually. If your work doesn't understand that, they suck and you should leave.
1) Is this for founders, because employees surely cant do this. With new AI surveillance tech, companies are looking over our shoulders even more than before.
reply