Hacker Newsnew | past | comments | ask | show | jobs | submit | combinator_y's commentslogin

I am wondering if there is a different approach that 'peaks' better in terms of perf, like instead of doing : - Optimization 1: Do not allocate a Vector when tokenizing - Optimization 2: Zero allocations — parse directly from the input bytes - Optimization 3: Do not use Peekable - Optimization 4: Multithreading and SIMD - Optimization 5: Memory‑mapped I/O

Example : - Optimization 1: Memory‑mapped I/O - Optimization 2: Do not use Peekable - Optimization 3: Do not allocate a Vector when tokenizing - Optimization 4: Zero allocations — parse directly from the input bytes Conclusion - Optimization 5: Multithreading and SIMD

I might be guessing, but in this order probably by Optimization 3 you would reach already a high throughput that you wouldn't bother with manual simd nor Multithreading. (this is a pragmatic way, in real life you will try to minimize risk and try to reach goal as fast as possible, simd/Multithreading carry a lot of risk for your average dev team)


> I might be guessing, but in this order probably by Optimization 3 you would reach already a high throughput that you wouldn't bother with manual simd nor Multithreading.

I agree with you though from my experience Memory Mapping is only useful if you need to jump through the file or read it multiple times (as is the case after the author added simd and a two pass step, the first to identify whitespaces and the second to parse the operation and the operants). If you just need to read the file once it's better to avoid memory mapping as it adds a little overhead.

On the other hand parsing directly from the input bytes avoiding the UTF-8 validation needed to have &str type is easy enough to do but still improves performance quite a bit. Even the rust csv crate, which does much more, is around 30% faster with this optimization. https://docs.rs/csv/latest/csv/tutorial/index.html#amortizin...

This is to say, my list for "easy optimizations, big gains", would be 1) Do not allocate a Vector — 2) Do not use peekable — 3) Avoid utf8 validation. I'm still guessing, but I think memory mapping can be skipped, and might be worth it only if you plan on also implementing simd.


On top of the shit system in place, there is no corporate control internally (5th screenshot "NGA FS" ...)

no, it's on purpose. if you follow what elon does, he is A/B testing and 'fixing' things when it goes viral.

He is doing the worst thing that could happen, leading us (users of x, USA, humanity) into the abyss with his obsession and sickness (yes he is sick and he should go see a therapist)


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: