mattmaynes's comments

mattmaynes · 2025-07-05T10:39:58 1751711998

This is where the power and expressiveness of kdb+ shines. It has SIMD primitives out of the box and can optimize your code based on data types to take advantage of it. https://kx.com/blog/what-makes-time-series-database-kdb-so-f...

MangoToupe · 2025-07-05T11:06:52 1751713612

Time series is vector processing on easy mode, though. The hard part is applying SIMD to problems that aren't shaped to be easily processed in parallel.

jgalt212 · 2025-07-05T11:23:15 1751714595

Fine. What's canonical or most basic example of where SIMD should be applied, but isn't because it's too tricky to do so?

In our shop, we never look to vectorize any function or process unless it's called inside a loop many times.

ashvardanian · 2025-07-05T12:31:41 1751718701

Text processing. Loading/branching/storing content 1 byte at a time is the CPUs worst nightmare, but most text processing is quite tricky in SIMD.

MangoToupe · 2025-07-05T13:29:11 1751722151

> where SIMD should be applied

That seems to disqualify your example

AlotOfReading · 2025-07-05T14:11:01 1751724661

There can be huge advantages to text processing in SIMD if you figure it out. Example, simdjson: https://github.com/simdjson/simdjson

ashvardanian · 2025-07-05T21:44:39 1751751879

Maybe even better examples would be <https://github.com/intel/hyperscan>, <https://github.com/simdutf/simdutf>, and my own <https://github.com/ashvardanian/StringZilla> :)

MangoToupe · 2025-07-05T11:36:53 1751715413

> What's canonical or most basic example of where SIMD should be applied, but isn't because it's too tricky to do so?

There is none. That's a contradiction in terms. SIMD either fits the shape or it doesn't.

krapht · 2025-07-05T12:35:33 1751718933

Variable length parallelism is hard. You can go to highload.fun (SIMD competition site) for problems that are only parallelized after significant effort.

Try problem #1, parsing numbers.