vlejd's comments

vlejd · 2025-11-26T18:37:00 1764182220

This is awesome. I have been trying to track similar things mainly to identify brands of cloths that actually last long enough to be worth it.

How do you keep the discipline to do this?

I also assume it is very nice to have such list if something gets broken/torn/lost and you need to buy a replacement right aways.

How much testing did it took Arive to such stable set?

amulyabaral · 2025-11-26T19:06:01 1764183961

In 2021 I bought 12 different types of printing blanks (iirc it cost $150 for everything) of various fits, gsm and fabrics to test my preferences and then stuck with the tshirt I preferred the most. Donated/worn out the rest, and only purchased the one I liked moving forward. I've been lucky with other purchases but those were also guided by gear recommendations from niche subreddits and picking whatever was on discount among them. And because I buy them on a discount, it's usually not THAT big of a hit to the pockets after reselling them if I don't like them for whatever reason (other than durability which there's no getting around to. seams, stitching, materials etc are a good tell, however).

There's always some randomness, for example I once had to rush to H&M for an emergency blazer when I showed up to an university event in a tshirt and the Queen of Denmark arrived. I got a bit drunk that evening but I remember I walked home in the evening without it.

vlejd · 2025-11-26T18:30:02 1764181802

Interestingly enough, we found that cublas is not that well optimized for some less common consumer GPUs, specifically 3090. We saw that is didn't really achieve it's full potential for a lot of different matrix shapes, probably because of poor tuning. Interestingly enough, out kernel does not have any parameters, and it was able to outperform cublas even in setting where it has no right to do so.

Regarding patterns, we tested mainly random matrices and ones created by Wanda pruning. 2:4 sparsity (commonly used structure) will have same results as random matrix (probably even better). Interestingly enough, block sparsity could have very close to a worst case scenario with our format, because it promotes disproportional long sequences of zeroes.

Regarding other usecases, we are looking into it, but most common ones we found are usually for much smaller sparsity <1%. If you know about some other use case that is in the 30-90 range, let us know.

jjgreen · 2025-11-27T12:46:34 1764247594

A question on https://math.stackexchange.com/ ?

vlejd · 2025-11-25T22:45:10 1764110710

I think the lack of efficient GPU kernels was the main problem. It is much, much easier to get a real speedup and memory reduction from quantization from fp16 to fp8 than from 50% sparsity. For sparsity you needed structure (which makes your model worse) and special hardware support.

vlejd · 2025-11-24T22:59:40 1764025180

Internet is dead indeed. Amazing idea! Will use it to test my posts.

vlejd · 2025-11-21T21:23:43 1763760223

Thant looks very nice. I am writing some technical blogs from time to time and this will come in handy. How does it work with the code execution though? Are you running some service somewhere that takes care of it, or is it in browser?

laurentabbal · 2025-11-22T07:54:06 1763798046

It's in browser with Pyodide.