Hacker Newsnew | past | comments | ask | show | jobs | submit | zepearl's commentslogin

Using X (at least in this context?) is weird.


I downloaded Ollama ( https://github.com/ollama/ollama/releases ) and experimented with a few Qwen models ( https://huggingface.co/Qwen/collections ).

My performance when using an RTX 5070 12GiB VRAM, Ryzen 7 9700X 8 cores CPU, 32GiB DDR5 6000MT (2 sticks):

  - "qwen2.5:7b": ~128 tokens/second (this model fits 100% in the VRAM).
  - "qwen2.5:32b": ~4.6 tokens/second.
  - "qwen3:30b-a3b": ~42 tokens/second (this is a MoE model with multiple specialized "brains") (this uses all 12GiB VRAM + 9GiB system RAM, but the GPU usage during tests is only ~25%).
  - qwen3.5:35b-a3b: ~17 tokens/second, but it's highly unstable and crashes -> currently not usable for me.
So currently my sweet spot is "qwen3:30b-a3b" - even if the model doesn't completely fit on the GPU it's still fast enough. "qwen3.5" was disappointing so far, but maybe things will change in the future (maybe Ollama needs some special optimizations for the 3.5-series?).

I would therefore deduce that the most important thing is the amount of VRAM and that performance would be similar even when using an older GPU (e.g. an RTX 3060 with as well 12GiB RAM)?

Performance without a GPU, tested by using a Ryzen 9 5950X 16 cores CPU, 128GiB DDR4 3200 MT:

  - "qwen2.5:7b": ~9 tokens/second
  - "qwen3:32b": ~2 tokens/second
  - "qwen3:30b-a3b": ~16 tokens/second

What about pre-December_2022? I cannot imagine that just a handful were imported.


> The main reason for this is lack of competition for DB in Germany

Cannot be - there is no competition in Switzerland, but things run pretty smoothly -> in the case of Germany I'd rather say: "lack of oversight, controls, 'konsequent zu sein'" -> in the case of Germany's DB I think that nobody at all levels gives a *hit about its problems.


Things work well in Switzerland because the Swiss spend a lot more money on rail. That's unfortunately the secret.


I interpreted your post like what "krupan" posted in the separate sub-thread ("This is a much tighter circle than any of us should be comfortable with"), but maybe others interpreted it differently (the words of your post are quite generic...). Cheers :o)


To fix stuttering I had to disable compositing in the window manager (Xfce on Linux Mint, nVidia proprietary with AMD CPU).


And there's also pulseaudio, which I had to run in some games with PULSE_LATENCY_MSEC=90 %command%, other games you can only run in lutris, other games you can't even minimize it or it will mess up with the screen entirely.


Fully agree - such a great movie: absolutely flowing, entertaining, fantastic characters, nice colors. That together with "Three days of the Condor" is what I immediately though of when I heard the news, but so far only one of the newspapers/sites I read has mentioned both of them..., weird :o| Am I just getting too old (respectively, are articles being written by too junior people?)?


Same here - fail2ban then adds the IP to my nftables fw


> I would presume Google still has all this data. ...

Maybe - I guess that they must have served that "cached" content from DB-records that had it all saved directly (URL X has contents Y => basically a "mirror" of the terms that they indexed) => not having to store that "mirror" (only the search index) might save quite a lot of storage space (and I/O and CPU to decompress it, as users won't be requesting it anymore) => all in all that might save quite a lot of infrastructure costs $$$.

> Could this be an advantage that Google can use to train their models on but others won't have access?

Maybe (if they decided to just get rid of the I/O related to the user requests), but on the other hand I don't know if previously any "Google-consumer" was ever able to perform mass-downloads of Google's "cached" data - could that be done without being banned by Google's webpage (or API)?


I fully agree with you & upstream - on the other hand there are specific (~local) shops which I use often and I'm 99% sure that I would not need the Credit Card (CC) protection with them, so having a CC-alternative for those cases is nice.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: