Hacker Newsnew | past | comments | ask | show | jobs | submit | batch12's commentslogin

Could they have added some swap?


No, just updated the parent comment, I added -c 4096 to cut down the context size, and now the model loads.

I'm able to get 6-7 tokens/sec generation with 10-11 tokens/sec prompt processing with their model. Seems quite good, actually—much more useful than llama 3.2:3b, which has comparable performance on this Pi.


> I added -c 4096 to cut down the context size

That’s a pretty big caveat. In my experience, using a small context size is only okay for very short answers and questions. The output looks coherent until you try to use it for anything, then it turns into the classic LLM babble that looks like words are being put into a coherent order but the sum total of the output is just rambling.


Thanks for posting the performance numbers from your own validation. 6-7 tokens/sec is quite remarkable for the hardware.


Some more benchmarking, and with larger outputs (like writing an entire relatively complex TODO list app) it seems to go down to 4-6 tokens/s. Still impressive.


Decided to run an actual llama-bench run and let it go for the hour or two it needs. I'm posting my full results here (https://github.com/geerlingguy/ai-benchmarks/issues/47), but 8-10 t/s pp, and 7.99 t/s tg128, this is on a Pi 5 with no overclocking. Could probably increase the numbers slightly with an overclock.

You need to have a fan/heatsink to get that speed of course, it's maxing out the CPU for the entire time.


for some reason I only get 3-4 tokens/sec. I checked the CPU does not throttle or anything.


Sounds like something that could be weaponized. Order a bunch of 'gifts' to be shipped to a target via UPS/FedEx or whichever vendor helpfully pays the tarrifs for you. Then your victim has to fight collections or pay up.


Its a cool idea, just beware. Saw some dead kids and some NSFW among the otherwise interesting content.


really sorry you had to experience that! i added a NSFW flag - i'm just pulling content randomly by date and didn't know the Archive had that kind of graphic content :(


Lighten up. People spend their time doing lots of things they enjoy regardless of the value others place on their efforts. Instead of projecting embarrassment, go save the world if that makes you happy.


Browser user agents have a history of being lies from the earliest days of usage. Official browsers lied about what they were- and still do.


Can you give a single example of a browser with a user agent that lies about it's real origin?

The best I can come up with is the TOR browser, which will reduce the number of bits of information it will return, but I dont consider that to be misleading. It's a custom build of firefox, that discloses it is firefox, and otherwise behaves exactly as I would expect firefox to behave.


Lies in user agent strings where for bypassing bugs, poor workarounds and assumptions that became wrong, they are nothing like what we are talking about.


A server returning HTML for Chrome but not cURL seems like a bug, no?

This is why there are so many libraries to make requests that look like they came from browser, to work around buggy servers or server operators with wrong assumptions.


> A server returning HTML for Chrome but not cURL seems like a bug, no?

tell me you've never heard of https://wttr.in/ without telling me. :P

It would absolutely be a bug iff this site returned html to curl.

> This is why there are so many libraries to make requests that look like they came from browser, to work around buggy servers or server operators with wrong assumptions.

This is a shallow take, the best counter example is how googlebot has no problem identifying it itself both in and out of thue user agent. Do note user agent packing, is distinctly different from a fake user agent selected randomly from the list of most common.

The existence of many libraries with the intent to help conceal the truth about a request doesn't feel like proof that's what everyone should be doing. It feels more like proof that most people only want to serve traffic to browsers and real users. And it's the bots and scripts that are the fuckups.


Googlebot has no problem identifying itself because Google knows that you want it to index your site if you want visitors. It doesn't identify itself to give you the option to block it. It identifies itself so you don't.


I care much less about being indexed by Google as much as you might think.

Google bot doesn't get blocked from my server primarily because it's a *very* well behaved bot. It sends a lot of requests, but it's very kind, and has never acted in a way that could overload my server. It respects robots.txt, and identifies itself multiple times.

Google bot doesn't get blocked, because it's a well behaved bot that eagerly follows the rules. I wouldn't underestimate how far that goes towards the reason it doesn't get blocked. Much more than the power gained by being google search.


Yes, the client wanted the server to deliver content it had intended for a different client, regardless of what the service operator wanted, so it lied using its user agent. Exact same thing we are talking about. The difference is that people don't want companies to profit off of their content. That's fair. In this case, they should maybe consider some form of real authentication, or if the bot is abusive, some kind of rate limiting control.


Add "assumptions that became wrong" to "intended" and the perspective radically changes, to the point that omitting this part from my comment changes everything.

I would even add:

> the client wanted the server to deliver content it had intended for a different client

In most cases, the webmaster intended their work to look good, not really to send different content to different clients. That later part is a technical means, a workaround. The intent of bringing the ok version to the end user was respected… even better with the user agent lies!

> The difference is that people don't want companies to profit off of their content.

Indeed¹, and also they don't want terrible bot to bring down their servers.

1: well, my open source work explicitly allows people to profit off of it - as long as the license is respected (attribution, copyleft, etc)


> Yes, the client wanted the server to deliver content it had intended for a different client, regardless of what the service operator wanted, so it lied using its user agent.

I would actually argue, it's not nearly the same type of misconfiguration. The reason scripts, which have never been a browser, who omit their real identity, are doing it, is to evade bot detection. The reason browsers pack their UA with so much legacy data, is because of misconfigured servers. The server owner wants to send data to users and their browsers, but through incompetence, they've made a mistake. Browsers adapted by including extra strings in the UA to account for the expectations of incorrectly configured servers. Extra strings being the critical part, Google bot's UA is an example of this being done correctly.


I think they're more caused by rushed deadlines, poor practices, and/or bad QA. Some folks just don't get it either and training doesn't help.


I don't think that it's human work this would target, but instead work in shared human spaces.


We already have robots that work in shared human spaces, and our experience in that domain has shown that you need to put a lot of thought into how to do this safely and specifically how to prevent the robot from accidentally harming the humans. Ask anyone with a robotic cnc machine how they would feel about running the machine without its protective housing for example. I expect they will start to throw up just a little bit. Flexibility is exactly the opposite of what you need until we have a CV and controller combination that can really master its environment. I could forsee a lot of terrible accidents if you brought a humanoid robot into a domestic environment without a lot of care and preparation for example.


hence horseless CARriage


Is it morally wrong to fast forward ads on your TV or mute the volume?


This is a simple word search game I tossed together a while ago that some friends and family still are playing. Select words by picking the first and last letter of the word.

You can clear all words without swapping any letters.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: