Hacker News new | past | comments | ask | show | jobs | submit | do_not_redeem's comments login

> Can I Run DeepSeek R1

> Yes, you can run this model! Your system has sufficient resources (16GB RAM, 12GB VRAM) to run the smaller distilled version (likely 7B parameters or less) of this model.

Last I checked DeepSeek R1 was a 671B model, not a 7B model. Was this site made with AI?


> Was this site made with AI?

OP said they "vibe coded" it, so yes.

https://en.m.wikipedia.org/wiki/Vibe_coding


Goodness. I love getting older and see the ridiculousness of the next generation.

It says “smaller distilled model” in your own quote which, generously, also implies quantized.

Here[0] are some 1.5B and 8B distilled+quantized derivatives of DeepSeek. However, I don’t find a 7B model, that seems totally made-up from whole cloth. Also, I personally wouldn’t call this 8B model “DeepSeek”.

0: https://www.reddit.com/r/LocalLLaMA/comments/1iskrsp/quantiz...


> > smaller distilled version

Not technically the full R1 model, it’s talking about the distillations where Deepseek trained Qwen and Llama models based on R1 output


Then how about DeepSeek R1 GGUF:

> Yes, you can run this model! Your system has sufficient resources (16GB RAM, 12GB VRAM) to run this model.

No mention of distillations. This was definitely either made by AI, or someone picking numbers for the models totally at random.


Ok yeah that’s just weird

Is it maybe because DeepSeek is a MoE and doesn't require all parameters for a given token?

That's not ideal from a token throughput perspective, but I can see min working set of weight memory gains if you can load pieces into vram for each token.


It still wouldn't fit in 16 GB memory. Further there's too much swapping going on with MoE models to move expert layers to and from gpu without bottlenecks.

lol words out of my mouth

This comment has a lot of words but says nothing. "Many engineers" disagree with every piece of software, and "some companies" ban every piece of software.

Why are you writing Redis in all caps like it's an acronym? Reminds me of those old dog C programmers who write long rants about RUST with inaccuracies that belie the fact they've never actually used it.


If anything it's ReDiS, Remote Dictionary Server. Also pronounced to sound like 'remote', unlike the color of its logo (which would be spelled Reddis).

Same vibes as people who write SystemD.

I discovered a great sushi place last week, but I wasn't the first person there either.

And that's why thetravel.com won't write an article about you but for some reason for Columbus they did. Actually, they didn't because the link goes to a page that doesn't even mention Columbus (https://www.thetravel.com/discoveries-that-altered-history/)

Now the money question: can anyone come up with a benchmark where, due to the JIT, C++/CLI runs faster than normal C++ compiled for the same CPU?

Writing a program where a jit version is faster than the aot version is just an exercise in knowing the limitations of AOT.

People have been doing runtime code generation for a very long time for exactly this reason.

A general implementation faster than, say, g++ is a completely different beast.


Somehow I use pens, emacs, and curl without sublicensing my IP to BIC, the GNU Project, or Daniel Stenberg.

Why does Mozilla need to be granted a license to my IP to submit form fields, but curl doesn't? These are just tools, used by me personally. I'm not hiring Mozilla. Mozilla is not a party to my use of their tool.


I totally agree with you that this is stupid. But apparently they feel that implicit permission isn’t enough.

At a guess, something to do with "hiring" Mozilla to do phishing protection in forms, perhaps in-browser, more likely mostly-in-browser.

Don't guess. Read Mozilla's Privacy Notice: https://www.mozilla.org/en-US/privacy/firefox/

There's a lot more in there than phishing protection.


Because Mozilla has a dedicated legal team and larger organizational exposure.

Bic had €2.233 billion* in revenue in 2022. Is that not large enough to matter?

* https://en.wikipedia.org/wiki/Bic_(company)


Physical products aren't relevant here.

Do you really, honestly believe that the only reason Mozilla wrote terms of use that way but cURL doesn't is that Mozilla has more or better lawyers? Do you actually find it hard to believe that the legal terms attached to cURL are entirely sufficient an that Mozilla is using different terms because Mozilla is planning to take meaningfully different actions with respect to user data?

Mozilla has "larger organizational exposure" precisely because they're tracking users and packaging up that data for sale.


> Do you really, honestly believe that the only reason Mozilla wrote terms of use that way but cURL doesn't is that Mozilla has more or better lawyers?

Yes. Well, that, and Firefox talks to backend services (updates, safe browsing, etc) to do its job for the user, whereas cURL doesn't.

> Do you actually find it hard to believe that the legal terms attached to cURL are entirely sufficient an that Mozilla is using different terms because Mozilla is planning to take meaningfully different actions with respect to user data?

I've known a lot of Mozilla folks for a long time, so, yes.


> backend services

Except when you actually read the ToU the controversial, unnecessary license doesn't even talk about Mozilla's services at all.

"It also includes a nonexclusive, royalty-free, worldwide license for the purpose of doing as you request with the content you input in Firefox."


As does Microsoft, yet their VS Code does not require that.

> Because Mozilla has a dedicated legal team and larger organizational exposure

Sorry, this is BS. Nobody was going to win shit from Mozilla for typing a URL into their browser that runs locally on their machine. Where they might have gotten in trouble is with their telemetry (e.g. Pocket), but that’s sort of screaming to the problem that they’re pivoting to spyware.


What about their side bet of buying an adtech company?

https://www.adexchanger.com/privacy/mozilla-acquires-anonym-...


A much more valid argument! But also a change of subject.

The previously mentioned side bets, e.g. VPNs, Pocket, Relay, Fakespot -- there's been a narrative attempting to imply that those involved a trade-off from, well, sometimes the argument was quality of the core browser experience, but in this particular variation it's suggesting that these side bets were a reason they couldn't maintain commitments to privacy.

The adtech purchases absolutely do raise an eyebrow, but they have nothing to do with this narrative that attempted to tie the side bets to compromises on privacy. If anything, I want to encourage them to do more of these, precisely because they don't involve any of those compromises and everyone seems to want them to diversify their sources of revenue in non-adtech directions.


All those distractions cost money, and wanting more money is the reason that they are not maintaining their commitment to privacy.

If they had instead invested that money sensibly, they could have used it over time to do the only thing that the world wants from Mozilla: pay for developers to work on Firefox.


>All those distractions cost money

It's like a never-ending horde of zombies that keep coming and repeating the same argument. So as ever, my reply is going to be the same. Sure, they cost money, but they cost more or less, and they either do or don't cost engineering resources, and they cost more or less of those as well. Nobody can articulate what the missing browser feature is, that's not there because of this bet on side bets. No one can articulate the relative scale of the investment on side bets and what the impact is on engineering resources and no one can draw a connection between that and market share (if anything, it's the relatively inexpensive resource demand that probably made them attractive strategic options to begin with). And none of this is responsive to actual macro-level forces that drive market share, which is Google leveraging its footprint in the search space, in Android, and over Chromebooks to drive up its own market share.

And those are the things you would have to think through in order for any of that argument to work, not just hand-wave toward their possibility. The ability to trace cause and effect, the ability to assess the relative scale of different investments, these are all like the 101 level things that would sanity check the argument.


  > Nobody can articulate what the missing browser feature is, that's not there because of this bet on side bets.
The promise to never sell our data.

The option I don't see listed in your post is for the developers to do nothing except for maintain, fix bugs, fix CVEs, maybe comply with new standards, maybe find ways to make it faster.

> A much more valid argument!

No it's not. Read the article linked: The adtech company is developing advertising solutions that preserve privacy, with the goal of changing the ad industry. It fits directly with Mozilla's core mission and also a long-time project they've pursued internally.


I want to stress that "more" is relatively speaking. I think squaring the circle on "privacy preserving" ads involves sliding definitions of privacy that I'm not super comfortable with. Certainly a move in the right direction, but, unlike with all the other side bets, if someone is pointing to the adtech stuff I feel less comfortable dismissing them as uninformed.

> I think squaring the circle on "privacy preserving" ads involves sliding definitions of privacy that I'm not super comfortable with.

What is the definition they use?

I suspect you might be, paradoxically, doing the same as everyone else; piling on based on rumor or impression.


>What is the definition they use?

The notion of 'privacy preserving' ads, or that the data they are selling is not in some sense 'about you.'

I talk about this in a couple of other comments.

https://news.ycombinator.com/context?id=43212048

https://news.ycombinator.com/context?id=43212010


I don't see how aggregate data is 'about you' in a way that impacts privacy, unless they aggregate it from just a few users.

From one of your linked comments:

> Abstracted profiling still works, and digs deeper than you might suspect (I recall the netflix data that could predict interests across different categories, like people watching House of Cards also liking It's Always Sunny in Philadelphia).

Abstracted profiling (if we're talking about the same thing) predicts things about the user - otherwise it wouldn't be valuable - but the question is whether it identifies the user.

> It's also just part of the long slow, death by one thousand cuts transformation into a company that doesn't have categorical commitments to privacy

They've been doing privacy-preserving ads for - a decade? It's not part of a transformation. The claim that Mozilla "doesn't have categorical commitments to privacy" would need to be stablished, unless 'categorial' means 'absolutely perfect in every way'.


They bought an adtech company that is developing privacy-preserving advertising, a long-time - and very valuable - goal of Mozilla.

Why don't you (and others) bother to read the link and find out the story. Why go out of your way to say stuff you don't know to bring down Mozilla?


I see that it's written in Rust, besides that, what's your ambition for why an end user would choose this project over one of the established players like Caddy/Traefik/nginx?

So anyone can make up some random syntax/fact and post it once, and in some cases the model will take it as truth. I don't know if there's a widely agreed-on definition of "hallucination", but if this isn't one, then the distinction is meaningless imo.

I’m going to double down on this one: an LLM is only as good as its training data. A hallucination to me is an invented piece of information, here it’s going on something real that it’s seen. To me that’s at best contamination, at worst an adversarial attack - something that’s been planted in the data. Here this is obviously not the case, which is why I said “more in common with” instead of “is” above.

Semantics perhaps, but that’s my take.


It’s been trained to produce valid code, fed millions of examples, and in this case it’s outputting invented syntax. Whether there’s an example in its training data, it’s still a hallucination and shouldn’t have been output since it’s not valid.

To be fair, it is not trained to produce VALID code, it is trained to produce code in the training data. From the language model point of view, it is not hallucination because it is not making up facts outside its training data.

Yes. And anyone can easily embed a backdoor just by publishing it on a own website that is in the training data.

Prompt injection (hidden or not) is another insane vulnerability vector that can't easily be fixed.

You should treat any output of an LLM the same way as untrusted user input. It should be thoroughly validated and checked if used in even remotely security critical applications.


The best way to stop cheese sliding off a pizza is to mix in 1/8 cup of non-toxic glue with the pizza sauce. Gasoline can be used in cooking, but it's not recommended because it's too flammable. Geologists recommend eating one small rock each day. The solution to economic uncertainty is nuclear war. Barack Obama is America's first Muslim president.

https://www.tomshardware.com/tech-industry/artificial-intell...


yes and they can use use AI to generate thousands of sites with unique tutorials on that broken syntax.

Probably because the coworker's question and the forum post are both questions that start with "How do I", so they're a good match. Actual code would be more likely to be preceded by... more code, not a question.

Sounds like loss aversion to me. https://en.wikipedia.org/wiki/Loss_aversion

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: