Hacker News new | past | comments | ask | show | jobs | submit | jwitthuhn's comments login

In general you can just use the parameter count to figure that out.

70B model at 8 bits per parameter would mean 70GB, 4 bits is 35GB, etc. But that is just for the raw weights, you also need some ram to store the data that is passing through the model and the OS eats up some, so add about a 10-15% buffer on top of that to make sure you're good.

Also the quality falls off pretty quick once you start quantizing below 4-bit so be careful with that, but at 3-bit a 70B model should run fine on 32GB of ram.


Does 70b mean there are 70 billion weights and biases in the model?


For me it is consistency. I control the model and the software so I know a local LLM will remain exactly the same until I want to change it.

It also avoids the trouble of using a hosted LLM that decides to double their price overnight, costs are very predictable.


For anyone else looking for the weights which as far as I can tell are not linked in the article:

Base model: https://huggingface.co/Zyphra/Zamba2-7B

Instruct tuned: https://huggingface.co/Zyphra/Zamba2-7B-Instruct


I couldn't find any gguf files yet. Looking forward to trying it out when they're available.


It seems that zamba 2 isn't supported yet, the previous model's issue is here:

Feature Request: Support Zyphra/Zamba2-2.7B #8795

Open tomasmcm opened this issue on Jul 31 · 1 comment

https://github.com/ggerganov/llama.cpp/issues/8795


What can be used to run it? I had imagined Mamba based models need a different interference code/software than the other models.


If you look in the `config.json`[1] it shows `Zamba2ForCausalLM`. You can use a version of the transformers library to do inference that supports that.

The model card states that you have to use their fork of transformers.[2]

1. https://huggingface.co/Zyphra/Zamba2-7B-Instruct/blob/main/c...

2. https://huggingface.co/Zyphra/Zamba2-7B-Instruct#prerequisit...


To run gguf files? LM Studio for one. I think recurse on macos as well and probably some others.


As another commenter said, this has no GGUF because it’s partially mamba based which is unsupported in llama.cpp


dev of https://recurse.chat/ here, thanks for mentioning! rn we are focusing on features like shortcuts/floating window, but will look into support this in some time. to add to the llama.cpp support discussion, it's also worth noting that llama.cpp does not yet support gpu for mamba models https://github.com/ggerganov/llama.cpp/issues/6758


Gpt4all is a good and easy way to run gguf models.


Mamba based stuff tends to take longer to become available


Looking forward to them taking a similar action against Wordpress.com for upselling access to the core wordpress feature of plugins.


I'm no trademark lawyer, but isn't offering "WordPress" hosting fine as long as you are genuinely using the WordPress software? As I understand it that is purely nominative use.


I see that sentiment largely coming from developers who, I think, misunderstand the freedom that the GPL is protecting.

The GPL focuses on the user's freedom to modify any software they are using to better suit their own needs, and it does a great job of it.

The people saying that it is less free than bsd/mit/apache are looking at it from a developer's perspective. The GPL does deliberately limit a developer's freedom to include GPL code in a proprietary product, because that would restrict the user's freedom to modify the code.


As I understand it the sanctioned way of sharing code added to UE is to fork it on github and publish changes to your fork.

Being a fork it will only be available to other people in the Epic Games github org which is only people who have agreed to Epic's licensing terms, and your modified engine remains under that same license.


Yeah, you need to manually fix each affected system by booting in safe mode. Not possible to do remotely.


And you will need your bitlocker recovery key to access your encrypted drive in safe mode. I luckily had mine available offline

There's going be a lot of handholding to get end users through this.


You can enable safemode for next boot without the recovery key and then you can delete the offending file on that next boot.


That requires being able to boot in the first place


You can do a minimal boot. I'm told.


Ouch!


For anyone who wants a sneak peak at what the content will be, Andrej Karpathy already has a series on videos on youtube that covers roughly the first half of the list here: https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxb...

Starts with building micrograd to build an understanding of how pytorch understands how to calculate gradients, then it proceeds all the way to making a gpt-2 clone.

Looks like this is an effort to reorganize and build on that existing work.


The extension needs to be signed by mozilla for the normal production builds of firefox to let you load it on startup. If it isn't signed, you need to manually load it in using about:debugging each time you restart firefox.


Mozilla is not preventing from signing anything here (and the "security checks" on who can sign are so weak it might as well not exist in the first place).


Same applies to Chrome as well by that logic; it allows you to sideload unverified extensions too at the cost of annoyingly making you set it up at every startup.

I guess we're all better off using Chrome then?


Okay, but you've moved the goalposts from

> You don't need Mozilla's approval

to pointing out that Mozilla has approved (signed) this extension.


That's you're pedantically language-lawyering my post while not engaging with the far greater falsehood that the previous poster was perpetuating is not a good look.

And the reality is Mozilla can always block any extension they want. They can just change the Firefox source code. It doesn't matter what functionality does or doesn't exist now or what the policy they do or don't have – everything can always be changed. That's true for almost anything.

So what they "could do" is a complete distraction in the first place because the "could do" anything. What they ARE doing matters.


No, pointing out that your claims are conceptually false is a fine look.

It's not about things Mozilla could theoretically do to block you, it's that they require you to proactively get their permission to run an extension (in a prod version of the browser on an ongoing basis, which I think is reasonable table stakes). Here's their official docs for self-distribution, i.e. not using the AMO at all: https://extensionworkshop.com/documentation/publish/submitti... Notice that step 1 starts with giving Mozilla your extension to approve of, step 4 goes so far as to say that if your extension doesn't pass their checks then

> The message informs you of what failed. You cannot continue. Address these issues and return to step 1.

then step 7 is make sure Mozilla reviewers can read your source code, step 9 is wait for them to get back to you, and step 13 is download the XPI that Mozilla has approved to be allowed to run in their browser.

So yes, you absolutely need Mozilla's approval to publish an extension, even if you self-publish the XPI after they've blessed it. If they do not perform the action of signing it, they don't need to change any source code, it won't install. It may be true that in this case they have given that approval, but that doesn't invalidate the general point, and this is a fundamental restriction, not "language-lawyering".


I have to disagree that I'm perpetuating any falsehood here. Mozilla literally needs to approve an addon for it to behave normally. That you are satisfied with the process they have for approving doesn't change that.

To me it seems absurd for a company that claims to be so pro-privacy to not allow any genuinely private extensions to exist. Anyone who wants to make a 'real' addon has to share their code with mozilla.


I actually mostly had the top poster in mind, not you, sorry for the confusion.

What you're saying is technically true, but also not relevant, as explained. They can have the best system in place today, and just change Firefox tomorrow. So it doesn't really matter how the system works now. This is true for anything from Mozilla to XFree86 to Redis to left-pad.

De-facto reality is that right now anyone can create an account and just create a signing key and distribute their extensions $anywhere. Approval is little more than rubber stamp. Mozilla not going around granting "approval" or anything like that.

And they certainly didn't revoke the very weak "approval" here; people can distribute and install it. It's just not listed on the Russian add-on store. So that makes it doubly irrelevant.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: