"BLOOM will be the first language model with over 100B parameters ever created. This is the culmination of a year of work involving over 1000 researchers from 70+ countries and 250+ institutions, leading to a final run of 117 days (March 11 - July 6) training the BLOOM model on the Jean Zay supercomputer in the south of Paris, France thanks to a compute grant worth an estimated €3M from French research agencies CNRS and GENCI."
Thanks to the researchers, institutions and the French government for providing the resources to make that happen. I hope more countries on the continent follow this model of open access and funding for AI research.
Though it requires an account, unlike other models on the hub (I guess they're hoping to get new users from the hype of this model).
Bloom looks pretty exciting. It's reportedly as performant as GPT-3 (I haven't tested it enough to confirm, but what little testing I did gave okay results), but the model is completely open-source; if you can afford some cloud compute, you can just upload the model on your preferred cloud provider and just generate whatever you want from it, good or evil.
I'm expecting the coming 12 months to be pretty interesting for text generation.
> but the model is completely open-source; if you can afford some cloud compute, you can just upload the model on your preferred cloud provider and just generate whatever you want from it, good or evil
It seems that the developers' position is that it's not fully open-source, and that some potentially evil or unethical uses are actually restricted:
Alright, fine. It's open-source-with-an-asterisk, which is completely different from open-source in every way, and happens to involve having access to the source of the thing.
The asterisk says "I'm not allowed to say it's open source because some website I don't care about says it's not the right definition".
> Though it requires an account, unlike other models on the hub (I guess they're hoping to get new users from the hype of this model).
This model is larger than anything HF has done, so resource usage likely has to be constrained a bit (GPT-3 was restricted at first for similar reasons, in addition to the potential abuse issues)
If you create a hugginface account, can you download the model and weights and run inference locally (I have a nvidia 3080 Ti on a 32-core machine with 64GB of RAM)? Or do they just host the inference server for you? IIUC the model design is open so I assume the weights are too.
You can actually download the weights without an account:
git clone https://huggingface.co/bigscience/bloom
You will need Git LFS and ~330GB of free space though.
It is possible to run inference locally, even without very much RAM/VRAM (though it will be very slow). Get the most recent versions of "transformers" and "accelerate" from git, and then:
Thanks, that's what I was looking for (double thanks for the example inference code). I'll queue up the download and try to get the dependencies working.
Comment 1: git-lfs is now close to saturating my gigabit ethernet connection, around 990Mb/sec
Comment 2: filled up the disk on my first computer and had to find an actual hard drive since none of my SSDs have 330GB. It seems more like 500+GB on disk. Haven't gotten to inference yet.
I haven't used LFS before but it seems to buffer incompletely downloaded files in .git/lfs/incomplete and a copy of "objects" in .git/lfs/objects (I am one of those people who absolutely no interest in how git's DB works, nor how its blob DB works).
Almost there. I've got everything installed and want to run your inference script. Model loading fails, TypeError: Got unsupported ScalarType BFloat16
File "/home/dek/miniconda3/lib/python3.9/site-packages/accelerate/utils/offload.py", line 25, in offload_weight
array = weight.numpy()
I verified my numpy knows nothing of bfloat16:
>>> import torch
>>> t=torch.arange(0,6).resize(2,3).to(dtype=torch.bfloat16)
/home/dek/miniconda3/lib/python3.9/site-packages/torch/_tensor.py:586: UserWarning: non-inplace resize is deprecated
warnings.warn("non-inplace resize is deprecated")
>>> t
tensor([[0., 1., 2.],
[3., 4., 5.]], dtype=torch.bfloat16)
>>> t.numpy()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Got unsupported ScalarType BFloat16
Seems par for the course that I'd be able to get the world's most advanced NLP model downloaded over gigabit to my home supercomputer only to be stymied by a data type.
Comment three: after hacking my way around that, the script ran for another few hours doing disk IE (writing the offload) such that my directory is now 1.3T (330GB .git/lfs/objects, 330GB sharded weights, and 607G of "offload".
Then it failed again:
File "/home/dek/miniconda3/lib/python3.9/site-packages/accelerate/big_modeling.py", line 188, in dispatch_model
main_device = [d for d in device_map.values() if d not in ["cpu", "disk"] [0]
IndexError: list index out of range
Now I'm curious just how long it will take to repro this error (IE, running again, with the offload files aalready written). It's also puzzling since I set the device_map to 'auto'.
Again, all par for the course and stuff I expected (having worked in HPC/ML/science for 3 decades, you get used to research codes).
Huh. That one looks like it wasn't able to place any part of the model on the GPU. I know there is currently an issue where if the first layer of the model is too big to fit on the GPU it will put all layers on the CPU/disk (rather than trying to see if a later layer would fit). But the 3080Ti has 12GB so I'd be surprised if that's happening here?
I see the same error as well. Looking at the underlying accelerate code, it looks like if GPU is not an option then CPU and disc are completely rejected and the above error is thrown.
Just to double-check, are you using the git versions of transformers and accelerate? I believe the BFloat16 support is pretty recent (22 days ago) and may not be in conda/pip:
Yep. The model is too big to fit on the two GPUs I have (even though each one is 48GB). So instead it gets split between the GPUs, CPU RAM, and some parts even have to sit on disk (SSD). This makes it much slower than if everything were on the GPU.
I believe there are performance enhancements planned in Accelerate that will allow layers to be preloaded in parallel as other layers are executing inference, which should make things faster: https://github.com/huggingface/accelerate/issues/512#issueco...
I wasn't going to ask about this until I had it running on my machine but... its very interesting that you can do out of core models with weights in various memory hierarchies. It feels almost like swap/paging. This is exciting!
What does 'offload_folder' parameter do? Why is it even needed? Does the inference produce intermediate results so large they have to be stored on disk? o-O
I'm calling it now, these models will just grow and grow, their responses are reflexive probabilistic responses. HN will scream its not a AI since it doesn't understand the world, then we will discover that human beings are just reflexive probabilistic machines. And the dumber reflexivity (which is what the models have (or rather dumber humans have)) are not logically incorrect, they just arent complex enough to hold all the information, but still very good at spitting out what is there. A deep understanding of human intelligence I believe will reveal that this is the case, and humanity will be cracked open
I am sure humans are just reflexive probabilistic machines but we have one major advantage - we are embodied in a complex environment, both physical and social. The environment allows simple reflexive inferences to align to real outcomes / grounds the model.
Maybe we need to build an android baby and raise it as part of human society to get similar benefits for AI. As it stands, AI has no "skin in the game" so to speak, it doesn't optimize rewards and has nothing to lose.
Predicting the next word has its limits, a model can't design and try new experiments like human agents, a static training dataset is dead while the world is alive. Even AlphaGo was limited while doing imitation learning, but surpassed humans training from scratch by accessing a Go environment (table + opponent). The environment is the real teacher.
Lambda and these language models are not currently capable maintaining an internal train of thought that isn't expressed as token context. It has no memory of the world other than what we tell it to have. It can only "think" as long as it takes to infer a token.
I think these big language models might get closer to something we would all call sentience once they can refer directly to their weights' previous activations for context instead of just tokens.
> Lambda and these language models are not currently capable maintaining an internal train of thought that isn't expressed as token context.
Why would that necessarily be insufficient for sentience? Sentience is just about the ability to "feel". Whatever "feeling" is, mechanistically, it will be some process that accepts some information and produces some output for these "feelings". It's not obvious that any kind of "memory" has to be part of the input to that process. For all we know, the transformer model is producing feelings as it computes, ie. that the next generated word is selected because it "feels right", where "feels right" simply means it's the most statistically likely word to follow the current token. Can you prove that this isn't basically what that gut feeling is for people?
We don't actually know what "feelings" are fundamentally, so most of the people dismissing LaMDA's sentience don't really understand the can of worms they just stepped in.
> We don't actually know what "feelings" are fundamentally
I bet feelings are value functions - they estimate the expected future reward from a given state. As we move from self-supervised to RL, AIs will get feelings. But they are always task dependent, the system of values is built around goals and past experience, it's not something that can be separated from the specifics of the case.
Humans come preloaded with reward signals that guide the development of feelings. We got our goals in-built - to survive, to thrive, to self replicate, and then all the sub-goals necessary to achieve the main directives - to move around and manipulate objects, to be social, to obtain food, etc.
A plausible abstract description, but what are the specific properties of value function that create the feeling? We need that mechanistic understanding to avoid creating feelings when we don't want a sentient system. More than likely this will make the system less effective at certain tasks as well, but the ethics around this are unclear and can only be discussed once that mechanistic understanding becomes clear.
I think feeling is an act of imagination, we imagine the future rewards, and this is necessary in order to select our actions. As we imagine scenarios we assign value to them, positive or negative, that makes them pop out as more than images.
Firstly, it is now pretty much exactly two years since OpenAI announced GPT-3 and it is great to see that access to these models – for scientists in particular – is becoming increasingly commonplace. I should also note that Hugging Face really have taken a “big tent” stance on the development of this model and their business in general. This is refreshing given that no other entity capable of raising the funds and manpower to develop these kinds of models have taken a similar stance (despite much deeper pockets…). Yes, Hugging Face is running on venture capital, but at least for now I see no reason to be cynical about their efforts that in every way appear genuine to me.
Still, it is not all great. OpenAI, DeepMind, Google, etc. continues to produce models that are completely closed and thus can never be the subject to analysis from the rest of the community. Models which all dwarf GPT-3 at this point. My fear is that those of us arguing for access will continue to play catch up – maybe even forever.
However, while BLOOM is more open than say FAIR’s OPT [1] – which had the gall to call itself both “open” and “democratising” despite restricting commercial usage in its “open” license – I believe there is a discussion to be had about how they (and many others in the community) use the word “open”.
While I am not a lawyer, I am somewhat familiar with licenses and have gone through the BigScience RAIL License v1.0 [2], the associated blog post [3], and some background papers and documents to better understand how Hugging Face and its community motivate the licensing.
From the license itself: “[T]his License aims to strike a balance between both [open and responsible AI development] in order to enable responsible open-science…” I am of the opinion that this is impossible, as the definition of “responsible” they use (see Appendix A) is directly at odds with every definition of “open” that I am aware of. Furthermore, I find phrasings such as “Although the BigScience community does not aim to impose its values on potential users of this Model, it is determined to take tangible steps towards protecting the community from inappropriate uses of the work being developed by BigScience.” confusing, as it both claims not to seek to impose its values and to seek to impose its values in the very same sentence!
In essence, I wish entities such as FAIR and Hugging Face would stop using the term “open” when they so clearly disagree with all accepted definitions of it. In the blog post related to their license [2], they explicitly state that they are incompatible with the OSI definition of open which for example the ethical source movement [4] also agrees with and thus avoids labelling themselves as “open”.
Okay, so BLOOM is not “open” as in “open source”, then what is it “open” like? Putting limitations on usage makes it run afoul of definitions of “open access”: “free availability and unrestricted use” [5]. Although I have to admit that I am less familiar with how this community would view imposing ethical considerations on readers of science, I am fairly certain that barring access to scientific information based on anything other than very clear ethical considerations (say, easily-deployable, pandemic-level virus mutations) would run afoul of a great majority of the open access movement.
Right, so it is not “open” as in “open source” and nor “open” as in “open access”. How about “open science”? To the best of my knowledge there is no widely accepted definition of open science, but most seem to model themselves around open access, open source, and open data. Thus falling back on OSI’s definition and to quote the Open Knowledge Foundation: “Knowledge is open if anyone is free to access, use, modify, and share it — subject, at most, to measures that preserve provenance and openness” [6]. Alternatively, their short definition makes it even clearer: “Open data and content can be freely used, modified, and shared by anyone for any purpose” [7].
In summary, I do not believe that you can argue that anything that is “ethically licensed” is “open” by any reasonable definition. This however is fine, as anyone is free to dictate how the fruit of their labour is to be used. However, even when I am being charitable, it is difficult for me not to feel that there is a desire to ride on the coattails of the positive semantics attached with the word “open” that has taken arguably more than 30 years for others to build up and I feel that it is ethically questionable to muddy the waters around a term that others have worked hard to define and build communities upon. You are “ethical”, “responsible”, or some other nice term, but not “open” – own it.
Lastly, why do I as a researcher in this area object? Especially given that I think I can agree to every single ethical point in Appendix A.
Firstly – as I have argued above – it annoys me greatly that multiple entities outside the “open” movements use the terms frivolously and I believe to their own benefit. This feels like appropriation if ever I saw it.
Secondly, I believe that despite their good intentions multiple efforts motivated by ethics are reversing the direction of openness of where science has been heading over the last twenty years. Even if only marginally so.
Thirdly (and lastly), I believe these efforts to achieve their ethical goals will prove to have little to no effect and that efforts are more strongly warranted elsewhere. Actors with the capability to cause harm will either be able to ignore the license or are likely to have already obtained these kinds of models independent of publicly released models (not to mention that cutting-edge models are developed solely by multi-national corporations without ethical quandaries). Thus, to me, taking a legalistic approach is akin to paying for indulgences and staying in our academic ivory towers. Rather, I think successful initiatives to restrict potential harm much be wider in scope akin to the Campaign to Stop Killer Robots [8] or, even better, to educate the general public about these models and work on developing countermeasures to support public trust and communication once these models inevitably become commonplace are necessary and receiving far too little attention in the community in favour of hypothetical risk and legal word play.
Unrelated notes: While browsing the license I noted that it has patent clauses akin to Apache 2.0, which I found interesting. It also contains a somewhat vague requirement to keep your models in sync with the upstream: “You shall undertake reasonable efforts to use the latest version of the Model.” I assume that this is intended to be used if the model is “patched” to restrict harm. But it creates another ongoing relationship between users of the model and Hugging Face.
I wonder if this is more a PR/legal strategy on their part than anything else. If you do decide to download the model and generate fake tweets with it, how are they going to prove you used it? And even if they could, what sort of action can they enforce to stop you from doing so?
This probably deserves it's own post for how thorough it is, and I am also very curious to see how others feel. I personally wouldn't call it open. "Open" is generally pretty objective, while "responsible" or "ethical" is quite subjective
Thanks to the researchers, institutions and the French government for providing the resources to make that happen. I hope more countries on the continent follow this model of open access and funding for AI research.