Not to be a conspiracy theorist, but members of Congress considering freely available models to be a threat but “well-regulated” models hidden away behind APIs to be acceptable is just what I’d expect from an effective lobbying campaign by OpenAI/Microsoft.
Honestly, it isn't an unreasonable stance. Given that we, and most understandably them, don't really know the consequences of the tech but know it could completely transform society as we know it it is safer if you have a few entities that can control it.
Sometimes I get the feeling that we are doomed to get an AI monopoly/duopoly and sometimes the feeling that open source will prosper.
And while the enthusiast in me clearly wants it to be open source I don't feel it is a given that society will be better for it.
Really? I can't believe that the right approach is to lock up tech, especially with large corporations and/or government holding the keys. Arguably they're the ones most likely to abuse it. Witness Facebook that has actually transformed society raising the lunatic fringe to be on par with the mainstream using the magic of social networks and their algorithms that promote the most profitable speech, regardless of its corrosive effect on society. You don't need fancy AI to destroy the fabric of society.
The more appropriate path is developing countermeasures or counter-technology. This is only possible via free and open development of technology. God knows what monstrocity OpenAI, or any other corporation is cooking up behind closed doors. And we know their incentives are not aligned with those of a health functioning society.
> You don't need fancy AI to destroy the fabric of society.
You don't, but you can do it a million times worse in just a fraction of the time. Considering how trivial Facebook can worsen our lives just imagine how effective it could be. That is the risk you are taking. Before we have even time to course correct we've lost everything we know.
> The more appropriate path is developing countermeasures or counter-technology.
If we get good AI the countermeasures are not going to be great. They alone could easily be dystopian. Since, captchas etc. won't be good enough. The only way to TRY and prevent abuse is to ID every action and tie it to a physical person. You could try to make that anonymously but yeah, like that is going to happen. And that is the easy part.
There's no need to presume lobbying. Freely released models, by definition, don't have a regulation body. It would be like replicators suddenly made it possible for anyone to produce automobiles or firearms with nearly zero variable costs. These things have life-changing consequences, so there are expectations of quality, safety and record keeping.
The simple truth is that our information tools have become so complex that they're capable of significant consequences. Bots already are being used to wage war on information, notably Russia's socio-political war that the U.S. doesn't even realize that it is in. That said, it doesn't mean that models must be proprietary. If anything, this must be treated like crypto protocols: open to many eyes and liberal amounts of sunshine to prevent anyone from encoding a Manchurian candidate.
I suspect if various stakeholders (u.s. senate, various corporate interest groups, etc) understood the implications of the home computers, they would've not hesitated to ban and otherwise regulate them. Up until early 2000s home computers were seen as toys for nerds (people forget!), a perspective that I'm sure many of them now regret.
Just whom I was thinking of IRL. If you run through the letter's concerns about risk, they just about all got a lot worse once a "hacker" could get a general purpose computer and a 2400 baud modem for around the same inflation-adjusted price as a nice AI/ML-optimized setup today.
But I don't think senators at the time were sending ominous letters to the Steves & BillG asking why they were letting the general public have access to these dangerous tools rather than guarding them in machine rooms like IBM.
Not to play the exact opposite role of a charitability theorist, but if one assumes OpenAI's founding principles are sincerely held, then that lobbying campaign may be more public good-intentioned than profit-intentioned. Under such a framing, they would would argue "yes, a freely available model indeed is a threat while a carefully protected model accountable to internal and governmental oversight is less of a threat" because they believe that to be the case, not because they want a monopoly.
(I don't necessarily actually believe this. I just feel like any good conspiracy theory warrants an equal and opposite angel-devil's advocate.)
> but “well-regulated” models hidden away behind APIs to be acceptable
This isn’t my takeaway from the letter [1]. Expressing caution towards one element of a thing doesn’t imply acceptance of or even preference for the other aspects.
I agree that the letter did try to give some nuance, but they clearly prefer that the models be hidden away behind APIs.
> At least at this stage of technology’s development, centralized AI models can be more effectively updated and controlled to prevent and respond to abuse compared to open source AI models.
> While centralized models can adapt to abuse and vulnerabilities, open source AI models like LLaMA, once released to the public, will always be available to bad actors who are always willing to engage in high-risk tasks, including fraud, obscene material involving children, privacy intrusions, and other crime.
> Meta’s choice to distribute LLaMA in such an unrestrained and permissive manner raises important and complicated questions about when and how it is appropriate to openly release sophisticated AI models.
In February 2023, Meta released LLaMA, an advanced large language model (LLM) capable of generating compelling text results, similar to products released by Google, Microsoft and OpenAI.
Unlike others, Meta released LLaMA for download by approved researchers, rather than centralizing and restricting access to the underlying data…
I don’t know any way to read this other than “you would have been fine by us if you kept it behind an API wall”.
It’s saying Facebook did something different; the Committee is asking why. There may be nefarious intent in the subtext. We should be wary of that. But the letter per se isn’t evidence of that intent.
Over 300 million reside in a nation with easy access to guns, leading to corporate profits and fatalities, while legislators focus on leaked company figures.
> a company can be called out by Congress for allowing the public to have access to a cool new thing
Facebook didn’t “allow” the public access to LLaMA. It lost control of its model weights. That difference is material. (I know, practically speaking, there were zero controls on the weights. But by their own communication, it was “leaked,” not released.)
I'm guessing at least (1) is true. Blumenthal is the senator who thought "finsta" was a product that Meta offered. Hawley in hearings has tried to get Meta to commit to not use the contents of encrypted messages for ad-targeting, apparently unaware that those contents were not readable. Being for years on a committee responsible for technology has not pushed them to develop an understanding of the domain.
4. The AI panic is at full force and the government thinks we need to regulate a glorified chatbot because to the average person and a surprising number of credulous techies said chatbot is somehow actual AI.
At this rate there will be no AI because it will be smothered in the cradle by governments and large corporations who will stop any possible free thinking individual who might produce the next breakthrough, lest they use this "dangerous" technology to usurp the former's power or the latter's profits.
3 is not likely as llama currently can't remember more than 2000 characters. I don't doubt they may think it's the case, but as dangerous as nukes would be a wild overstatement.
politicians tend to like monopolies as a subset of liking centralized control over things. This means they just need to get 10 people in the room to control entire industries
To those of you who live in Connecticut or Missouri, contact the shit out of these representatives and give them a piece of your mind. This sort of blatant attempt to restrict the power of individuals should not be allowed.
This feels like the slowpoke meme; there are many other equally-capable foundation models in the wild now. And even if there wasn't, DJB vs US established source code as speech. I'd strongly prefer to not see the federal government investigating speech.
There are tons of models that either LoRA or finetune LLaMA, but I meant foundational models, i.e., ones completely distinct from LLaMA. That includes Falcon, MPT, StableLM, StarCoder, Replit Code, CodeGen.
There's also RedPajama and OpenLLaMA, which are intended to give similar performance to LLaMA but be legally distinct.
That sounds like a shot across the Bow from open AI. here we go.
It’s pure unadulterated fun to watch these tech titans foil one another’s ambitions with cheap shots and swipes. Sama comes off as a little arrogant after all and downright scary sometimes. no doubt he is brilliant.
I don't think Facebooks moves were heavily strategically thought out.
Instead, I think a small team of a few researchers made LLaMA and published a paper about it[1], and then realised to get any citations for their paper they'd need to let other researchers have the weights. Citations are like money in the academic world. Every researcher wants citations. Executives and managers care less about that stuff.
Letters like this make me think that Facebook may have intentionally "leaked" the weights. Certainly they didn't hesitate to open-source the model once the leak occurred.
The model was open for any researcher simply by applying; you had to enter your name, university or group affiliation, write like one sentence about your prior work or publications, agree to the noncommercial research only license, and that was it. They don't say how many people got to download the weights via this method but it was probably in the thousands.
And the weight files weren't watermarked/fingerprinted or anything. The hashes were all the same. Therefore no real way to trace who put up the torrents first.
So this isn't much of a 'leak' as it is... just good old piracy. Like if you found a torrent of a DVD rip of the movie Frozen, the MPAA wouldn't be blaming Disney for 'leaking' it.
LLaMA weights are uploaded everywhere (including Huggingface, which Meta themselves regularly use) and I have not seen a single enforcement action. No one uses the access request forms or the XORs.
Seeing the comments here, I think HN is unaware of just how nonexistant Meta's policing of LLaMA is. They don't seem to care about their own license.
Hence my theory that Meta is having their cake and eating it. Clearly they don't mind LLaMA being widely used, but the restricted release and license gives them plausible deniability from displeased parties, especially those that have no idea what huggingface-hub even is.
Maybe this wasn't the initial intention, but it sure is an easy way forward.
> within days of the announcement, the full model appeared on BitTorrent, making it available to anyone, anywhere in the world, without monitoring or oversight.
>"A series of tubes" is a phrase used originally as an analogy by then-United States Senator Ted Stevens (R-Alaska) to describe the Internet in the context of opposing network neutrality.
Am I alone in hating this pattern of of Congress trying to do an end around the first amendment freedom of speech by pressuring companies to do it for them by using hearings to harangue them.
It is questionable that Congress could pass a law regulating that behavior or even if they passed it if it would be Constitutional.
Yet they harangue companies for fully legal behavior. Congress is elected to pass laws, not to be moral scolds.
It's so cool to think of the billions of weights/numbers that are worth $100M. Would be cool to find out where in pi they are located and just share that coordinate.
Senators are realizing the power of AI and want to control it, bending it to their will using back-channels. OpenAI is going to have to "leak" its models as well or will become an extension of the political elites.