For a pittance OpenAI could maintain OpenAI Gym, Spinning Up, and their other early projects that have been so helpful to students and amateurs like myself. They choose not to.
Contributing a patch is a one-time operation.
Maintaining a fork and keeping it up to date means you have to constantly integrate upstream work into your work if you want it to remain relevant.
It's the difference between fixing a single issue and owning a project indefinitely.
A well meaning forker (forkee?) interested in RL looking to make a name for themselves could do the world a huge amount of good.
You have to be competent though, which is too high a bar for the majority of people who want to change things.
It'd be nice if github could show some activity metric alongside all those forks too I guess. If only they accepted pull requests!
If you're getting software from your friends, generally they'll pick up the fork and you'll end up with it. If you're getting it from The Official Repository on the Centralised Forge, that doesn't happen.
How hard is it to make a public git repo?
Packaging your software falls into the 'being competent' pile that's too hard.
While I'm not privy to OpenAI's internal motivations, it's worth keeping in mind that as a nonprofit they face a pair of hard organizational constraints:
1) Training state-of-the-art ML models costs huge amounts of money. That means OpenAI's capex is forcibly high.
2) Their charter (https://openai.com/charter/) commits them to a "fiduciary duty to humanity", so among other things they can't accept donations which are subject to conditions that may be misaligned with their mission. That means OpenAI's effective funding pool is forcibly low.
The combination of 1) and 2) means that, absent an absolutely massive yet hands-off private donor — a rare thing indeed in the nonprofit world — it's hard to imagine a solution to this dilemma that doesn't involve some kind of for-profit spinoff. And the outcome here really is pretty binary: if you believe OpenAI will build safe AGI, then 100x returns for investors really are immaterial compared to the wealth they can expect to generate for everyone else. While if you don't believe they'll build safe AGI, then you must either believe that A) they'll fail to build AGI (in which case investors get nothing anyway); or that B) they'll fail to build it safely (in which case your concerns are presumably technical rather than organizational).
Incidentally, it's quite likely that in the beginning, OpenAI's own researchers were by no means sure they'd be able to get results like GPT-3 through scaling alone. But that turned out to be the world we live in, so they're now playing the hand they've been dealt.
1) How seriously is AI safety research taken by academia? Are there grants? Academic positions? Sponsored studies etc?
2) How much interaction is there people who study the long terms dangers of strong AI and people who look at the problems and biases of existing AI systems (say, analyzes of racial in AI bond-granting systems, etc).
Full disclosure, I have something of a love-hate relationship with "AI safety" positions. It seems you have a series of intertwined problems. So now some biased, editorializing points.
* Exactly what General AI is, what it can do and so-forth, is not known, not understood. So it's hard to either know or articulate potential dangers. Often the stand-in for this unknown is taking an AI as a god-like with an unknown agenda. This often results in contradictory and implausible worries that it's hard to take serious, so-and-so's basilisk , the "terrifying" that's Pascal's Wager warmed-over.
* A primary focus of AI research is "alignment" between AIs and humans. This has the fundamental problem that humans are often not aligned with other humans. Humans don't always engage in genocidal world wars but we certainly know times when they have. AI researchers like Paul Christiano don't seem particular concerned with problem of "aligning" humans with each other but seems like a pressing question.
* But let's take "AI dangers" seriously. Let's look at an analogous situation. Suppose there's a new kind of energy with the potential for massive payoff but with some kind of dangers (pollution, explosions, whatever). It's weird to say "to solve this danger, we'll create a safe form first". A rational response to some dangerous activity would be regulation for everyone creating it.
However, this is ignoring possible future developments - things may get better and more advanced and that's what AI-safety people are concerned about.
So for both large models (which are split up amongst multiple accelerators but effectively one "computer"), as well as learning from "your friends" you want to be more tightly coupled than something like seti/folding@home. There are plenty of workloads that are great for embarrassingly parallel (and now "pleasingly parallel") workloads, but training a single model isn't one of them.
- The computing resources required to train models are not distributed
- Training data will often contain licensed material
- Our digital concept of 'open' revolves around transparency, which is not readily available with conventional AI
That's not to say that we should give up on open efforts in the field, but we're still deep in the experimentation/research phase of artificial intelligence. Copilot has been an excellent demonstration of how poorly suited ML is for prime-time use, even by developers.
Which is what DeepMind has done with the AlphaFold code (Apache licensed https://github.com/deepmind/alphafold) and published model predictions (CC licensed at https://alphafold.ebi.ac.uk/). I guess they could publish the weights but that would probably be useless since nobody else would be running the exact same hardware.
1. A lot of people want these systems to be open, and don't want the power that comes along with them to be locked up in the hands of a few rich people.
2. But some people also think these systems are powerful and don't want them in the hands of bad-faith actors (spammers, scammers, propagandists).
3. A lot of people also want these systems to be weakly safe and not have negative externalities when used in good faith (avoid spitting out racism when prompted with innocent questions). This is already hard.
4. Even better would be for the system to be strongly safe and be really hard to use for bad-faith purposes, but this seems unreasonably hard.
5. It's often easier to develop the "unsafe" version of something first and then figure out the details of safety once it's actually able to do something. This is basically where OpenAI is now.
6. The details around liability for the harms caused by this kind of thing are not clear at all.
So OpenAI is in this position where it has built this thing that is not yet weakly safe. People have very different ideas about how potentially harmful this could be, ranging from very dismissive ("there's tons of racism on the internet already, who cares?") to the very not dismissive ("rich white tech people are exacerbating inequities by subjecting us to their evil racist AI systems!").
What should OpenAI do with this thing? Keep it locked up so that it doesn't hurt anybody? Release it to the world and push accountability onto the end users? Brush aside the ethical questions and use the hype generated by the above tensions to get as rich as possible? So far their answer seems to be somewhere cautiously in the middle.
My personal opinion is that these questions will be very important for real AGI, but this ain't it, so the issues may not be as bad as they seem. On the other hand, maybe this is a useful test case for how to deal with these problems for when we do actually get there? Also from past experience, it's probably not a good idea for them to allow open access to something that spits out unprompted racism. I would like to see OpenAI more open, but I also realize that it's very hard for them to make any decision in this space without making people unhappy and generating a lot of bad press and accusations.
Openness, in the libre/free sense, is also making sure to minimize gatekeeping or putting the creators in a position of making judgements about what's good an what's not.
All the other points you list are ancillary. OpenAI is a prime example of "open-washing". OpenAI got good will from the community by implying they were open (free/libre) and then hid behind all the other points you listed to not commit to openness.
If they wanted to have a discussion about the moral hazard of AI and their business model was to create a walled garden where only approved scientists, engineers and researchers had access to the data and code, that's their prerogative, just don't name it "OpenAI".
This is not a criticism, just an observation of where we're at and how dramatically attitudes have shifted.
Free speech in the US has practically never been principled in support of those marginalized, but a tool for the wealthy and rich to maintain power.
what bizarre writing style / logic. It is exactly what I would call capped, it's capped at 100x.
If OpenAI genuinely produced a strong AI (I'm not saying they will, I'm saying they think they will) the investment returns would be 10,000x, or possibly uncountably more than that.
What on earth would you call it then?
Better yet, I'd call it unnecessary bullshit.
In the context of a company trying to create an AGI, yes, this is capped.
Furthermore batchsize is limited to memory and training hyerparams are dependent to some extent on the hardware.
So training in the real world is done on parallel identical hardware nodes. Farming it out with current algorithms does not make sense.
As far as I can tell, it is a combination of smart algorithms, good engineering, and the hardware to make it happen. And scientists that had the right hunch for which direction to push in.
Well, GPT-3 isn’t a classifier and it isn’t using labeled data.
As an outsider it definitely appears that GPT-3 is an engineering advancement, as opposed to a scientific breakthrough. The difference is important because we need a non linear breakthrough.
GPT-3 is a bigger GPT-2. As far as we know, there is no more magic. But I think it’s a near certainty that larger models will not get us to AGI alone.
In essence, I feel that the same people introduced two quite separate things - a completely new paradigm on how to obtain few-shot learning from a language model in a way that competes with supervised learning of the same tasks; and the GPT-3 large model which is used as "supplementary material" to illustrate that new paradigm bit is also usable and used with the old paradigms, and by itself isn't a breakthrough. And IMHO when the public talks about GPT-3, they do really mean GPT-3-the-model and not the particular few-shot learning approach.
Quality is increasing with parameters. Even now, interfacing with codex leads to unique and clever solutions to the problems I present it.
So is the answer here both?
Some scientist at OpenAI had the hypothesis "what if our algorithm is correct, but all we need is to scale it 3 magnitudes larger" and made it happen. They figured out how to scale all the dimensions. How fast should x scale if I scale y? (That is very tricky, as modern machine learning is basically alchemy)
And then they actually scaled it. That took a ton of engineering and hardware, for what was essentially still following a hunch.
And then they actually noticed how good it was, and did a ton of tests with the surprising results we now all know.
I'm not sure many would agree that the desire to scale was simply a "hunch".
GPT-3 has 175B parameters. The previous largest model was Microsoft's Turing-NLG which had 17B. GPT-2 had 1.5B.
But then a few sentences down the author complains:
"The company “in charge” of protecting us from harmful AIs decided to let people use a system capable of engaging in disinformation and dangerous biases so they could pay for their costly maintenance."
So the author doesn't even present a coherent position.
The danger in the latter scenario is that OpenAI is garbage: when used to answer a perfectly innocuous request, it regurgitates disinformation and biased results. OpenAI can't gatekeep for "bad" people since it's actually good people interacting with dangerous AI. Like when asked to solve a simple math problem, it just gets the math straight up wrong, but people trust it.
OpenAI wants us to believe its AI is so good it needs to be protected from "bad people" and built its safety protocols around that threat model, when it seems far more likely the latter is going to happen, and they haven't done a thing to protect us from that.
Maybe the author is a bot.
I got just one paragraph. Where's the rest of the article?
“Call out culture” runs the gamut from obvious to pendantic, and, more often than not, the result is a net positive.
So, my question is: Why isn’t there a concerted effort from [OpenAI’s] like-minded contemporaries to call them out on their incredibly embarrassing name?
To all the OpenAI engineers, investors, and onlookers: there is _nothing_ “open” about your platform. Am I stupid or are your efforts just casually misleading?
The reality is, they have a lot of mouths to feed, and the money was going to have to come from somewhere.
I think it's fair to be cynical, but the name itself is an artifact of history.
Perhaps if they were more truly authentic, they'd have nudged the name as well, but I think the concerns should most be over the materiality of their actual 'openness' in operations, not so much the name.
In entertainment - we don't see all the scripts that were given a pass, the actors that were not considered because they were going to cause controversy by being the wrong ethnicity/gender, comedy writing rooms in particular are generally 'anything can be said unsafe spaces', which is an important part of the culture of good writing that has a questionable future.
If you go and have a look at films made just between 2000-2015 - you can see how many of them - even though they don't rise to the level of controversy today, and are probably not inherently controversial - would not get made today.
'The Last Samurai' with Tom Cruise - some could argue it's 'appropriation' I think many would argue that it's not, it's not really a 'White Saviour' film either. Though it shouldn't be objectively controversial per say, though maybe fodder for discussion, it probably would not get made because it just has the potentiality for ruin: there is just far, far too much risk in an escalating cascade of populist voices calling the film all sorts of things with or without legitimate grievance it won't matter - the mob rules.
Tom Cruise has 10 projects on the drawing board, and he's getting pitches daily, so there's going to be 'a different film' without the risk profile to his personal brand.
Studios (and BigCos) are risk averse and so what we see is a 'watering down' effect across the board.
Conan O'Brien had Sean Penn as a guest on his podcast and has some thoughtful things to say about it.
So does the analogy hold for capitalist control of Open AI? Possibly.
MSFT might be able to push for some 'results' which are seemingly more obvious and public, but the systematic effect of 'closed' and 'proprietary' results in stifled innovation and opportunity otherwise.
It's actually an interesting analogy but I think it's probably good for maybe reasons you might not suspect, and that is, the big part of the iceberg that never gets seen.
I mean, you can make a better argument for Windows being "open". Available to a much broader section of society, as an installable product not API privileges that can be withdrawn, and a lot more of what's under the hood is exposed, documented and moddable.
> OpenAI was an AI research laboratory. But its ambitions were simply out of reach for the resources it had access to.
It is a sign of the rapid transformation of our world that the above statement is simultaneously true and completely missing the context here.
Why? Think about the context of OpenAI's founding. As I remember it, OpenAI wanted to offer an alternative locus of power compared to Google and Facebook around AI. They wanted to have the spotlight to talk about broader topics and share advancements more broadly.
To accomplish that mission, there are many hard compromises that have to be met.
To be clear, I'm not assessing how well OpenAI did. I don't know the right answers.
Rather, I'm pointing out the constraints in their space are substantial. I don't think anyone can dispute the upsides of having a large budget -- to hire in-demand experts and train expensive models. What are the best ways to acquire this funding? I think there are many thought experiments to conduct and cross-comparisons to make.
I’m going to trust them on their reasoning, but with that premise what are the options here if they lack required resources?
If they to avoid what is dubbed a “sell” option to private money, the other option is to get some public funding or to be honest about the issue and close the shop.