Hacker Newsnew | past | comments | ask | show | jobs | submit | Bx6667's commentslogin

Watt hours are the most widely used unit for energy in science and engineering ...


I know (in the case of engineering anyway; I would hold joules are more used in science). I accept the convenience of Wh. The problem comes with Wh/x, where x is any unit of time. Why not cancel the h/x for a dimensionaless factor in this case?


This comment blatantly breaks the guidelines and yet nobody flags it. But when I make a benign comment that has a pro-conservative sentiment, it’s instantly flagged and responded to by Dan.


That doesn't sound likely.

Complaints like this never come with links, because supplying the relevant link would reveal the untold part of the story and allow readers to make up their own minds: https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


It’s a fact that the guidelines are bendy. It’s a fact that comments that break guidelines but have a liberal sentiment get flagged and danged much, much less often than the same comments with conservative sentiment. That is a fact and I invite everyone who reads this censored and flagged comment to see for themselves.


[flagged]


We've banned this account for ideological battle. Please don't create accounts to break HN's rules with: https://news.ycombinator.com/newsguidelines.html.


Offline classes aren’t worth the cost of full tuition...


Classes are not worth the cost of full tuition.


That's true, but, if you look at how university budgets have changed over the past few decades, it's clear that they aren't supposed to be.

They've been aggressively holding educational costs down (see: adjunct professor), while the costs for non-educational services such as career counseling and "student support" (not entirely sure what that entails; my alma mater didn't have it as far as I'm aware) have mushroomed.


This.

The value I got out of university was mostly not tuition, but I think overall, (very pre-pandemic) the whole package was worth it. I've learnt vastly more by reading, engaging with people, and doing both before and after university though, for much less cost.

I've also never been on a paid training or course that was worth the cost and didn't involve something above and beyond the tuition (networking, project based, access to facilities/tools/data, etc.). The only times this obviously isn't true are when you're gaining some required or coveted industry certification, and then that's only because of the value attached to it, not the value of the learning itself vs other ways to gain the same skills/knowledge.

The lesson for me has been that for any course or school there's got to be an expected benefit way beyond the perceived value of the tuition, and that most of the time this doesn't exist.

I think paid training in companies is likely as popular as it is mostly because it feels a bit like a vacation and most companies don't allow, or don't make it socially acceptable, for people to take sufficient time off for their own long term wellbeing.


Ye well the major upside with a degree for me was forcing me to learn stuff I would never have bothered with. I mean, at the time when I started in uni I had no clue where to even begin. Nowadays I would prefer reading a book rather than taking classes, but I think that is because I went to uni and learned to learn.

I feel it is the same with books on programming. They are worth reading first when you already know how to program.


I think people here sometimes are delusional about the prices of education since everything software-related has almost become a commodity. You can buy a Django/React/whatever course for 9.99 on Udemy. However, there’s an incredible amount of effort which goes into preparing courses in other disciplines.


* people are delusional because colleges are getting away with low quality work, because of the higher barrier of entry for competition compared to something like Udemy.

* People are getting delusional because of both low quality and high price.

Edit: disclaimer: I am a college student at thus I may be biased about what I see here in India. But I assume it is almost universal from other discussions here and on reddit.


I am totally confused by people not being impressed with gtp3. If you asked 100 people in 2015 tech industry if these results would be possible in 2020, 95 would say no, not a chance in hell. Nobody saw this coming. And yet nobody cares because it isn’t full blown AGI. That’s not the point. The point is that we are getting unintuitive and unexpected results. And further, the point is that the substrate from which AGI could spring may already exist. We are digging deeper and deeper into “algorithm space” and we keep hitting stuff that we thought was impossible and it’s going to keep happening and it’s going to lead very quickly to things that are too important and dangerous to dismiss. People who say AGI is a hundred years away also said GO was 50 years away and they certainly didn’t predict anything even close to what we are seeing now so why is everyone believing them?


I think people should be impressed, but also recognize the distance from here to AGI. It clearly has some capabilities that are quite surprising, and is also clearly missing something fundamental relative to human understanding.

It is difficult to define AGI, and it is difficult to say what the remaining puzzle piece are, and so it's difficult to predict when it will happen. But I think the responsible thing is to treat near-term AGI as a real possibility, and prepare for it (this is the OpenAI charter we wrote two years ago: https://openai.com/charter/).

I do think what is clear is that we are, in the coming years, going to have very powerful tools that are not AGI but that still change a lot of new things. And that's great--we've been waiting long enough for a new tech platform.


On a core level, why are you trying to create an AGI?

Anyone who has thought seriously about the emergence of AGI equates the chance that AGI causes a human extinction level event ~20%, if not greater.

Various discussion groups I am a part of now see anyone who is developing AGI to be equivalent to developing a stockpile of nuclear warheads in your basement that you're not sure won't immediately shoot off on completion.

As an open question. If one believes that 1. We do not know how to control an AGI 2. AGI has a very credible chance to cause a human level extinction event 3. We do not know what this chance or percentage is 4. We can identify who is actively working to create an AGI

Why should we not immediately arrest people who are working on an "AGI-future" and try them for crimes against humanity? Certainly, In my nuclear warhead example, I would immediately be arrested by the government of the country I am currently living in the moment they discovered this.


The problem is that if the United States doesn't do it, China or other countries will. It's exactly the reason why we can't get behind on such a technology from a political / national perspective.

For what it's worth though, I think you're right that there are a lot of parallels with nuclear warheads and other dangerous technologies.


There needs to be a level of serious discourse that doesn't appear to currently be in the air, around what to do, international treaties, and repercussions.

I have no idea why people aren't treating this with grave importance. The level of development of AI technologies is clearly much ahead of where anyone thought it would be.

With exponential growth rates, acting too early is always seen as an 'overreaction', but waiting too long is sure to be a bad outcome (see, world re: coronavirus).

There seems to be some hope, in that as a world we seemed to have banned human cloning, and that has been around since dolly in the late 90s.

On the other hand, the USA can't seem to come to a consensus that a deadly virus is a problem, as it is killing its own citizens.


You don’t know the distance! And you are conflating distances! Distance between agi behavior and gtp3 behavior has nothing to do with the distance in time between the invention of gtp3 and agi. That’s a deceptive intuition and fuzzy thinking... again my point is that the “behavior distance” between AIM chat bots and gtp3 would, under your scrutiny, lead to a prediction of a much larger “temporal distance” than 10 years. Nit-picking about particular things that this particular model can’t do is completely missing the big picture.


I think there's a divide between "impressive" and "good".

I think deep learning will keep creating more impressive, more "unintuitive and unexpected", more "wow" results. The "wow" will get bigger and bigger. Gpt-3 is more impressive, more "wow"-y than Gpt-2. Gpt-3 very impressively seems to demonstrate understanding of various ideas, Gpt-3 indeed very impressively develops ideas over several sentences. No argument with the "unintuitive and unexpected" part.

The problem is the whole thing doesn't seem definitively good (in Gtp-3's case, doesn't produce good or even OK writing). It's not robust, reliable, trustworthy. The standard example is the self-driving car. They still haven't got those reliable but with more processing power, a company could probably add more bells and whistles to the self-driving process but still without making it safe. And GPT-3 seems in that vein - more "makes sense if you're not paying attention", the same "doesn't really say coherent things".

I'm trying to trace a middle ground between the two reactions. I'm perhaps laughing a little at those just looking at impressive but I acknowledge there's something real there. Indeed, the more you notice something real there, the more you notice something real missing there too.


Thats similar to my thoughts. That demo video of generating html was very impressive, I have never seen anything that can do that, but its also 1000x less useful than squarespace or wordpress. The tool in its current state is totally useless even if it is very impressive.


It's not robust, reliable, trustworthy

Is human writing robust, reliable, trustworthy? Would you agree that some humans produce vastly better writing than others? Have you never read comments here on HN that appeared to be incoherent rambling, logically faulty, or just shallow, trite and cliched?

GPT-1 is a significant improvement over earlier RNN based language models. GPT-2 is a significant improvement over GPT-1. GPT-3 is a significant improvement over GPT-2, especially in terms of "robustness". All these achievements appeared in the course of just 3 years, and we haven't yet reached the ceiling of what these large transformer based models can do. We can reasonably expect that GPT-4 will be a significant improvement over GPT-3 because it will be trained on more and better quality data, it will be bigger, and it might be using better word encoding methods. Aside from that, we haven't even tried finetuning GPT-3, I'd expect it would result in a significant improvement over the generic GPT-3. Not to mention various potential architectural and conceptual improvements, such as an ability to query external knowledge bases (e.g. Wikipedia, or just performing a google search), or an ability to constrain its output based on an elaborate profile (e.g. assuming a specific personality). There are most likely people at OpenAI who are working on GPT-4 right now, and I'm sure Google, Microsoft, Facebook, etc are experimenting with something equally ambitious.

I agree that GPT writing is not "good" if we compare it to high quality human writing. However, it is qualitatively getting better and better with each iteration. At some point, as soon as a couple years from now, it will become consistent and coherent enough to be interesting and/or useful to regular people. Just like self-driving cars in a couple of years might reach the point where the risk of dying is higher when you drive than when AI drives you.


From the POV of an AI practitioner, there is one and only one reason I remain unimpressed with GPT3.

It is nothing more than one big transformer. At a technical level, it does nothing impressive, apart from throw money at a problem.

So in that sense, having already been impressed at Transformers and then ELMO/BERT/GPT-1 (for making massive pretraining popular). There is nothing in GPT3 that is particularly impressive outside of Transformers and massive pre-training, both of which are well known in the community.

So, yeah, I am very impressed by how well transformers scale. But, idk if I'd give OpenAI any credit for that.


The novelty of GPT3 is its few shot learning capabilities. GPT3 shows a new, previously-unknown, and, most importantly, extremely useful property of very large transformers trained on text -- that they can learn to do new things quickly. There isn't any ML researcher on record who predicted it.


> There isn't any ML researcher on record who predicted it.

That's just absurd - this was an obvious end-result for LM. NLP researchers knew that something like this was absolutely possible, my professor predicted it like 3 years ago.


Yes, the emergent ability to understand commands mixed in with examples is pretty crazy.


"People who say AGI is a hundred years away also said GO was 50 years away" this is not true. The major skeptics never said this. The point skeptics were making was that benchmarks for chess (IBM), Jeopardy!(IBM), GO (Google), Dota 2 (OpenAI) and all the rest are poor benchmarks for AI. IBM Watson beat the best human at Jeopardy! a decade ago, yet NLP is trash, and Watson failed to provide commercial value (probably because it sucks). I'm unimpressed by GLT-3, to me nothing fundamentally new was accomplished, they just brute forced on a bigger computer. I expect this go to the same way as IBM Watson.


One expert predicted in mid-2014 [1] that a world-class Go AI was 10 years away. AlphaGo defeated Lee Sedol 18 months later.

It's not 50 years, but it does illustrate just how fraught these predictions can be and how quickly the state of the art can advance beyond even an insider's well-calibrated expectations.

(To his credit the expert here immediately followed up his prediction with, "But I do not like to make predictions.")

[1] https://www.wired.com/2014/05/the-world-of-computer-go/


People also predicted 2000 would have flying cars. The moral of the story is future prediction is very difficult and often inaccurate for things we are not close to achieving. Not that they always come sooner than predicted.


We have flying cars. What we don't have is a flying car that is ready for mass adoption. The biggest problem is high cost both for the car and its energy requirements, followed by safety and the huge air traffic control problem they would create.


As a counterpoint I felt like when alphago came out I was surprised it took so long, because go really seems like a good use case for machine learning supremacy because 1) the go board looks particularly amenable to convey analysis and 2) it's abstract enough for humans to have missed critical strategies, even after centuries.

I wish I were on record on that, so take what I say with a grain of salt


Ultimately the greatest factor is stereotypes about inventors. The OpenAI team doesn’t remind anyone of say the Manhattan Project team in any way. They don’t look act or sound like Steve Jobs and Steve Wozniak. Elon Musk does, and that’s why I think people get so excited about rockets that land themselves. That is honestly pretty cool. Very few people pull stuff like that off. But is it less cool than GPT3?

Sam Altman and Greg Brockman were also online payments entrepreneurs like Elon Musk so it’s not like it was about their background / prior history. It’s also not about sounding too grandiose or delusional, Musk says way crazier stuff in his Twitter than Greg Brockman has ever said in his life. It’s clearly not about tempering expectations. Musk promises self driving cars every year!

So I think there are a lot of factors that impact the public consciousness about how cool or groundbreaking a discovery is. Personally I think the core problem is the contrivance of it all, that the OpenAI people think so much about what they say and do and Elon does not at all, and that kind of measured, Machiavellian strategizing is incommensurable with public demand for celebrity.

What about objective science? There was this striking Google Research paper on quantum computing that put the guy who made “some pipes” first author. I sort of understand abstractly why that’s so important but it’s hard for me to express to you precisely how big of a discovery that is. Craig Gentry comes to mind also as someone who really invented some new math and got some top accolades from the academy for it. There is some stereotyping at play here that may favor the OpenAI team after all - they certainly LOOK more like Craig Gentry or pipes guy than Elon Musk does. That’s a good thing so I guess in the pursuit of actually advancing human knowledge it doesn’t really matter what a bunch of sesame grinders on Hacker News, Twitter and Wired think.


What would be a good benchmark? In particular, is there an accomplishment that would be: (i) impressive, and clearly a major leap beyond what we have now in a way that GPT-3 isn't, but (ii) not yet full-blown AGI?


How about driving a car without killing people in ways a human driver would never kill people (i.e. mistaking a sideway semi truck for open sky)?

That's a valuable benchmark loads of companies are aiming for, but it's not a full AGI.


Maybe nothing? “Search engines through training data” are already the state of the art, and have well documented and mocked failure cases.

Unless someone comes along with a more clever mechanism to pretend it’s learning like humans, you’re not looking at a path towards AGI in my opinion.


> you’re not looking at a path towards AGI in my opinion

What I'm trying (and apparently failing?) to ask is, what would a step on the path towards AGI look like? What could an AI accomplish that would make you say "GPT-3 and such were merely search engines through training data, but this is clearly a step in the right direction"?


> What I'm trying (and apparently failing?) to ask is, what would a step on the path towards AGI look like?

That's an honest and great question. My personal answer would be to have a program do something it was never trained to do and could never exist in the corpus. And then have it do another thing it was never trained to do, and so on.

If GPT-3 could say 1) never receive any more input data or training, and then 2) read an instruction manual for a novel game that shows up a few years from now (so it can't be replicated from the corpus), and 3) plays that game, and 4) improves at that game, that would be "general" imo. It would mean there's something fundamental with its understanding of knowledge, because it can do new things that would have been impossible for it to mimic.

The more things such a model could do, even crummily, would go towards it being a "general" intelligence. If it could get better at games, trade stocks and make money, fly a drone, etc. in a mediocre way, that would be far more impressive to me than a program that could do any of those things individually well.


If a program can do what you described, would it be considered a human-level AI yet? Or would there be some other missing capabilities still? This is an honest question.

I intentionally don’t use the term AGI here because human intelligence may not be that general.


> human intelligence may not be that general

Humans have more of an ability to generalize (ie learn and then apply abstractions) than anything else we have available to compare to.

> would it be considered a human-level AI yet

Not necessarily human level, but certainly general.

Dogs don't appear to attain a human level of intelligence but they do seem to be capable of rudimentary reasoning about specific topics. Primates are able to learn a limited subset of sign language; they also seem to be capable of basic political maneuvering. Orca whales exhibit complex cultural behaviors and employ highly coordinated teamwork when hunting.

None of those examples appear (to me at least) to be anywhere near human level, but they all (to me) appear to exhibit at least some ability to generalize.


From grandparent post:

> 2) read an instruction manual for a novel game that shows up a few years from now (so it can't be replicated from the corpus), and 3) plays that game, and 4) improves at that game, that would be "general" imo.

I would say that learning a new simple language, basic political maneuvering, and coordinated teamwork might be required to play games well in general, if we don't exclude any particular genre of games.

Complex cultural behaviors might not be required to play most games, however.

I think human intelligence is actually not very 'general' because most humans have trouble learning & understanding certain things well. Examples include general relativity and quantum mechanics and, some may argue, even college-level "elementary mathematics".


Give it an algebra book and ask to solve the exercises at the end of the chapter. If it has no idea how to solve a particular task, it should say “give me a hand!” and be able to understand a hint. How does that sound?


That makes me think we are closer rather than farther away because all that would be needed is for this model to recognize the problem space in a question:

“Oh, you are asking a math question, you know a human doesn’t calculate math in their language processing sections of their brain right, neither do I... here is your answer”

If we allowed the response to delegate commands, it could start to achieve some crazy stuff.


> probably because it sucks

it's not technically bad, but it requires domain experts to feed it domain relevant data and it's as good as this setup phase is, and this setup phase is extremely long, expensive and convoluted. so yeah it sucks, but as a product.


Whenever someone talks about how AI isn't advancing, I think of this XKCD comic from not too long ago (maybe 2014-ish?), in which "check whether a photo is of a bird" was classified as "virtually impossible".

https://xkcd.com/1425/


Read the alt-text. Photo recognition wasn't impossible in 2014, it was impossible in the 1960s and the 2014-era author was marvelling at how far we'd come / making a joke of how some seemingly-simple problems are hard.


First, I remember the demos for GTP-2. Later, when it was available and I could try it myself, I was kind of disappointed in comparison.

Second, while impressive we are also finding out at the same time just how much more is needed to make something of value. It‘s like speech recognition in 1995. Mostly there, but in the end it took another 20 years to actually work.

But still, it‘s exciting.


I am really impressed with it as a natural language engine and query system. I am not convinced it "understands" anything or could perform actual intellectual work, but that doesn't diminish it as what it is.

I'm also really worried about it. When I think of what it will likely be used for I think of spam, automated propaganda on social media, mass manipulation, and other unsavory things. It's like the textual equivalent of deep fakes. It's no longer possible to know if someone online is even human.

I am thinking "AI assisted demagoguery" and "con artistry at scale."


> And yet nobody cares because it isn’t full blown AGI. That’s not the point. The point is that we are getting unintuitive and unexpected results.

I don't think these are unintuitive or unexpected results. They seem exactly like what you'd get when you throw huge amounts of compute power at model generation and memorize gigantic amounts of stuff that humans have already come up with.

A very basic Markov model can come up with content that seem surprisingly like a human would say. If anything, what all of the OpenAI hype should confirm is just how predictable and regular human language is.


> They seem exactly like what you'd get when you throw huge amounts of compute power

I disagree with that.

The one/few shot ability of the model is much much better than what I would have imagined, and I know very few people in the field that saw GPT-3 and were like "yep, exactly what I thought".


> A very basic Markov model can come up with content that seem surprisingly like a human would say.

This is false. Natural language involves long-term dependencies that are beyond the ability of any Markov model to handle. GPT-2 and -3 can reproduce those dependencies reliably.

> If anything, what all of the OpenAI hype should confirm is just how predictable and regular human language is.

Linguists have been trying to write down formal grammars for natural languages since the 1950s. Some of the brightest people around have essentially devoted their lives to this task. And yet no one has ever produced a complete grammar of any human language. So no, human language is not predictable and regular, at least not in any way that we know how to describe formally.


W.r.t. the Markov model, I just mean that something even that trivial can sound lifelike. It's not surprising that throwing billions of times more data at the problem with more structure can make the parroting better.

> So no, human language is not predictable and regular, at least not in any way that we know how to describe formally.

I don't know what to say about this other than perhaps the NLP community has been a little too "academic" here and I disagree.

Grade schoolers routinely are forced to make those boring diagrams for their particular language, and that has tremendous structure. When you add that structure (function) with the data of billions of real-world people talking, it's not surprising that the curve fit looks like the real thing. Given how powerful things like word2vec have been that do very, very simple things like distance diffs between words, it's not surprising to me that the state of the art is doing this.


It is surprising! You could throw all the data of the entire human race at a Markov model and it would not sound a tenth as good as even GPT-2. Transformers are simply in a new class.


Were you alive in 2010?


Right...but at the end of the day that's what intelligence is. You are just an interconnected model of billions of neurons that has been trained on millions of facts created by other humans. Except for this model can vastly exceed the amount of factual knowledge that you could possibly absorb over your entire lifetime.


> You are just an interconnected model of billions of neurons that has been trained on millions of facts created by other humans.

...but I didn't pop out of the womb that way, and as you said, over my lifetime I will read less than 1 millionth of the data that GPT-3 was trained on. GPT-2 had a better compression ratio than GPT-3, and I'm sure a GPT-4 will have a worse compression ratio than GPT-3 on the road we're on.

Rote memorization is hardly what I'd call intelligence. But that's what we're doing. If these things were becoming more intelligent over time, they'd need less training data per unit insight. This isn't a dismissal of the impressiveness of the algorithms, and I'm not suggesting the classic AI effect "changing the goalposts over time." I fundamentally believe we're kicking a goal in our own team's net. This is backwards.


Exactly. Even gpt3 is not creating new content. It is just permuting ecisting content while retaining some level of coherence. I don't reason by repeating various tidbits I've read in books in random permutations. I reason by thinking abstractly and logically, with a creative insight here and there. Nothing at all like a Markov model trained on a massive corpus. Gpt3 may give the appearance of intelligent thought, but appearance is not reality.


> I don't reason by repeating various tidbits I've read in books in random permutations.

Are you sure?


Yes, I would fail any sort of math exam if I used the GPT-3 model.


GPT-3 is nothing like a Markov model.


Same sort of generative probabilistic model idea.


All creative work is derivative.


Not all derivative work is creative.


I can't help but feel what gpt is really teaching us about is language not AI.


IMO, language is one of the purest forms of thinking / consciousness. What is our brain doing that makes it different?


This brings to mind the debates between Frank Ramsey and Ludwig Wittgenstein.

Episode: https://philosophybites.libsyn.com/cheryl-misak-on-frank-ram... Media: https://traffic.libsyn.com/secure/philosophybites/Cheryl_Mis...


The problem is that not only is this "not full blown AGI". The problem is that, if you understand how this works, it's not "intelligence" at all (using the layperson meaning of the word, not the marketing term), and it's not even on the way to get us there.


It reminds me of that pithy remark by someone I read a while ago which was (paraphrased): "Any time someone pushes forward AI as a field, people will almost alway remark: 'but that's not real AI.'"

It's true, the mundanity quickly settles in, and we look to the next 'impossible hurdle' and disregard the fact that only a few years ago, natural language generation like this was impossible.


> "Any time someone pushes forward AI as a field, people will almost alway remark: 'but that's not real AI.'"

This statement reveals a widespread, and in my opinion, a not-entirely-correct, assumption that increases in the ML field means we're actually pushing forward on AI. It also implies a belief that the pre-1970s people were somehow less right than the 2000s+ ML crowd, when a lot of ML's success is related to compute power that simply did not exist in the 1970s.

ML computational machines to transform inputs->outputs are great, but there's no compelling reason to believe they're intrinsic to intelligence, as opposed to functioning more like an organ.

We might be making great image classifier "eyes", or spam-filtering "noses", or music-generating "ears". But it's not clear to me that will incrementally get us closer to an intelligent "brain", even if all those tools are necessary to feed into one.


I disagree. Yes, it is just a decoder of transformer. But it looks like we are really close, with some tweaks on the network structure, reward function design and inputs / outputs. On the same time, GPT-3 also points how far away we are at hardware level.

Let me put it this way: I don't know how challenging the rest is going to be, but it surely looks like we are on the right path finally.


It fundamentally has no _reasoning_. There is no AGI without reasoning.


What makes you think this? The fact that it can produce working code from a prompt in some cases shows rudimentary non-trivial reasoning. Hell, GPT-2 demonstrated rudimentary reasoning of the trivial sort.


> The fact that it can produce working code from a prompt in some cases shows rudimentary non-trivial reasoning.

It doesn't at all. It indicates that it read stackoverflow at some point, and that on a particular user run, it replayed that encoded knowledge. (I'd also argue it shows the banality of most React tutorials, but that's perhaps a separate issue.)

Quite a lot of these impressive achievements boil down to: "Isn't it super cool that people are smart and put things on the internet that can be found later?!"

I don't want to trivialize this stuff because the people who made it are smarter than I will ever be and worked very hard. That said, I think it's valid for mere mortals like myself to question whether or not this OpenAI search engine is really an advancement. It also grates on me a bit when everybody who has a criticism of the field is treated like a know-nothing Luddite. The first AI winter was caused by disillusionment with industry claims vs reality of what could be accomplished. 2020 is looking very similar to me personally. We've thrown oodles of cash and billions of times more hardware at this than we did the first time around, and the most use we've gotten out of "AI" is really ML: classifiers. They're super useful little machines, but they're sensors when you get right down to it. AI reality should match its hype, or it should have less hype (e.g. not implying GPT-3 understands how to write general software).


>It doesn't at all.

Assertions aren't particularly useful in this discussion. Nothing you said supports your claim that GPT-3 doesn't show any capacity for reasoning. The fact that GPT-3 can create working strings of source code from prompts it (presumably) hasn't seen before means it can compose individual programming elements into a coherent whole. If it looks like a duck and quacks like a duck, then it just might be a duck.

Here's an example of rudimentary reasoning I saw from GPT-2 in the context of some company that fine-tuned GPT-2 for code completion (made up example but captures the gist of the response):

[if (variable == true) { print("this sentence is true") } else] { print("this sentence is false") }

Here's an example I tested using talktotransformer.com: [If cars go "vroom" and my Ford is a car then my Ford] will also go "vroom"...

The bracketed parts where the prompt. If this isn't an example of rudimentary reasoning then I don't know what is. If your response is that this is just statistics then you'll have to explain how the workings of human brains aren't ultimately "just statistics" at some level of description.


> working strings of source code from prompts it (presumably) hasn't seen before

I'm saying that "presumably" is wrong, especially on what it was: a simple React program. It would not surprise me if the amount of shared structure and text in the corpus is all over the place.

This can be tested by making more and more sophisticated programs in different languages, and seeing how often it returns the correct result. I don't really care, because it can't reliably do basic arithmetic if the numbers are in different ranges. This is dead giveaway it hasn't learned a fundamental structure. If it hasn't learned that, it hasn't learned programming.

The examples are not really that impressive either. They are boolean logic. That a model like this can do copy-pasta + encode simple boolean logic and if-else is... well.. underwhelming. Stuff like that has been happening for a long time with these models, and no one has made claims that the models were "reasoning".


The react programming example isn't the only example of GPT-3 writing code. There was an example of it writing python programs going around before they opened up the API. It was impressive. There was no reason to think it had seen the exact examples before.

Also it isn't the case that one needs to be perfect at algorithmic thinking to be capable of some amount of reasoning. I don't claim that GPT-3 is perfect, but that its not just copy and pasting pieces of text it has seen before. It is coming up with new sequences of text based on the structure of the surrounding text and the prompt, in a manner that indicates it has a representation (albeit imperfect) of the semantic properties of the text and can compose them in meaningful ways. Handwavy responses do nothing to undermine the apparent novelty it creates.

>encode simple boolean logic and if-else is... well.. underwhelming. Stuff like that has been happening for a long time with these models, and no one has made claims that the models were "reasoning".

Seems like you're just moving the goalposts as always happens when it comes to AI advances. What do you take to be "reasoning" if that isn't an example of it?


Couldn’t you yourself learn how to do that, in a foreign language, without knowing what the words mean?


Logic is sensitive to the meaning of words to a degree and so if I can pick out the context to apply certain deductive rules, then I know what the relevant words mean, at least to the degree that they indicate logical structure.

It's possible that a program could learn when to apply certain rules based on its own if-then statements and bypass understanding, but that's not the architecture of GPT-3. If it learns the statistical/structural properties of a string of text such that it can apply the correct logical transformations based on context, the default assumption should be that it has some rudimentary understanding of the logical structure.


> The fact that it can produce working code from a prompt in some cases shows rudimentary non-trivial reasoning.

No, “in some cases” doesn't show reasoning. It is, arguably, weak evidence for reasoning that supports other explanations. With the right input corpus, a Markov chain generator will produce working code from a prompt “in some cases”, and I don't think any one has a weak enough definition of reasoning to admit Markov chains.


Of course we need to quantify "in some cases" for your argument to hold. Humans aren't perfect reasoners, for example. The examples I saw were impressive and were mostly correct, apart from some minor syntax errors or edge cases. This wasn't a Markov chain generator where the "interesting" responses where cherry picked from a pile of nonsense.


So under your logic we won’t have any idea that we are close to having agi until we have a machine that can reason... which is agi. You are missing the big picture


There's clearly no planning for a solution, I think that's what GP is getting at.


You don’t understand how it works. You can’t explain how the model works. Go ahead and correct me if I’m wrong.


Impressive to a human is a highly subjective property. Humans generally consider the understanding of language to be an intelligent trait, yet tend to take basic vision which took much longer evolutionarily to develop for granted. Neural networks can approximate arbitrary functions, and the ability to efficiently optimize neural network parameters over high dimensional non-convex landscapes has been well established for years. What typically limits pushing the state of the art is the availability of "labeled" data and the finances required for very large scale trainings. With NLP, there are huge datasets available which are effectively in the form of supervised data, since humans have 1) invented a meaningful and descriptive language and 2) generated hundreds of trillions of words in the form of coherent sentences and storylines. The task of predicting a missing word is a well-defined supervised task for which then there is effectively infinite "labeled" data. Couple these facts with a large amount of compute credits and the right architecture and you get GPT3. The results are really cool but in my opinion scientifically unsurprising. GPT3 is effectively an example of just how far we can currently push supervised deep learning, and even if we could get truly human level language understanding asymptotically with this method, it may not get us much closer to AGI, if only because not every application will have this much data available, certainly not in a neatly packaged supervised representation like language (such as computer vision). While approaches like GPT3 will continue to improve the state of the art and teach us new things by essentially treating NLP or other problems as an "overdetermined" system of equations, these approaches are subject to diminishing returns and the path to AGI may well require cracking that human ability to create and learn with a vastly better sample complexity, effectively operating in a completely different "under-sampled" regime.


I mean I find the fact that a human can actually build and work with a tool that it can't actually understand?

Even now, you could, if you wanted to, rip apart your computer even to the CPU level and understand how it works. Even analyzing the code. Sure, it might take you ten years.

But you would NEVER be able to understand how GPT3 works... it's just too complex.


I’m no expert, but tools such as SHAP and DeepLift can give you insight into what activates a network. It’s probably not possible to inspect a network with billions of parameters, however it’s to be expected since I don’t think that explainable ML is an established field yet.

But also think about it from another angle: it doesn’t seem too hard to explain why people say what they say. We can usually get into the shoes if the other person if we try hard enough. However, if we say there’s no way for us to explain GPT-3, it just shows how fundamentally different it is from human mind.


Agreed. Even if we put research into deconstructing and attempting to understand how deep neural networks work in tasks such as autonomous driving, the fact is that these tasks are too complex to even logically describe.

That said, I do think it is possible to come up with robust guarantees to these methods.


Really? I bet in a few years we'll have tools that can inspect a model and tell you exactly what parts do what function and how they do it.


> People who say AGI is a hundred years away also said GO was 50 years away and they certainly didn’t predict anything even close to what we are seeing now so why is everyone believing them?

Do you why AlphaGo decided to perform move 37 in Game 2 with Lee Sedol? Can AlphaGo explain itself as to why it did that move?

If we don't know why it made that decision, then it is a mysterious black-box hiding it's decisions and taking in an input to produce and output, which is still a problem. This isn't useful to researchers in understanding decisions of these AI systems, especially for AGI. Hence, this problem also applies to GPT-3.

While it is still an advancement in NLP, I'm more interested in getting a super accurate or generative AI system to explain itself than one that cannot.


>While it is still an advancement in NLP, I'm more interested in getting a super accurate or generative AI system to explain itself than one that cannot

Why? People can explain ourselves because we rationalize our actions, not necessarily because we know why we did something. I don't understand why we hold AI to such a high standard.


there seems to be an overabundance of negative sentiment towards deep learning among hn commentators, but whenever i hear the reasons behind the pessimism i'm usually unimpressed.


for the same reason why we have psychiatrist, for when the AI does a mistake, you need to fix it, work around it, prevent it or if all else fail to protect others from it.

it's all fun and games when AI do trivia. when AI get plugged into places that can result in tangible real world consequences (i.e. airport screening) you need to be able to reason about the system so it gets monotonically better over time.


gp3 is an impressive technical feat and the pinnacle of the current line of research

however, if you remove the technical colored glasses and boil down what it is and what it does, it's a regurgitation of existing data that it had been fed, it has no understanding of the data itself beyond linguistic patterns.

it's not going to find correlations where there were none, it's not going to actually discover new data, it will find unexpected correlations between data but there's zero indication whether these correlation bear any significance until a human goes validate the prompt, and it can generate infinite of these, making the discovery of significant new ideas pretty slim.


And precisely none of what you just said addresses my point.


> The point is that we are getting unintuitive and unexpected results.

> it will find unexpected correlations between data but there's zero indication whether these correlation bear any significance until a human goes validate the prompt, and it can generate infinite of these, making the discovery of significant new ideas pretty slim

seems a pretty direct response tbh


[flagged]


it isn't really helpful or conductive of an interesting discussion that you are not detailing what the point is, even when the "the point is this" gets directly quoted, while not clarifying neither the point nor why the reply don't apply.


These results are literally unexpected. The consensus of all the experts in 2010 was that none of this could possibly happen in the next ten years. The way these results were arrived at are unintuitive which probably has to do with why they were so utterly unexpected. The broad effort to mine “algorithm space,” which includes many different ML agents and other things, is producing results that were not expected. This is just a fact. There is no way around this. Just accept it and move on.

It’s obvious that the surprises will keep coming. We slowly close in on the algorithms that bring the silicon to its full potential. The real question is what is at the bottom? We keep digging and with more compute and more data eventually we will find that the stuff near the bottom is important and dangerous.

The fact that gtp3 has x flaw has absolutely no logical intersection with this. It’s basically unrelated.


What's impressive about it? It's bigger, that's cool. What's it actually mean in the real world.

I see nothing to get excited about at this point.


I'll tell you why I'm not impressed. We can't keep doubling, er, increasing model size by two orders of magnitude forever for iterative improvements in quality. (Maybe this is a Malthusian law of the nothing-but-deep-learning AI approach: parameters increase geometrically, quality increases arithmetically.)

This is an achievement, but is not doing more with less. When someone refines GPT-3 down to a form that can be run on a regular machine again (hint: probably a new architecture), then that will be genuinely exciting.

I also want to address this point directly:

> We are digging deeper and deeper into “algorithm space” and we keep hitting stuff that we thought was impossible and it’s going to keep happening and it’s going to lead very quickly to things that are too important and dangerous to dismiss.

I hope the above convinced you that this is basically not possible with current approaches. OpenAI spent approximately $12M just in computation cost training this model (no one knows how much they spent on training previous iterations that did not succeed). Running this at scale only for inference is also an extremely expensive proposition (I've joked with others about many tenths of a degree Celsius GPT-3aaS will contribute to climate change). If we extrapolate out, GPT-4 will be a billion dollar model with tens of trillions of parameters, and we might get a dozen pages of so of generated text that may or maybe not resemble 8chan!

> People who say AGI is a hundred years away also said GO was 50 years away and they certainly didn’t predict anything even close to what we are seeing now so why is everyone believing them?

Isn't this a bit too ad hominem? And not even particularly good ad hominem. I'm sure there existed people on the eve of AlphaGO saying it would be another 50 years, but there's no evidence that the set of these people is the same as those saying AGI is 50-100 years away. How many people made this particular claim? I, for one, made no predictions about Go's feasibility (mostly because I have never thought that playing games is synonymous with intelligence and so mostly didn't find it interesting) but absolutely subscribe to the 50-100 year timeline for AGI.

Think about it like this: Go is a well-defined problem with a well-defined success criterion. AGI has neither of those properties. We don't even understand what intelligence is enough to answer those questions. Life took billions of years to achieve landing on the Moon and building GPT-3. It's not far-fetched that it'll take us at least 100 more using directed research (as opposed to randomness) to learn those same lessons.


It's something pretty unique to ML research. The goal posts keep moving whenever an advancement is made. Every time ML achieves something that was considered impossible X years ago, people look at it and say, actually, that's nothing special, the real challenge is <new goalpost>.

I'm pretty sure even as we cross into AGI, people will react the same way. And only then will some stop and realize that we just wrote off our own intelligence as nothing special, a parlour trick.


There are laws to prevent felons from having guns. Most people who murder with guns are felons. Some people want to make more laws to restrict gun ownership even further. So in the future maybe people who haven’t gotten a training certificate and who have not passed various tests cannot own a gun. In this case, people who carry around pistols might be seen almost as an extension of the police force who only become active in the most extreme situations where police are not available, because they are so thoroughly screened and trained. The man who head-shifted a terrorist in a Texas church comes to mind.

In this scenario, the number of guns carried by criminals would stay the same. And the amount of people murdered by guns will be the same. You know why? Because stop and frisk is racist. According to everyone, stopping people for 2 minutes to check them for guns is abhorrent, racist and evil. So criminals will always have guns in this country. There is no reason to pass laws one way or another.

Some say that stop and frisk is unconstitutional. I think I agree with this. So it follows that it is embedded in the constitution that criminals will be carrying around pistols. There is no changing this. I think that the only reasonable response is to allow ordinary people who are not criminals, not obviously insane and etc to carry pistols. It’s written in the constitution that we are allowed to do something like that. So if liberals want to be constitutionalists about stop and frisk, then conservatives have the right to be constitutionalists about carrying guns. If criminals are allowed to carry guns in practice then it is only reasonable to let their victims carry guns too.


Or, y’know, like everywhere else, criminals likely wouldn’t carry guns and certainly wouldn’t shoot people because “hey, I got caught” is not usually an imminent life and death situation and they’d risk a significantly higher sentence.

Of course, this only works when you don’t hand out 10+ year prison sentences for property crime in the first place.


How is this connected to the post?


In the context of this post, "the constitution" would be:

https://www.admin.ch/opc/en/classified-compilation/19995395/...


Didn’t read the article but I met a Swiss guy at a hostel the other day. He told me how citizens can prompt a vote on almost any issue — almost a direct democracy, in his words. He said that a vote was successfully prompted to limit the gap between the salaries of ceos and low level employees. It was voted down. My only remark was that if the United States had a system like that, the country would promptly destroy itself.


How would the country destroy itself?


By voting through feel-good proposals that are ultimately misinformed and destructive. It’s happening anyway through our indirect democracy but at a slower rate.


That is not what is happening in Switzerland. There isn't reallya reason why ot would play out differently in the USA.

The beginning might be turbulent, but that should cha.ge after a few years


I would guess the Swiss population is generally more educated on average compared to the American public, which to be fair is easier to achieve with a population of 9M vs 328M. A sufficiently educated public is a requirement for a more direct democracy to have desirable results.

America really needs an education system overhaul, teachers are often paid close to fast food restaurant staff levels of wages.


If you have a working education system, it will work exactly the same for 9 and for 300 million people. Its not about size but about the approach.


The idea that someone would opt out of downloading Elon masks or Jeff bezos’ DMs is insane. Completely and perfectly insane. Not to mention the other people. Even if just in terms of profit, clearly the dms of the richest man in the world have enough value to just click download. It seems like the probability of this guy passing it up due to lack of interest is very small. Slightly more likely is that he was overwhelmed by the massive implications of having this access and simply didn’t think to do it in the rush to do something before his source chickened out or realized what was going on. And most likely to me is that he simply couldn’t do it because of a higher level of security associated with verified accounts, hence why none of the accounts were verified. He wasn’t able to do anything with trumps twitter so I suspect that for certain high profile accounts, there is a much higher level of security that this guys source couldn’t override.


I'm not sure what you're thinking, but it's perfectly reasonable.

People like you're talking about don't communicate anything of value over twitter. Bezos only follows his ex-wife who doesn't follow him back, barely uses twitter and would be unlikely to have any DMs at all. After the saudi hack, I would be surprised if he has much of anything installed on his phone.

The only real reason to hack celebrity accounts in this instance, and which they should have done, would be to deflect attention from the accounts they actually went after.


I wouldn't be so sure with Musk, for example. He even met his partner via DM.


Yeah, Musk spends a lot of time on Twitter, his DMs are def loaded, but I also agree that Bezos' is definitely empty. Still, I'm sure there are some verified accounts they would've downloaded the data from if they could, so it makes me think they couldn't easily.


> Bezos only follows his ex-wife who doesn't follow him back

I honestly didn't believe you. But it's true. That's... kinda weird.


Probably a dead account he no longer access it before divorce, hence the awkward status


Hmm, I don't think so. The account has tweeted as recently as Feb of this year, but his divorce was finalized in July 2019.


I always thought high profile accounts are run by media teams. I doubt the account owners know the credentials themselves or have direct access in most cases.


I’m pretty positive Elon’s account is NOT run by a media team :D


His media team works behind the counter at the pot dispensary.


If you think Musk or Trump have any sense of restraint, much less opsec, do I ever have a bridge to sell you.


Where do you think you will find more info -- the DMs of a PR account or the someone's private alt? Or the DMs of a twitter celebrity or the DMs of a hedge fund manager or member of the board of directors of a bank?


I suspect that hedge fund managers and bank board members are considerably less likely to use DMs as a means of primary communication than Twitter celebrities. Private alt accounts would be very interesting, but you'd need to know who they are....


I'm unfamiliar with Twitter's interface but, isn't it entirely possible that things like DMs are available (when you're able to log in as the account in question) and scrape-able without directly using the "download twitter data" tool?


the it's even weirder because this looks something that required a certain amount of planning


I have no idea what you’re talking about. Rust is not faster than c or anything else necessarily that is not garbage collected. It is almost as fast, and it has memory safety and as is made apparent in this thread many advantages in terms of usability and undefined behavior. Being the quickest i s not something rust is known for. So you wouldn’t be saving milliseconds. And calling an api is pretty simple so the security aspect wouldn’t be of much use. So you are wrong and the guy you are responding to is right.


I also got ptsd in a domestic setting. Please remember it gets better with time. I’m sure you already know this, but MDMA has been shown to be effective in treating ptsd. I was personally able to get rid of most of the symptoms with large doses of propranolol. Try both of those before killing yourself.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: