By writing, you are voting on the future of the Shoggoth using one of the few currencies it acknowledges: tokens it has to predict. If you aren't writing, you are abdicating the future or your role in it. If you think it's enough to just be a good citizen, to vote for your favorite politician, to pick up litter and recycle, the future doesn't care about you.
These AI predictions never, ever seem to factor in how actual humans will determine what AI-generated media is successful in replacing human-ones, or if it will even be successful at all. It is all very theoretical and to me, shows a fundamental flaw in this style of "sit in a room reading papers/books and make supposedly rational conclusions about the future of the world."
A good example is: today, right now, it is a negative thing for your project to be known as AI-generated. The window of time when it was trendy and cool has largely passed. Having an obviously AI-generated header image on your blog post was cool two years ago, but now it is passé and marks you as behind the trends.
And so for the prediction that everything get swept up by an ultra-intelligent AI that subsequently replaces human-made creations, essays, writings, videos, etc., I am doubtful. Just because it will have the ability to do so doesn't mean that it will be done, or that anyone is going to care.
It seems vastly more likely to me that we'll end up with a solid way of verifying humanity – and thus an economy of attention still focused on real people – and a graveyard of AI-generated junk that no one interacts with at all.
I've been writing for decades with the belief I was training a future AI and
used to say that the Turing test wasn't mysterious at all because it was a solved problem in economics in the form of an indifference curve that showed where peoeple cared whether or not they were dealing with a person or a machine.
the argument against AI taking over is we organize around symbols and narratives and are hypersensitive to waning or inferior memes, thereofre AI would need to reinvent itself as "not-AI" every time so we don't learn to categorize it as slop.
I might agree, but if there were an analogy in music, some limited variations are dominant for decades, and there are precedents where you can generate dominant memes from slop that entrains millions of minds for entire lifetimes. Pop stars are slop from an industry machine that is indistinguishable from AI, and as evidence, current AI can simulate their entire catalogs of meaning. the TV Tropes website even identifies all the elements of cultural slop people should be immune to, but there are still millions of people walking around living out characters and narratives they received from pop-slop.
there will absolutely be a long tail of people whose ontology is shaped by AI slop, just like there is a long tail of people whose ontology is shaped by music, tv, and movies today. that's as close to being swept up in an AI simulation as anything, and perhaps a lot more subtle. or maybe we'll just shake it off.
That is a good point, and fundamentally I agree that these big budget pop star machines do function in a way analogous to an AI, and that we're arguing metaphysics here.
But even if a future AI becomes like this, that doesn't prevent independent writers (like gwern) from still having a unique, non-assimilated voice where they write original content. The arguments tend to be "AI will eat everything, therefore get your writing out there now" and not "this will be a big thing, but not everything."
With AI you need to think, long and hard, about the concept (borrowed from cryptography), "Today, the state of the art in is the worst it will ever be".
Humanity is pinning its future on the thought that we will hit intractable information-theoretic limitations which provide some sort of diminishing returns on performance before a hard takeoff, but the idea that the currently demonstrated methods are high up on some sigmoid curve does not seem at this point credible. AI models are dramatically higher performance this year than last year, and were dramatically better last year than the year before, and will probably continue to get better for the next few years.
That's sufficient to dramatically change a lot of social & economic processes, for better and for worse.
There's a good chance you're right, but I think there's also a chance that things could get worse at some point (with some hand-wavy definition of "a while").
Currently the state-of-the-art is propped up with speculative investments, if those speculations turn out to be wrong enough, or social/economic changes force the capital to get allocated somewhere else, then there could be a significant period of time where access to it goes away for most of us.
We can already see small examples of this from the major model providers. They launch a mind-blowing model, get great benchmarks and press, and then either throttle access or diminish quality to control costs / resources (like Claude Sonnet 3.5 pretty quickly shifted to short, terse responses). Access to SOTA is very resource-constrained and there are a lot of scenarios I can imagine where that could get worse, not better.
Even "Today, the state of the art in is the worst it will ever be" in cryptography isn't always true, like post-spectre/meltdown. You could argue that security improved but perf definitely did not.
I don’t disagree that it’ll change a lot of things in society.
But that isn’t the claim being made, which is that some sort of AI god is being constructed which will develop entirely without the influence of how real human beings actually act. This to me is basically just sci-fi, and it’s frankly kind of embarrassing that it’s taken so seriously.
> dramatically higher performance this year than last year, and were dramatically better last year than the year before
Yeah, but, better at _what_?
Cars are dramatically faster today than 100 years ago. But they still can't fly.
Similarly, LLMs performing better on synthetic benchmarks does not demonstrate that they will eventually become superintelligent beings that will replace humanity.
If you want to actually measure that, then these benchmarks need to start asking questions that demonstrate superintelligence: "Here is a corpus of all current research on nuclear physics, now engineer a hydrogen bomb." My guess is, we will not see much progress.
Having an obviously AI-generated header image on your blog post was cool two years ago, but now it is passé and marks you as behind the trends.
This is true only because publicly-accessible models have been severely nerfed (out of sheer panic, one assumes), making their output immediately recognizable and instantly clichéd.
Dall-E 2, for instance, was much better when it first came out, compared to the current incarnation that has obviously been tweaked to discourage generating anything that resembles contemporary artists' output, and to render everything else in annoying telltale shades of orange and blue.
Eventually better models will appear, or be leaked, and then you won't be able to tell if a given image was generated by AI or not.
I think the wider question mark about that sentence is that even if LLMs that ingest the internet and turn it into different words are the future of humanity, there's an awful lot of stuff in an AI corpus and a comparatively small number intensively researched blogs probably aren't going to shift the needle very much
I mean, you'd probably get more of a vote using generative AI to spam stuff that aligns with your opinions or moving to Kenya to do low wage RHLF stuff...
I don’t believe Gwern lives as frugally as he’s described in this (if this even actually is the real Gwern). I’m 100% sure that he has a persona he likes to portray and being perceived as frugal is a part of that persona.
When it comes to answering the question “who is gwern?” I reckon Gwern’s a plant a seed in people’s mind type of guy, and let them come up with the rest of the story.
Still, I like a lot of his writing. Especially the weird and niche stuff that most people don’t even stop to think about.
And thanks to Gwern’s essay on the sunk costs fallacy, I ended up not getting a tattoo that I had changed my mind about. I almost got it because I had paid a deposit, but I genuinely decided I hated the idea of what I was going to get… and almost got it, but the week before I went to get the tattoo, I read that essay, and decided if small children and animals don’t fall victim to sunk costs, then neither should I!
Literally - Gwern saved the skin on my back with his writing. Haha.
I met Gwern once, when he came to the NYC Less Wrong meetup. I don't think he was internet-famous yet. It was probably 12 years ago or so, but based on my recollection, I'm totally willing to believe that he lives very frugally. FWIW he wouldn't have looked out of place at an anime convention.
But I do know he created an enormous dataset of anime images used to train machine learning and generative AI models [1]. Hosting large datasets is moderately expensive - and it's full of NSFW stuff, so he's probably not having his employer or his college host it. Easy for someone on a six-figure salary, difficult for a person on $12k/year.
Also, I thought these lesswrong folks were all about "effective altruism" and "earning to give" and that stuff.
Hosting large datasets can be expensive but the hosting for the danbooru datasets was not.
It's "only" a few terabytes in size. A previous release was 3.4TB, so the latest is probably some hundreds of GB, to a TB~, in size larger.
The download was hosted on a hetzner IP, which is a provider known for cheap servers. You can pay them $50/m for a server with "unmetered" 1gigabit up/down network + 16TB of disks.
$600 a year would not be difficult.
I'm fairly sure it's relatively true (except for occasional extra purchases on top) unless he's been keeping it up in places I wouldn't expect him to.
I don't like that now people might pigeonhole him a bit by thinking about his effective frugality but I do hope he gets a ton of donations (either directly or via patreon.com/gwern ) to make up for it.
There is a comment on the r/slatestarcodex subreddit with supposedly true information about him (which I found googling 'who is gwern'), but it left me with even more questions.
> Gwern was the first patient to successfully complete a medical transition to the gender he was originally assigned at birth... his older brother died of a Nuvigil overdose in 2001... his (rather tasteful) neck tattoo of the modafinil molecule
The only concrete things we know about gwern are that he's a world-renowned breeder of Maine Coons and that he is the sole known survivor of a transverse cerebral bifurcation.
He does have a neck tattoo, but it's actually a QR code containing the minimal weights to label MNIST at 99% accuracy.
> The only concrete things we know about gwern are that he's a world-renowned breeder of Maine Coons and that he is the sole known survivor of a transverse cerebral bifurcation.
There's some articles on his site about these attempts and they claim that they are all wrong. If it would be that "easily found" I'd guess we wouldn't be having these discussions: https://gwern.net/blackmail#pseudonymity-bounty
This was a tough listen, for two subtly similar reasons.
The voice was uncanny. Simply hard to listen to, despite being realistic. I mean precisely that: it is cognitively difficult to string together meaning from that voice. (I am adjacent to the field of audio production and frequently deal with human- and machine-produced audio. The problem this podcast has with this voice is not unique.) The tonality and meaning do not support each other (this will change as children grow up with these random-tonality voices).
The conversation is excessively verbose. Oftentimes a dearth of reason gets masked by a wide vocabulary. For some audience members I expect the effort to understand the words distracts from the relationship between the words (ie, the meaning), and so it just comes across as a mashup of smart-sounding words, and the host, guest, and show gets lauded for being so intelligent. Cut through the vocabulary and occasional subtle tsks and pshaws and “I-know-more-than-I-am-saying” and you uncover a lot of banter that just does not make good sense: it is not quite correct, or not complete in its reasoning. This unreasoned conversation is fine in its own right (after all, this is how most conversation unfolds, a series of partially reasoned stabs that might lead to something meaningful), but the masking with exotic vocabulary and style is misleading and unkind. Some of these “smart-sounding” snippets are actually just dressed up dumb snippets.
It's a real voice. Probably with some splicing, but I don't know how much. Gwern admits he isn't that good at speaking, and I believe him. He also says he isn't all that smart. Presumably that's highly relative.
The voice was uncanny. Simply hard to listen to,
despite being realistic. I mean precisely that: it
is cognitively difficult to string together meaning
from that voice.
What? According to the information under the linked video,
In order to protect Gwern's anonymity, I proposed
interviewing him in person, and having my friend Chris
Painter voice over his words after. This amused him
enough that he agreed.
I'm not familiar with the SOTA in AI-generated voices, so I could very well be mistaken.
But it did not sound fake to me, and the linked source indicates that it's a human.
Perhaps it sounds uncanny to you because it's a human reading a transcript of a conversation.... and attempting to make it sound conversational, as if he's not reading a transcript?
It's been a bumper week for interesting podcast interviews with an AI theme!
In addition to this, there are Lex Fridman's series of interviews with various key people from Anthropic [0], and a long discussion between Stephen Wolfram and Eliezer Yudkowsky on the theme of AI risk [1].
I found the conversation between Wolfram and Yudkowsky hard to listen to. In fact, I didn't make it to the half. The arguments presented by both were sort of weak and uninteresting?
I find any conversation these days involving Wolfram or Yudkowsky hard to listen to. Them trying to talk to each other... I'd imagine them talking completely past each other, and am happy not to have to verify that.
I'm only half way through that and it IS good, but I wish they wouldn't burn so much valuable time on recaps of the history that has already been told in so many other interviews, and get on to talking about the real changes we should expect going forward.
This will come across as vituperative and I guess it is a bit but I've interacted with Gwern on this forum and the interaction that has stuck to me is in this thread, where Gwern mistakes a^nb^n as a regular (but not context-free) language (and calls my comment "not even wrong"):
Again I'm sorry for the negativity, but already at the time Gwern was held up by a certain, large, section of the community as an important influencer in AI. For me that's just a great example of how basically the vast majority of AI influencers (who vie for influence on social media, rather than research) are basically clueless about AI and CS and only have second-hand knowledge, which I guess they're good at organising and popularising, but not more than that. It's easy to be a cheer leader for the mainstream view on AI. The hard part is finding, and following, unique directions.
With apologies again for the negative slant of the comment.
> For me that's just a great example of how basically the vast majority of AI influencers (who vie for influence on social media, rather than research) are basically clueless about AI and CS
This is a bit stark: there are many great knowledgeable engineers and scientists who would not get your point about a^nb^n. It's impossible to know 100% of of such a wide area as "AI and CS".
>> This is a bit stark: there are many great knowledgeable engineers and scientists who would not get your point about a^nb^n. It's impossible to know 100% of of such a wide area as "AI and CS".
I think, engineers, yes, especially those who don't have a background in academic CS. But scientists, no, I don't think so. I don't think it's possible to be a computer scientist without knowing the difference between a regular and a super-regular language. As to knowing that a^nb^n specifically is context-free, as I suggest in the sibling comment, computer scientists who are also AI specialists would recognise a^nb^n immediately, as they would Dyck languages and Reber grammars, because those are standard tests of learnability used to demonstrate various principles, from the good old days of purely symbolic AI, to the brave new world of modern deep learning.
For example, I learned about Reber grammars for the first time when I was trying to understand LSTMs, when they were all the hype in Deep Learning, at the time I was doing my MSc in 2014. Online tutorials on coding LSTMs used Reber grammars as the dataset (because, as with other formal grammars it's easy to generate tons of strings from them and that's awfully convenient for big data approaches).
Btw that's really the difference between a computer scientist and a computer engineer: the scientist knows the theory. That's what they do to you in CS school, they drill that stuff in your head with extreme prejudice; at least the good schools do. I see this with my partner who is 10 times a better engineer than me and yet hasn't got a clue what all this Chomsky hierarhcy stuff is. But then, my partner is not trying to be an AI influencer.
Strong gatekeeping vibes. "Not even wrong" is perfect for this sort of fixation with labels and titles and an odd seemingly resentful take that gwern has being an AI influencer as a specific goal.
"not even wrong" is supposed to refer to a specific category of flawed argument, but of course like many other terms it's come to really mean "low status belief"
OK, I concede that if I try to separate engineers from scientists it sounds like I'm trying to gatekeep. In truth, I'm organising things in my head because I started out thinking of myself as an engineer, because I like to make stuff, and at some point I started thinking of myself as a scientist, malgré moi, because I also like to know how stuff works and why. I multiclassed, you see, so I am trying to understand exactly what changed, when, and why.
I mean obviously it happened when I moved from industry to academia, but it's still the case there's a lot of overlap between the two areas, at least in CS and AI. In CS and AI the best engineers make the best scientists and vv. I think.
Btw, "gatekeeping" I think assumes that I somehow think of one category less than the other? Is that right? To be clear, I don't. I was responding to the use of both terms in the OP's comments with a personal reflection on the two categories.
I sure hope nobody ever remembers you being confidently wrong about something. But if they do, hopefully that person will have the grace and self-restraint not to broadcast it
any time you might make a public appearance, because they're apparently bitter that you still have any credibility.
Point taken and I warned my comment would sound vituperative. Again, the difference is that I'm not an AI influencer, and I'm not trying to make a living by claiming an expertise I don't have. I don't make "public appearances" except in conferences where I present the results of my research.
And you should see the criticism I get by other academics when I try to publish my papers and they decide I'm not even wrong. And that kind of criticism has teeth: my papers don't get published.
Please be aware that your criticism has teeth too, you just don't feel the bite of them. You say I "should see" that criticism you receive on your papers, but I don't; it's delivered in private. Unlike the review comments you get from your peers, you are writing in public. I'm sure you wouldn't appreciate it if your peer reviewer stood up after your conference keynote and told the audience that they'd rejected your paper five years ago, described your errors, and went on to say that nobody at this conference should be listening to you.
I think I'm addressing some of what you say in my longer comment above.
>> Please be aware that your criticism has teeth too, you just don't feel the bite of them.
Well, maybe it does. I don't know if that can be avoided. I think most people don't take criticism well. I've learned for example that there are some conversations I can't have with certain members of my extended family because they're not used to being challenged about things they don't know and they react angrily. I'm specifically remember a conversation where I was trying to explain the concept of latent hypoxia and ascent blackout [1] (I free dive recreationally) to an older family member who is an experienced scuba diver, and they not only didn't believe me, they called me an ignoramus. Because I told them something they didn't know about. Eh well.
_____________
[1] It can happen that while you're diving deep, the pressure of the water keeps the pressure of oxygen in your blood sufficient that you don't pass out, but then when you start coming up, the pressure drops and the oxygen in your blood thins out so much that you pass out. In my lay terms. My relative didn't believe that the water pressure affects the pressure of the air in your vessels. I absolutely can feel that when I'm diving- the deeper I go the easier it gets to hold my breath and it's so noticeable because it's so counter-intuitive. My relative wouldn't have experienced that during scuba diving (since they breathe pressurised air, I think) and maybe it helps he's a smoker. Anyway I never managed to convince him.
As I never managed to convince him that we eat urchins' genitals, not their eggs. After a certain point I stopped trying to convince him of anything. I mean I felt like a know-it-all anyway, even if I knew what I was talking about.
I actually either didn't know about that pressure thing [0], or I forgot. I suspect I read about it at some point because at some level I knew ascending could have bad effects even if you don't need decompression stops. But I didn't know why, even though it's obvious in retrospect.
So thanks for that, even though it's entirely unrelated to AI.
[0]: though I've seen videos of the exact same effect on a plastic water bottle, but my brain didn't make the connection
Can I say a bit more about criticism on the side? I've learned to embrace it as a necessary step to self-improvement.
My formative experience as a PhD student was when a senior colleague attacked my work. That was after I asked for his feedback for a paper I was writing where I showed that my system beat his system. He didn't deal with it well, sent me a furiously critical response (with obvious misunderstandings of my work) and then proceeded to tell my PhD advisor and everyone else in a conference we were attending that my work is premature and shouldn't be submitted. My advisor, trusting his ex-student (him) more than his brand new one (me), agreed and suggested I should sit on the paper a bit longer.
Later on the same colleague attacked my system again, but this time he gave me a concrete reason why: he gave me an example of a task that my system could not complete (learn a recursive logic program to return the last element in a list from a single example that is not an example of the base-case of the recursion; it's a lot harder than it may sound).
Now, I had been able to dismiss the earlier criticism as sour grapes, but this one I couldn't get over because my system really couldn't deal with it. So I tried to figure out why- where was the error I was making in my theories? Because my theoretical results said that my system should be able to learn that. Long story short, I did figure it out and I got that example to work, plus a bunch of other hard tests that people had thrown at me in the meanwhile. So I improved.
I still think my colleague's behaviour was immature and not becoming of a senior academic- attacking a PhD student because she did what you 've always done, beat your own system, is childish. In my current post-doc I just published a paper with one of our PhD students where we report his system trouncing mine (in speed; still some meat on those old bones otherwise). I think criticism is a good thing overall, if you can learn to use it to improve your work. It doesn't mean that you'll learn to like it, or that you'll be best friends with the person criticising you, it doesn't even mean that they're not out to get you; they probably are... but if the criticism is pointing out a real weakness you have, you can still use it to your advantage no matter what.
Constructive criticism is a good thing, but in this thread you aren't speaking to Gwern directly, you're badmouthing him to his peers. I'm sure you would have felt different if your colleague had done that.
He did and I did feel very angry about it and it hurt our professional relationship irreparably.
But above I'm only discussing my experience of criticism as an aside, unrelated to Gwern. To be clear, my original comment was not meant as constructive criticism. Like I think my colleague was at the time, I am out to get Gwern because I think, like I say, that he is a clueless AI influencer, a cheer-leader of deep learning who is piggy-backing on the excitement about AI that he had nothing to do with creating. I wouldn't find it so annoying if he, like many others who engage in the same parasitism, did not sound so cock-sure that he knows what he's talking about.
I do not under any circumstances claim that my original comment is meant to be nice.
Btw, I remember now that Gwern has in other times accused me , here on HN, of being confidently wrong about things I don't know as well as I think I do (deep learning stuff). I think it was in a comment about Mu Zero (the DeepMind system). I don't think Gwern likes me much, either. But, then, he's a famous influencer and I'm not and I bet he finds solace in that so my criticism is not going to hurt him in the end.
is it really? this is the most common example for context free languages and something most first year CS students will be familiar with.
totally agree that you can be a great engineer and not be familiar with it, but seems weird for an expert in the field to confidently make wrong statements about this.
Thanks, that's what I meant. a^nb^n is a standard test of learnability.
That stuff is still absolutely relevant, btw. Some DL people like to dismiss it as irrelevant but that's just because they lack the background to appreciate why it matters. Also: the arrogance of youth (hey I've already been a postdoc for a year, I'm ancient). Here's a recent paper on Neural Networks and the Chomsky Hierarchy that tests RNNs and Transformers on formal languages (I think it doesn't test on a^nb^n directly but tests similar a-b based CF languages):
And btw that's a good paper. Probably one of the most satisfying DL papers I've read in recent years. You know when you read a paper and you get this feeling of satiation, like "aaah, that hit the spot"? That's the kind of paper.
a^nb^n can definitely be expressed and recognized with a transformer.
A transformer (with relative invariant positional embedding) has full context so can see the whole sequence. It just has to count and compare.
To convince yourself, construct the weights manually.
First layer :
zeros the character which are equal to the previous character.
Second layer :
Build a feature to detect and extract the position embedding of the first a.
a second feature to detect and extract the position embedding of the last a,
a third feature to detect and extract the position embedding of the first b,
a fourth feature to detect and extract the position embedding of the last b,
Third layer :
on top that check whether (second feature - first feature) == (fourth feature - third feature).
The paper doesn't distinguish between what is the expressive capability of the model, and the finding the optimum of the model, aka the training procedure.
If you train by only showing example with varying n, there probably isn't inductive bias to make it converge naturally towards the optimal solution you can construct by hand. But you can probably train multiple formal languages simultaneously, to make the counting feature emerge from the data.
You can't deduce much from negative results in research beside it requiring more work.
>> The paper doesn't distinguish between what is the expressive capability of the model, and the finding the optimum of the model, aka the training procedure.
They do. That's the whole point of the paper: you can set a bunch of weights manually like you suggest, but can you learn them instead; and how? See the Introduction. They make it very clear that they are investigating whether certain concepts can be learned by gradient descent, specifically. They point out that earlier work doesn't do that and that gradient descent is an obvious bit of bias that should affect the ability of different architectures to learn different concepts. Like I say, good work.
>> But you can probably train multiple formal languages simultaneously, to make the counting feature emerge from the data.
You could always try it out yourself, you know. Like I say that's the beauty of grammars: you can generate tons of synthetic data and go to town.
>> You can't deduce much from negative results in research beside it requiring more work.
I disagree. I'm a falsificationist. The only time we learn anything useful is when stuff fails.
Gradient descent usually get stuck in local minimum, it depends on the shape of the energy landscape, that's expected behavior.
The current wisdom is that by optimizing for multiple tasks simultaneously, it makes the energy landscape smoother. One task allow to discover features which can be used to solve other tasks.
Useful features that are used by many tasks can more easily emerge from the sea of useless features. If you don't have sufficiently many distinct tasks the signal doesn't get above the noise and is much harder to observe.
That the whole point of "Generalist" intelligence in the scaling hypothesis.
For problems where you can write a solution manually you can also help the training procedure by regularising your problem by adding the auxiliary task of predicting some custom feature. Alternatively you can "Generatively Pretrain" to obtain useful feature, replacing custom loss function by custom data.
The paper is a useful characterisation of the energy landscape of various formal tasks in isolation, but doesn't investigate the more general simpler problem that occur in practice.
Regarding your linked comment, my takeaway is that the very theoretical task of being able to recognize an infinite language isn't very relevent to the non-formal, intuitive idea of "intelligence"
Transformers can easily intellectually understand a^nb^n, even though they couldn't recognize whether an arbitrarily long string is a member of the language -- a restriction humans share!, since eventually a human, too, would lose track of the count, for a long enough string.
I don't know what "intellectually understand" means in the context of Transformers. My older comment was about the ability of neural nets to learn automata from examples, a standard measure of the learning ability of a machine learning system. I link to a paper below where Transformers and RNNs are compared on their ability to learn automata along the entire Chomsky hierarchy and as other work has also shown, they don't do that well (although there are some surprising surprises).
>> Regarding your linked comment, my takeaway is that the very theoretical task of being able to recognize an infinite language isn't very relevent to the non-formal, intuitive idea of "intelligence"
That depends on who you ask. My view is that automata are relevant to computation and that's why we study them in computer science. If we were biologists, we would study beetles. The question is whether computation , as we understand it on the basis of computer science, has anything to do with intelligence. I think it does, but that it's not the whole shebang. There is a long debate on that in AI and the cognitive sciences and the jury is still out, despite what many of the people working on LLMs seem to believe.
By intellectually understand, I just mean you can ask Claude or ChatGPT or whatever, "how can I recognize if a string is in a^n b^n? what is the language being described?" and it can easily tell you; if you were giving it an exam, it would pass.
(Of course, maybe you could argue that's a famous example in its training set and it's just regurgitating, but then you could try making modifications, asking other questions, etc, and the LLM would continue to respond sensibly. So to me it seems to understand...)
Or going back to the original Hofstadter article, "simple tests show that [machine translation is] a long way from real understanding"; I tried rerunning the first two of these simple tests today w/ Claude 3.5 Sonnet (new), and it absolutely nails them. So it seems to understand the text quite well.
Regarding computation and understanding: I just though it was interesting that you presented a true fact about the computational limitations of NNs, which could easily/naturally/temptingingly -- yet incorrectly (I think!) -- be extended into a statement about the limitations of understanding of NNs (whatever understanding means -- no technical definition that I know of, but still, it does mean something, right?).
>> (Of course, maybe you could argue that's a famous example in its training set and it's just regurgitating, but then you could try making modifications, asking other questions, etc, and the LLM would continue to respond sensibly. So to me it seems to understand...)
Yes, well, that's the big confounder that has to be overcome by any claim of understanding (or reasoning etc) by LLMs, isn't it? They've seen so much stuff in training that it's very hard to know what they're simply reproducing from their corpus and what not. My opinion is that LLMs are statistical models of text and we can expect them to learn the surface statistical regularities of text in their corpus, which can be very powerful, but that's all. I don't see how they can learn "understanding" from text. The null hypothesis should be that they can't and, Sagan-like, we should expect to see extraordinary evidence before accepting they can. I do.
>> Regarding computation and understanding: I just though it was interesting that you presented a true fact about the computational limitations of NNs, which could easily/naturally/temptingingly -- yet incorrectly (I think!) -- be extended into a statement about the limitations of understanding of NNs (whatever understanding means -- no technical definition that I know of, but still, it does mean something, right?).
For humans it means something- because understanding is a property we assume humans have. Sometimes we use it metaphorically ("my program understands when the customer wants to change their pants") but in terms of computation... again I have no clue.
Personally I am convinced LLMs do have real understanding, because they seem to respond in interesting and thoughtfull ways to anything I care to talk to them about, well outside of any topic I would expect to be captured statistically! (Indeed, I often find it easier to get LLMs to understand me than many humans. :-)
There's also stuff like the Golden Gate Claude experiment and research @repligate shares on twitter, which again make me think understanding (as I conceive of it) is definitely there.
Now, are the conscious, feeling entities? That is a harder question to answer...
How do you do intelligence without computation though? Brains are semi-distributed analog computers with terrible interconnect speeds and latencies. Unless you think they're magic, any infinite language is still just a limit to them.
Edit: and technically you're describing what is more or less backprop learning, neural networks, by themselves, don't learn at all.
Yes, I'm talking about learning neural nets with gradient descent. See also the nice paper I linked below.
>> How do you do intelligence without computation though?
Beats me! Unlike everyone else in this space, it seems, I haven't got a clue how to do intelligence at all, with or without computation.
Edit: re infinite languages, I liked something Walid Saba (RIP) pointed out on Machine Learning Street Talk, that sure you can't generate infinite strings but if you have an infinite language every string accepted by the language has a uniform probability of one over infinity, so there's no way to learn the entire language by learning the distribution of strings within it. But e.g. the Python compiler must be able to recognise an infinite number of Python programs as valid (or reject those that aren't) because of the same reason, that it's impossible to predict which string is going to come out of a source generating strings in an infinite language. So you have to able to deal with infinite possibilities, with only finite resources.
Now, I think there's a problem with that. Assuming a language L has a finite alphabet, even if L is infinite (i.e. it includes an infinite number of strings) the subset of L where strings only go up to some length n is going to be finite. If that n is large enough that it is just beyond the computational resources of any system that has to recognise strings in L (like a compiler) then any system that can recognise, or generate, all strings in L up to n length, will be, for all intents and purposes, complete with respect to L, up to n etc. In plain English, the Python compiler doesn't need to be able to deal with Python programs of infinite length, so it doesn't need to deal with an infinite number of Python programs.
Same for natural language. The informal proof of the infinity of natural language I know of is based on the observation that we can embed an arbitrary number of sentences in other sentences: "Mary, whom we met in the summer, in Fred's house, when we went there
with George... " etc. But, in practice, that ability too will be limited by time and human linguistic resources, so not even the human linguistic ability really-really needs to be able to deal with an infinite number of strings.
That's assuming that natural language has a finite alphabet, or I guess lexicon is the right word. That may or may not be the case: we seem to be able to come up with new rods all the time. Anyway some of this may explain why LLMs can still convincingly reproduce the structure of natural language without having to train on infinite examples.
What I don't know how to do is bounded rationality. Iterating over all the programs weighted by length (with dovetailing if you're a stickler) is "easy", but won't ever get anywhere.
And you can't get away with the standard tricky tricks that people use to say it isn't easy, logical induction exists.
FWIW, I’ve had a very similar encounter with another famous AI influencer who started lecturing me on fake automata theory that any CS undergrad would have picked up on. 140k+ followers, featured on the all the big podcasts (Lex, MLST). I never corrected him but made a mental note not to trust the guy.
Being an influencer requires very little actual competence, same goes for AI influencers.
The goal of influencers is to influence the segment of a crowd who cares about influencers. Meaning retards and manchildren looking for an external source to form consensus around.
To the person that commented that five years is an awful long time to remember something like that (and then deleted their comment): you are so right. I am trying to work through this kind of thing :/
I take the Feynman view here; vain memory tricks are not themselves net new production, so just look known things up in the book.
Appreciate the diversity in the effort, but engineering is making things people can use without having to know it all. Far more interesting endeavor than being a human Google search engine.
No, look. If a student (I'm not a professor, just a post-doc) doesn't know this stuff, I'll point them to the book so they can look it up, and move on. But the student will not tell me I'm "not even wrong" with the arrogance of fifty cardinals while at the same time pretending to be an expert [1]. It's OK to not know, it's OK to not know that you don't know, but arrogant ignorance is not a good look on anyone.
And there's a limit to what you need to look up in a book. The limit moves further up the more you work with a certain kind of tool or study a certain kind of knowledge. I have to look up trigonometry every single time I need it because I only use it sparingly. I don't need to look up SLD-Resolution, which is my main subject. How much would Feynman need to look up when debating physics?
So when someone like Feynman talks about physics, you listen carefully because you know they know their shit and a certain kind of nerd deeply appreciates deep knowledge. When someone elbows themselves in the limelight and demands everyone treats them as an expert, but they don't know the basics, what do you conclude? I conclude that they're pretending to know a bunch of stuff they don't know.
________________
[1] ... some do. But they're students so it's OK, they're just excited to have learned so much and don't yet know how much they don't. You explain the mistake, point them to the book, and move on.
@newmanpo Your comment is [dead] so I can't directly reply to it, but you're assuming thing about me that are wrong. I say above I'm a post-doc. You should understand what this means: I'm the workhorse in an academic research lab where I'm expected to make stuff work, and then write papers about it. I write code and tell computers when to jump. I'm not a philosopher by any stretch of the term and just to be clear, a scientist is not a philosopher (not any more).
Edit: dude, come on. That's no way to have a debate. Other times I'm the one who gets all the downvotes. You gotta soldier on through it and say your thing anyway. Robust criticism is great but being prissy about downvotes just makes HN downvote you more.
In the thread you linked, Gwern says in response to someone else that NNs excel at many complex real-world tasks even if there are some tasks where they fail but humans (or other models) succeed. You try to counter that by bringing up an example for the latter type of task? And then try to argue that this proves Gwern wrong?
Whether they said "regular grammar" or "context-free grammar" doesn't even matter, the meaning of their message is still the exact same.
a^nb^n is regular, but it is also context free. I don't think there's a restriction on the n. Why do you say this?
Edit: sorry, I read "finite" as "infinite" :0 But n can be infinite and a^nb^n is still regular, and also context free. To be clear, the Chomskky Hierarchy of formal languages goes like this:
That's because formal languages are identified with the automata that accept them and when an automaton accepts e.g. the Recursively Enumerable languages, then it also accepts the context-sensitive languages, and so on all the way down to the finite languages. One way to think of this is that an automaton is "powerful enough" to recognise the set of strings that make up a language.
It seems like his objection is that "parsing formal grammars" isn't the point of LLMs, which is fair. He was wrong about RGs vs CFGs, but I would bet that the majority of programmers are not familiar with the distinction, and learning the classification of a^nb^n is a classic homework problem in automata theory specifically because it's surprising that such a simple grammar is CF.
> I love the example of Isaac Newton looking at the rates of progress in Newton's time and going, “Wow, there's something strange here. Stuff is being invented now. We're making progress. How is that possible?” And then coming up with the answer, “Well, progress is possible now because civilization gets destroyed every couple of thousand years, and all we're doing is we're rediscovering the old stuff.”
The link in this paragraph goes to a post on gwern website. This post contains various links, both internal and external. But I still failed to find one that supports claims about Newton's views on "progress".
> This offers a little twist on the “Singularity” idea: apparently people have always been able to see progress as rapid in the right time periods, and they are not wrong to! We would not be too impressed at several centuries with merely some shipbuilding improvements or a long philosophy poem written in Latin, and we are only modestly impressed by needles or printing presses.
We absolutely _are_ impressed. The concept of "rapid progress" is relative. There was rapid progress then, and there is even more rapid progress now. There is no contradiction.
Anyway, I have no idea how this interview got that many upvotes. I just wasted my time.
That works in reverse too. While I am in awe of what humanity already achieved - when I read fictional timelines of fictional worlds (Middle-Earth or Westeros/Essos) I am wondering how getting frozen in medieval like time is even possible. Like, what are they _doing_?
You're right, really: it's not possible. It's a problem with the conservative impulse (*for a very specific meaning of conservative) in fiction: things don't stay frozen in amber like that. If it was nonfiction - aka real life - the experience of life itself from the perspective of living people would change and transform rapidly in the century view.
> I am wondering how getting frozen in medieval like time is even possible. Like, what are they _doing_?
Not discovering sources of cheap energy and other raw inputs. If you look carefully at history, every rapid period of growth was preceded by a discovery or conquest of cheap energy and resources. You need excess to grow towards the next equilibrium.
Those stories are inspired (somewhat) by the dark ages. Stagnation is kinda the default state of mankind. Look at places like Afghanistan. Other than imported western tech, it's basically a medieval society. Between the fall of the Roman Empire and the middle medieval era, technology didn't progress all that much. Many parts of the world were essentially still peasant societies at the start of the 20th century.
All you really need is a government or society that isn't conducive to technological development, either because they persecute it or because they just don't do anything to protect and encourage it (e.g. no patent system or enforceable trade secrets).
Even today, what we see is that technological progress isn't evenly distributed. Most of it comes out of the USA at the moment, a bit from Europe and China. In the past there's usually been one or two places that were clearly ahead and driving things forward, and it moves around over time.
The other thing that inspires the idea of a permanent medieval society is archaeological narratives about ancient Egypt. If you believe their chronologies (which you may not), then Egyptian society was frozen in time for thousands of years with little or no change in any respect. Not linguistic, not religious, not technological. This is unthinkable today but is what academics would have us believe really happened not so long ago.
They're probably doing the same thing humans on our earth were doing for centuries until ~1600. Surviving. Given how cruel nature is I think we're lucky to have the resources to do more than just survive, to build up all this crazy technology we don't strictly need to live, just for fun/profit.
Most people get on with life without inventing much new stuff themselves. It was interesting trekking in Nepal that you could go to places without electricity or cars and life went on really quite similar to before and probably still does. Though they may have got solar electric and phones now - not quite sure of the latest status.
> “Well, progress is possible now because civilization gets destroyed every couple of thousand years, and all we're doing is we're rediscovering the old stuff.”
Irrespective of the historical accuracy of the quote I've always felt this way in some form, having personally lived through the transition from a world where it felt like you didn't have to have an opinion on everything to one dominated by the ubiquitous presence of the Internet. Although not so much because I believe an advanced human civilization has destroyed itself in our current timeline, but because the presence of so many life-changing breakthroughs in such a short period of time to me indicates a unceasing march towards a Great Filter.
My favourite Gwern insight is “Bitcoin is Worse is Better”, where they summarize an extensive list of objections to Bitcoin and then ask if there’s a common thread:
No! What’s wrong with Bitcoin is that it’s ugly. … It’s ugly to make your network’s security depend solely on having more brute-force computing power than your opponents, ugly to need now and in perpetuity at least half the processing power just to avoid double-spending … It’s ugly to have a hash tree that just keeps growing … It’s ugly to have a system which can’t be used offline without proxies and workarounds … It’s ugly to have a system that has to track all transactions, publicly … And even if the money supply has to be fixed (a bizarre choice and more questionable than the irreversibility of transactions), what’s with that arbitrary-looking 21 million bitcoin limit? Couldn’t it have been a rounder number or at least a power of 2? (Not that the bitcoin mining is much better, as it’s a massive give-away to early adopters. Coase’s theorem may claim it doesn’t matter how bitcoins are allocated in the long run, but such a blatant bribe to early adopters rubs against the grain. Again, ugly and inelegant.) Bitcoins can simply disappear if you send them to an invalid address. And so on.
Interesting idea with the avatar, but I feel like just having a voice and some audio waves would be better than trying to create a talking avatar. Could just be my personal preference of not having a mental image of someone unknown I guess? Similar to reading a book after watching the movie adaptation.
In my country, when TV series are interviewing anonymous people they use specific visual language - pixellated face, or facing away from the camera, or face clad in shadow.
Having an actor voice the words is normal. But having an actor showing the anonymous person's face is an... unusual choice.
I'm aware of that possibility but I don't mind watching the other interviewer and I'm also not sure if there's something being shown (screenshots etc.) after the first few minutes I watched.
It does make me wonder what easy way to do ML assisted shot / person detection and `blanking` is. I'm just gonna point out there is a nerd snipe danger here ;D
Although the avatar tool is probably not SOTA, I thought the 3D model was a really cool way to deal with interviewing Gwern, I am quite enjoying the current video.
Something I've noticed in spending time online is that there's a "core group" of a few dozen people who seem to turn up everywhere there are interesting discussions. Gwern (who also posts here) is probably at the top of that list.
There have been multiple times where I read a comment somewhere; thought to myself, wow, this guy is brilliant, let me see who wrote it so I can see if there are other things they've written; and, lo and behold, gwern.
> In Internet culture, the 1% rule is a general rule of thumb pertaining to participation in an Internet community, stating that only 1% of the users of a website actively create new content, while the other 99% of the participants only lurk. Variants include the 1–9–90 rule (sometimes 90–9–1 principle or the 89:10:1 ratio),[1] which states that in a collaborative website such as a wiki, 90% of the participants of a community only consume content, 9% of the participants change or update content, and 1% of the participants add content.
I don't know what the current HN usage stats are, but assume you would still need to explain about 3 additional orders of magnitude to get from 1% of HN down to "a few dozen".
It's ChrisMarshallNY for me. So frequently I'll come to a comment chain on Apple or Swift or NYC stuff with the intention to make a sweet point, only to find Chris has already said the same thing, though much more eloquently.
He's been building software for 10 years longer than I've been alive, hopefully in a few decades I'll have gained the same breadth of technical perspective he's got.
I wonder how much of that can be attributed to the limitations of the human mind as we evolved in relatively small groups/tribes and it might be difficult to see beyond that
>Wait if you’re doing $900-1000/month and you’re sustaining yourself on that, that must mean you’re sustaining yourself on less than $12,000 a year. What is your lifestyle like at $12K?"
>I live in the middle of nowhere. I don't travel much, or eat out, or have health insurance, or anything like that. I cook my own food. I use a free gym. There was this time when the floor of my bedroom began collapsing. It was so old that the humidity had decayed the wood. We just got a bunch of scrap wood and a joist and propped it up. If it lets in some bugs, oh well! I live like a grad student, but with better ramen. I don't mind it much since I spend all my time reading anyway.
Not sure what to think of that. On one hand, it's so impressive that gwern cares only about the intellectual pursuit. On the other hand, it's sad that society does not reward it as much as excel sheet work.
It does show how out of touch tech workers are that they are shocked someone is able to live off 1k a month. It sometimes feels like hn posters were born on college campuses and shielded from normal society until their twenties, after which they move to the Bay Area with the upmost confidence that they know everything there is to know about the world.
Especially if you are doing it voluntarily, 1k a month can provide you more then enough for a comfortable life in many part of the country. More so if you can avoid car ownership/insurance and health insurance (Which gwern seems to do).
There's a bit in The West Wing where one of the characters finds out the US poverty thresholds are calculated based on work done in the 1950s by a woman named Mollie Orshansky, and that they can't be updated because then US would then have millions more poor people, and that's bad politics. According to your link that's still mostly true 25 years later.
It's not uncommon, but hopefully it offers some perspective to those commenters who say things like "this-or-that subscription fee should be trivial pocket change to the sorts of people who comment on HN".
Gwern lives this way because he wants to. He has chosen this ascetic lifestyle. He could easily raise money if he ever needs it (he doesn't need it right now). He could easily do tech work in SF and set himself up for life. He also has many friends who got rich off crypto.
A room is only a prison cell when you're not allowed to leave.
From a resource point of view, time is one of the most precious we have and optimizing for "the most control over my time" by living frugally makes sense. If you put this time into your skills growth you may outperform in the long term for some fields (where skills matter more than social capital) the people who had to sell their time to pay higher bills.
It's a reasonable tradeoff for some circumstances.
> How do you sustain yourself while writing full time?
> Gwern
> Patreon and savings. I have a Patreon which does around $900-$1000/month, and then I cover the rest with my savings. I got lucky with having some early Bitcoins and made enough to write for a long time, but not forever. So I try to spend as little as possible to make it last.
Then Dwarkesh just gets stuck on this $1k/month thing when Gwern right out of gate said that savings are being used.
Who knows how much of the savings are being used or how big of a profit he got from BTC.
He's living in a place where the floor collapsed and eats (good) ramen. If it's 12k or 20k I'm not sure it makes a meaningful difference to the narrative.
Eating (good) ramen is being used here as evidence that he's doing poorly, or something? I don't get it. Ramen is delicious, and can be very healthy. I hereby politely demand that you explain what exactly you are insinuating about the wonder that is ramen.
A floor collapsing and not bothering to replace it sounds more serious, sure, but that can mean a lot of different things in different circumstances. Imagine, for example, someone who's an expert in DIY but also a devoted procrastinator. That person could leave a roof in the state described for months or years, planning to eventually do it up, and I wouldn't consider anything terribly revelatory about the person's financial or mental status to have occurred.
In America, ramen typically (or used to, at least) refers to cheap packages of noodles with extremely high amounts of salt. They are not healthy or considered anything other than “cheap” food. Hence the concept of "being ramen profitable."
I think the misunderstanding is caused by existing of two types of ramen - traditional, good quality is one, and another is instant - bad quality fast food.
I'm aware that it sounds like poverty. I'm just saying it's going to depend on the details. To people who live ordinary western lifestyles, everything outside of that sounds weird.
- ramen =/= cheap pot noodle type things. Ramen restaurants in Japan attest to this. It'll depend on context what that actually means in this case.
- 12k/year =/= no money, numbers like that make no sense outside of context. It depends on how much of it is disposable. You live in a hut with no rent, you grow food, you've no kids, no mortgage, no car, etc, these all matter.
- no healthcare =/= bad health. How many of the growing numbers of people dying from heart-related diseases in their 50s and 60s had healthcare? Didn't do much for them, in the end.
- collapsing floor =/= bad hygiene or something else actually potentially dangerous, as I said above, the nuisance this causes or doesn't cause depends on lots of actual real factors, it's not some absolute thing. It just sounds wild to people not used to it
> 12k/year =/= no money [...] live in a hut with no rent, you grow food, you've no kids, no mortgage, no car, etc, these all matter.
I agree that 12k goes a long way when you're a subsistence farmer, living alone in a collapsing hut in the middle of nowhere, with no health insurance.
Nonetheless, that's sounding a lot like poverty to me.
> a subsistence farmer, living alone in a collapsing hut in the middle of nowhere, with no health insurance.
"no health insurance" is the only thing in your list there that isn't hyperbole or a twisting of what was said. But anyway, again, no-one is arguing about whether 12,000/year "officially" is or is not below the poverty line, except you, with yourself.
Did you read the part about him having savings in bitcoin that wouldn't sustain him forever, but nonetheless for many, many years? If the bitcoin was worth 500,000, for example, would you then say "oh, that's a totally acceptable way to live now!", or would you still be belittling their life choices, insinuating there was something bad or wrong with it?
I don’t know Gwern (except for a small donation I made!), but I truly believe people can be that frugal if their inner self is satisfied. By the way, my father is an artist who feels completely fulfilled living a bohemian lifestyle, selling his work in two art markets and enjoying a vibrant social life there, even without fame.
Honestly I've become skeptical of people who end up in high-intellectualized "pursuits" neglecting their own personal and social interactions and the larger societal reactions
Maybe it works for maths, physics and such, and of course it's ok to philosophize, but I think those "ivory tower" thinkers sometimes lack a certain connection to reality
Could be, but he does not strike as someone who is looking for fame.
Plus the whole discussion about why he would like to move to SF but can't seems pretty authentic.
Great interview but I wasn't a fan of how they handled the voice. Whether a human voice actor or an AI "voice actor", it inserted cadence, emphasis and emotion that I have no way of knowing if it was Gwern's original intention or not. Reading the transcript of the interview would probably be better as you won't be mislead by what the voice actor added in or omitted.
I clicked on the video asking myself, wait, how does the title make sense, how is he anonymous if he's doing videos?
Then when I saw the frankly very creepy and offputting image and voice, thinking he'd been anonymised through some AI software, thought, oh no, this kind of thing isn't going to become normal is it.
Then - plot twist - I scroll down to read the description and see that that voice is an actual human voiceover! I don't know if that makes it more or less creepy. Probably more. What a strange timeline.
Ah, that was a human voice over? I really wish they hadn't done the voice thing as I found it distracting. The emotion felt all off and misleading. I guess it's better than an AI voice at least but a traditional voice mask would have been better IMO
Yes, someone called "Chris Painter", who could easily be a nice person, I suppose. Maybe the generic U.S. "non-offensive" male A.I. accent is based off his voice originally, and we're coming full circle?
Technically he's pseudonymous. I don't know if he always had the (fictional) last name "Branwen", but I have records of "gwern" (all lowercase) going way back. And yes I'm pretty sure it was the same person.
He says he has collaborators under the "Gwern" name now, but the main guy is the main guy and it's unlikely he could hide it.
How many citations for "Branwen 2018" are on the ArXiv now?
I really don’t understand why we give credit to this pile of wishful thinking about the AI corporation with just one visionary at the top.
First: actual visionary CEOs are a niche of a niche.
Second: that is not how most companies work. The existence of the workforce is as important as what the company produces
Third: who will buy or rent those services or products in a society where the most common economy driver (salaried work) is suddenly wiped out?
I am really bothered by these systematic thinkers whose main assumption is that the system can just be changed and morphed willy nilly as if you could completely disregard all of the societal implications.
We are surrounded by “thinkers” who are actually just glorified siloed-thinking engineers high on their own supply.
Gwern's (paraphrased) argument is that an AI is unlikely to be able to construct an extended bold vision where the effects won't be seen for several years, because that requires a significant amount of forecasting and heuristics that are difficult to optimise for.
I haven't decided whether I agree with it, but I can see the thought behind it: the more mechanical work will be automated, but long-term direction setting will require more of a thoughtful hand.
That being said, in a full-automation economy like this, I imagine "AI companies" will behave very differently to human companies: they can react instantly to events, so that a change in direction can be affected in hours or days, not months or years.
> Someone probably said the exact same thing when the first cars appeared.
Without saying anything regarding the arguments for or against AI, I will address this one sentence. This quote is an example of an appeal to hypocrisy in history fallacy, a form of the tu quoque fallacy. Just because someone criticizes X and you compare it to something else (Y) from another time does not mean that the criticism of X is false. There is survivorship bias as well because we now have cars, but in reality, you could've said this same criticism against some other thing that failed, but you don't, because, well, it failed and thus we don't remember it anymore.
The core flaw in this reasoning is that just because people were wrong about one technology in the past doesn't mean current critics are wrong about a different technology now. Each technology needs to be evaluated on its own merits and risks. It's actually a form of dismissing criticism without engaging with its substance. Valid concerns about X should be evaluated based on current evidence and reasoning, not on how people historically reacted to Y or any other technology.
In this case, there isn't much substance to engage with . The original argument made in passing in an interview covering a range of subjects is essentially [answering your question which presupposes that AI takes over all jobs] I think it'll be bottom up because [in my opinion] being a visionary CEO is the hardest thing to automate
The fact that similar, often more detailed assertions of the imminent disappearance of work has been a consistent trope since the beginning of the Industrial Revolution (as acknowledged in literally the next question in the interview, complete with an interestingly wrong example) and we've actually ended up with more jobs seems far more like a relevant counterargument than ad hominem tu quoque...
Again, my comment is not about AI, it is about the faulty construction of the argument in the sentence I quoted. X and Y could be anything, that is not my point.
My point is also not really about AI, my point is that pointing out that the same arguments that X implies Y could (and have been) applied to virtually every V, W, Z (where X and V/W/Z are both in the same category, in this case the category of "disruptive inventions") and yet Y didn't happen as predicted isn't ad hominem tu quoque fallacy or anything to do with hypocrisy, it's an observation that arguments about the category resulting in Y have tended to be consistently wrong so we probably should treat claims about Y happening because of something else in the category with scepticism...
Car and motor vehicles in general get you to work and help you do your work. They don't do the work. I guess that's the difference in thinking.
I'm not sure that it's acrually correct: I don't think we'll actually see "AI" actually replace work in general as a concept. Unless it can quite literally do everything and anything, there will always be something that people can do to auction their time and/or health to acquire some token of social value. It might taken generations to settle out who is the farrier who had their industry annihilated and who is the programmer who had it created. But as long as there's scarcity and ambition in the world, there'll be something there, whether it's "good work" or demeaning toil under the bootheel of a fabulously wealthy cadre of AI mill owners. And there will be scarcity as long as there's a speed of light.
Even if I'm wrong and there isn't, that's why it's called the singularity. There's no way to "see" across such an event in order to make predictions. We could equally all be in permanent infinite bliss, be tortured playthings of a mad God, extinct, or transmuted into virtually immortal energy beings or anything in between.
You might as well ask the dinosaurs whether they thought the ultimate result of the meteor would be pumpkin spice latte or an ASML machine for all the sense it makes.
Anyone claiming to be worrying over what happens after a hypothetical singularity is either engaging in intellectual self-gratification, posing or selling something somehow.
They don't do the work, they help you do the work. The work isn't compiling or ploughing, it's writing software and farming, respectively. Both of which are actually just means to the ends of providing some service that someone will pay "human life tokens" for.
AI maximalists are talking about breaking that pattern and having AI literally do the job and provide the service, cutting out the need for workers entirely. Services being provided entirely autonomously and calories being generated without human input in the two analogies.
I'm not convinced by that at all: if services can be fully automated, who are you going to sell ERP or accounting software to, say? What are people going to use as barter for those calories if their time and bodies are worthless?
But I can see why that is a saleable concept to those who consider the idea of needing to pay workers to be a cruel injustice. Though even if it works at all, which, as I said, I dont believe, the actual follow-on consequences of such a shift are impossible to make any sensible inferences about.
There is no data, just hyperbole from those same "visionaries" who keep claiming their stochastic parrots will replace everyone's jobs and we therefore need UBI
Didn’t say that.
If you posit that the future of the corporation is having a visionary CEO with a few minion middle managers and a swath of AI employees, then tell me, what do you do with the thousands of lost and no longer existing salaried jobs?
Or are you saying that the future is a multitude of corporations of one?
We can play with this travesties of intellectual discourse as long as you like, but we’re really one step removed from some stoners’ basement banter
You can see this with many words – the most commonly known word gradually encompasses all similar ones. I don't know if there is a formal linguistic term, but I call it "conceptual compression." It's when a concept used to have multiple levels, but now just has one. It seems like an inevitable outcome in a society that doesn't care about using language accurately.
It's not. I don't know if you think we're all Gwern, but he's provided a useful enough service for enough people that a discussion with more than 13 comments is worthwhile.
"In order to protect Gwern's anonymity, I proposed interviewing him in person, and having my friend Chris Painter voice over his words after. This amused him enough that he agreed."
It's fine. I don't know how Gwern actually talks, but unless Patel was going to get an experienced voice actor I'm not sure how much better it could be.
Everything I read from gwern has this misanthropic undertones. It's hard to put a finger on it exactly, but it grits me when I try reading him.
It is also kinda scary that so many people are attracted to this.
It rhymes with how I feel about Ayn Rand. Her individualism always seems so misanthropic, her adherants scare me
It’s possible that a person can enjoy things that cater to the advancement of “civilization” while being seen as someone indifferent to (or inclined away from) “humanity”. Ie, a materialist.
I'm not entirely sure what you're referring to, but pretty much every major 20th century intellectual had misanthropic undertones (and sometimes overtones) - people who read/think/write an exceptional amount don't tend to be super people-loving.
That should depend on what you read. There's more than enough in the history of our species and in books about us to make someone love humanity too, at least conceptually if not in daily practice.
I've never really experienced that from his writing, and I am definitely not an Ayn Rand fan. I'm also pretty sure he's not interested in creating a movement that could have 'adherents'... I suppose I could be wrong on that. But on the contrary, I find his writing to be often quite life-affirming - he seems to delight in deep focus on various interesting topics.
The worst I can say is that I find his predictions around AI (i.e. the scaling laws) to be concerning.
edit: having now read the linked interview, I can provide a clearly non-misanthropic quote, in response to the interviewer asking gwern what kind of role he hopes to play in people's lives:
I would like people to go away having not just been entertained or gotten some useful information, but be better people, in however slight a sense. To have an aspiration that web pages could be better, that the Internet could be better: “You too could go out and read stuff! You too could have your thoughts and compile your thoughts into essays, too! You could do all this!”
Interesting that there's no mention of human biodiversity (aka blacks are dumb), as if you spend five minutes on #lesswrong you'll notice that that's a big issue for gwern and the other goons.
Thing is: Are you absolutely sure that notion of human biodiversity is wrong? IQ is heritable, as height is heritable. You'll grant that there are populations that differ in their genetic potential for height -- e.g. Dalmatians vs. Pygmies -- so how is it that you dismiss out of hand the notion that there might be population-wide differences in the genetic potential for intelligence?
I can hear it now: "But IQ is not intelligence!" I agree to a point, but IQ -- and, strangely, verbal IQ in particular -- maps very neatly to one's potential for achievement in all scientific and technological fields.
The Truth is a jealous goddess: If you devote yourself to her, you must do so entirely, and take the bad along with the good. You don't get to decide what's out of bounds; no field of inquiry should be off-limits.
Heritability is simply a measure of how much of the variation in a trait, like height or IQ, is due to genetic factors rather than environmental influences in a given population. Could be feedlot steers, could be broiler chickens, could be humans. In humans, traits like height are very highly heritable, at ~0.8. Certain others, like eye color, are heritable to ~0.98.
Right. It's a simple ratio of genetic variation to phenotypical variation. How does evidence of heritability support HBD claims, which are based on genetic determinism, a notion orthogonal to heritability?
I don't think that HBD claims -- at least, those made in reasonably good faith -- are based on genetic determinism. A Bosnian guy from the Dinaric alps is much more likely to be >1.8m in stature than a Pygmy. This is not predetermined as such, it's just that one population has something like +3SD in stature over the other. (An admittedly wildly extreme example!)
Differences in IQ between groups are apparently far more modest, but, however distasteful, it's still possible to speak of them, and it's possible to make statistical statements about them. My position is simply that, on the one side, it should be done in good faith -- and, on the other side, it shouldn't be seen as something heretical.
I don't understand the alternative interpretation you're alluding to. Stipulate the validity of IQ or the common g. If group variations in these metrics aren't caused by genes, why are they distasteful? If they are, you're describing genetic determinism, which, again, is orthogonal to heritability.
Heritability is a statistical concept, not a measure of genetic determinism. High heritability doesn’t imply that a trait one exhibits, such as IQ or height, is entirely predetermined by one's genes. Even eye color is only heritable to ~0.98. I'll grant that any trait heritable to 1.0 is indeed entirely predetermined by one's genes -- though, offhand, I'm not sure that such traits exist in humans.
That aside, we're getting into semantics. Whether you call it "genetic determinism" or "heritability," we're talking about durable group differences in genetically-mediated traits. And that is what people may find distasteful or even heretical.
Are we talking past each other? I'm saying: heritability is orthogonal to the question of whether a trait is determined by genetics. There are traits with no genetic component at all that are highly heritable, and vice versa. "Genetic determinism" doesn't mean "a guarantee that a group of genetically similar people will display a trait"; it means "the trait is causally linked to genes".
The semantics matter, because the evidence supporting HBD positions is stated in terms of the technical definition of heritability.
While I've got you, can I ask that you stop evoking "heresy" and "distaste" in this thread? I believe I'm making simple, objective points, not summoning opprobrium on your position.
Sure, heritability is orthogonal to the question of whether a trait is determined by genetics.
But traits like IQ, height, and eye color are both (A) highly heritable and (B) substantially shaped by genetic factors. In casual online discourse, I believe that (B) is usually taken for granted, so it's glossed over, and when people say that any given trait is "heritable" they're also assuming that (B) is true for the trait. At least, I am guilty of that lapse.
When you say "substantially shaped by genetic factors", you should present evidence. It's easy to provide evidence for the heritability of intelligence (again, stipulating IQ), but as we've established, that begs the question of whether the genetic connection is correlation or causation. Environments are inherited, too.
There is growing evidence that group IQ heritability isn't evidence of genetic causation.
I'm not gonna say that people there don't think that hbd is real, but it's not an everyday discussion topic. Mostly because it's kind of boring.
(Do spend five minutes on #lesswrong! We don't bite! (Generally! If you come in like "I heard this was the HBD channel", there may be biting, er, banning.))
New to the deeper internet? Just about anyone who has been aware of gwern for long comes to believe he's got some special skills, even if they disagree with him. He's a niche celebrity entirely for being insightful and interesting.
Lots of the claims about Gwern in the intro are exaggerated.
Gwern is an effective altruist and his influence is largely limited to that community. It would be an exaggeration to claim that he influenced the mainstream of AI and ML researchers -- certainly Hinton, LeCun, Ng, Bengio didn't need him to do their work.
He influences the AI safety crowd, who have ironically been trying to build AGI to test their AI safety ideas. Those people are largely concentrated at Anthropic now, since the purge at OpenAI. They are poorly represented at major corporate AI labs, and cluster around places like Oxford and Cal. The EAs' safety concerns are a major reason why Anthropic has moved so much slower than its competitors, and why Dario is having trouble raising the billions he needs to keep going, despite his media blitz. They will get to AGI last, despite trying to be the good guys who are first to invent god in a bottle.
By the same token, Dwarkesh is either EA or EA adjacent. His main advertiser for this episode is Jane Street, the former employer of the world's most notorious EA, Sam Bankman-Fried as well as Caroline Ellison. Dwarkesh previously platformed his friend Leopold Aschenbrenner, who spent a year at OAI before he wrote the scare piece "Situation Report" made the rounds. Leopold is also semi-technical at best. A wordcel who gravitated to the AI narrative, which could describe many EAs.
People outside of AI and ML, please put Dwarkesh in context. He is a partisan and largely non-technical. The way he interfaces with AI is in fantasizing about how it will destroy us all, just as he and Gwern do in this interview.
It's sad to see people who are obviously above average intelligent waste so much time on this.
"Wordcel" is roon terminology, right? I highly doubt Aschenbrenner is an EA, and if he's a "wordcel" he somehow managed to do mathematical economics without too much problem.
Gwern's probably not a "wordcel" either, he can program, right? I've never seen any of his publications though.
It's called Situational Awareness too, not "Situation Report", and Yudkowsky said he didn't like it. Not that Yudkowsky is an EA either.
I think the situation is more complex than you think it is, or at least more complex than you're pretending.
Edit: oh, you're saying Gwern is an EA too? Do you have a source for that?
He has studied and influenced the real world. Here's Dario Amodei, the CEO of Anthropic, one of the leading AI companies, directly referencing him in an interview from this week: https://youtu.be/ugvHCXCOmm4?t=343
These AI predictions never, ever seem to factor in how actual humans will determine what AI-generated media is successful in replacing human-ones, or if it will even be successful at all. It is all very theoretical and to me, shows a fundamental flaw in this style of "sit in a room reading papers/books and make supposedly rational conclusions about the future of the world."
A good example is: today, right now, it is a negative thing for your project to be known as AI-generated. The window of time when it was trendy and cool has largely passed. Having an obviously AI-generated header image on your blog post was cool two years ago, but now it is passé and marks you as behind the trends.
And so for the prediction that everything get swept up by an ultra-intelligent AI that subsequently replaces human-made creations, essays, writings, videos, etc., I am doubtful. Just because it will have the ability to do so doesn't mean that it will be done, or that anyone is going to care.
It seems vastly more likely to me that we'll end up with a solid way of verifying humanity – and thus an economy of attention still focused on real people – and a graveyard of AI-generated junk that no one interacts with at all.
reply