Hacker News new | past | comments | ask | show | jobs | submit | gjm11's comments login

The quotation from Quanta was incomplete, and the Quanta article doesn't in fact claim that "totally real" means what "algebraic" actually means. Here's what the article actually says (added emphasis mine):

> A number is totally real if it satisfies a polynomial equation with integer coefficients that only has real roots.

Nothing wrong with that.


So far as I can tell, the fact that

"particles" are just what we call particular kinds of excitations in quantum fields

doesn't in any way answer, or obviate, or otherwise demystify the question of

why the electric charge associated with one sort of "particle" should be exactly 3x the electric charge associated with another.

So your comment is not only gratuitously rude, it's also either (1) wrong or (2) missing some essential explanation.


A key idea here is that being poor makes you do worse at high school relative to how well you would do at university. Being stupid probably doesn't. So being more generous to poorer students may result in getting an academically stronger set of students overall, whereas being more generous to stupider students won't.

(I don't know how true this "key idea" actually is. I can imagine ways in which poverty might disadvantage people that would persist through their time at university. But it seems plausible prima facie.)


That may well be some people's experience. It wasn't mine.

Not because I didn't like it: I enjoyed it a lot, I learned things from it, it was one of my favourite books, etc. I still think it's an impressive piece of work. But there was nothing particularly ineffable about it.

So let me try to summarize some of the perfectly effable things that people who like GEB tend to like about it, which might (or might not!) make it worth reading for you.

1. It's a playful book. Hofstadter is having a lot of fun as he writes. (I think this is one of the things that people who don't like the book tend to really dislike: if you don't happen to enjoy the same things Hofstadter does, it can just feel self-indulgent.)

Here's a fairly typical example of the sort of thing he does: the book alternates between ordinary chapters, where Hofstadter might explain some bit of mathematics or talk about an incident in the life of J S Bach or whatever, and dialogues between some imaginary characters. Each of those dialogues is named after a particular piece of music by J S Bach. For instance, one of them is called "Crab Canon", after one of the little pieces in Bach's "Musical Offering" which has the amusing property that it's the same forwards as backwards. So Hofstadter's dialogue is also the same forwards as backwards, and he's constructed it so that the conversation makes a reasonable amount of sense both ways around.

That's a fairly superficial sort of play -- it doesn't have much to do with the deeper underlying ideas Hofstadter is trying to explore, it's just a bit of fun. But he does play around with the deeper ideas too.

2. It brings a bunch of apparently different ideas together and relates them to one another. The "psychedelic" aspects may come from this -- there's something of the "wow, I never realised before, but everything is, like, one thing" to it. And this is another thing that you might either really like -- he's made a bunch of unobvious connections between things you mightn't have seen links between, and connecting things better enriches your mind -- or really dislike, if you feel that the connections he's claiming to make are bogus.

For instance, Hofstadter is very keen on what he calls "strange loops", in which category he includes (1) indirect self-reference, as in the machinery of Goedel's incompleteness theorem or "quining" (copy this down, first without the quotes and then again, after a colon, with them: "copy this down, first without the quotes and then again, after a colon, with them") and (2) what happens when a person thinks about their self and (3) Escher pictures like "Print Gallery" or "Drawing Hands" that somehow show something containing or creating a representation of itself and (4) the way in which DNA codes for proteins which make cells with machinery for converting DNA into proteins which etc. and (5) rather more tendentiously, one of Bach's canons which ends a tone higher than it starts so that if you kept on playing it the key would keep rising and rising. The common theme is something about traversing levels of a hierarchy and somehow coming back to where you started. If you agree with Hofstadter that this is an interesting and important general phenomenon (he thinks it's essential to how conscious minds work) and that all these diverse things are cases of it, then you'll find this enlightening, maybe even exciting. If you think he's just grouped together a bunch of things with little in common and convinced himself they're all the same thing, then not.

3. It talks about some really quite exciting mathematics (at least, for those who are able to be excited by mathematics): Goedel's incompleteness theorems. If you just want to learn about how Goedel's stuff works, you can get that more efficiently and probably more clearly in other places. But Hofstadter's explanation isn't so bad, and he intertwines it with all those other things he's interested in, and once again you might like it or hate it. In any case, for many people GEB was their first exposure to the idea that some statements that are just about the properties of the positive integers might be provably neither provable nor disprovable, and to the neat techniques Goedel cooked up by which, in some sense, statements about properties of positive integers can "really" be "talking about" mathematical statements and proofs and whatnot, and these are (again, for some subset of the population) exciting ideas.

4. Escher made some really cool pictures. Bach made some really cool music. If you happen not to be familiar with them before reading GEB, then being introduced to this cool stuff is a pretty valuable service GEB can do you.

5. Chunks of GEB are about artificial intelligence. The world of AI has changed a lot since GEB, of course, and today's AI systems don't have at all the sort of structure I think Hofstadter expected them to have. (It may be that they have some of that structure "hidden inside" -- artificial neural networks are mysterious and inscrutable in something like the same way as brains are -- but I think Hofstadter was expecting that structure to be in the code.) I haven't read GEB in a while and won't try to pronounce on how much value his thoughts-from-way-back-then have today. But I think one thing some people have found exciting about GEB, especially when reading it early in life, is that it was the first thing they read that took seriously the possibility that computers might be able to think in something like the same way as humans, and tried to think about how that might work. The ideas of AI are much more "in the water supply" these days; I doubt anyone will first hear about them from reading GEB any more.

6. (Same theme as 3, 4, 5.) There are just lots of interesting things in GEB. Zen koans. Fractals. Winograd's "SHRDLU" AI system. Bacteriophages. Non-euclidean geometry. Srinivasa Ramanujan. Etc. You won't learn much about any of these things from GEB, but encountering them at all is delightful if one happens not to have seen them before. So the experience of reading the book, if one happens not already to know everything, is one where at any moment you may suddenly encounter some fascinating new thing.

I am not claiming that you should read GEB. It's pretty long. All the individual things you could learn from it, you could learn another way. If you're a generally-well-informed adult, you probably already know a lot of the things some people first encounter in GEB. You might not share Hofstadter's taste in wordplay and the like. But it definitely has merits that can be described.


Wow, this is a great explainer. I think you captured most of it. The other things others have highlighted is that when it came out in 1979, and many fans read it, there was no internet to speak of, definitely no WWW or YouTube, and these ideas were not as easily accessible as they're now. So for many this book might well have been the first encounter with many of the themes, thus mind-blowing.


The post is long and complicated and I haven't read most of it, so whether it's actually any good I shan't try to decide. But the above seems like a very weird argument.

Sure, the code is doing what it's doing. But trying to understand it at that level of abstraction seems ... not at all promising.

Consider a question about psychology. Say: "What are people doing when they decide what to buy in a shop?".

If someone writes an article about this, drawing on some (necessarily simplified) model of human thinking and decision-making, and some experimental evidence about how people's purchasing decisions change in response to changes in price, different lighting conditions, mood, etc., ... would you say "You can just apply the laws of physics and see what the people are doing. They're not doing something more or less than that."?

I mean, it would be true. People, so far as we know, do in fact obey the laws of physics. You could, in principle, predict what someone will buy in a given situation by modelling their body and surroundings at the level of atoms or thereabouts (quantum physics is a thing, of course, but it seems likely that a basically-classical model could be good enough for this purpose). When we make decisions, we are obeying the laws of physics and not doing some other thing.

But this answer is completely useless for actually understanding what we do. If you're wondering "what would happen if the price were ten cents higher?" you've got no way to answer it other than running the whole simulation again. Maybe running thousands of versions of it since other factors could affect the results. If you're wondering "does the lighting make a difference, and what level of lighting in the shop will lead to people spending least or most?" then you've got no way to answer it other than running simulations with many different lighting conditions.

Whereas if you have a higher-level, less precise model that says things like "people mostly prefer to spend less" and "people try to predict quality on the basis of price, so sometimes they will spend more if it seems like they're getting something better that way" and "people like to feel that they're getting a bargain" and so on, you may be able to make predictions without running an impossibly detailed person-simulation zillions of times. You may be able to give general advice to someone with a spending problem who'd like to spend more wisely, or to a shopkeeper who wants to encourage their customers to spend more.

Similarly with language models and similar systems. Sure, you can find out what it does in some very specific situation by just running the code. But what if you have some broader question than that? Then simply knowing what the code does may not help you at all, because what the code does is gazillions of copies of "multiply these numbers together and add them".

Again, I make no claim about whether the particular thing linked here offers much real insight. But it makes zero sense, so far as I can see, to dismiss it on the grounds that all you need to do is read the code.


You’re spot on; it’s like saying you can understand the game of chess by simply reading the rules. In a certain very superficial sense, yes. But the universe isn’t so simple. The same reason even a perfect understanding of what goes on at the level of subatomic particles isn’t thought to be enough to say we ‘understand the universe’. A hell of a lot can happen in between the setting out of some basic rules and the end — much higher level — result.


And yet...alpha zero.


My entire point is that implementation isn’t sufficient for understanding. Alpha Zero is the perfect example of that; you can create an amazing chess playing machine and (potentially) learn nothing at all about how to play chess.

…so what’s your point? I’m not getting it from those two words.


Understanding how the machine plays or how you should play? They aren't the same thing. And that is the point - trying to analogize to some explicit, concrete function you can describe is backwards. These models are gigantic (even the 'small' ones), they are looking to minimize a loss function by looking in multi thousand dimensional space. It is the very opposite of something that fits in a human brain in any explicit fashion.


So is what happens in an actual literal human brain.

And yet, we spend quite a lot of our time thinking about what human brains do, and sometimes it's pretty useful.

For a lot of this, we treat the actual brain as a black box and don't particularly care about how it does what it does, but knowing something about the internal workings at various levels of abstraction is useful too.

Similarly, if for whatever reason you are interested in, or spend some of your time interacting with, transformer-based language models, then you might want some intuition for what they do and how.

You'll never fit the whole thing in your brain. That's why you want simplified abstracted versions of it. Which, AIUI, is one thing that the OP is trying to do. (As I said before, I don't know how well it does it; what I'm objecting to is the idea that trying to do this is a waste of time because the only thing there is to know is that the model does what the code says it does.)


Sure, good abstractions are good. But bad abstractions are worse than none. Think of all the nonsense abstractions about the weather before people understood and could simulate the underlying process. No one in modern weather forecasting suggests there is a way to understand that process at some high level of abstraction. Understand the low level, run the calcs.


> Understanding how the machine plays or how you should play? They aren't the same thing.

And yet, seeing Alpha Zero play has indeed led to new human chess strategies.


Alpha Zero didn't read the rules, it trained within the universe of the rules for 44 million games.


...in fact, one could argue that not only did it not read the rules — it has no conception of rules whatsoever.


It is very promising. In fact, in industry there are jokes about how getting rid of linguists has helped language modeling.

Trying to understand it at some level of abstraction that humans can fit in their head has been a dead end.


Trying to build systems top-down using principles humans can fit in their head has arguably been a dead end. But this doesn't mean that we cannot try to understand parts of current AI systems at a higher level of abstraction, right? They may not have been designed top-down with human-understandable principles, but that doesn't mean that trained, human-understandable principles couldn't have emerged organically from the training process.

Evolution optimized the human brain to do things over an unbelievably long period of time. Human brains were not designed top-down with human-understandable principles. But neuroscientists, cognitive scientists, and psychologists have arguably had success with understanding the brain partially at a higher level of abstraction than just neurons, or just saying "evolution optimized these clumps of matter for spreading genes; there's nothing more to say". What do you think is the relevant difference between the human brain and current machine learning models that makes the latter just utterly incomprehensible at any higher level of abstraction, but the former worth pursuing by means of different scientific fields?


I don't know neuroscience at all, so I don't know if that's a good analogy. I'll make a guess though - if you consider a standard RAG application. That's a system which uses at least a couple models. A person might reasonably say "the embeddings in the db are where the system stores memories. The LLM acts as the part of the brain that reasons over whatever is in working memory plus it's sort of implicit knowledge." I'd argue that's reasonable. But systems and models are different things.

People use many abstractions in AI/ML. Just look at all the functionality you get in PyTorch as an example. But they are abstractions of pieces of a model, or pieces of the training process etc. They aren't abstractions of the function the model is trying to learn.


Right, I've used pytorch before. I'm just trying to understand why the question of "how does a transformer work?" is only meaningfully answered by describing the mechanisms of self-attention layers at the highest level of abstraction, with any higher level of abstraction being nonsense. More specifically, why we should have a ban on any higher level of abstraction in this scenario when we can answer the question of "how does the human mind work?" at not just the atom level, but also the neuroscientific level or psychological level. Presumably you could say the same thing about this question: The human mind is a bunch of atoms obeying the laws of physics. That's what it's doing. It's not something else.

I understand you're emphasizing the point that the connectionist paradigm has had a lot more empirical success than the computationalist paradigm - letting AI systems learn organically, bottom-up is more effective than trying to impose human mind-like principles top-down when we design them. But I don't understand why this means understanding bottom-up systems at higher level of abstractions is necessarily impossible when we have a clear example of a bottom-up system that we've had some success in understanding at a high level of abstraction, viz. the human mind.


It would be great if they were good, but they seem to be bad, it seems that they must be bad given the dimensionality of the space, and humans latch onto simple explanations even when they are bad.

Think about MoE models. Each expert learns to be good at completing certain types of inputs. It sounds like a great explanation for how it works. Except, it doesn't seem to actually work that way. The mixtral paper showed that the activated routes seemed to follow basically no pattern. Maybe if they trained it differently it would? Who knows. It certainly isn't a good name regardless.

Many fields/things can be understood at higher and higher levels of abstraction. Computer science is full of good high level abstractions. Humans love it. It doesn't work everywhere.


Right, of course we should validate explanations based on empirical data. We rejected the idea that there was a particular neuron that activated only when you saw your grandmother (the "grandmother neuron") after experimentation. But just because explanations have been bad, doesn't mean that all future explanations must also be bad. Shouldn't we evaluate explanations on a case-by-case basis instead of dismissing them as impossible? Aren't we better off having evaluated the intuitive explanation for mixtures of experts instead of dismissing them a priori? There's a whole field - mechanistic interpretability - where researchers are working on this kind of thing. Do you think that they simply haven't realized that the models they're working on interpreting are operating in a high-dimensional space?


Mechanistic interpretability studies a bunch of things though. Like, the mixtral paper where they show the routing activations is mechanistic interpretability. That sort of feature visualization stuff is good. I don't know what % of the field is spending their time on trying to interpret the models in a way that involves higher level, human can explain, approximating the following code type work though? I'm certainly not the only one who thinks it's a waste of time, I don't believe anything I've said in this thread is original in any way.

I... don't know if the people involved in that specific stuff have really grokked they are working in high dimensional space? A lot of otherwise smart people work in macroeconomics, where for decades they haven't really made any progress because it's so complex. It seems stupid to suggest a whole field of smart people don't realize what they are up against, but sheesh it kinda seems that way doesn't it? Maybe I'll be eating my words in 10 years.


They certainly understand they're working in a high dimensional space. No question. What they deny is that this necessarily means the goal of interpretability is a futile one.

But the main thrust of what I'm saying is that we shouldn't be dismissing explanations a priori - answers to "how does a transformer work?" that go beyond descriptions of self-attention aren't necessarily nonsensical. You can think it's a waste of time (...frankly, I kind of think it's a waste of time too...), but just like any other field, it's not really fair to close our eyes and ears and dismiss proposals out of hand. I suppose > Maybe I'll be eating my words in 10 years. indicates you understand this though.


> the most menial tech jobs

How do you know what job moffkalast has, and why does it matter? This reads like pure snobbery to me.

(Also: moffkalast did not in fact suggest that anything is a solution to a centuries-old problem. "Some common rhetoric about LLMs is too simplistic" is a far cry from "LLMs resolve all the perplexities about human consciousness and thought".)


It was informally called that from the very beginning; see e.g. https://www.theregister.com/2008/11/05/firefox_market_share_...

(I think the usual term among the people developing the feature was actually "pr0n mode", but I may be wrong and/or misremembering. For the avoidance of doubt, I was not one of those people, I'm just describing what I think I remember seeing on the interwebs at the time.)


Consider a world in which AI existential risk is real: where at some point AI systems become dramatically more capable than human minds, in a way that has catastrophic consequences for humanity.

What would you expect this world to look like, say, five years before the AI systems become more capable than humans? How (if at all) would it differ from the world we are actually in? What arguments (if any) would anyone be able to make, in that world, that would persuade you that there was a problem that needed addressing?

So far as I can tell, the answer is that that world might look just like this world, in which case any arguments for AI existential risk in that world would necessarily be "very hypothetical" ones.

I'm not actually sure how such arguments could ever not be hypothetical arguments, actually. If AI-doom were already here so we could point at it, then we'd already be dead[1].

[1] Or hanging on after a collapse of civilization, or undergoing some weird form of eternal torture, or whatever other horror one might anticipate by way of AI-doom.

So I think we either (1) have to accept that even if AI x-risk were real and highly probable we would never have any arguments for it that would be worth heeding, or (2) have to accept that sometimes an argument can be worth heeding even though it's a hypothetical argument.

That doesn't necessarily mean that AI x-risk arguments are worth heeding. They might be bad arguments for reasons other than just "it's a hypothetical argument". In that case, they should be refuted (or, if bad enough, maybe just dismissed) -- but not by saying "it's a hypothetical argument, boo".


This is exactly the kind of hypothetical argument I'm talking about. You could make this argument for anything — e.g. when radio was invented, you could say "Consider a world in which extraterrestrial x-risk is real," and argue radio should be banned because it gives us away to extraterrestrials.

The burden of proof isn't on disproving extraordinary claims, the burden of proof is on the person making extraordinary claims. Just like we don't demand every scientist spend their time disproving cold fusion claims, Bigfoot claims, etc. If you have a strong argument, make it! But circular arguments like this are only convincing to the already-faithful; they remind me of Christian arguments that start off with: "Well, consider a world in which hell is real, and you'll be tormented for eternity if you don't accept Jesus. If you're Christian, you avoid it! And if it's not real, well, there's no harm anyway, you're dead like everyone else." Like, hell is real is a pretty big claim!


I didn't make any argument -- at least, not any argument for or against AI x-risk. I am not, and was not, arguing (1) that AI does or doesn't in fact pose substantial existential risk, or (2) that we should or shouldn't put substantial resources into mitigating such risks.

I'm talking one meta-level up: if this sort of risk were a real problem, would all the arguments for worrying about it be dismissable as "hypothetical arguments"?

It looks to me as if the answer is yes. Maybe you're OK with that, maybe not.

(But yes, my meta-level argument is a "hypothetical argument" in the sense that it involves considering a possible way the world could be and asking what would happen then. If you consider that a problem, well, then I think you're terribly confused. There's nothing wrong with arguments of that form as such.)

The comparisons with extraterrestrials, religion, etc., are interesting. It seems to me that:

(1) In worlds where potentially-hostile aliens are listening for radio transmissions and will kill us if they detect them, I agree that probably usually we don't get any evidence of that until it's too late. (A bit like the alleged situation with AI x-risk.) I don't agree that this means we should assume that there is no danger; I think it means that ideally we would have tried to estimate whether there was any danger before starting to make a lot of radio transmissions. I think that if we had tried to estimate that we'd have decided the danger was very small, because there's no obvious reason why aliens with such power would wipe out every species they find. (And because if there are super-aggressive super-powerful aliens out there, we may well be screwed anyway.)

(2) If hell were real then we would expect to see evidence, which is one reason why I think the god of traditional Christianity is probably not real.

(3) As for yeti, cold fusion, etc., so far as I know no one is claiming anything like x-risk from these. The nearest analogue of AI x-risk claims for these (I think) would be, when the possibility was first raised, "this is interesting and worth a bit of effort to look into", which seems perfectly correct to me. We don't put much effort into searching for yeti or cold fusion now because people have looked in ways we'd expect to have found evidence, and not found the evidence. (That would be like not worrying about AI x-risk if we'd already built AI much smarter than us and nothing bad had happened.)


This article — and my statements — are not about "is this interesting and worth a bit of effort to look into." The article is about how current AI safety orgs have tried to make current open-source models illegal. That's a much stronger position than just "this is interesting, let's look into it."

Sure! By all means look into whatever seems interesting to you. But claiming that it should be banned, to me, seems like it requires a much stronger argument than that.

(P.S. I'm not sure why hell should obviously have real world evidence: it supposedly exists only in a non-physical afterlife, accessible only to the dead. It's unconvincing because there is no evidence, but I don't see why you think there would be any; it's simply that the burden of proof for extraordinary claims rests on the claimant, and no proof has been given.)


You made an analogy between AI x-risk and e.g. cold fusion. I pointed out that there's an important disanalogy here: no one is claiming or has claimed that cold fusion poses an existential threat. Hence, the nearest cold-fusion claim to any AI x-risk claims is "cold fusion is worth investigating" (which it was, once, and isn't now).

It looks to me as if (1) you made an analogy that doesn't really work, then (2) when I pointed out how it doesn't work, (3) you said "look, you're making an analogy that doesn't really work". That doesn't seem very fair.

I wouldn't expect hell itself to have physical-world evidence. But the idea of hell doesn't turn up as an isolated thing, it comes as part of a package that also says e.g. that the world is under the constant supervision of an all-powerful, supremely good being, and that I would expect to have physical-world evidence.

I have no problem with the principle that extraordinary claims require extraordinary evidence. The difficult thing is deciding which claims count as "extraordinary". A lot of theists would say that atheism is the extraordinary claim, on the grounds that until recently almost everyone believed in a god or gods. (I'm not sure that's actually quite true, but it might be true for e.g. "Western" societies.) I don't agree and I take it you don't either, but once the question's raised you actually have to look at the various claims being made and how plausible they are: you can't just say "look, obviously this claim is extraordinary and that claim isn't".

Advocates of AI x-risk might say: it's not an extraordinary claim that AI systems will keep getting more powerful -- they're doing that right now and it's not at all uncommon for technological progress to continue for a while. And it's not an extraordinary claim that they'll get smarter than us along whatever axis you choose to measure -- that's a thing that's happened over and over again in particular domains. And it's not an extraordinary claim that something smarter than us might pose a big threat to our well-being or even our existence; look at what we've done to everything else on the planet.

You, on the other hand, would presumably say that actually some or all of those are extraordinary claims. Or perhaps that their conjunction is extraordinary even if the individual conjuncts aren't so bad.

Unfortunately, "extraordinary" isn't a term with a precise definition that we know how to check objectively. It's a shorthand for something like "highly improbable given the other things we know" or "highly implausible given the other things we know", and if someone doesn't agree with you that something is an "extraordinary" claim I don't know of any way to convince them that doesn't involve actually engaging with it.

(Of course you might not care whether you convince them. If all you want to do is to encourage other people who think AI x-risk is nonsense, saying "extraordinary claim" and "burden of proof" and so on may be plenty sufficient.)


If you want to make a research avenue illegal, IMO you need evidence that it's harmful. If there isn't evidence — minus circular claims that already assume it's harmful — I don't think it should be illegal. Very simple. This isn't an "analogy," it's what is happening in reality and is what the article is about.


I was not arguing for making anything illegal.

"But that's what the argument was about!" No, it's what the OP was about, but this subthread was about the statement that AI x-risk arguments are "pretty hypothetical". Which, I agree, they are; I just don't see how they could possibly not be, even in possible worlds where in fact they are correct. If that's true, it seems relevant to complaints that the arguments are "hypothetical".

To repeat something I said before: it could still be that they're terrible arguments and/or that they don't justify any particular thing they're being used to justify (like, e.g., criminalizing some kinds of AI research). But if you're going to argue that just because they're "hypothetical" then you need to be comfortable accepting that this is a class of (yes, hypothetical) risk that can never be mitigated in advance, because even if the thing is going to happen we'll never get anything other than "hypothetical arguments" before it actually does.

You may very well be comfortable accepting that. For my part, I find that I am more comfortable accepting some such things than others, and how comfortable I am with it depends on ... how plausible the arguments actually are. I have to go beyond just saying "it's hypothetical!".

If I'm about to eat something and someone comes up to me and says "Don't eat that! The gods might hate people eating those and torture people who do in the afterlife!" then I'm comfortable ignoring that, unless they can give me concrete reasons for thinking such gods are likely. If I'm about to eat something and someone comes up to me and says "Don't eat that! It's a fungus you just picked here in this forest and you don't know anything about fungi and some of them are highly poisonous!" then I'm going to take their advice even if neither of us knows anything about this specific fungus. These are both "hypothetical arguments"; there's no concrete evidence that there are gods sending people who eat this particular food to hell, or that this particular fungus is poisonous. One of them is much more persuasive than the other, but that's for reasons that go beyond "it's hypothetical!".

To repeat once again: I am not claiming that AI x-risk arguments are in fact strong enough to justify any particular action despite their hypothetical-ness. Only that there's something iffy about using "it's only hypothetical" on its own as a knockdown argument.


Does the strongest argument that AI existential risk is a big problem really open by exhorting the reader to imagine it's a big problem? Then asking them to come up with their own arguments for why the problem needs addressing?


I doubt it. At any rate, I wasn't claiming to offer "the strongest argument that AI existential risk is a big problem". I wasn't claiming to offer any argument that AI existential risk is a big problem.

I was pointing out an interesting feature of the argument in the comment I was replying to: that (so far as I can see) its reason for dismissing AI x-risk concerns would apply unchanged even in situations where AI x-risk is in fact something worth worrying about. (Whether or not it is worth worrying about here in the real world.)


I think what is meant is "hypothetical" in the sense of making assumptions about how AI systems would behave under certain circumstances. If an argument relies on a chain of assumptions like that (such as "instrumental convergence" and "reflective stability" to take some Lesswrong classics), it might look superficially like a good argument for taking drastic action, but if the whole argument falls down when any of the assumptions turn out the other way, it can be fairly dismissed as "too hypothetical" until each assumption has strong argumentation behind it.

edit: also I think just in general "show me the arguments" is always a good response to a bare claim that good arguments exist.


> Consider a world in which AI existential risk is real: where at some point AI systems become dramatically more capable than human minds, in a way that has catastrophic consequences for humanity.

Consider a world where AGI requires another 1000 years of research in computation and cognition before it materializes. Would it even be possible to ban all research that is required to get there? We can make all sorts of arguments if we start from imagined worlds and work our way back.

So far, it seems the biggest pieces of the puzzle missing between the first attempts at using neural nets and today's successes in GPT-4 were: (1) extremely fast linear algebra processors (GPGPUs), (2) the accumulation of gigantic bodies of text on the internet, and in a very distant third, (3) improvements in NN architecture for NLP.

But (3) would have meant nothing without (1) and (2), while it's very likely that other architectures would have been found that are at least close to GPT-4 performance. So, if you think GPT-4 is close to AGI and just needs a little push, the best thing to do would be to (1) put a moratorium on hardware performance research, or even outright ban existing high-FLOPS hardware, (2) prevent further accumulation of knowledge on the internet and maybe outright destroy existing archives.


In cases where AI x-risk is real, wouldn't that only apply to situations in which an AI is embodied in a system that gives it autonomy? For example, in ChatGPT, we have a next token predictor that solely produces text output in response to my input. I have about as much control over the system as possible: I can wipe its mind, change my responses, and so on - and the AI is none the wiser. Even if ChatGPT-n is superhumanly intelligent[0], there is nothing it can do to autonomously escape the servers and do bad things. I have to specifically choose to hand it access to outside input through the plugin APIs. So we could argue that the models themselves are fine, but using them in certain ways that take control away from humans is risky. We could say "you can use AI to write your spicy fanfiction but not put it in a robot that has access to motors and sensors".

I think what's really throwing people off about AI safety - including myself - is that people are arguing that the models themselves hold the x-risk. Problem is, there's no plausible way for a superhuman intelligence to 'bust out of its cage' using text output to a human reader alone[1]. Someone has to decide to hook it up to stuff, and that's where the regulation should be.

But that's also usually where the AI safety people stop talking, and the AI ethics people start.

[0] GPT is, at the very least, superhuman at generating text that is statistically identical to, if not copied outright from, existing publicly-available text.

[1] If there is, STOP, call the SCP Foundation immediately.


This is better than most "periodic tables" in that the periodicity in it actually represents something real: there actually are (conveniently!) 18 infinite families of finite simple groups (corresponding to the 18 columns in a standard periodic table), some of which are related to one another in ways that somewhat justify putting them next to one another in a table like this; one family (the cyclic groups of prime order) really are different from all the others, to an extent comparable to how different the noble gases are from the other elements; etc.


Aside from the matter of whether the article's criticisms of the paper in question are valid[1], I think there's some equivocation going on here between two kinds of thing that can both be called failures of "replicability" or "reproducibility" but that are very different. (1) Sometimes a paper doesn't make it perfectly clear exactly what the authors did and why. (2) Sometimes it's clear enough that you can try to do the same thing, and when you do you get different results.

[1] As others have pointed out, there are rebuttals in the comments from some of the authors of the paper; make of them what you will.

The "reproducibility crisis" / "replication crisis" is about #2: lots of published work, it seems, reports results that other people trying to do the same experiments can't reproduce. This probably means that their results, or at least a lot of their results, are wrong, which is potentially a big deal.

The article's complaint about the paper by van Zwet, Gelman et al isn't that. It's that some details of the statistical analysis aren't reproducible in sense 1: the authors did some pruning of the data but apparently didn't explain exactly what criteria they used for the pruning.

You could argue that actually #1 is worse than #2, because maybe what the authors did is bad but you can't even tell. Or you could argue that it's not as bad as #2, because most of the time (or: in this specific case) what they do is OK. But I don't think it makes any sense to suggest that they're the same thing, that there's some great irony if a paper about #2 has a #1 problem. That's just making a pun on two completely different sorts of reproducibility.

(Note 1: The article makes another complaint too, which if valid might be a big deal, but it doesn't have anything much to do with reproducibility in either sense.)

(Note 2: One of the authors of the paper does explain, in comments on the OP, exactly how they did the pruning. I have not checked what the original paper says about that; it doesn't seem to be available freely online.)

(Note 3: Even if the Zwet/Gelman/... paper were absolute trash, I don't really see how that would make the reproducibility crisis irreproducible. This is one paper published in 2023; the reproducibility crisis has been going on for years and involves many many unsuccessful attempts at replication.)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: