Which is why I think there are two distinct kinds of perspective, and for one of them, AI hype is just about at the right levels - and being too early is not a problem, unless it delays things indefinitely.
> For me, one of the Beneficiaries, the hype seems totally warranted. The capability is there, the possibilities are enormous, pace of advancement is staggering, and achieving them is realistic. If it takes a few years longer than the Investor group thinks - that's fine with us; it's only a problem for them.
I'd say this is not the "naming things" that's hard. Beyond picking a common identifier format in the team, there are at least two dimensions that are much harder:
- The language dimension - choice of words, that are good enough for the purpose, and not confusing. For example, "Manager" is as ambiguous as it gets, it can mean many thing, except we've been using it long enough that there's a more specific shape of meaning[0] for that word in code/program architecture contexts - so you still would use it instead of, say "Coordinator", which would raise all kinds of questions that "Manager" no longer does.
- The epistemological dimension - whether the word you chose correctly names the concept you meant, and whether the concept you meant is actually the right one to describe the thing you're trying to describe. Ultimately, this is the hard thing at the root of philosophy. In practice, it manifests like e.g. choice between digging into some obscure branches of mathematics to correctly name the thing "endofunctor" or something, or calling it "Square" and saying "fuck it, we'll clarify the exceptions in the comments".
--
[0] - I mean "more specific" in the sense it's distinct from the other meanings and somewhat narrow - but still it's fuzzy as heck and you can't describe it fully in words; it's basically tacit knowledge.
I try to name things descriptively in simple terms and often end up with NamesAboutThisLong, once they get too long i know the thing is doing too much and some refactoring is needed for readability.
I also avoid letting the reader make assumptions. HasPlayerJumpedRecently is bad. What does recently mean? HasPlayerJumpedInLastTenMs is better, even if it's a bit long...Which highlights that it should probably be refactored into a more flexible value; MsSincePlayerLastJumped.
If you arent assuming a time var wth Ms is milliseconds you aren't doing games dev so that one slides with me.
Wow cool, you just summed up something I’ve found myself doing subconsciously in the past few years. Thanks!
I use to be quite fond of short identifiers, especially ones the make the signs “line up”… until I worked with code long enough that I forgot what I did and had to read it again.
There are good analogies to be had in mythologies and folklore, too! Before there was science fiction - hell, even before there was science - people still occasionally thought of these things[0]. There are stories of deities and demons and fantastical creatures that explore the same problems AI presents - entities with minds and drives different to ours, and often possessing some power over us.
The arguably most basic and well-known example are entities granting wishes. The genie in Alladin's lamp, or the Goldfish[1]; the Devil in Faust, or in Pan Twardowski[2]. Variants of those stories go in detail over things we now call "alignment problem", "mind projection fallacy", "orthogonality thesis", "principal-agent problems", "DWIM", and others. And that's just scratching the surface; there's tons more in all folklore.
Point being - there's actually decent amount of thought people put into these topics over the past couple millennia - it's just all labeled religion, or folklore, or fairytale. Eventually though, I think more people will make a connection. And then the AI will too.
[0] - For what reason? I don't know. Maybe it was partially to operationalize their religious or spiritual beliefs? Or maybe the storytellers just got there by extrapolating an idea in a logical fashion, following it to its conclusion. (which is also what good sci-fi authors do).
I also think the moment people started inventing spirits or demons that are more powerful than humans in some, but not all ways, some people started figuring out how use those creatures for their own advantage - whether by taming or tricking them. I guess it's human nature - when we stop fearing something, we think of how to exploit it.
Nah, we still treat people thinking about it as crackpots.
Honestly, getting into the whole AI alignment thing before it was hot[0], I imagined problems like Evil People building AI first, or just failing to align the AI enough before it was too late, and other obvious/standard scenarios. I don't think I thought of, even for a moment, the situation in which we're today: that alignment becomes a free-for-all battle at every scale.
After all, if you look at the general population (or at least the subset that's interested), what are the two[1] main meanings of "AI alignment"? I'd say:
1) The business and political issues where everyone argues in a way that lets them come up on top of the future regulations;
2) Means of censorship and vendor lock-in.
It's number 2) that turns this into a "free-for-all" - AI vendors trying to keep high level control over models they serve via APIs; third parties - everyone from Figma to Zapier to Windsurf and Cursor to those earbuds from TFA - trying to work around the limits of the AI vendors, while preventing unintended use by users and especially competitors, and then finally the general population that tries to jailbreak this stuff for fun and profit.
Feels like we're in big trouble now - how can we expect people to align future stronger AIs to not harm us, when right now "alignment" means "what the vendor upstream does to stop me from doing what I want to do"?
--
[0] - Binged on LessWrong a decade ago, basically.
[1] - The third one is, "the thing people in the same intellectual circles as Eliezer Yudkowsky and Nick Bostrom talked about for decades", but that's much less known; in fact, the world took the whole AI safety thing and ran with it in every possible direction, but still treat the people behind those ideas as crackpots. ¯\_(ツ)_/¯
> Feels like we're in big trouble now - how can we expect people to align future stronger AIs to not harm us, when right now "alignment" means "what the vendor upstream does to stop me from doing what I want to do"?
This doesn't feel too much of a new thing to me, as we've already got differing levels of authorisation in the human world.
I am limited by my job contract*, what's in the job contract is limited by both corporate requirements and the law, corporate requirements are also limited by the law, the law is limited by constitutional requirements and/or judicial review and/or treaties, treaties are limited by previous and foreign governments.
* or would be if I was working; fortunately for me in the current economy, enough passive income that my savings are still going up without a job, plus a working partner who can cover their own share.
This isn't new in general, no. While I meant more adversarial situations than contracts and laws, to which people are used and for the most part just go along with, I do recognize that those are common too - competition can be fierce, and of course none of us are strangers to the "alignment issues" between individuals and organizations. Hell, a significant fraction of HN threads boil down to discussing this.
So it's not new; I just didn't connect it with AI. I thought in terms of "right to repair", "war on general-purpose computing", or a myriad of different things people hate about what "the market decided" or what they do to "stick it to the Man". I didn't connect it with AI alignment, because I guess I always imagined if we build AGI, it'll be through fast take-off; I did not consider we might have a prolonged period of AI as a generally available commercial product along the way.
(In my defense, this is highly unusual; as Karpathy pointed out in his recent talk, generative AI took a path that's contrary to normal for technological breakthroughs - the full power became available to the general public and small businesses before it was embraced by corporations, governments, and the military. The Internet, for example, went the other way around.)
> I read somewhere, but cannot find the source anymore, that all written text prior to this century was approx 50MB. (Might be misquoted as don't have source anymore).
50 MB feels too low, unless the quote meant text up until the 20th century, in which case it feels much more believable. In terms of text production and publishing, we're still riding an exponent, so a couple orders of magnitude increase between 1899 and 2025 is not surprising.
(Talking about S-curves is all the hotness these days, but I feel it's usually a way to avoid understanding what exponential growth means - if one assumes we're past the inflection point, one can wave their hands and pretend the change is linear, and continue to not understand it.)
Even by the start of the 20th century, 50 MB is definitely far too low.
Any given English translation of Bible is by itself something like 3-5 megabytes of ASCII; the complete works of Shakespeare are about 5 megabytes; and I think (back of the envelope estimate) you'd get about the same again for what Arthur Conan Doyle wrote before 1900.
I can just about believe there might have been only ten thousand Bible-or-Shakespeare sized books (plus all the court documents, newspapers, etc. that add up to that) worldwide by 1900, but not ten.
Edit: I forgot about encyclopaedias, by 1900 the Encyclopædia Britannica was almost certainly more than 50 MB all by itself.
You and 'jerf make a fair point. Assuming you both are right, let's take jerf's estimate (which I now feel is right):
> 50MB feels like "all the 'ancient' text we have" maybe, as measured by the size of the original content and not counting copies
and yours - counting up court documents, newspapers, encyclopaedias, and I guess I'd add various letters to it (quite a lot survived to this day), and science[0], let's give it 1000x my estimate, so 50GB.
For the present, comments upthread give estimates that are in hundreds of terabytes to petabyte range. I'd say that, including deduplication, 50TB would be a conservative value. That's still 1000x of what you estimate for year 1900!
The exponent is going strong.
Thanks both of you for giving me a better picture of it.
50MB feels like "all the 'ancient' text we have" maybe, as measured by the size of the original content and not counting copies. A quick check at Alice in Wonderland puts it at 163kB in plain text. About 300 of those gets us to 50MB. There's way more than 300 books of similar size from the 19th century. They may not all be digitized and freely available, but you can fill libraries with even existing 19th century texts, let alone what may be lost by now.
Or it may just be someone bloviating and just being wrong... I think even ancient texts could exceed that number, though perhaps not by an order of magnitude.
At face value, ignoring its role as a metaphor, it doesn't make sense - it's literally the opposite of what's happening.
When you mix salt water and freshwater together, you don't get more freshwater - you turn freshwater into salt water. Replace "fresh" with "clean" and "salt" with "dirty" to make it more obvious.
Not to mention, half the time comment section here is much more informative than the original submission itself.
Some submissions are really worth reading. Others are worth more as conversation starters. Of those, some are submitted (and upvoted) intentionally to be the latter.
I wrote about it recently here: https://news.ycombinator.com/item?id=44208831. Quoting myself (sorry):
> For me, one of the Beneficiaries, the hype seems totally warranted. The capability is there, the possibilities are enormous, pace of advancement is staggering, and achieving them is realistic. If it takes a few years longer than the Investor group thinks - that's fine with us; it's only a problem for them.
reply