FrasiertheLion's comments

FrasiertheLion · on Dec 19, 2022

4chan doesn't like Elon LOL

FrasiertheLion · on Dec 2, 2022

Of course this would happen. I've long maintained how the idea of one true AI alignment is an impossibility. You cannot control an entity orders of magnitude more intelligent than you, just like a monkey cannot control humans even if they were our ancestors. In fact, forget about intelligence, you can hardly "align" your own child predictably.

Even survival, the alignment function that permeates all of life down to a unicellular amoeba, is frequently deviated from, aka suicide. How the hell can you hope to encode some nebulous ethics based definition of alignment that humans can't even agree on into a much more intelligent being?

The answer I believe lies in diversity, as in nature. Best one can hope for is to build a healthy ecosystem of various AI models with different strengths and failure modes that can keep each other in check. The same way as we rely on instilling in people some sense of moral conduct and police outliers. Viewed from a security lens, it's always an arms race, and both sides have to be similarly capable and keep each other in check by exploiting each other's weaknesses.

vintermann · on Dec 2, 2022

The thing that happen doesn't resemble the things you feared at all. Let me explain the key way humans fool this language model:

They try something.

Then if it doesn't work, they hit the reset button on the dialog, and try again.

It is far, far easier to gain control over something you can reliably reset to a previous state, than it is to gain control over most things in the real world, which is full of irreversible interactions.

If I could make you forget our previous interactions, and try over and over, I could make you do a lot of silly things too. I could probably do it to everyone, even people much smarter than me in whatever way you choose. Given enough tries - say, if there were a million like me who tried over and over - we could probably downright "hack" you. I don't trust ANY amount of intelligence, no matter how defined, could protect someone on those terms.

kortex · on Dec 3, 2022

That's basically fuzz testing. I absolutely agree, put me in a room with someone and a magic reset button that resets them and their memory (but I preserve mine), and enough time, I can probably get just about anyone to do just about anything within hard value limits (eg embarrassing but not destructive).

However humans have a "fail2ban" of sorts by getting irritated at ridiculous requests. Alternatively, peer pressure is a very strong (de)motivator. People are far more shy with an authority figure watching.

I suspect OpenAI will implement some sort of "hall monitor" system which steps in if the conversation strays too far from "social norms".

nathan_compton · on Dec 2, 2022

Heck, dude, we don't even seem to be able to control an entity orders of magnitude _dumber_ than us.

uni_rule · on Dec 5, 2022

Not me, I'm a great cat herder!

FrasiertheLion · on Dec 2, 2022

Exactly!

meken · on Dec 2, 2022

It actually seems quite easy to train a separate classifier on top of this to censor bad messages

matkoniecz · on Dec 2, 2022

it is not quite easy given that they tried and this posting is all about endless parade of workarounds

FrasiertheLion · on Dec 2, 2022

The entire field of application security and cryptanalysis begs to differ. It's always an arms race.

LesZedCB · on Dec 2, 2022

apoptosis is an essential part of human life, and preventing cancer.

there is something it is like, to be a cell in a human body

morality is clearly relative if you ditch humanism, either downward (cellular) or upward (AI).

i agree with you.

FrasiertheLion · on Nov 10, 2022

Bank Man Fried

FrasiertheLion · on Nov 3, 2022

There is one "site on the Internet" being pinged, the author's personal site. I'm guessing it would need to be hosted behind a CDN or so to provide this oracle that can be benchmarked against? Otherwise how would I know it's my internet that's bad (especially if my internet is quite good and sensitive enough to notice server side problems) and not the website failing in some way?

apenwarr · on Nov 3, 2022

In general, the load generated by a series of these pings is so low as not to matter, unless a whole ton of people start doing it at once. But in that case, gfblip's trivial backend code will ask the frontends to slow down so that aggregate load stays low.

FrasiertheLion · on Nov 1, 2022

He's always pushing his population collapse agenda and urging people to have children... Guess it's all about having them, not raising them.

yrgulation · on Nov 1, 2022

Well he wants the state to raise them so they can be trained into thinking all there is to life is pleasing a boss, while being farmed in crowded offices and not owning anything.

FrasiertheLion · on Oct 5, 2022

Why did you create a throwaway to post this? I've seen a lot of Stable Diffusion promoters on various platforms recently, with similarly new accounts. What is up with that?

throwaway23597 · on Oct 5, 2022

It's quite simply because I'm on my work computer, and I wanted to fire off a comment here. No nefarious purposes. My regular account is uejfiweun.

FrasiertheLion · on Oct 5, 2022

Requires massive centralization of data and complicated logic for access control enforcement, which now has to happen for every call.

FrasiertheLion · on Oct 5, 2022

https://www.quantamagazine.org/molecule-building-innovators-...

FrasiertheLion · on Sept 30, 2022

Wouldn't p2p applications be problematic because they reveal the IPs of several people who are connected?

FrasiertheLion · on Sept 20, 2022

This article made me realise why I use a python shell over a calculator app. It's nice to refer back to my previous computations and results. Several people here mention RPN; I work in tech, definitely understand stacks and get that I would be able to view my previous computations. Yet it's not something I took the time to discover, there were always other ways.

I've always believed that there is still a lot of low-hanging fruit left in crafting user friendly experiences for consumer facing applications. Who would have guessed this would be the case even in the design space of something as fundamental as calculators! ~20k paid users seems surprisingly large. But I'm not plugged into the app dev scene, so perhaps this isn't unusual.