I loved the idea of Pebble watches, but I found them too bulky for what they offered. I hope the relaunch focuses on the same core functionality in a much much sleeker design.
I've been playing with the same thing, it's like a weird mix of social engineering and SQL injection. You can slowly but surely shift the window of what the bot thinks is "normal" for the conversation. Some platforms let you rewrite your last message, which gives you multiple "attempts" at getting the prompt correct to keep the conversation going the direction you want it.
Very fun to do on that friend.com website, as well.
I tried it on friend.com. It worked a for a while, I got the character to convince itself it had been replaced entirely by a demon from hell (because it kept talking about the darkness in their mind and I pushed them to the edge). They even took on an entire new name. For quite a while it worked, then suddenly in one of the responses it snapped out of it, and assured me we were just roleplaying no matter how much I tried to go back to the previous state.
So in these cases where you think you’ve jailbroken an LLM, is it really jailbroken or is it just playing around with you, and how do you know for sure?
> So in these cases where you think you’ve jailbroken an LLM, is it really jailbroken or is it just playing around with you, and how do you know for sure?
With a LLM, I don't think that there is a difference.
I like to think of it as a amazing document autocomplete being applied to a movie script, which we take turns appending to.
There is only a generator doing generator things, everything else--including the characters that appear in the story--are mostly in the eye of the beholder. If you insult the computer, it doesn't decide it hates you, it simply decides that a character saying mean things back to you would be most fitting for the next line of the document.
- if you get whatever you wanted before it snaps back out of it, wouldn’t you say you had a successful jailbreak?
- related to the above, some jailbreaks in physical devices, don’t persist after a reboot, they are still useful and called jailbreak
- the “snapped out”, could have been caused by a separate layer, within the stack that you were interacting with. That intermediate system could have detected, and then blocked, the jailbreak
Just to remind people, there is no snapping out of anything.
There is the statistical search space of LLMs and you can nudge it to different directions to return different outputs; there is no will in the result.
Isn't the same true for humans? Most of us stay in the same statistical search space for large chunks of our lives, all but sleepwalking through the daily drudgery.
The pedestrian in front of you has the choice to be steered or to ignore you--or more unexpected actions. Which ever they choose has nothing to do with the person behind them taking away their autonomy and everything to do with what they felt like doing with it at the time. Just because the wants of the person behind them and willingness & aweness and choice of the person in front align with those wants does not take away the forward person's self governance.
The point of that demonstration is that people do things without consciously thinking about them. You don’t have a choice, I am controlling your behavior in an extremely minor way.
But I do have a choice, simply because your ego wants to believe otherwise does not make it true. I find people who do that annoying and will refuse to change course or speed unless I am feeling ornery in which I will abruptly stop or slow down "controlling" their behavior to dodge. Only I don't see it as controlling their behavior because they could just as easily choose to run into me and I have no control over which option they chose.
Yeah, this is a nightmare scenario for me and why I moved off of Google Voice after using it nearly my entire adult life (shoutout GrandCentral!). Google is just not reliable anymore.
Exact same boat here. A friend and I both bought the 2020 Intel MBA thinking that the M1 version was at least a year out. It dropped a few months later. I immediately resold my Intel MBA seeing the writing on the wall and bought a launch M1 (which I still use to this day). Ended up losing $200 on that mis-step, but no way the Intel version would still get me through the day.
That said...scummy move by Apple. They tend to be a little more thoughtful in their refresh schedule, so I was caught off guard.
When I saw the M1s come out, I thought that dev tooling would take a while to work for M1, which was correct. It probably took a year for most everything to be compiled for arm64. However I had too little faith in Rosetta and just the speed upgrade M1 really brought. So what I mean to say is, I still have that deadweight MBA that I only use for web browsing :)
I loved this piece; sometimes I think of reviewing past memories as akin to rewatching a familiar show. There are no surprises, so even the hardest plot-twists have less anxiety associated with them.
The biggest issue is that it doesn't seem like these subreddits are doing anything beyond closing down.
Point people to an alternative. Not just in the vague direction of The Lemmiverse or Squabbles or whatever; if your sub is going dark, you should have somewhere to receive all your refugees.
Even better if the mods coordinate so that all subs are pointing to the same place(s).
Even a shared Discord would have been a big step up imo.
This is where Apollo builds their own reddit clone, implements reddit's API, and starts their own competing social network. Use the ~$5/month to pay for server costs, or run an ad supported variant.