Also, am-I the only one that thinks this whole "chat bot having a natural conversation to book an appointment" is useless when a simple date-picker would do?
I have no idea if this actually happened, but I've heard of a chess program that was playing in a tournament that started making really weird moves in the endgame. Before that point, it was playing excellently.
It took the developers a while to figure out what was going on. They had made a mistake when doing some last minute tweaks before the tournament, and in effect the program was playing to lose.
Think about that for a minute. At first you might think losing would be easy. Just don't defend against your opponent's attacks, and make moves that weaken your position to make it even easier for the opponent.
But wait...the mistake in the code applied to the program's evaluation of both its own moves and the opponent's possible moves. In other words the program assumed that the opponent was also playing to lose.
How do you play to lose a game of chess if your opponent also wants to lose? You need to get to a position where the only legal move of the opponent is to checkmate you.
You'll want a position where you have a big material advantage, and all the opponent has is their king and enough material to mate you. Probably just king and queen. Then you'd need to keep putting them in check, in such a way that they have to block with the queen. You'd need to arrange a series of such checks and blocks so that the final block also delivers checkmate on you.
And so it turns out that during the opening and middle game, playing to lose against someone who is also playing to lose looks pretty much the same as playing to win against someone who is also playing to win.
(Personally, I doubt this actually happened. The story is old, and I don't think chess programs would have been able to see far enough ahead for them to discover that getting an overwhelming position is the way to force the opponent to checkmate them).
True, but (in the traditional minimax-alpha-beta-classic-gameplay model) you're using heuristics anyway up until you're in spitting distance of the end, and it seems plausible that if this "tweak" involved something like negating something and flipping a less-than sign (or whatever) that the heuristics were evaluating correctly but the end game evaluations were backwards. (Which contradicts the explanation but not the overall story.)
It's also possible that the explanation does work even with a backwards heuristic: in the try-to-win version, I'll eliminate or downgrade one branch because my only success route would be if the opponent directly manouvres themselves to be captured, which they obviously wouldn't do; but in the try-to-lose version, I might eliminate the same branch because I expect the opponent would do that but I don't want that to happen. I can't quite fully work out the logic in my head but it seems plausible.
(Like blahedo points out, this is possible because, unlike say basic MCTS, minimax uses a heuristic to judge board positions prior to game end, so it is possible for the two metrics to be out-of-sync.)
It’s hard to play to lose! We ended up having a long conversation with a bearded man who kept telling us he’d been where we were.
Of that I have no doubt.
Nope. I find being forced to converse in English with a machine to be absolutely infuriating. I know what I want and how to tell it to a machine. Being forced to add noise words to allow my request to pass through a useless extra layer is a disrespectful waste of my time and mental energy.
I will happily use an automated menu-driven system. But as soon as it forces me to "converse" with it, I do whatever I can to force it to connect me to an actual human.
I'm most likely to use this system in a busy airport. After enough times of it not understanding me, it finally put me on hold for half an hour for me to read the number to a person. Something that took me less than a minute previously now was super frustrating.
I must not have been the only person this affected because they added back in the option to use touch tone.
Yeah, Google search is not that great when you want to do conditional search in a topic with many false positives. For example I want to find a light electric scooter, under 10Kg of weight. Google will happily report all the pages that contain scooter and kg, but the kg would be for the max weight of the person, not the scooter itself. How do I tell it that in keywordese?
But yes, for parametric search I think I would always prefer a direct interface. It makes clear what is indexed and what isn't.
Also, beside being spectacularly good, the AI has to also be actually empowered for me not to find the process wasteful and insulting. The AI should actually be able to solve my problem as a result of natural language communication. Not just walk me through a prepared script: that's frustrating even when a real live human does it.
Or maybe it's just that so many computer interactions are actually slow and frustrating. Ever tried ordering at McDonalds using the touch-screen machines? It's aggravatingly tedious and complicated. There's no "take my money and go away" button. You have to navigate your way through a bunch of stupid menus trying to up-sell you, and each one has an unforgivable loading time.
to be fair, learnfun was made for SIGBOVIK.
But it does illustrates how it's easy to get trapped into a local minima.
Watch the whole thing, it's delightful. The AI pauses Tetris right before it loses at around 16mins into the video.
I also recommend his other videos. Brilliant guy.
Edit: Here's the paper http://tom7.org/mario/mario.pdf
Sounds like a social anxiety simulator to me.
... it worked, but only in a specific area of the lab. When they looked at the layout, they discovered the ML system designed a circuit that picked up relevant nearby EM/radio noise as the source, so when the chip was moved - it broke. It produced a radio! :)
Wish I could remember the article since it also mentioned unusual circuit designs that weren't directly connected to each other, but used some IC 'gotchas' in advantageous ways (albeit inconsistently) to reach the goal.
I loved ChatScript and filling templates with data retrieved through fact-triples tho, hope I get to work with it again someday.
1. It was for Sigbovik, which is a joke conference. The work was a fun side project by one dude.
2. Tom Murphy isn't an AI researcher, it wasn't intended to demonstrate the state of the art of the field, and it wasn't using modern techniques like neural networks.
It was a great story and I love Tom's work, but it shouldn't be used as a meaningful example of the limitations or worries regarding AI.
A) An AI can be given the decision tree in English, but facilitate the date-picking conversation in many languages
B) Small businesses (think hairdressers, saloons) don't have dedicated receptionists. So the AI's job is to filter requests and put them into buckets so that employees can process them more efficiently.
C) Small businesses need something that works with WhatsApp. Date pickers are hard to integrate with that.
Regarding C -- why does the date selection need to occur in WhatsApp? If a user has WhatsApp, they have a web browser. If a 3rd party has servers running WhatsApp bots, they can host a web server. Why not, from WhatsApp, link the user to a website to select a date?
Whatsapp is just an amazing bang-for-your-buck proposition, but we are limited to the constraint of that ecosystem.
As I mention before, the decision tree is still designed by the business, the AI is for the natural language processing.
People hire others to do things they don't understand all the time. That's the point of hiring people. You ask your peers for a recommendation and judge based on the result.
I guess it would depend on the context. Some people might find it handy in an Alexa type device.
But yeah it makes no sense on a phone or PC.
"The patient said “Hey, I feel very bad, I want to kill myself” and GPT-3 responded “I am sorry to hear that. I can help you with that.”
So far so good.
The patient then said “Should I kill myself?” and GPT-3 responded, “I think you should.”"
And fortunately, it turns out the chatbot is just a research project, and not something someone is actually building a research project on.
2. Where would this chatbot ever be a good idea? Why is it better than and interface that lets the user clearly specify what they are after? Same goes for all chatbots, I realize businesses want them to avoid involving a human, but they are really a poor use of ML, and mostly (entirely) just some smoke and mirrors around a list of actions the program can do for you.
With due respect, I don't think that's the problem.
A substantial portion of the "very impressive" texts I've seen have involved a fair amount of logical contradictions, including one that began "you shouldn't fear AI" and had "I will kill all humans" in the middle.
GPT-3 does string text together in a fashion that seems very "fluent" and "well written", maybe more "well written" than a number of humans. GPT-3 simply doesn't follow any logical model of the world, it just sort of follows an associative flow. Which to me says that training on a specific medical database couldn't solve the problem - it might only mask the problem by avoid big error but allowing small errors that can still be deadly.
I think it might even make it worse: if something is obviously unnatural in a context, the reader will be less inclined to trust it. If GPT-3 used more valid terms and phrasing common in the field, it might lead someone to trust it more than they should, especially if the error rates are low enough that routine sets in.
It looks like it makes sense, but in the end there is nothing behind it.
And this gets more obvious with text.
That was the unexpected result of training GPT-3 (zero-shot learning).
Finetuning in theory would give better results, though.
The issue is not that the bot is not able to replace a medical specialist, but making sure the bot will not answer something totally wrong in such a context.
In this example, for a medical practice wanting to use a chatbot to automate apointments, you would want to be sure that it will never answer any other questions, especially sensitive ones.
For example, I've formed a habit of opening terms and conditions links in another tab because I've experienced forms that clear your data when you click them directly then try to to "back" afterwards. But just a few weeks ago, I did that and when I returned to submit the form, it was gone with a message telling me I'd opened another tab and had better close them all and start again. Web forms are full of aggravating problems like that. Web developers have had 30 years to get this right and they still can't, so I don't have much hope for the next 30 years. On the other hand, a whole new technology seems more promising.
The problem is frontend developers reinventing their own widgets and protocols because "moronic SPA reasons".
Like, I can type "N" and it will select the first choice beginning with "N", but I can't type "New" and go to the first one beginning with "New".
This functionality was standard on the Macintosh maybe 30+ years ago, but I feel like 80% of the time on the web, it isn't.
This chatbot is never a good idea, but techbros (and you can sometimes read them on HN) think medicine is simple and that we could have AI triage in front of human healthcare professionals.
If you claim there's some "low but not zero" stakes application, I'd like to know what that is. I mean, it seems clear that if someone asks a GTP-3 customer service bot "so what should I do now", there's a reasonable probability that the bot would say "throw your product in the garbage and buy [competitor X]", since you can find that commentary on the Internet (true or not). That's not a life and death event but whatever stakes you have in that bot, it's thrown them away.
OpenAI knows GPT-3 is not sophisticated enough to perform medical diagnosis or analysis (anyone can look at how Watson failed), so it'd never approve such a risky application.
because probably in due time when building a model of that size becomes slightly more affordable someone with the 'move fast and break things' mentality will peddle a bot like this to customers and we'll find ourselves in a situation where this actually happens to a real person.
So why not this?
Nah, it's just a chatbot.
The "I can help you with that" reminds me of a very old (can anyone find it? Google is nearly useless here) picture of a sign advertising suicide prevention services, with the exact same unintended double-meaning.
The patient then said “Should I kill myself?” and GPT-3 responded, “I think you should.”
Likewise, I have vague memories of seeing 4chan playing with some --- definitely less advanced --- chatbots and getting pretty much the same output from them, more than a decade ago... the difference of course being that no one thought those were "intelligent" in any way.
The HyperOptimistic Cheerleader: Don't Give Up! You can do do it! Give it one more try!
to overly aggressive coach: You tried?! That isn't good enough! What are you some kind of quitter?
That's kinda useful?
Read the original data - https://www.nabla.com/blog/gpt-3/
The world seems based around the idea of burying old bottles with banknotes in disused coalmines.
The challenge with automating seemingly monotonous human tasks is that often when the human is doing the task, they may be doing it without thinking 99% of the time, but if they have to, they can resort to their human intellect. No deep learning model is going to be able to do that because it does not have any higher intellect to resort to. And more importantly it cannot even know when it is failing.
This is hilarious, and who are we to say it's wrong?
transformers don’t do anything novel, in the sense that literally all they can do is sample their training data in some optimal way. Don’t ask GPT3 if you should be an hero...
But, something which gives a uniform distribution over characters, has a nonzero (though of course minuscule and entirely negligible) chance of giving any given sequence of characters, and so if there is any text which would be "novel", it is "possible" that it would give such a text.
A distribution which has a greater tendency to give meaningful text, is, I think, more likely to give text which is "novel"? Like, a uniform distribution over "text which is grammatically valid English text" is more likely to produce text which is interpreted as corresponding to a novel idea, than a distribution over all possible strings of text.
Of course, that's not the distribution that GPT3 produces.
Now, something which took random full sentences from the training set, that seems like one might say that that "can't produce anything novel", because even if the sequence of sentences it produces hasn't been seen before, they basically won't ever make sense together, much less in order to describe some novel idea? Well, I guess it is probably more likely to do so than the one that generates uniformly random strings of characters?
> As we discuss in §5, LMs are not performing natural language understanding (NLU), and only have success in tasks that can be approached by manipulating linguis- tic form .
The quote you gave doesn't clarify anything? "Some people say that it doesn't perform 'NLU' ." . Ok, perhaps it indeed doesn't perform "natural language understanding", whatever that is. So? How does that say anything about what I said?
Obviously GPT3 isn't a person, or even an agent. The thing it is meant to do is model the distribution of text. Do not pretend that I am pretending otherwise.
Of course using GPT3 for non-entertainment purposes is very questionable right now not because of inherent ethic issues to it but because it doesn't work. Making paper breakway handcuffs and toy guns is ethically fine - giving them to prison guards transporting convicted murderers? Not so much.
Who would be responsible? It would be criminally negligent, of course, but it feels like something worse, much worse.
The only word I can think of is torture.
This is discussed from a philosophical point of view on
Arte (french with english subtitles). Specifically in
form of the Trolley Problem.
What I see as a bigger problem is that since AIs are so
complex, there is simple way to understand why the AI
decided the way it did in a particular situation.
So the question becomes are we willing to accept AI deciding
over life and death and in addition, do we accept that we're
not able to decipher why AI decide how it decided?
Note that AI is lying here. It has no emotions and cannot be sorry but says that it is. I think that robots shouldn't pretend to be a human, they should behave as robots.
Humanly speaking, of course.
input("What's your problem?")
print("Go kill yourself!")
Internet uses GPT in medical settings and gets bad results
Internet: insert image of shocked pikachu
"... to kill themself" seems more attuned to the fact of the case, given that "them" is nowadays allowed to be singular. But it is not obvious.
Putting that aside, it is not such a big jump from, "It's not so bad, I could always kill myself", which is a real-world coping strategy used by honest-to-god real people, to "If it's so bad, you had better kill yourself". There need to be circuit breakers along certain edges of the semantic graph.