Yeah, I always make an example with coin flips to show how this is true.... lets say heads is success and tails is failure.
Flip 100 coins. Take the ones that 'failed' (landed tails) and scold them. Flip them again. Half improved! Praise the ones that got heads the first time. Flip them again. Half got worse :(
Clearly, scolding is more effective than praising.
It's like saying that just because I was able to parallel park successfully this time in one maneuver without hitting the curb or either of the cars in front or behind me that I have some sort of increased chance of not being successful the next time. That seems kind of absurd, doesn't it?
So the coin is an extreme example, not an irrelevant example.
Methods are MUCH more important for a career than any individual execution.
Half the heads, having been praised were successful again. That's implying that praise was 50% effective.
Half the tails, having been scolded, were suddenly successful. That's implying that scolding was 50% effective.
How is scolding more effective than praising?
>“On many occasions I have praised flight cadets for clean execution of some aerobatic maneuver, and in general when they try it again, they do worse. On the other hand, I have often screamed at cadets for bad execution, and in general they do better the next time. So please don't tell us that reinforcement works and punishment does not, because the opposite is the case.”
With scolding, half got better and half stayed the same.
Scolding is obviously better, in terms of effect on outcomes.
Though it's probably a better parallel to the flight instructor case if you have a six-sided die, and you praise on 5-6, and scold on 1-2: most of the praised get worse, most of the scolded get better, those who get neither scolding or praise have mixed results, and a few of those scolded get worse and likewise a few praised get better.
The coin flip example, while in one sense more clearly showing the problem, simplifies to the point where the connection with realistic scenarios is harder to see, IMO.
Wouldn't that indicate that indeed no praise + lots of scolding is the best approach?
If it's a Gaussian distribution of performance, and he praises above the 99th percentile of bad performance and scolds below the 99th percentile, that 80%-80% pattern might very well occur---but again, it'd be expected to occur, independent of the signal he's putting into the system (because the candidate's he's scolding are staggeringly unlikely to get worse, and the candidates he's praising staggeringly unlikely to get better).
Anyway, the tendency of the article pushes negative reinforcement, not just anecdotally, but also as a continuous prevailing message, instead of alternating with degrees of praise even if only on occassion. There are at least three things wrong with that on a behavioral level: a. resentment as an inversion of stockholm syndrome b. learned helplessness, and c. simple fatigue.
In other words, people will hate your idea of leadership, conclude that you can never be pleased, and realize that they'll be worked to death on a treadmill of demands, if they don't jump ship sooner or later.
So then you're in a situation where the believers begin to exert their authority over you in order to control the situation and back up their beliefs instead of trying to solve the problem, up to and including threats of termination. It's as though they might suffer emotional harm if you fix the actual underlying problems that caused the whole need to create the belief system in the first place. Those behaviors are that strong and that lizard-brained.
One example: A transaction issue (actually the lack of a proper transaction) in a database was causing duplicate order numbers to occur pretty frequently each day on a large, multi-user system. Instead of getting someone competent to come in to analyze and fix the issue, the people at the company developed all these elaborate procedures for how to enter orders so that they could avoid creating duplicates in the system. They had actual written procedures and even the work schedules were affected by this order entry bug. I swear they were twirling around three times and throwing salt over one shoulder before entering a new order, but only on odd-numbered Wednesdays and Fridays.
I've seen similar problems involving thread local data and badly written multithreaded code, odd caching issues on networks relating to AJAX calls from certain browsers, network setup issues, you name it.
There, but for the grace of God, go I.
It's rare for people to make enormous improvements or experience enormous depredations in performance one experience to the next. It's common for natural variations due to random factors, especially when still learning, to crop up and impact performance from instance to instance. That's true whether one is shooting a basketball at a net or flying a jet aircraft. This means that quite often the experience of providing negative feedback will seem to result in improvement. But this is also true if one simply were to secretly write their negative feedback down, put it in a letter, and never show it to the student. Due to the regression to the mean effect. Students who do poorly in one round will often do better in the next, just as students who do well in one round will often do slightly worse in the next.
Only over a longer period of time and collecting data from multiple students will you balance out these effects and be able to determine whether positive or negative re-enforcement work better. And the data does seem to indicate that positive re-enforcement is superior.
Not a golfer, I see.
Then also what magnitude of praise or negative feedback elicits best results?
Essentially this could down to finding out the optimal training schedule for humans. (Like for neural nets or classifiers.) And this is similar to adversarial training. You have to consider both the trainee and the trainer (discriminator).
We like to say that people (especially children) shouldn't be punished; they should only be rewarded. And then we deny that we are still punishing them, after all, for example by silence and withdrawal.
The reality is you aren't in a relationship if you only get praise and positivity. That feels meaningless or even creepy depending on the intensity. A genuine connection will feel positive/neutral most of the time and negative occasionally, since we are knowledge-creating entities. For example, in a computer game you're learning optimally if your win:lose ratio is somewhere around 80:20. I would guess this applies to our relationships too.
Neither reward nor punishment will 'work' in the absence of the other. This is why tyrants go over the top with punishment, because all blame and no praise is lack of relationship too.
If you establish a norm that everybody gets yelled at, even the better performers, then getting yelled at comes to just mean "pay attention" or "you need to focus more on this." Otherwise getting yelled at means, "You are not living up to the standards of the group, and we resent you for it. You should worry about what's going to happen to you here, if you're even allowed to remain."
Language acquisition presents what is perhaps the easiest falsification of your claim. Children don't learn verbal communication so well because of a system of instruction inescapably based on rewards and punishment from an instructor who can teach the lessons that the child wishes to learn (or thinks they should learn). They learn so well because their brain essentially builds them a nice filter chain based on the sounds they hear at an early age range (thus language exposure is an important factor in later language learning).
We hoist praise and blame onto that process mainly because most parents have decades- (or sometimes centuries-) old concepts of learning that aren't based on modern research. Still, I'd much prefer they err on the side of too much praise rather than risk abusing their children. We have plenty of research that tells us the clear risks when that happens.
> In certain circumstances, especially where (b) is less true, then praise and blame get upgraded to reward and punishment.
The example above is a case where (b) is less true. Infants don't desire to build a language filter based on the sounds they are hearing. It happens involuntarily. But your system would actually guide a parent in the wrong direction-- escalating praise/blame to reward/punishment in a situation where neither are warranted.
> For example, in a computer game you're learning optimally if your win:lose ratio is somewhere around 80:20. I would guess this applies to our relationships too.
For that to be testable, your character in the game would have to stay dead once it gets killed. Or at least its injuries would need to follow it everywhere. Like a friend tries to show you a new move and hands you the controller, then the game says, "Hey, you're that guy with the broken leg," and doesn't allow you to do the move.
I think the win:lose ratio would change significantly in that case.
Reward and punishment are an amplification of a generally unwanted signal. I'm not advocating these; I'm not taking a moral stance. Rather I'm talking about what people already do irrespective of what they think or say they are doing.
Children learn language because it helps them to get what they want. If reward and punishment guaranteed results then adults would be able to reliably recite the multiplication tables (which they can't).
I think neither is correct. I think this is a tactical decision that needs to be made in the moment based on time and energy.
Which is to say, it is neither correct nor incorrect to slowly, agonizingly, peel off a band-aid. This can be a successful strategy. Sometimes, however, you just need to rip it off and get on with your life ...
Ask yourself why do they do double-blind experiments. I had read that depending on the illness, a non-active treatment (sugar pill) given to the control group improved their reported condition by significant and substantial margins over a population that did not get the fake treatment.
"We did not find that placebo interventions have important clinical effects in general. However, in certain settings placebo interventions can influence patient-reported outcomes, especially pain and nausea, though it is difficult to distinguish patient-reported effects of placebo from biased reporting. The effect on pain varied, even among trials with low risk of bias, from negligible to clinically important. Variations in the effect of placebo were partly explained by variations in how trials were conducted and how patients were informed."
This suggests the double-blind may have more to do with stopping the researchers fudging the results, than convincing the patient they're getting real treatment.
That raises the question, has the medical research community picked up on this? I would expect a burgeoning movement to correct this in medical trials.
Though as that article explains, shysters, whether practicing some kind of standard woo, or working for big pharma, are keen to continue to pretend they don't understand this as long as it's paying their salary.
Let's say that you repeat something over and over again with the same individuals, with random Gaussian outcomes on each trial. What will happen is that an individual who is at the extreme on one trial (positive or negative) will be more likely to be in the middle on the next trial, because the extreme deviations are extremely unlikely, and the middle deviations are relatively likely. So the really good person on one trial is likely to look worse on a subsequent trial, and the really bad person is likely to look better.
If you praise the person who did well on the initial trial, then, you will see good outcome -> praise -> worse outcome. If you punish the person who did poorly, you will see poor outcome -> punish -> better outcome. But this is spurious, because the good -> worse transitions and bad -> better transitions would have happened regardless of whether you praised or punished.
I think the thought experiment is imperfect for various reasons but does illustrate an important potential source of bias pretty well.
It only tells you what to account for if you want to test it
I think the lecturer and the objecting instructor are talking about two different things.
The instructor's thesis is that positive reinforcement is more effective. I'm assuming that this was a measured, long term result, e.g. students that were positively reinforced did better on final tests or other summarizing measurements.
The objecting instructor is talking about single instances "in the seat," not measured over time. "More often than not, the next maneuver ..."
I'm assuming that a student's performance will be somewhat more random than an experienced practitioner's performance. So there would be a tendency for regression to the mean in isolated instances, and a small enough sample of instances would not be enough to predict final measured proficiency.
Also because of the way the objecting instructor phrased his objection, I'm assuming that he hadn't checked his observations against the final summarizing measurements.
This is how I made sense of the conflict between the two people.
Is doing well on one maneover mostly signal (the pilot skill level) or noise (random chance).
In the book Nate talks about his own career as a poker player, and how there's so much chance involved that he still didn't really know if he was a good player or not after doing it for a couple of years, that's how hard it is to tease the signal from the noise. Would beating himself up about every lost hand and celebrating every win be worthwile, or would it just distract him from the real task of slightly beating the odds and grinding out a profit at the margin.
The point is praise is better. Not futile. Easy to misread reversion to the mean as the consequence of praise / punishment.
I am sure that isn't what you are saying, but for the sake of argument, here's a counter.
After raising a few kids, and watching many other parents do the same, I have found that both praise and punishment is an absolute requirement for success. But praise is good for personal bonding, and punishment for learning responsibility.
As an adult we punish ourselves (if raised to self evaluate) and on the extreme side we justify our selves in error if only given praise. Self evaluation cannot be done without self punishment.
Before you go nuts on what "self punishment" means in this context, I will provide one. At the lowest level it means acknowledging that an error was "my fault". The most minimum amount of punishment possible is taking responsibility. It hurts, it's punishment. Blaming others is easy, and doesn't hurt at all, but will not help you avoid repeating a mistake.
Having to apologize for a mistake or hurting others is a form of self imposed (or parent imposed in some contexts) punishment. And it's extremely effective in resolving conflict.
Again, I am sure that I read into your point, but it seemed fitting to address this purely from an argumentative perspective.
As an example, 'regression to the mean' ensures that it seems like blaming works better than praising.
Because of regression to the mean, an exceptional result is more likely to be followed by a more average one, irrespective of the instructor’s reaction. What the instructor actually achieved was condition themselves to respond in a certain way to a student’s performance, rather than conditioning the student to respond to the instructor’s feedback as intended!
I thought the demonstration was going to be to set people in pairs, have one toss a coin and the other punish / scold him every time he didn't come up with tails ;-)
That would help make the case that the positive effect of punishment is entirely due to chance.
The book is about Daniel Kahneman and Amos Tversky and their research. In my opinion a very good combination of telling the story of the main characters and also giving interesting points from their research.
The 'gets better, gets worse' quality here, is contestable because the underlying "skill" is a phantom. There is no perversity, testing if a non-existent thing is a property of the system or not: thats a strong test in science.
The perversity, is people's belief in the distinction of ability over something subject to random chance.
Is the final analogy showing that relative performance on any given trial is mostly random?
Or is the analogy to reinforce his point that relative performance on subsequent trials is inversely related to the type of feedback given?
It shows that when there is a gaussian distribution of performance, then exceptionally good performance will, purely based on statistics, be followed by a decrease in performance, and exceptionally bad performance will be followed by better performance.
The point is that this effect will usually be much stronger than the effects of feedback, and if you naively analyse what kind of feedback works better, it will lead you to the wrong conclusion because praise is given for exceptionally good performance while scolding is given for exceptionally bad performance.
His thesis is that praise, in real-world situations, has a positive effect, and punishment has a negative effect, but on extremely short timeframes random effects dominate, leading to the reasonable empirical belief that praise is ineffective compared to punishment.
while attempting to teach flight instructors that praise is more effective than punishment for promoting skill-learning
The claim is that it does make a difference; being nice is better. It's easy to misread reversion to the mean as the result of praise / punishment.
Here's what I paid attention to:
"because there is regression to the mean"
"I immediately arranged a demonstration in which each participant tossed two coins at a target behind his back, without any feedback."
I think the important part of that last sentence is "without any feedback" -- this is why I assert "It doesn't affect the outcome".
Regression to the mean notwhitstanding, you can "scold" but also correct something that was done wrong
And if you got right the first time you'll have a tendency to half-ass it next times (as you already know how to do it if the need arises)