
Should Surgeons Keep Score? - jsomers
https://medium.com/backchannel/should-surgeons-keep-score-8b3f890a7d4c
======
shanusmagnus
I'm astounded at the tenor of comments on this article, which I would expect
to find at any site other than this one. Yes, measuring surgical outcome data
and surgeon performance could lead to a variety of complications, perverse
incentives, regulatory capture, etc. etc. All true.

But you know what's worse than that? Having no fucking information about
anything. I'm in the process of trying to find a surgeon to help with an
orthopedic procedure, and it is so incredibly frustrating that I have a
thousand times as much information on which refrigerator I should buy, or
which phone, than about something that will affect my life more intimately
than any consumer product ever could.

And yes, of course I get that rating a commodity with a fixed function that is
used by zillions of people is way more straightforward. But look, all of these
apocalyptic scenarios about brain surgeons who only take trivial cases so they
can have a better score on the leaderboard? I'm just not worried about it. At
all. Because people are generally proud and want to do better. Because of the
social and professional stigma that would come from such behavior. Because if
you collect a rich dataset, you can account for most kinds of gaming, just as
you can do for teacher outcomes, or every other damn thing.

But probably most of all because the current state of affairs is so abjectly
wretched, and literally any effort at measurement and accountability would be
better under every reasonable scenario.

~~~
hermanhermitage
Incase you find this useful:

Just lying in hospital bed here about 36 hours after 2nd revision on a THR
first done in 1987. In the absence of clear public metrics I used the
following approach to choose my most recent surgeon:

1\. Age/experience. I was looking for someone about 2/3 of their way thru
career. Assuming experience and survival bias were positive indicators.

2\. I got a series of second opinions asking surgeons the direct question of
whom they would get to operate on themselves if they needed the procedure
done.

3\. I consulted a series of family doctors on their recommendations.

4\. I went with someone with a strong track record designing prostheses but
also a large history of performing the particular procedure I was having.

There are always trade offs. My surgeon works long shifts and so I got them 12
hours into a shift on a friday, which had me a little nervous (fatigue wise).

So my main advice is don't be shy in getting a lot of opinions and advice. I
will probably need another 2-3 revisions in my life time on this hip alone. So
I thought it worth my while taking a thorough approach.

Still at the end of the day bad luck can always happen. Expect the best but be
mentally prepared for the worst.

~~~
tomcam
I believe #2 is key and have thought for years this could be the basis of a
super-effective site. But not a lucrative one: a key problem is that any kind
of monetization that would benefit the doctors doing the rating could pervert
the results.

~~~
imaginenore
It could easily turn into "i scratch your back, you scratch mine" type of
reviews. Even voting rings of doctors.

~~~
hermanhermitage
Definitely detected that in my travels - diabolical surgeon golfers ring :). I
learnt to ask them what handicap they play off.

------
conorh
My wife is a highly specialized surgeon, she does _one_ operation, and she
does it around 600 times a year (Parathyroidectomy). An average endocrine
surgeon might do 20 of these a year. She went through training as an endocrine
surgeon and she tells me that the difference between the operation that they
do at her center and what an average endocrine surgeon will do is like night
and day. It is just not possible for surgeons with normal volumes to be able
to achieve that level of skill. What helped them recently was the release of
the medicare volume data [1]. This data is probably right now the only way to
get an idea of how many operations of a particular type your surgeon does (not
for all operations unfortunately, not unless you know a lot about billing
codes and practices!).

[1] [http://blog.parathyroid.com/parathyroid-surgery-
medicare/](http://blog.parathyroid.com/parathyroid-surgery-medicare/)

------
bokonist
Recently I had a major shoulder operation. I was shocked at how little
information I had about who was a good surgeon. If I didn't have a family
member who worked in the complaints department of the local big hospital, I
would have had no way of knowing who was considered good or bad.

My surgeon told me that I had a 90% percent chance of success, and a 1% chance
of nasty complications like nerve damage. The literature on the procedure said
that typical success rates were 75% and the complications rate more like 5%.
Was my surgeon particular good? Did I have a better shot because I was young
and healthy? Or was my surgeon suffering from the Lake Wobegon affect and
overconfidence? There was no way for me to know as a patient.

That said, naive score keeping could go very awry, for very obvious reasons
that others in thread have mentioned.

Here is the system I would like to see. Tell me how this could get gamed:

Surgeons should be required to give official, written, probabilities to all
potential patients. So for instance, a surgeon might say that there is a 90%
chance that I can play football again, a 1% chance that my arm will end up
worse than before the surgery, and a .01% chance of death.

Then surgeons should simply be measured against their own predictions. When I
go to a surgeon, I should have access to that data. The surgeon has no
incentive to be overly conservative with the probabilities - because then I
will go to a surgeon who is more skilled, can predict better outcomes, and the
track record to prove it. Nor does the surgeon have an incentive to be overly
optimistic, because then they will get dinged for not scoring according to
their own predictions. Nor does the surgeon have an incentive to turn away
high-risk patients, they just need to state the risks accurately.

The patient wins because the patient can finally have the most accurate as
possible information about the risks and benefits of a surgery, and can get
multiple opinions, compare them, and have good data about which surgeons are
reliable in their predictions.

~~~
imaginenore
Why ask for the doctor's opinion about the data, and not for the data itself?

And what's the punishment if the doctor's estimate is wrong? Or he/she is
lying?

Doctors have enough shit on their plate, we simply need the access to the
hospital data with the doctor names attached.

~~~
bokonist
I ask for the doctor's official opinion/prediction about the probabilities for
my own surgery. This must be the doctor's opinion because every person and
every surgery is different. A doctor's job is to analyze each person's
situation, and then use their experience, knowledge, and professional
judgement to give the patient the doctor's best estimate of the benefits and
risks of a given procedure. This is not "adding sh*t to their plate", this is
formalizing a core job function of every doctor.

The patient would get the doctor's prediction track record directly from the
hospital or a third party monitoring agency. That way the patient knows if the
doctor is generally accurate in their predictions, or if they are consistently
overconfident in their own abilities.

If a doctor was wrong once, they are wrong. If they are consistently wrong,
then that shows up in their stats. Patients will no longer trust their
predictions, and will seek other doctors. The doctor will have to really
improve their prediction ability (a good thing) or else go out of business for
lack of patients.

~~~
pedrosorio
I don't see how this solves the problem of some doctors tackling harder/easier
cases.

How do you distinguish two doctors with high prediction ability and low
success rates (compared to the average for that procedure) if one is bad (and
she knows it) and the other is tackling harder cases (and is actually one of
the best in the field for cases with high probability of complications)?

Without input from other doctors (or simply using a lot of data where you can
correlate hard procedures with other factors in the patient data) you'll never
be able to distinguish the two doctors mentioned above.

~~~
bokonist
" _How do you distinguish two doctors with high prediction ability and low
success rates (compared to the average for that procedure)_ "

The doctor is never compared against the "average for that procedure" for
exactly the reasons you give. The doctor is only compared against that
doctor's own predictions.

So as a patient in need of a surgery, you would get opinions from 3-4
different surgeons, each one would offer their personal outcome probabilities.
The patient gets access to that doctor's stats that score their actual track
record against their own predictions. The patient should then choose the
surgeon who gives the best odds but also has a track record of hitting their
predictions.

A doctor who tackles hard cases should still have a good success rate against
their own predictions. Such a doctor will just lower their predictions
according to the riskiness of the case. If the doctor is good, such a doctor
will still get business, because the skilled doctor will still offer better
odds (odds that the patient can actually trust) than can be reliably offered
by a less skilled doctor.

The one weakness of my system is that it does not give any sort of global
score. There is still the problem of having to find 3 to 4 good surgeons to
ask for an opinion in the first place. But at least once you have gotten to
that point, you can have trustworthy predictions upon which to base your
decisions.

~~~
sokoloff
Doctors in such a system may still have an incentive to over negatively
predict the tough cases, and specifically by an amount greater than their
peers.

The hope would be "send this tough or impossible case to someone else" such
that the doctor's success rate stats will remain high and the outcome
prediction stats would be unaffected (as the "trial" would go to another
doctor).

I'm all for having more information available, and when my extended family
faces a serious medical concern, we seek out friends and family in medicine,
asking "if you faced this situation, what doctor would you trust?" I don't
know of a way to globally institutionalize that process.

~~~
MarkMc
But the track record for such a doctor will clearly see that they aren't
tackling the difficult cases. Would a doctor want a high success rate if it
comes with a reputation for only taking easy cases?

Even if the answer to the above question is Yes, the reduced number of doctors
willing to tackle difficult cases will be able to charge higher fees. So at
some point you would reach a market equilibrium where the desire to tackle
easy cases is balanced by the desire to earn a higher income. That is, when
compared to the current system the easy cases will become cheaper and the more
difficult cases will become more expensive - but maybe that is an acceptable
outcome if it means the system as a whole is more efficient?

~~~
sokoloff
It depends on the goals of the doctor. If the doctor aspires to a massive
volume of fixed-rate, "easy" procedures (look at cataract surgeons benefiting
from the advances in that field while insurance and Medicare reimbursement
rates remained constant [and high]). I'm not knocking those docs; they
provided real and tangible benefits for millions of patients with cataracts
and I don't begrudge them their money. It's just super amusing to me to walk
down the multi-million dollar warbird parking area and have half the owners be
eye doctors.

As for aspiring to have a reputation for efficacy in difficult cases and
assuming that you'll be able to charge more due to market forces? I don't see
that playing out in any Western medicine economy. IMO, you can't build a
functional ecosystem around the very few patients who are self-paying and
willing to pay large sums for better care.

Very wealthy individuals and pro sports teams are the only customers I can see
for that. The overwhelming majority of people (far in excess of 99%) are going
to have two hurdles to procure your expensive services. First, they have to
find and select you. Second, they have to convince their insurance provider to
pay your rate, instead of the "going rate". That seems uphill, probably
steeply so.

~~~
MarkMc
(Sorry for delay - I just read your reply)

As someone who had a serious illness last year and who is not in the top 1%, I
can promise you I would have crawled over broken glass if I thought it would
have improved my chance of survival. I certainly would have paid a large chunk
of my net worth to switch to a doctor who had a significantly better track
record.

I don't understand why the medical insurance market wouldn't work. If the
insurance company won't let me switch to a doctor who has a better track
record, won't that insurance company get a bad reputation and lose business?
U.S. doctors have a better reputation than Mexican doctors and in this case
most people pay for insurance which covers treatment by the more reputable
doctor - I imagine the same effect would apply between doctors within the US
if there were a objective track record which identifies the better-performing
doctors.

------
forrestthewoods
Why don't we? Because when it comes to health care we aren't rational.
Emotions run and win the day. But not only do they win the day, they win the
day in court for a lawsuit.

Ob-Gyns have some of the highest insurance rates among doctors. Possibly the
highest. Why? Because they screw up it's a baby that dies. God help you if you
think there's an ounce of rationality in a room with a dead baby.

Here's a similar article from earlier this year. The short version is that a
man lost his wife to a mistake and... well that was it. The NHS (UK) doesn't
do investigations as to what happened or why. That just means it will happen
again. When a plane crashes there is a gigantic investigation and the results
are shared. There are some famous cases where a series of basic mistakes
needlessly lead to a crash and everyone dies. All pilots know about this so it
doesn't happen anymore. Hospitals don't do that. Because it ends in pointing
the blame finger. And then people lose their job, lose their license, and get
the ever living shit sued out of them.

[http://www.newstatesman.com/2014/05/how-mistakes-can-save-
li...](http://www.newstatesman.com/2014/05/how-mistakes-can-save-lives)

~~~
rokhayakebe
The difference with the airline industry is that when there is a crash
everyone dies. If my father is dead, and his doctor who made the mistake is
alive, you can imagine she is not going to volunteer the information.

~~~
kijiki
This is only true if the cause is pilot error. If the cause is improper
maintenance, the responsible party is still alive.

In the case of the NTSB, there is a strong culture of not penalizing errors
unless they were criminal or egregious, which works well in promoting
cooperation with the investigation. This would likely be very difficult to
achieve with doctors, especially in the US.

------
ytturbed
Perhaps would-be surgeons ought to have their manual coordination assessed
_before_ they commence years of expensive training. The irony is that for my
father's generation, in England, prowess on the rugby field was considered
important in getting into medical school. (And that might actually have been a
good thing. I suspect top athletes, musicians and surgeons all possess the
same talent which would cease to be a 'talent' if only we could explain it.)

~~~
IndianAstronaut
Right now the main criteria for medical school admission is simply how much
information you can cram into your head and regurgitate. Critical thinking,
thinking outside the box, dexterity, compassion...irrelevant.

------
busyant
I heard this first hand from Judah Folkman
([http://en.wikipedia.org/wiki/Judah_Folkman](http://en.wikipedia.org/wiki/Judah_Folkman))
at a seminar in 1999. He was discussing news reports that other labs were
unable to reproduce the anti-cancer properties of certain proteins discovered
by an MD in his lab.

He said (paraphrasing): Scientists become upset when other researchers cannot
reproduce their results. Surgeons become upset when other surgeons _can_.

------
alextgordon
I've been watching TCEC this week - the chess world championship for
computers.

The joint #1 engine, Stockfish, has had a series of embarrassing losses to
Gull, a much inferior engine. Everybody was convinced that Stockfish was
surely broken because its level of play was so poor.

Then suddenly, in the last couple of days, it has turned itself around. After
a series of wins it's only half a point off the top spot (if it draws the
current game).

Here's the thing: _nothing changed_. It's the same code, the same algorithms,
the same build. That bad run it had was just bad luck, there was never
anything wrong with it.

An excess of information can absolutely be a force for ignorance. People will
see patterns even when none exist. Often the only way to stop people from
misinterpreting data is to not have any data at all.

If people can be so easily mislead by the fortunes of an unchangingly
consistent algorithm, just think how destructive data on surgeons could be.

~~~
lucio
question: bad luck in chess? I do not get it. Maybe I'm simplifying, but the
program which gets deeper in the movement tree, should always win.

~~~
shadowfox
> but the program which gets deeper in the movement tree, should always win

It doesn't always work like that. Usually, since examining, the whole move
tree is very expensive, heuristics tend to kick in at various points providing
assessments for positions (thus helping to prune the tree). It is quite
possible to land in a position where your set of heuristics is not quite
right.

------
robert_tweed
The ideas presented here seem pretty sensible. However, there is a risk that
if such data is collected, it may later be used to compare surgeons to each
other. One of the reasons that is risky is that good surgeons tend to
deliberately take on more complex and riskier operations than those less
capable, and therefore have statistically worse outcomes. There can also be
geographical biases if statistics are summarised, such as certain hospitals
having poor outcomes because they happen to get a lot of gunshot wounds, or
stabbings, or the local population is older than average, or childbirth rates
are higher, etc. It would difficult to control for all possible such
variables, and ever more difficult to explain the methodology to the public.

It's probably a universally good idea as long as there are safeguards to
ensure that the data is not shared generally, which might lead to misinformed
reactions and subsequently, target-chasing (see: Goodhart's law). Using the
data purely for personal improvement seems like it should be effective. I
think it's reasonable to assume that most surgeons would want to self-improve
if they can. Also surgeons tend to be especially competitive, so creating a
competition against themselves could be a good motivator in itself.

The data could perhaps be used to weed out poorly performing surgeons too. I
believe the right way to do that would be to share the data anonymously with
peers who are able to interpret it properly. These peers could then flag any
worrying anomalies in the data that can't be explained away, which should
trigger a follow-up investigation of the individual concerned. This could
perhaps stop the next Harold Shipman, as well as weeding out incompetence.

Of course, it may not be possible to usefully anonymise case data since some
of the more complex cases may be sufficiently unique that the surgeon could be
personally identified from the description. This would only work if case can
be classified broadly enough to avoid personal identification and narrowly
enough to provide enough information for peer review.

~~~
arjie
As the case of Dr. Christopher Duntsch revealed, the limiting factor in these
things isn't that other surgeons noticed an incredible lack of skill (because
they did), but the nature of the problem requires a great deal of time before
the board involved can collect sufficient evidence of incompetence to the
degree of negligence. It looks like this sort of thing would still help, but
you'd still need to determine that the surgeries themselves were done ineptly.

------
johnorourke
Developers could learn from this. I've a few grey hairs, many caused by late
nights coding, and I now run a dev team. I wonder if we could really learn
from this - for example, looking at the 3rd video in the youtube playlist from
the article, looking at the criteria they judge each other on, let's see how
transferable they are:

\- minimal movement: using clear, concise code, or few commands, to solve a
problem

\- lack of repeated actions: exactly that. Why did I just look at that same
log file 5 times?

\- confident use of tools: do I have a set of tools (IDEs, editors, commands)
I know intimately?

\- awkwardness of actions: can I think several steps ahead in the problem and
bring things into line to form the solution?

And so on. This is a raw, unrefined thought and I hope it gets thoroughly
pulled apart in any replies.

~~~
codingdave
I think this would lose some practicality when put into practice. Clumsy,
awkward code may still offer great business value/ Brilliant, precise code,
when misapplied to its purpose, can still fail. Of course, everyone wants good
code... but I do not think the direct correlation of quality to outcomes from
surgery would exist in the coding world.

~~~
c0rtex
Maybe it's sufficient to ask: what would be useful for developers to keep
score of in order to improve professionally?

Surgeons keep score by measuring patient outcomes in order to draw conclusions
about their own performance, the effect of which is to "shrink the outliers"
\- people see where they can improve and they go ask their colleagues, "hey,
how'd you do that?". So what would the equivalent of that be for a developer?

The best tool I can think of for this isn't an automated system for
scorekeeping - it's soliciting feedback [1]. You ask a more seasoned dev who
is familiar with your work where you can improve.

How do you "shrink the outliers" on a team of developers? Get people to work
together. Take each other's code apart.

[1] Particularly negative feedback, according to Elon Musk at the end of this
vid:
[http://www.ted.com/talks/elon_musk_the_mind_behind_tesla_spa...](http://www.ted.com/talks/elon_musk_the_mind_behind_tesla_spacex_solarcity)

------
lumberjack
This doesn't solve the information asymmetry (and I don't really think you can
solve the information asymmetry short of getting an MD yourself). What it does
is abstract it behind tables of scores but at the end of the day, the patient
still needs to trust that the system is logical and that the compiled data is
accurate and then somehow bridge the generalized case of the compiled data to
his own case which is probably the trickiest part for somebody without medical
training.

Maybe you should just trust your GP on this one. Or if you think you cannot,
maybe find a healthcare system that aligns the incentives of your GP with your
well-being.

------
xorcist
A big problem with scoring systems is that it incentivises risk taking. If you
enter one of the stock picking competitions, for example, you rational choice
is to take extreme risks.

You probably won't pick a winning stock, but if you do you're likely to have a
good chance at winning the competition. This is the opposite of what an
investor wants to do with his own money, which is to manage the risk taking.

This is a common problem in designing scoring systems for measuring
performance, even after the more ovious problems with natural variations.

~~~
learnstats2
That's true of stock picking competitions/tournament-style gambling, yes, but
it wouldn't apply here.

If I'm a surgeon with a good score, I make bad judgements and they start
coming off poorly, I won't have a good score any more. Am I motivated to take
a bad risk? Not unless I have some other reason to.

~~~
xorcist
It's not directly applicable because it is a different problem domain, but the
general reasoning holds. The problem is that the optimum strategy for
maximizing your score is not to be as good a surgeon as possible.

(Which I stupidly say without having the slighest idea how this works beyond
what is described in the article. But I have yet to see a scoring system which
aligns where the optimum strategy aligns perfectly with the desired outcome,
and this holds for everything from grades in school to karma systems.)

~~~
learnstats2
No: the general reasoning doesn't hold.

The reasoning for tournaments is that you have to be first at all cost, so you
should consider taking the largest risk available to avoid coming second. That
reasoning doesn't hold for surgeons: there's room for more than one surgeon.

There are problems with scoring systems, yes, but a complete lack of
information for consumers is considered terrible in any other industry.

------
k2enemy
Here's a study of hospital report cards that shows a decrease in patient
welfare from the increased transparency. As others in this thread have noted,
the report cards led to doctors not wanting to take on risky patient.

[http://www.kellogg.northwestern.edu/faculty/satterthwaite/re...](http://www.kellogg.northwestern.edu/faculty/satterthwaite/research/2003-0520%20dranove%20et%20al%20re%20report%20cards%20%28jpe%29.pdf)

~~~
lostlogin
This article mentions this problem and addresses it towards the end of the
article. The key part of addressing the problem is confidentiality.
Interestingly know of another piece of software which is being developed with
the aim of reducing inappropriate dosagages during medical care. It works a
similar way and uses anonymous comparison as well.

------
patcheudor
Like anything in life the devil is in the details. What measures and metrics
could possibly be used to determine the score? If you go by mortality rates
alone you risk creating an environment where no one wants to operate on the
most at risk patients. You'll quickly find that the best surgeons aren't
necessarily the most surgically skilled, but instead those who do the best job
of pre-screening surgical candidates. If this continues, soon you'll have
patients whom doctors flat out refuse to cut open. If instead of mortality
rates, you score the doctors on skill of movement in their procedures, again,
you'll find doctors gaming the system by picking healthier patients with lower
body fat percentages.

Fundamentally, if a system was in place to score surgeons a lot of checks and
balances would need to be enacted to avoid lowering the quality of care by
ensuring doctors, hospital staff, and administrators couldn't simply pick and
choose what surgeon gets what patient. I really see this as a neat data
science project after the fact, but if implemented could have a significant
downside impact on patient outcomes for some.

~~~
Kliment
Well, the solution proposed in the article is to keep the scores hidden from
everyone but the particular surgeon. That is, have the score be a motivator
and guide for personal improvement rather than an external quality indicator.
They report on trials where similar scores were published in NY and the end
result was that the highest risk patients got shipped off to Ohio, exactly as
you describe. So it's critical that the data is not shown to anyone but the
surgeon in question.

~~~
patcheudor
The problem is that by keeping it hidden from everyone but the surgeon and by
still allowing the surgeon to have a say in who they operate on, they can
still game the system. There are no external controls at that point. Now if
it's only for their benefit what possible reason would a surgeon have to game
the score you might ask?

You can provide all the assurances in the world that the score is a personal
motivator, but we all know that could change at the drop of the hat. I've seen
this far too often in the security field. Someone will come up with a great
measure, will reassure everyone that all unintended consequences have been
considered and then boom. A year down the line an executive, unaware of the
ramifications of unintended consequences will want to use the metric for
performance evaluations. I think any surgeon who believes the score is only
there to help them and that they shouldn't plan for a future where someone
pushes to make it more public is being a bit naive in the ways of the world.

This isn't to say I don't think there could be a solution. The scoring system
needs to be built from the ground up with an offset risk equation which
provides significant incentive to operate on at risk patients. Maybe a surgeon
gets three points for operating on the patient and if they die they only loose
one. However if they have a healthy patient maybe they start off with one
point, with just one to loose. Obviously all the unintended consequences for
this model would need to be explored at length.

~~~
jerf
"There are no external controls at that point."

Well... there appear to be no external controls at _this_ point, either...

~~~
baddox
Right. Even in the current situation, surely there is _some_ incentive for
surgeons to turn down high-risk patients, because of the inevitable stress and
potential legal trouble from a failed operation.

------
bawana
Measuring the quality of surgeons is like measuring the quality of policemen
or firemen or lawyers or politicians. It just isn't done.

And even if it could be done - how are you going to enforce quality once you
measure it? By suspending the bottom 20%? There is already a shortage of
surgeons. And surgeons are paid poorly for their work and their investment in
education. You cannot pay the better ones more-It is not a free market- prices
are fixed by government decree. And physician reimbursement is ONLY tied to
work volume (Relative Value Units billed) Frankly, I would be scared to be a
patient in the US now; cannon fodder in war of the medical industrial complex
versus the third party payers. Insurors would rather pay multimillion dollar
salaries to executives and hospital CEOs instead of a quality assurance
program. Incentives are aligned to maximize profit, which constantly and
carefully measured. Shareholder return is the single most important metric of
a publicly held company like Blue Cross, United Health Care, etc. Quality is a
pass/fail grade based on outdated measures designed to be politically correct.

------
healthenclave
I would like to give my 2 cents to the discussion, as a Medical Doc who was
training to become an Orthopedic Surgeon.

One of the problems with Surgical Branches is that beyond a certain point
(i.e: Beyond from knowing how to do a procedure) the act of performing a
surgery essentially is an art form (Skill). I learned the same from one of the
leading Orthopedic Surgeons in India. And that is one of the crucial most
factors that differentiates Good surgeon from a Bad One.

Although skill can NOT be quantified but certainly in the case of Surgery we
can quantify the results of the skills in terms of complications of surgery,
recovery and patient satisfaction.

One solution to the problem would be :

(A) To have a feedback mechanism for doctors. Where they receive a score on
their performance and can compare if to other surgeons performing similar
procedures.

The surgeon would upload a video of all the types of procedures they do every
3 months. And just like NEJM a committee of people provide inputs and rating
on the skills of the surgeon. This score in combination with the complication
rate and patient feedback would go into making the overall score of the
doctor. The doctor will be able to see where he stand in compare to their
colleagues from across the country (possible the world). And also for newer
(or BAD) surgeons this system would provide a way to learn from the best in
the field and improve their skills.

(B) If you try to make such a score Public initially, it will receive a huge
backslash from the doctors and the industry. But having an internal score
keeping mechanism is much better than having no score / rating system.

(C) Some hospitals actually do have internal metrics where they track
surgeon's performance. In terms of complication rate and other metrics -- but
this data is RARELY available to the public.

(D) Unless some kind of law is passed at a Federal level in the US, I am not
very optimistic about the situation improving.

------
Zigurd
Measuring surgeon performance directly is both very difficult and unlikely to
do what you want: increase your chances of living through a surgery.

BUT, surgeons do not perform in a vacuum. There are a number of things you can
measure and get usable information.

You can measure the number of surgeries of the kind you are getting performed
by your surgeon vs. by practices specializing in such procedures. You will
almost always find that the more a surgeon does a procedure, the better the
outcomes.

You can measure re-admissions after surgery. You can measure infection rates.
Etc. These will tell you the quality of the hospital where the surgery will be
performed.

------
ankit84
After watching the videos, my answer is YES.

If a bad programmer is 10x slower, experience with a bad surgeon make 10x more
likely to die, have complications, undergo reoperation, and be readmitted
after hospital discharge.

~~~
TazeTSchnitzel
Honestly, rating surgeons is probably much more difficult than rating
programmers.

~~~
gd1
Not so sure about that, these surgeons are performing the same procedure again
and again. And being compared to surgeons who also perform the same procedure.

We don't tend to write the exact same code repeatedly. Or the exact same code
as other coders.

~~~
robert_tweed
I suppose it is just like writing exactly the same app over and over again.
But in a different Lisp dialect every time. And several of the system
libraries are missing, but you never know which ones until you try running the
code.

------
tokenadult
I like this paragraph of the article best (but there are a lot of other good
paragraphs, building to an overall good whole, so I encourage you to read the
whole article): "In Better, Atul Gawande argues that when we think of
improving medicine, we always imagine making new advances, discovering the
gene responsible for a disease, and so on — and forget that we can simply take
what we already know how to do, and figure out how to do it better. In a word,
iterate." That's exactly it. Medicine improves most dramatically simply by
spreading the word about how to prevent and how to treat illnesses better to
everyone who hasn't mastered that yet. That's the biggest single factor in
steadily reducing death rates at all ages all over the developing world.[1]

The one time I had an immediate family member who needed treatment for a
puzzling disease, my mom was still working as a surgical nurse in our state's
main research university's teaching hospital. She knew who the best surgeon
was, who the best surgical resident was, who the best anesthesiologist was,
and who the best surgical nurses were. My relative was able to recover fully
very soon after surgery that THAT surgeon thought had "nil" risk--he was
confident of his abilities, with justification. Now that my mom is not in
active practice of nursing anymore, I would like a better consumer-facing
channel for information about which surgeons are the best in town. I would
definitely ask the nurses I know who work at teaching hospitals if the
question came up again for my family.

[1] An article in a series on Slate, "Why Are You Not Dead Yet? Life
expectancy doubled in past 150 years. Here’s why"[3] Provides some of the
background.

[http://www.slate.com/articles/health_and_science/science_of_...](http://www.slate.com/articles/health_and_science/science_of_longevity/2013/09/life_expectancy_history_public_health_and_medical_advances_that_lead_to.html)

Life expectancy at age 40, at age 60, and at even higher ages is still rising
throughout the developed countries of the world.[4]

[http://www.nature.com/scientificamerican/journal/v307/n3/box...](http://www.nature.com/scientificamerican/journal/v307/n3/box/scientificamerican0912-54_BX1.html)

------
tallTrees
This is worth reading, to develop your software engineering skills.

[http://www.amazon.com/Introduction-Personal-Software-
Process...](http://www.amazon.com/Introduction-Personal-Software-Process-
Humphrey/dp/0201548097/ref=sr_1_5?s=books&ie=UTF8&qid=1418563485&sr=1-5&keywords=engineering+watts+humphrey)

------
fiatjaf
Here's an interesting story of medical open data and highly improving medical
procedures: [http://www.newyorker.com/magazine/2004/12/06/the-bell-
curve](http://www.newyorker.com/magazine/2004/12/06/the-bell-curve)

------
known
[http://blogs.law.harvard.edu/abinazir/2005/05/23/why-you-
sho...](http://blogs.law.harvard.edu/abinazir/2005/05/23/why-you-should-not-
go-to-medical-school-a-gleefully-biased-rant/)

------
baldfat
“You can think of surgery as not really that different than golf.” ... The
difference is that golfers keep score.

No Golfer don't think they are God.

------
MarkMc
This article reminds me of Bill Gate's emphasis on measuring outcomes - here's
a quote from the Gates Foundation 2013 letter [1]:

\-------------- Begin Gates Quote ---------------

 _Over the holidays I read The Most Powerful Idea in the World, a brilliant
chronicle by William Rosen of the many innovations it took to harness steam
power. Among the most important were a new way to measure the energy output of
engines and a micrometer dubbed the "Lord Chancellor," able to gauge tiny
distances.

Such measuring tools, Rosen writes, allowed inventors to see if their
incremental design changes led to the improvements-higher-quality parts,
better performance, and less coal consumption-needed to build better engines.
Innovations in steam power demonstrate a larger lesson: Without feedback from
precise measurement, Rosen writes, invention is "doomed to be rare and
erratic." With it, invention becomes "commonplace."

Starting around 1805, the “Lord Chancellor” micrometer, according to author
William Rosen, was “an Excalibur of measurement, slaying the dragon of
imprecision,” for inventors in the Industrial Revolution. (© Science Museum,
London) Of course, the work of our foundation is a world away from the making
of steam engines. But in the past year I have been struck again and again by
how important measurement is to improving the human condition. You can achieve
amazing progress if you set a clear goal and find a measure that will drive
progress toward that goal-in a feedback loop similar to the one Rosen
describes. This may seem pretty basic, but it is amazing to me how often it is
not done and how hard it is to get right._

\-------------- End Gates Quote ---------------

Nobody questions the need for measurement in engineering, but when Mr Gates
tried to apply the same logic to measuring teacher effectiveness [2] he
received a lot of pushback from people who say his method is flawed [3,4] or
simply that teaching effectiveness cannot be reliably measured.

This is a controversial topic. Here is an interesting take on this subject
from a book called Teaching as Leadership [5]:

\-------------- Begin Teaching as Leadership Quote -----------

 _As we see modeled by these teachers, the less tangible nature of such longer
term dispositions, mindsets, and skills does not mean they cannot be tracked
and, in some sense, measured. In fact, if these ideas are going to be infused
into a big goal, you must have a way to know that you are making progress
toward them.

Mekia Love, a nationally recognized reading teacher in Washington, D.C., sets
individualized, quantifiable literacy goals for each of her students but also
frames them in her broader vision of "creating lifelong readers." This is a
trait she believes is a key to her students opportunities and fulfillment in
life. In order for both Ms. Love and her students to track their progress
toward creating lifelong readers, Ms. Love developed a system of specific and
objective indicators (like students self-driven requests for books, students'
own explanations of their interest in reading, the time students are engaged
with a book.) By setting specific quantifiable targets for and monitoring each
of those indicators, she was able to demonstrate progress and success on what
would otherwise be a subjective notion.

Strong teachers -- because they know that transparency and tracking progress
add focus and urgency to their and their students efforts -- find a way to
make aims like self-esteem, writing skills, "love of reading," or "access to
high-performing high schools" specific and objective. These teachers -- like
Ms. Love, Mr. Delhagen, and Ms. Jones -- ask themselves what concrete
indicators of resilience or independence or "love of learning" they want to
see in their students by the end of the year and work them into their big
goals.

In our experience, less effective teachers may sometimes assume that because a
measurement system may be imperfect or difficult, then it must be wrong or
impossible. As Jim Collins reminds us in his studies of effective for profit
and nonprofit organizations:

"To throw our hands up and say, But we cannot measure performance in the
social sectors the way you can in a business is simply lack of discipline. All
indicators are flawed, whether qualitative or quantitative. Test scores are
flawed, mammograms are flawed, crime data are flawed, customer service data
are flawed, patient outcome data are flawed. What matters is not finding the
perfect indicator, but settling upon a consistent and intelligent method of
assessing your output results, and then tracking your trajectory with rigor."_

\-------------- End Teaching as Leadership Quote -----------

Lastly, on a personal note I have found that I simply cannot lose weight
unless I keep track of the number of calories I eat. There is something about
seeing that number that has a strong influence over my behaviour.

\-------------- References --------------

[1] [http://www.gatesfoundation.org/Who-We-Are/Resources-and-
Medi...](http://www.gatesfoundation.org/Who-We-Are/Resources-and-Media/Annual-
Letters-List/Annual-Letter-2013)

[2] [http://www.metproject.org/](http://www.metproject.org/)

[3] [http://jaypgreene.com/2013/01/09/understanding-the-gates-
fou...](http://jaypgreene.com/2013/01/09/understanding-the-gates-foundations-
measuring-effective-teachers-project/)

[4]
[http://garyrubinstein.teachforus.org/2013/01/09/the-50-milli...](http://garyrubinstein.teachforus.org/2013/01/09/the-50-million-
dollar-lie/)

[5] [http://www.amazon.com/Teaching-As-Leadership-Effective-
Achie...](http://www.amazon.com/Teaching-As-Leadership-Effective-
Achievement/dp/0470432861)

------
bayesianhorse
Surgeons, at least in first-world countries, and not only there, already
operate on a skill level which is hard to measure at all.

In procedures where there is a high probability of success, many surgeons
would need to collect data points for years to even reliably tell they are
better or worse than a particular other doctor.

Very challenging, statistically...

~~~
kens
The whole point of the article is that surgeon ability can be reliably
determined from looking at short videos, and this ability is correlated with
how well the patient does.

For another interesting article on this, see
[http://well.blogs.nytimes.com/2013/10/31/a-vital-measure-
you...](http://well.blogs.nytimes.com/2013/10/31/a-vital-measure-your-
surgeons-skill/?_r=0)

~~~
apetresc
And not only that, by ten-year-old daughters, too!

------
bronbron
Hm, this article reads like kind of a puff piece for Amplio.

> similar efforts to “grade” American schoolteachers, for instance, have
> perhaps generated more controversy than results.

Yes, for good reasons, namely...

> It’s all about trust.

No, it's not. At all. The author even notes the problem with scoring systems,
that happened in _this exact field_. When you start scoring people, they start
gaming the system to increase that score. It's the same problem with "grading
teachers". You give surgeons huge incentives to start "fudging the truth"
about their patients' surgical risk.

"Oh blah blah blah it's private". Great. Hopefully everyone involved can see
the obvious future problems (which 7 comments in, other HN posters have zeroed
in on), but they haven't given any assurances that these fears will never come
to fruition. Or any prevention plans.

> It’s like Vickers said to me one night in early November, as we were
> discussing Amplio, “Having been in health research for twenty years, there’s
> always that great quote of Martin Luther King: The arc of history is long,
> but it bends towards justice.”

I actually laughed when I read this. How pretentious.

