leftpad's comments

leftpad · on Dec 15, 2016

Clover collects tons of data about its patients, probably more than most health plans. They may only have 19,000 patients, but they also like to talk about how their data is very wide. Most health plans I've worked with do a terrible job of even collecting the simplest types of data and outsource a vast majority of their data collection processes. These health plans have aging technology and a reluctance to use new and open source tools. A great example of this is how many health plans manage to somehow overpay claims to the tune of 10-100s of millions of dollars per year, and have no idea why. There's an entire cottage industry devoted to solving this problem for insurers.

Most of the tests you mention above are wasteful for your typical Medicare Advantage enrollee. There's a ton of low hanging fruit for a start-up like Clover to make a meaningful impact. Trying to change human behavior through diet or exercise is incredibly difficult, especially for those from disadvantaged communities or of lower socioeconomic status. Kudos to Clover to trying to make a marginal impact on that front; most insurance plans wouldn't do anything.

yourapostasy · on Dec 15, 2016

> Most of the tests you mention above are wasteful for your typical Medicare Advantage enrollee.

A good point to discuss. Perhaps I'm overly optimistic about people in general, but how do we know, a priori to a data-centric healthcare model applied to a patient, about their participation? These tests are only wasteful in the context of the current care model, which stipulates that once you have Type 2, there is a lockstep progression of increasingly invasive and expensive medication and interventions, graduation to insulin injection, and culminating in early death from complications? In the face of that kind of prognosis, it is not at all surprising that any additional tests are considered futile by both medical practitioners and patients alike. But if the patient was offered via data-centric care a clearer window to their condition, amplified proactive participation in care management with negative feedback loops tamping down undesirable fluctuations, and in the case of Type 2 or metabolic syndrome the clear goal of reversion (though not a cure) and drastically improved eventual outcomes, why do we assume that even elderly patients on Medicare Advantage would not statistically respond well as a whole population? I dunno, this isn't my domain expertise, so I'm honestly asking these questions; if there is the equivalent of behavioral economics studying patient behavior, then perhaps that field has the answers I'm seeking.

> There's an entire cottage industry devoted to solving this problem for insurers.

I'd like nothing more than to see these efforts succeed, look forward to following the progress.

leftpad · on Oct 15, 2016

It depends on what your goals are. If you'd like to become an ML Engineer or Data Scientist, Tensorflow should be last thing you learn. First, develop a solid foundation in linear algebra and statistics. Then, familiarize yourself with a nice ML toolkit like Scikit-Learn and The Elements of Statistical Learning (which is free online). The rest is a distraction.

In addition to the linear algebra and statistics MOOCS mentioned, I'll also add:

* No bullshit guide to Linear Algebra: https://gumroad.com/l/noBSLA

* Statistical Models: Theory and Practice: https://www.amazon.com/Statistical-Models-Practice-David-Fre...

leftpad · on Sept 24, 2016

This is absolutely one of the reasons. See Google's computer vision system classifying black folks as gorillas: http://blogs.wsj.com/digits/2015/07/01/google-mistakenly-tag...

leftpad · on Sept 24, 2016

> If the output examples we are training on are true, the ML algorithm won't adopt any incorrect biases.

This is rarely the case when working wtih real data, and thus inspecting whether our models are biased against protected classes is probably one of the most important things an ML practitioner should do.

wyager · on Sept 25, 2016

We have a good understanding of the error profiles of most ML algorithms. We can put tight bounds on the difference between predictions and validation data. If ML algorithms make mistakes, it's usually due to noise or low-quality data, not the algorithm itself.

There is no reason an ML algorithm would be biased against a "protected class". It doesn't know what those are. It's possible that the algorithm will uncover a truth that you don't like, like that there as differences in risk profiles across race or gender, but that doesn't mean the algorithm is incorrectly biased. It just means that reality is at odds with how you might want it to be.

leftpad · on Sept 25, 2016

Not talking about the algorithm. It's the data that are biased. The choice of algorithm just determines how interpretable those biases actually are.

wyager · on Sept 25, 2016

> It's the data that are biased.

Which data are you referring to? In most cases, the training data isn't human-generated, and if it is, we usually want to match human behavior as close as possible.

leftpad · on Sept 25, 2016

Virtually all data used to predict crimes or recidivism is fraught with human bias, for example. Not sure that we want to reproduce the bias of criminal justice system in any prediction problem involving this type of data.

Read anything written by Solon Barocas: http://solon.barocas.org/

wyager · on Sept 25, 2016

How is recidivism data biased? I'm sure that the information gleaned from parole officers and cops might be biased, but as long as the ML system is trained on whether or not someone actually reverted to committing crimes, it should be able to detect bias on the part of P.O.s and other functionaries and give a more accurate determination as to someone's chances of recidivism.