
Apply HN: TemporalHealth Search and Aggreg. engine on millions of health records - aub3bhat
(To quote Linus Torvalds “Talk is cheap show me the code”)
A working demo on 6 Million hospital visits: http:&#x2F;&#x2F;www.computationalhealthcare.com<p>Government agencies collect medical records on millions of patients for various purposes such as reimbursement, medical research, healthcare quality&#x2F;delivery assessment. Even though these datasets are available to researchers and analysts, they are severely underutilized. Further due to security &amp; privacy concerns, modern analytics tools cannot be directly used. We have developed a privacy preserving analytics platform. The platform uses aggregates statistics pre-computed in an exhaustive manner to enable exploratory analysis. By allowing researchers and physicians to search on these aggregate statistics using medical concepts (diagnosis, procedure &amp; drugs), we can significantly reduce security &amp; privacy concerns, while greatly enhancing benefits of these datasets. Given the enormous privacy concerns due to amount of information present in such datasets, we are deeply committed to transparency. We will Open Source core parts of platform for researchers. While we would prefer such system strong and useful enough for public use. In short term we aim for providing access to all practicing physicians and medical students in United States.<p>Consider a physician who discharged a patient with Sarcoidosis, within a week the patient went to an ED complaining severe headaches. The physician would be curious if any other patients have had suffered headaches following sarcoidosis and if she should admit him. Due to availability of 140 Million visits and 40 Million patients our system can provide useful guidance to the physician in spite Sarcoidosis being a rare disease.<p>We have the data. We have infrastructure. There is no real reason why it should not exist. Hence we have built it. We have a fully functional platform for research use. We have been working on this for 4 years and its my PhD thesis. Also applied to YC S16.
======
aub3bhat
Some interesting examples: Chemotherapy followed by Stem cell transplant

[http://www.computationalhealthcare.com/N1/TX/Entry/_P9925_P4...](http://www.computationalhealthcare.com/N1/TX/Entry/_P9925_P4105)

Admission via ED followed by Cardiac Cath.

[http://www.computationalhealthcare.com/N1/TX/Entry/_N1_e_P37...](http://www.computationalhealthcare.com/N1/TX/Entry/_N1_e_P3722)

------
rpedela
Variations of this have been tried and didn't go well including one serious
attempt by me. I agree this should exist but it is very much an uphill battle.

Email rpedela @ datalanche .com if you would like to discuss further.

Good luck!

~~~
aub3bhat
Thanks I have actually observed several startups attempt this over years in
one form or the other.

My hope is that in case we fail I still leave behind a good theory/framework,
academic papers and hopefully some quality open source code from my research
that allows the next guy to pick up. There are some good developments over
last year as far as access to data is concerned. We also have some good
medical discoveries using this platform that we expect to publish over the
next year. I will follow up on email tomorrow.

------
radnam
This is great. We and few others including rpedela who commented below did
something similar working with over 50M+ patient events from clinical data
from our partners however found it incredibly hard to sell. Hope you find some
success in this area. It is always interesting to see another healthcare
entrepreneur. Feel free to ping me at rahurkar@gmail.com if you want to talk.

~~~
aub3bhat
Thanks, I will email you. One particular feature of our platform is that the
bulk of data collection & normalization work is already done by government
agencies such as AHRQ HCUP and various state agencies. These databases have
several users already. We are literally standing on shoulder of giants.
Further while these agencies have been very good at developing databases. They
suffer from lack of good tools to query them efficiently.

E.g. Here is the tool used by Texas,
[http://healthdata.dshs.texas.gov/](http://healthdata.dshs.texas.gov/)

This is the one used by AHRQ HCUP
[http://hcupnet.ahrq.gov](http://hcupnet.ahrq.gov)

New York state simply strips large number of fields and makes all individual
visits information available via [https://health.data.ny.gov/Health/Hospital-
Inpatient-Dischar...](https://health.data.ny.gov/Health/Hospital-Inpatient-
Discharges-SPARCS-De-Identified/rmwa-zns4)

There is a strong need to create a single easy to use platform that allows
patients/physicians/researchers to query them while optimizing for usability,
privacy and security. We think that Computational Healthcare does that and is
a step in the right direction.

~~~
rpedela
Have any of the users of these databases and/or agencies committed to
using/buying your tool? Have any signed a letter of intent?

------
shimon
Cool idea. I looked at your demo, but all I could get from it was a report of
visit counts by diagnosis/symptom. Can you explain how the example question
you offered (Sarcoidosis, headache, readmit) could be answered?

Is your goal to build a business or a non-profit?

Who pays for it? There are lots of potential users of this data, but some
already have access to this data; do you open up the data to users who
wouldn't have been able to access it before?

~~~
aub3bhat
Hi, the Texas data which is publicly available to download, only has
individual visits. While the HCUP data (california state datasets) has
information for ED, Ambulatory Surgery and Inpatient visits linked to a unique
patient identifier spanning over multiple years. Using this data, you can set
up complex query types such readmission following a hospitalization with a
DRG. We do not offer these results online. But you can see an example (fake
data) of readmissions following sub-endocardial infarction (a heart attack) in
which the patient underwent intubation in the next visit.
[http://www.computationalhealthcare.com/N2/HCUP/Entry/D41071_...](http://www.computationalhealthcare.com/N2/HCUP/Entry/D41071_P9604)

>> Is your goal to build a business or a non-profit?

We plan on building a business. We plan to operate a publicly available
service for researchers and physicians that also acts as marketing vehicle. In
addition to the central platform and we will allow individual enterprise users
to run it on their premise with their own dataset.

>> Who pays for it?

Hospital, Insurers and Government agencies already have their own data in same
format as we do. We plan to create a simple hosted drop in solution. By
creating a single platform for all three types of customers and a public
platform we can divide the development costs. Further we plan to offer add-ons
specific to requirements of each customer type.

We do not "own" the raw data. The "raw" data is merely licensed to use for
specific use cases by the government. We plan on using the data provided by
the government to generate insights in form of aggregate statistics. These
insights are then in turn provided to physicians and researchers.

------
aub3bhat
clickable link:
[http://www.computationalhealthcare.com](http://www.computationalhealthcare.com)

------
kumarski
I spent a reasonable amount of time pitching this space. I think we need more
brilliant people working on hard problems like this.

That being said, I think you might be a bit off the mark and I'm going to
share my story and the evidence before me. (Feel free to reach out to me, I'm
happy to help anyone trying to tackle problems in this space).

A blog post I wrote several years ago about this kind of thing:
www.engineersf.com/2015/07/04/do-we-need-a-human-data-projecthdp/

Our Stab At It My cofounder and I went around pitching over 100 MDs and
disease researchers telling them we had the full medical records of tens of
thousands of patients(hint: We didn't).

What They Told Us What they told us is that the data isn't all that useful for
research purposes because it's inaccurate to some extent, but as well the
ability to compare and juxtapose patient cases is just really tough. Ron
Shigeta @rshigeta on twitter quickly told us the queries they would have to
run would be several pages long even for research purposes.

How The Data Could Be Valuable to MDs and Researchers The REAL pervasive
utility in a large number of records is to find the Gene Regulation Pathways,
and as humanity we do that through clinical trials. The MDs and Researchers
wanted to use it to recruit patients for trials. Fast forward to today and
we've focused all our efforts on patient recruitment and screening.

There's another company that's also a YC company that went down a different
route and focuses on diagnoses. It might be worth checking them out:
humandx.org. (Stands for Human Diagnosis Project)

As well cancer is often highly mutated....ummm...
[http://blogs.sciencemag.org/pipeline/archives/2008/09/08/the...](http://blogs.sciencemag.org/pipeline/archives/2008/09/08/the_complicated_causes_of_cancer)

The example you've outlined might be a bad one for the simple reality that
cancer is so complicated. Cancer is a cell to cell battle.

Evidence 1:
[http://blogs.sciencemag.org/pipeline/archives/2011/04/07/mor...](http://blogs.sciencemag.org/pipeline/archives/2011/04/07/more_zeroing_in_on_breast_cancer_cells)
Evidence 2:
[http://blogs.sciencemag.org/pipeline/archives/2011/04/05/so_...](http://blogs.sciencemag.org/pipeline/archives/2011/04/05/so_you_thought_breast_cancer_was_complicated)

The Quote for me that really catches my attention in the pipeline blog:
"Recent work from Bert Vogelstein’s group at Johns Hopkins (with a host of
collaborators) and from the CGA itself now show that there are an average of
63 mutations in pancreatic cancer cells, and 47 in glioblastomas, two of the
nastiest tumors around. The first impulse might be to think “Great! Plenty of
drug targets to go around!” But hold on. For one thing, even though these
mutations are surely not all equal, the fact that there are so many makes you
wonder about whether attacking any one of them alone can make much of a
difference. And different patients can have varying suites of those mutations,
so it’s difficult to imagine that going after just one or two of those targets
will be enough to treat a majority of cases. This work follows up on earlier
studies in other tumor lines, all of which seem to point in the same
direction: patients who are currently classed as having the same type of
cancer really don’t. This won’t come as a surprise to most oncologists, who
have seen for themselves the widely varying responses to current therapies.
The challenge is to figure out what these various changes mean, and how to
classify patients to give them the best therapy. It’s not going to be easy.
Just doing the math on the possible interactions of several dozen mutations
with a list of possible treatment regimes is enough to make you pause. The
hope is that most patients will fall into broad categories, which will line
up, more or less, with broad categories of treatment. But it’s not going to be
a good fit, most likely, and even getting those approximations to work is
taking a lot of time and effort. (Just think back about how long you’ve been
hearing about the wonderful new age of personalized medicine. . .)"

I would highly recommend talking to more doctors. Happy to help in any way
possible.

Onwards & Upwards.

~~~
aub3bhat
Thank you for your comment. The HCUP data we use is being collected for more
than 15 years. There have been at least 2000 peer reviewed papers published
using the HCUP data in journals such as New England Journal of Medicine, JAMA
, Annals, etc. My co-founder has deep experience with both medicine (as an
anesthesiologist), Outcomes research and healthcare system. We have already
published several papers using Computational Healthcare.

The example quoted is not too far off from a real scientific study that we are
preaparing to publish soon. Also Sarcoidosis quoted in the example is an auto-
immune disease and not a cancer.

The chemo / stem cell example is from a published article which investigated
timing of chemotherapy prior to transplant.

