If you want to do your own analytics on these data, here's one approach, which I'm using. Start your project as a vanilla Git repo, then add the JH repo as a submodule.
It can be a bit of a gotcha when trying to share your work with others. They'll have to clone with --recursive or the submodules will come up as empty folders for them.
Interesting. I never read news that China cases are actually decreasing and all the charts show the total cases which is flatlining (in China) but not going down.
The rest of the world seems to go exponential, otoh.
Its, hopefully, not mentioned in the news because the CCP is fudging the numbers. There's no way of knowing the exact situation on the ground in wuhan but it's nowhere near the reported cases.
Why do people keep saying this? I've seen interviews with disease specialists that are plugged in to the community and say that the China numbers are not only as accurate as they can be[1], but that it would be really hard to fudge them given the transparency they've already engaged in - my sense from that was that the numbers can be cross-checked in various ways, and that it would be mathematically obvious if there was major fraud going on.
[1]This is a different issue than numbers being limited due to testing shortages, or people with minor symptoms that recover at home, or asymptomatic people not being included in the count - problems that a lot of countries are sharing. We could also argue that US has been "fudging" the numbers because of how they've tested far fewer people than Canada.
Google also seems to heavely favour WHO's website compared to alternative sources of information. Probaly all in order to combat "fake news" some dystopian world we live in.
This is pretty great! Being able to see the progress of the virus over time could be interesting. Could maybe reveal some patterns about human movement + interaction??
Is it possible to view the number of tests ran? I keep seeing in the news that the current numbers for US are probably very off as very little tests are performed.
Interestingly, the CDC website no longer has these numbers. A week ago, their main COVID-19 landing page had a table of positive tests, negative tests, and pending tests (total tests was in the neighborhood of 450). That seems to have disappeared, with a subset of those statistics here: https://www.cdc.gov/coronavirus/2019-ncov/cases-in-us.html.
I've heard this argument but it skews in both directions, I imagine. You'll have mild cases that recover where no test is ran that skews the data to the "more harmful than being presented" but I suspect there are also cases, especially for those with other preexisting conditions, where deaths also didn't run a test. Those cases would skew the data in the opposite, "less harmful than being presented" side. You essentially have a huge mess of amalgamated/non-systematically collected data. Even in controlled settings you get a lot of censored data to deal with.
RT-PCR tests are reasonable these days in cost, but it isn't cheap by any means and I suspect many early cases weren't checking or didn't have the capability to check. The question is: which way is the data skewed more towards?
I know absolutely nothing about the chinese healthcare system. Even if you do, there's a significant amount of guesswork. Only now as it spreads are we starting to get more controlled tests that give us more accurate data.
For now, I have a tendency to trust the raw data until I see definitive evidence of skewness otherwise.