
Think Like a Statistician – Without the Math - ric3rcar
http://flowingdata.com/2010/03/04/think-like-a-statistician-without-the-math/
======
stdbrouw
I wonder if the people upvoting the article are actually reading it. It
essentially just says "be careful" without any kind of practical advice for
how to think like a statistician.

Two much better articles:

* [https://source.opennews.org/en-US/learning/distrust-your-dat...](https://source.opennews.org/en-US/learning/distrust-your-data/)

* [https://github.com/Quartz/bad-data-guide](https://github.com/Quartz/bad-data-guide)

~~~
ericdykstra
Just telling people to "think like a statistician" is enough to make them
consider base rate information, so I wouldn't say that the article is useless,
even if the content is a little lacking in the specifics.

~~~
stdbrouw
Well, if you really want to learn how to think probabilistically in everyday
life, I'd recommend Douglas Hubbard's "How to Measure Anything" which contains
detailed advice for how to calibrate your estimates (so you don't continually
over- or underestimate the probability of various events) and how base risk
management and strategic decision making on this knowledge. Probably useful
for the startup folk here.

[http://smile.amazon.com/How-Measure-Anything-Intangibles-
Bus...](http://smile.amazon.com/How-Measure-Anything-Intangibles-Business-
ebook/dp/B00INUYS2U/)

------
lordnacho
Stats is maybe the area of math where non-math (formulas, calculations,
derivations) reasoning is the most important:

\- Have I unwittingly introduced a bias? Is there a selection taking place
that I didn't think about when collecting the data? (Eg calling people on
landlines to get polling data.)

\- Is there a reason why I should expect the dynamics in the dataset occur in
the future, or a reason to expect it to be gone? (Eg does the financial market
work like it did before the year 2000?)

\- If such-and-such hypothesis is true, what else should I be able to find in
the data? (Eg suppose crime is caused by non-aborted kids, what other effects
should there be? More truancy in schools?)

~~~
Practicality
Does non-aborted kids mean something I am not aware of? I mean, every child
who is born would be non-aborted, right?

~~~
stdbrouw
It refers to research, heavily publicized by Freakonomics, which claims that
the reduction in crime rates in the US is not because of police interventions
etc. but because more people are having abortions and abortions are legal in
more states and so fewer people are born in dysfunctional environments that
tend to produce more criminal behavior.

------
danso
> _This should go without saying, but approach data as objectively as
> possible. I 'm not saying you shouldn't have a hunch about what you're
> looking for, but don't let your preconceived ideas influence the results.
> Because if you go to length looking for some specific pattern, you're
> probably going to find it. It'll just be at the sacrifice of accurate
> results._

I couldn't say it better myself. I admit to loathing when people think that
they can finally prove whatever thing they've been long advocating for, now
that they have the data that proves it. Besides the huge issue of thinking
that data -- by nature of being data, or something -- inherently contains more
truth than just someone literally rambling into a spreadsheet...if the dataset
is indeed worthwhile, and by that, I mean _deep_...then whatever foundational
beliefs you _think_ were true needs to be re-evaluated in light of examining
the data before moving on to prove something.

This is something that I'm reminded of when sampling the public Twitter
stream...I use Twitter probably more than I do email, but my perception of
what the Twitter community is like -- I.e. Who and from where people
participate, the kind of things they tweet about, etc -- are inextricably
narrowed by how I've self-selected users to follow. So when just looking at a
random sample of everything that is currently being tweeted, it practically
feels like I've stepped onto an alternate reality.

------
erikb
The idea behind all skills is implementing them in your daily life. If you are
a statistician who never really uses math in daily life, then it's not the
math at fault, but you simply haven't found a way to make it a habbit yet.
This is also why the points in the article are rather broad and not really
statistics related. For instance, how do you decide when which details are
important and when the big picture is important (two points that follow each
other directly after one another). You'll probably say gut feeling. But if you
don't use the science to consistently train your gut, then your gut feeling
isn't that much better than mine (I'm no statistician).

Yes there is a point where you get so much into the principles that you don't
need to calculate everything anymore. However, that point is not after
graduation, but after doing more math than the people around you for about 3-5
years, maybe 10 for some. And a huge part of your intuition would probably be
around not needing a calculator to put the numbers together, and about finding
many different solutions to the same problem, understanding that each has
their pros and cons (e.g., different fittings that all tell you something
about your data cloud).

If you answer a broad question with "it depends", it's a good sign you only
think you are well trained (happens to all of us all the time). If you think
"it depends, if A then B but Z, if C then D but Y, if E then G but X, ..."
then you are probably well trained.

------
rdlecler1
Consistent deviations from the null hypothesis indicate the presence of a
confounding variable and pave the way to new potential discoveries. A great
example (albeit not statistical) is how planets deviate from their elliptical
orbits and that this provides evidence for the gravitational pull of other
hidden planets and moons. Having a simple model that gives you an expectation
(usually the expected value E) can be extremely powerful.

------
brudgers
After thinking like a statistician about thinking like a statistician, I
thought about
[https://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statist...](https://en.wikipedia.org/wiki/Lies,_damned_lies,_and_statistics)

------
swagaholic
Yup... uh... this is great advice for life in general. Really what this guy is
saying boils down to... pay as much if not more to model selection / validity
than you do to model analysis

------
xyzzy4
One major mistake I see is when people look at statistics on a linear scale
when often it should be on a logarithmic scale.

