Hacker News new | past | comments | ask | show | jobs | submit login
Be good-argument-driven, not data-driven (twitchard.github.io)
462 points by historynops on Sept 1, 2022 | hide | past | favorite | 161 comments



While I agree completely with the premise of this article, on the other hand I'm weighing the relatively robust findings by Meehl et al. They find, time and time again, in all sorts of fields, that extremely parsimonious models like equal-weighted linear regression of one or two predictors outperform expert judgment[1].

One would think this is cognitively dissonant enough, but it gets worse:

This article, with the thesis that good arguments are more important than data, is based on, well, a good argument – not much data. On the other hand, the work by Meehl et al. claiming pretty much the opposite, is based on, well, a lot of data, and maybe not much intuitive reasoning. (There's some, yes, but the main thrust of why I believe it is that variants of the experiment have been replicated reliably.)

I don't know what to believe. Fortunately, as I've grown older, I've become more comfortable with holding completely dissonant opinions in my head at the same time.

----

Edit a few minutes later: This actually prompted me to refresh on the subject. It might be the case that Meehl is actually making the same argument as this article, only it gets distorted when repeated. Some things are reliably measurable; for those things be data-driven. Other things not so much, then use your expertise.

----

[1]: Here's just one relatively early example: http://apsychoserver.psych.arizona.edu/JJBAReprints/PSYC621/...


Implicit in all of this is the is-ought problem.[0] The data are collected and interpreted under some procedure, often with normative biases built in about how the world ought to be (especially when involving human subjects), but are interpreted as saying what the world is. Thus data collection is fertile ground for charlatans.

When the psychiatric profession or Google or whoever else use experimentation to decide on what criteria they should follow, with sound controls, valid statistical analysis and loads of replication, they either arrive at evaluation procedures without much bias or, more likely, they realize the phenomenon they're trying to measure is almost all noise with no or excessively weak signals.

A better approach would be to acknowledge as much normative bias as possible up front, then conduct tests using sound experimental design. But the problem with this approach is that the data shows performing a bunch of well-crafted experiments is expensive, and management doesn't buy in if the vast majority are unlikely to reject the null. That leaves us which a class of "data driven" managers who are in fact indulging their biases to a sometimes extreme degree, using "the data" as a shield.

[0]https://plato.stanford.edu/entries/hume-moral/#io


I find it strange that these are presented in tension, when they’re complementary.

You can create situations where you have a lot of data but can’t reach conclusions, because you lack a narrative and explanatory model which “makes sense” of that data; inversely, you can convincingly argue complete nonsense that’s obviously contrary to facts.

Deep understanding requires a model/narrative which fits the collection of data we have, and which allows us to reason about and predict the outcome of new situations.

As Jeff Bezos put it:

> Good inventors and designers deeply understand their customer. They spend tremendous energy developing that intuition. They study and understand many anecdotes rather than only the averages you’ll find on surveys. They live with the design.

> I’m not against beta testing or surveys. But you, the product or service owner, must understand the customer, have a vision, and love the offering. Then, beta testing and research can help you find your blind spots. A remarkable customer experience starts with heart, intuition, curiosity, play, guts, taste. You won’t find any of it in a survey.

https://www.aboutamazon.com/news/company-news/2016-letter-to...


I was about to write that in case of Bezos with Amazon, the customer was simpler and the answer was to just pour money into it until you substituted the market, but I realise now that that is not that simple. It seems simple because we have hindsight.

My main idea though is that it is very hard to foresee what the customer will want after you deliver the product. Not what the customers want now, because sometimes they don't understand it until they experience it, and that makes me think that there is a LOT of luck at play here and a good deal of continency in prototype product design. Experience alone could be overrated. Think Kodak, I don't think they didn't have experience in product design, that they didn't understand their customers. I think they only didn't risk their luck and didn't think about what their customers would want in the future. And that is always a gamble.

- Things are more nuanced and complex than I am putting it here, but bottom line is that I am trying to tap into survivors bias.


Sure — business is a gamble, made harder by our own foibles. My main point was that even somewhere very data-driven like Amazon, that data should be used within a narrative as a grounding-not-guiding force.

(Disclaimer: I used to work on a customer sentiment analysis team at Amazon, doing a lot of surveys.)

Amusingly, the two paragraphs after what I cited agree on that danger:

> The outside world can push you into Day 2 if you won’t or can’t embrace powerful trends quickly. If you fight them, you’re probably fighting the future. Embrace them and you have a tailwind.

> These big trends are not that hard to spot (they get talked and written about a lot), but they can be strangely hard for large organizations to embrace. We’re in the middle of an obvious one right now: machine learning and artificial intelligence.

I don’t think the digital revolution was lost on Kodak — I think that for organizational reasons they couldn’t pivot.

> The first actual digital still camera was developed by Eastman Kodak engineer Steven Sasson in 1975. He built a prototype (US patent 4,131,919) from a movie camera lens, a handful of Motorola parts, 16 batteries and some newly invented Fairchild CCD electronic sensors.

https://www.cnet.com/google-amp/news/history-of-digital-came...


Seems far-fetched to assume that this thesis applies to product development just the same?

The impact a data-driven mindset can have on the organization cannot be understated ('RIP intrinsic motivation' section). I've seen it first-hand, both data being used as cop-out for bad leadership, meaningless 'successes' used as trading cards for promotions, and design experts having a decade of experience overridden by shaky statistical analysis, or worse, non-inferiority tests.

Meanwhile, the shortcomings in the product that everyone knows are rarely addressed because they are 'difficult to test'.


> They find, time and time again, in all sorts of fields, that extremely parsimonious models like equal-weighted linear regression of one or two predictors outperform expert judgment.

I came across this in Thinking Fast and Slow. Kahneman was a big fan of Meehl and restates the point:

The important conclusion from this research is that an algorithm that is constructed on the back of an envelope is often good enough to compete with an optimally weighted formula, and certainly good enough to outdo expert judgment.

https://www.goodreads.com/quotes/9574537-the-important-concl...

I too agree with the premise of this article. On this topic of expert judgment vs data, however, I found the counterpoint in this HN comment thought-provoking enough to bookmark and refer back to now and again:

I started at MS during Vista and I've been involved (sometimes tangentially) with Windows ever since. This is all my opinion, but It's been very interesting seeing the decision making process change over time.

If I had to summarize the change, I'd say that it's evolved from an expertise-based system to a data based system. The reason why eight people were present at every planning meeting is because their expert opinion was the primary tool used in decision making. In addition to poor decisions, this had two very negative outcomes:

1) reputation was fiercely fought for. Individuals feared that if they were ever incorrect, the damage to their reputation would limit their ability to impact future decisions and eventually lead to career death. Whether this actually happened or not is irrelevant; the fear itself caused overt caution and consensus seeking.

2) In the absence of data, an eloquent negotiator is often able to obtain their desired outcome, no matter how sub-optimal that outcome might be.

https://news.ycombinator.com/item?id=15174737#15176957

Even more provocative, it ends up being a (qualified, as I read it) defense of telemetry.


It seems to imply that expertise-driven design gave us Vista and Win7 while the data-driven one gave us Win8, Win10, and Win11. It's notable that, from this list, Win7 seems to be the only one that people genuinely liked.


Yup, it seems a side effect of data driven approach is that Windows no longer cares about its own reputation.


Expertise-driven design did not give us any Windows operating system. I don't believe that MS Windows is the kind of system OS experts would design.

But - perhaps you're referring to the user interface? Or just the kernel? Or the driver mechanism?


Define "people". Tech people, people/customers in general, some other group such as shareholders? Both your point and the point your responding to could be true at the same time both anecdotally and/or in the data. Anecdotes are probably just another form of "expert opinions"


It's entirely an anecdote, but from my experience, Win7 was broadly accepted as a good iteration among techies and non-techies both.

As a software engineer, I actually find a lot more to be excited about in Win10+ thanks to WSL and other such things. But I don't hear my acquaintances who are non-techies being positive about anything from Win8 on.


> Edit a few minutes later: This actually prompted me to refresh on the subject. It might be the case that Meehl is actually making the same argument as this article, only it gets distorted when repeated. Some things are reliably measurable; for those things be data-driven. Other things not so much, then use your expertise.

Highlighting your edit at the bottom, as I think it’s important and not everyone will read that far.


I've come to heavily discount these types of studies. What makes an expert? What was the sample size of experts? What was the non-expert tool? Etc.

There is such a thing as having common sense based on thoughtful life experience. Checklists and regressions help, but human beings are very capable of deep expertise and to pretend otherwise is silly. I expect a musician to be able to identify a violin from a viola.


>Some things are reliably measurable; for those things be data-driven. Other things not so much, then use your expertise.

Maybe too much of a nit-pick, but how does one build expertise without data? I'll grant that it may be informally or subconsciously collected but it's still data.

It makes me think of Malcolm Gladwell's book Blink. There are lots of experts who can subconsciously chunk data to make intuitive and reliable decisions. But they got to that point often gathering lots of data in the form of experience.


> This article, with the thesis that good arguments are more important than data, is based on, well, a good argument – not much data.

I'm not sure what you're claiming. All intellectual demonstration is a matter of rational argument. That's what proofs are: arguments. Data is not self-explanatory or demonstration. "Data" can only support arguments by first being collected, something motivated by argument, and then interpreted so that it can enter into argument as a body of propositions.

> On the other hand, the work by Meehl et al. claiming pretty much the opposite, is based on, well, a lot of data, and maybe not much intuitive reasoning.

I don't understand. Argument is logical demonstration. The strongest form is the deductive argument. If you don't have a logical argument, then you haven't got a demonstration.

> I don't know what to believe. Fortunately, as I've grown older, I've become more comfortable with holding completely dissonant opinions in my head at the same time.

Depending on what you mean, this could be good or bad. Inconsistency is not a virtue, and if there is an inconsistency between two of your beliefs, then it means you've got work to do (or at least you'll need to admit you don't know what the truth is). This requires humility, the frank acknowledgment that you're faced with an aporia that you don't know (at least not yet) how to address. It also requires patience if you are to tolerate your ignorance instead of jumping to some ersatz explanation.


I feel like the author is leaning into comfort, intuitiveness. You bring up a fantastic point. Often we find data reveals things very unintuitive to human experience. We should always try to make Good Arguments - but without data they aren't always honest beyond feelings.


> Are you prepared to do some very very fancy statistics?

I'd extend this with "... while understanding what you're doing?"

I've seen it so many times already, someone does some A/B-test and then presents a very fancy looking slide-deck with all kinds of crazy-looking math. But if you start to ask questions, it's all very obvious that they didn't really understood what they were doing and that very often it doesn't really matter to them in the first place; it's all about reaching a decision using some pseudo-scienty method that nobody dares to question because 'data' and 'science', without having to take responsibility.


I think "Be brutally honest about you many assumptions and caveats" at least implies that.

I mean, in an informal setting there's room for an honest person to say "well I did some math and I don't really get it but I think it says...," but I think this article is addressed to software engineers and scientists. Someone representing themself as an engineer or scientists has a professional ethical responsibility to some sort of... I dunno, epistemic honesty, the knowledge of what their expertise covers, and communicating their limitations to laymen.

The person with the A/B test in your example is either a liar because they are misrepresenting what their tool says, or they are a liar because they are misrepresenting their ability to tell you what it says, but either way they are a liar.


> Are you prepared to do some very very fancy statistics?

IF you need 'fancy' statistics then it is not going to be a good data driven argument at all.


I have experienced this first hand, so this article resonates a lot with me.

I worked with a manager who prioritized work which was easily measurable, so he could report the good numbers to leadership and get career points out of this. Unfortunately the project we took on was a demanding and technically challenging problem, and in almost a year of work of a team of engineers we made barely any real progress or made any actual difference, but the numbers were great and people were satisfied during presentations. I ended up feeling completely disconnected from my job and losing all motivation to work there.


> I originally claimed that data-driven culture leads bad arguments involving data to be favored over good arguments that don’t

This is symptomatic of the deeper problem of thinking in terms of bumper stickers and slogans, instead of thinking from first principles. When it afflicts educated people, usually you hear slogans like "an anecdote is not data", or "that's the slippery slope fallacy". Instead of grappling with noisy reality, they have sharp cognitive categories with firm boundaries between concepts, then they try to squeeze things into these categories in order to make cognition easier because the relations between the categories are already understood. This gives them the illusion of rigorous and clear thought.


This entire discussion makes a good case for why the general populace would benefit from being taught the basics of philosophy.

In this case the topic of value is the often fraught relationship between empiricism and rationalism, and the impacts each have on the scientific process, research, education, and how we go about understanding the world.

To operate with one with a complete absence of the other is to expose yourself to huge, often fundamental gaps in your thinking, your arguments, and your plans. This is what the author is ultimately getting at from the direction of the empirical: data, in the form of a large collection of discrete observations, can be used to justify a sea of mutually exclusive claims that may or may not be in accordance with reality, and that's to say nothing about the quality of the data itself.


Most argumentation we do as on questions that are worth debating aren't based purely on deductive reasoning, but more on informal reasoning and heuristics with limited evidence.

Toulmin identifies the three essential parts of any argument as

    - the claim
    - the data (also called grounds or evidence), which support the claim
    - the warrant.
The warrant is the assumption on which the claim and the evidence depend. Another way of saying this would be that the warrant explains why the data support the claim.

Toulmin says that the weakest part of any argument is its weakest warrant. Remember that the warrant is the link between the data and the claim. If the warrant isn’t valid, the argument collapses.

Example:

    Claim: You should buy our toothwhitening product.
    Data or Grounds: Studies show that teeth are 50% whiter after using the product for a specified time.
    Warrant: People want whiter teeth.
Notice that those commercials don’t usually bother trying to convince you that you want whiter teeth; instead, they assume that you have accepted the value our culture places on whiter teeth.

https://www.blinn.edu/writing-centers/pdfs/Toulmin-Argument....


I would start by simply putting everyone through a course in deductive reasoning at the earliest age possible: https://en.wikipedia.org/wiki/Deductive_reasoning

From there you can go into the whole spectrum of critical thinking approaches, and then on to what's basically the liberal arts e.g. philosophy, social sciences etc. as you desire. But the value you get from all of those things depends heavily on the framework you have for thinking about them going in.

Claiming random things are "fake news" would be a lot harder if people could work out what is and isn't fake by themselves!


> I would start by simply putting everyone through a course in deductive reasoning at the earliest age possible

I was taught the explicit premise of deductive vs. inductive reasoning as part of our unit on the scientific method in, I think, fourth or fifth grade. I always assumed this was a standard curriculum module.


>I would start by simply putting everyone through a course in deductive reasoning at the earliest age possible

Indeed. This would help ensure people's brains' transition function is stable enough to perform faultless computation. We forget that our brains aren't wired for exact computation. They're wired to perform approximations of computation that are good enough for survival.

As a result, you end up with myriads students who go through the school system via memorization and emergent fuzzy computation.

They reach an adult age without possessing the cognitive tool-set to grasp the subtleties and nuances of the world they live in. The fact that such people are also preyed on by charlatans, ad companies and politicians(intersection of charlatans and ad companies) obviously doesn't help.


I agree so strongly with this.

The point I would add is that hardly anyone uses the empirical process directly. It is all 'this article claims this' or 'this study says that'. It's very 'meta' with little to no personal verification or testing of the claims - ie, theories based on theories or models based on models, or maps based on maps.

Very few check the terrain itself to confirm that the map applies. We trust education, experts, peer review etc. We're drowning in models, especially as these are easily represented on computers, but have no ability to check the models against reality.

PS this disassociation from reality will not improve as we move forward technologically. No doubt, in the metaverse we will be able to create ever more elaborate models, or is it that we will be ever more disassociated from our own anecdotal experiences? (Where 'anecdotal' is something to apologise about).


In the metaverse, the map is the territory. Think about how the word "map" is used in gaming.


And often we have no idea who made a particular model and what are it's limitations.


> This entire discussion makes a good case for why the general populace would benefit from being taught the basics of philosophy.

But our entire education pipeline is optimized for loading people into the “system”. Philosophy etc. has little market value (unless it aligns with the system).


While this is indeed an issue that falls within the domain of philosophy, philosophy is also home to fields in which the absence of empirical evidence is regarded as an irrelevance, or at best a mere detail that can be deferred to an indefinite future. Take, for example, the resurgence of enthusiasm for panpsychism, and the enduring appeal of armchair metaphysics.

I am doubtful that academic philosophy has much enthusiasm for pursuing and inculcating the practical aspects of reason (any more than does theoretical physics or mathematics), though there are exceptions.


Exploring ideas like panpsychism doesn't mean you're committing to them being true. We can't know everything, and we can't always link new ideas deductively to things we are certain about, but we can notice the inadequacy of current explanations, say "suppose this explanation is true" and proceed from there. Every good philosopher knows that they're doing that. And the fact that people defend their position and attack opposing views is just part of the adversarial process for testing ideas. Yeah, of course ego and pride and hubris happen to many philosophers, and the academic profession is frankly in a bad state, but that doesn't mean the fundamental approach is bad.


That is a fair point in general, but in the specific case of panpsychism, at least one of its most active proponents (Goff) combines an insistance that it is the most plausible explanation of the mind with an apparent lack of interest in saying anything empirically verifiable about what it actually means.

Whether in physics or metaphysics, one can only go so far without facts. Even the mundane world of that which actually is has repeatedly turned out to be stranger than was imagined possible.


Idk how if studying philosophy helps. Most philosophers were/are themselves committed to one school or theory, with gaps galore.

In any case, I think empirical science's defeat of rationalism ( eg Galileo Vs Church) has all sorry of ramifications. Social sciences like economics and psychology have a lot of trouble bridging the gaps.


Epistemology is a subfield of philosophy. Seems like a healthy understanding of that would be good for society right now.

> Most philosophers were/are themselves committed to one school or theory, with gaps galore.

Most scientists specialize one thing, but students of science don't. One can learn about many schools of philosophy, as well.


The problem with this is that philosophy isn't a magical panacea that illuminates the way towards a more ideal state. It can be used to justify a sea of mutually exclusive claims that may not be in accordance with reality, and that's to say nothing about the quality of the arguments themselves.


I often experience the inverse: people come up with hypotheses and theories that should see expressions in observable data - but no-one bothers to look and instead everyone argues around logical constructs etc.


This reminds me a lot of the discussion of the scientific method by Karl Popper, and David Deutsch who was very influenced by Popper. "Being data-driven" sounds very empirical. Just look at the data, and see what you find in it.

But you can't just let the data "speak for itself" without an explanation or a theory that interprets the data. Popper in Conjectures and Refutations:

> Observation is always selective. It needs a chosen object, a definite task, an interest, a point of view, a problem. And its description presupposes a descriptive language ... which in its turn presupposes interests, points of view, and problems.

Deutsch, in The Beginning of Infinity, emphasizes the importance of conjecture, and the role of observation as refuting or criticising those conjectures:

> Where does [knowledge] come from? Empiricism said that we derive it from sensory experience. This is false. The real source of our theories is conjecture, and the real source of our knowledge is conjecture alternating with criticism. We create theories by rearranging, combining, altering and adding to existing ideas with the intention of improving upon them. The role of experiment and observation is to choose between existing theories, not to be the source of new ones. We interpret experiences through explanatory theories, but true explanations are not obvious.

To bring this back to the subject of the article, I might suggest that it's possible to be "data driven" without a sound explanation or theory that the data is either interpreted through, or used to criticise. Or maybe such theories do exist, but are left implicit.


Doesn’t the scientific method specifically say you can’t start with the data, you have to start with a hypothesis otherwise you are subject to all sorts of selection/hindsight biases. I mean you can start with data, but then you have to develop a hypothesis and use that to create an experiment that generates new data in order to reach a conclusion. It seems like that is the compromise the author is looking for, start with a good idea, then see if you can verify it with data.


The scientific method as taught in K–12 schools is largely pablum. Often, the real process (beyond iterating off prior research) begins with collecting data, then noticing patterns to make a hypothesis to be tested with targeted data collection.


A lot of this perspective depends on what point in time you choose as the start of the process. You can start with the hypothesis, or you can start with what gave rise to the hypothesis: exploration.

But, it's a layman's mistake to confuse the two and use it as a critique of the formalized scientific method.

Science bodies (like the NIH) explicitly forbid reuse or reinterpretation of data. An individual may use exploration as inspiration for a hypothesis...but for it grow into science out of curiosity requires new data generation from a carefully considered framework for the hypothesis.


> Science bodies (like the NIH) explicitly forbid reuse or reinterpretation of data.

I think your main point is that collecting new data is necessary to test existing ideas. But reuse and reinterpretation of data is routine, e.g., in meta-analyses. It's not forbidden. You do have to disclose where the data came from.


> But you can't just let the data "speak for itself" without an explanation or a theory that interprets the data.

If you look at the heart attack data, and you ignore smoking you end up inventing the mythical Type A personality — but it was data driven.

https://en.m.wikipedia.org/wiki/Type_A_and_Type_B_personalit...


A single metric is just one very thin dimension from the temporal development of a complex process involving many factors. You need to watch a multitude of metrics to devise an explanative theory, and even then, that theory can be rendered flawed when new and unexpected factors come at play.


I think the point is the theory doesn’t come from the data (it can’t). It comes from the process of creative conjecture in a person’s mind.


The fact that empiricism is false was a revelation to me as a young adult, after reading so much about the triumphs of science and reason and getting very excited (mistakenly) that you can get away without bothering with pesky things like epistemology. Of course, Quine and others pointed out that empirical observations are useless without explanations both of the phenomenon being measured and the measurement device itself (including, for example, the human vision system). And I believe it was Deutschmark who pointed out that empiricism is itself an epistemology which had to be invented. It turns out that it tended to be a significant improvement upon previous widespread epistemologies, but that doesn’t mean it’s not false. :)


I agree, pure empiricism can lead to superstition. If you only learn from experience, and do not have any theory that ensures the consistency of the model, it's easy to infer wrong causal connections.


> Observation is always selective. It needs a chosen object, a definite task, an interest, a point of view, a problem. And its description presupposes a descriptive language ... which in its turn presupposes interests, points of view, and problems.

Thanks, I'd never heard this quote before. He's pretty much describing pragmatism à la William James. I had no idea.


The pragmatists went a little bit too far in my opinion, though it has been a long time since I read any of them. Popper is describing observations, not reality.

I highly recommend Conjectures if you can find a copy. It's a short read and interesting.


What do you mean that they went too far? James and Peirce were not describing "reality" (in this discussion anyway. [1][2]) but rather were instrumentalists and thus saw every theory as having a purpose. That's the whole point of the squirrel argument. It not just "depends on what you mean" (as per analytic and some medieval philosophy) but also depends on what you're trying to do (which in turn depends on what you want/like.) In any case, the similarity I was pointing out is just that theories have purposes and ignoring this is a blatant blunder.

1. James even endorsed religion and other make-believe if it was useful to your purposes.

2. Peirce: "Consider the practical effects of the objects of your conception. Then, your conception of those effects is the whole of your conception of the object."


I think I'm out of my depth at this point actually, and maybe shouldn't have opined as readily as I did on the pragmatists. I did a little recap of where I'd encountered the pragmatists before and realised I only read Royce, who was a friend and interlocutor of James. But I don't think he could be called a member of the pragmatist school, so I shouldn't take the impressions I got from him to be representative!

The impression I had of pragmatism was that it made claims about absolute truth or reality. That's where I felt things were taken a little too far. But the impression I have may be a caricature or misunderstanding on my part.

Deutsch has a fair bit to say about instrumentalism in Beginning of Infinity which I will leave to the interested reader to discover.


Oh okay. I'm not an expert either but I think many might even consider them anti-realists but I'm not sure... I have a friend who is literally an expert, so I should probably ask and figure that out :)

> Deutsch has a fair bit to say about instrumentalism in Beginning of Infinity which I will leave to the interested reader to discover.

Okay, thanks, I'll check it out!


I was about to post that.

Here is a good talk https://www.youtube.com/watch?v=EVwjofV5TgU


I won’t belabor the point because others have already made it: this article assumes there is some way to sort through good and bad arguments in the absence of data - a pretty big leap. The reality is all of our arguments are appealing to some sort of data (eg previous experience), it’s just that it doesn’t always fit in a neat definition of data.

Obligatory: https://en.m.wikipedia.org/wiki/All_models_are_wrong


"Previous experience" is not what is meant by 'data' in this industry. If company's decision-making was including both data and experience/wisdom/intuition, it wouldn't be so frustratingly wrong all the time.


I agree that's not what is meant by 'data' in the industry and I'm challenging that a little bit. However, even if we use the industry definition, what you're saying is hyperbole. Every company uses both data and experience to varying degrees. People get hung up when they think the balance isn't appropriate - not surprisingly, that happens when one or the other doesn't support their opinion. I'd rather be in a position of defending my opinion with data. It's already been quoted but... "If we have data, let's look at data. If all we have are opinions, let's go with mine."


There's lies, damn lies, and statistics. Models are further along, beyond statistics.


Models are just applied statistics?


The related problem that I see actually more often is the "you don't have big data" problem.

You know, in data science, you see people spending hours writing pandas scripts that replicate a few clicks in excel for a one of analysis. You see datasets of a few gigabytes being processed with spark when SQL would be fine. You see ML techniques being thrown at questions that could be answered simply and reliably with basic statistical tests.

Especially in the B2C space a lot of companies, departments, products don't actually have a lot of customers and certainly not many decision makers. The N number is always going to be low. You can just talk to people. Let's say you are doing pretty well and running a SaS with 1000 corporate customers paying a million each - that's a billion dollar revenue - you can just talk to them. Certainly you can just talk to every single person who signs the cheque and those are the only people that matter.

And which is easier - putting together a thorough suite of A/B tests or getting some real customers to use your app on video and talking to them about what they are finding annoying, useful, missing? I see less people do that than you'd think.


frankly, there’s only a tiny handful of these mythical saas “1000 users each paying 1 mullion dollars” companies. the vast, vast majority of saas startups are serving millions of “users” - i put that in quotes because these aren’t real users or customers. they are real people checking out your product - but they aren’t users or customers.

if you set up a gas station near the off ramp of some major interstate, say I-65 North, you will see cars pulling in to fill up on gas. maybe buying a coffee. now, these aren’t your customers in the traditional sense of a Target or Walmart customer. Because you will never see them again. They were driving from town A to town B via the interstate- they started running out of gas and needed to refuel, so they are in your gas station now. Once they gas up, off they go. They aren’t going to come back to you and establish a customer relationship or something. We’ve all been to tons of gas stations on the interstate and we’ll probably never go back to the same one twice - unless we are plying the same route everyday like a truck driver. So the task is to find and convert these truck drivers, who are the true repeat customers.

I was working on an android app which had like millions of unique cookies. When they hired me they said we have million of users. No you don’t. If you put out an android app in some popular domain, say news, entertainment, tax accounting etc- people will download and “use” your app. they are checking it out. they aren’t users, in the sense they aren’t using it everyday or want to have a relationship with you, pay subscription etc. conversion stats are minuscule, like 0.01%. So maybe 1 out of 10000 users is the truck driver. The vast majority will never ever use your app again. To do data science with these millions of rows of user interactions and find some nuggets just because you know your way around pandas or sklearn is a fool’s pursuit. To ask foolish questions of your data, like why are all these people churning, is silly - they aren’t your users, they haven’t converted, they are just checking it out. In that sense, its a waste of time and resources to do so much data crunching. Look at actual conversions, which are probably a few thousand people, not millions. Reach out to those thousands and maybe a few tens will give feedback and then continue to iterate on the product based on that.


This is a really good analogy (gas station customers) that I haven’t heard before. I’ve often tried to describe this ‘low intent’ group but never had a good way to make it relatable.


it gets the point across, but it raises other interesting questions.

sure, most customers at an interstate gas station will only visit once or twice, but that doesn't necessarily mean they are less important to the business than the truck drivers that fill up every day. maybe the bulk of revenue actually comes from one-time customers. this could be a case where attracting new customers is more important than retaining the current ones.


>the vast, vast majority of saas startups are serving millions of “users”

There are tons of B2B saas, including regional ones, that only serve a small number of customers way under millions.


Even more problematically, if you have a free service that could attract any kind of automation (e.g. an API SaaS with a free trial) then you're also going to get a lot of "users" who seem to be the "truck drivers" given a black-box usage profile, but who will also never actually convert. They use some free part of your service a lot, but they're not and never will be interested in any paid part of your service.

Maybe a close analogy would be: truck drivers who stop at your rest stop every time they come by... just to use the washroom. But who never go into the store itself.


Unfortunately, your reality-driven approach has ~zero emotional appeal for most managers, exec's, and alpha-data-scientist wanna-be's.


Data has CYA appeal.


Needing to CYA also has pretty low emotional appeal for managers, exec's, and alpha-data-scientist wanna-be's. (Until it's just about too late, obviously.)

And recall Mark Twain's old quip about lies & statistics. The more & bigger data that the folks who control the data & analysis have, the easier it is to make sure that those meet their own emotional & political needs.


Wasn't that Will Rogers?



Yep.


Why? Inadequately “technical”?


I think there are lots of reasons why.

One possible reason: no one whose job it is to write Python scripts was ever promoted for making an Excel spreadsheet when that is the simpler and more practical approach. And no manager of people who write Python scripts is going to be able to use that Excel spreadsheet to sell "I need more responsibility and head count." People tend to follow incentives, rather than focusing on making wise decisions.


> People tend to follow incentives, rather than focusing on making wise decisions.

This is the key issue. Solving it isn't easy -- it requires people who are wise, and wisdom is a scarce commodity.


Even wise people likely follow the incentives. What is wise about doing something that your employer doesn’t reward in exchange for doing something that they will reward?


It's wise to do what's morally right, regardless of the consequences.


Sure, but I guess my point is that it's not clear that going against the wishes of your employer is what's morally right.


Excel has a history of forced format updates, breaking incompatibility. I know people who banned it because they got tired of marching to MSs upgrade beat.

Python 2 to 3 upgrade aside, can’t really say the same about the language.

There are a number of good arguments out there that might violate an engineers perception, which one might call a cognitive data model built through training and experience.

There is no theory that makes any given engineering path “wiser” than others. Just engineers chasing incentives to be engineers.


Libraries introduce breaking changes, too. I’ve been bit by silent default changes in Pandas, for example. To me that’s kind of striking because I also wouldn’t consider myself a major user of the library.


- "You just talked to them and concluded this? What certainty you can have on this conclusion, and how can we trust you just didn't want it to be true from the start?"

A few slides showing the data, a boring 10 minutes about methodology, and finally the conclusion brings an air of reliability that you can't replicate for knowledge instead of data.


Our field is filled with people who want to use the most technical approach possible to solve a non issue, their paychecks probably depend on it.


Talking to people is not going to help you either. You end up getting a lot of noise and making sense of what you hear is difficult. When you keep probing you will get to hear stuff thats not really critical and just often made up because you ask too many questions. Classical trap of market research.


ycombinator startup school disagrees and says it's one of the two CRITICAL things founders must have a hand in.

Of course you need to interpret it but its incredibly important and I do not think you really know what you are talking about.

https://www.ycombinator.com/library/6g-how-to-talk-to-users

Almost all the major fails I have seen in my career have been some derivative of not understanding your users.


At the very least, I feel talking to users will give you decent hypothesis to test.

The creation of hypothesis is often glossed over as a trivial first step in scientific or data-driven decision making, but in fact, that's where the magic lives.


> I do not think you really know what you are talking about.

Nice strawman. I have never said to never talk to your users, but to pretend that using data is meaningless and you should follow some bullshit and vague "good argument" instead is just sheer foolishness.


That depends on how big the differences you're looking for are.

When you've got an early product, there are probably things you can do that 2x as many people will like as dislike. Even a small set of customers will be good for discovering this. When you've got a mature product, you should be optimizing around the edges and need a large sample size to find those 1% wins.

Likewise if you don't have scale, there are a lot of well-known best practices that probably improve your site by 5-10%. You probably don't have sufficient volume to discover test those ideas, so following general best practices is a good idea. But if you have scale, you can and should A/B test the heck out of everything. And then do it again in a couple of years in case the answer changes.


It's this data-led uncritical thinking that destroyed facebook


Talking to customers might uncover some things you haven't even thought about.


The same thing is true with pure data analysis. Unless you have never analyzed data in depth in your life, that should be pretty obvious.


You have to do both. You can't just look at data & you can't just talk to users / customers without looking back at data.


of course, but then following "good argument" is just ignoring data in the original article, which is nonsense.


> You end up getting a lot of noise and making sense of what you hear is difficult. When you keep probing you will get to hear stuff thats not really critical and just often made up because you ask too many questions.

Yeah, but this just means qualitative data is challenging, not that it's useless. You have to be careful when asking questions that you're asking useful questions and not leading people into telling you what they think you want to hear (or going off on useless rabbit trails like what they think the product should be instead of what the problem they want the product to solve is).


While I agree with your suggested outcome for some or many, a product designer or manager who is skilled at asking questions, going deeper, removing distractions, asking why continuously, and empathizing while not seeming judgey can garner really good insights.

I am guessing it's like you see of a psychologist with a patient on TV..... the customer must feel comfortable enough to open up, then flood gates can open.


Go talk with your customer service. Oops, so much rotation nobody cares, everyone is cheating KPIs.


You both make good arguments, there must be a middle here. I doubt you can uncover what your customer wants very well without just talking to them, but maybe they wind up misleading you sometimes. A/B testing to discover a customer wants a whole different paradigm isn't possible.


This is true; what customers SAY they want doesn't necessarily corellate with what they will actually use or pay for.

I mean I worked on an app where in one part, the end user could upload CSV files to be used. What they SAID they wanted was basically a full data management system and RESTful API to enforce constraints, data validation, record retrieval and updating, etc. What they probably wanted was an excel sheet. I dislike how my employer was like "yeah sure if you pay for it" to them.


> what customers SAY they want doesn't necessarily corellate with what they will actually use

A key cause of this in many cases is that the stake-holders you talk to do not work closely with the end users of the system. Talking to the right people can help a lot, though unfortunately as a 3rd party this is not usually anywhere near your realm of control.

The other issue is them knowing what they have and wish to store, but not knowing what outputs are going to be needed down the line. That is harder to fix, but having some good industry knowledge within your company can be a great help on such matters – you can then sometimes preempt client needs if the people holding that knowledge are keeping an active eye on changes (for instance new/planned regulations that might be coming into force in X weeks/months/years).


Very dated thinking. Suggest you read up on Lean Customer Development (for example).


been doing product design most of my life in top corporations, so I'll pass on your opinion.


> You know, in data science, you see people spending hours writing pandas scripts that replicate a few clicks in excel for a one of analysis

I mean, having an Excel doc at all usually implies hour(s) of work formatting the data in structured manner. Sometimes collective decades of work depending on how much heavy lifting your 15GB .xlsx is doing.


This is why I've adopted R and Python for the data work I do. I have a bunch of exported data (CSV files) that I use. Manipulating the structure and format is 90% of the work. I wrote the scripts once, now I can reuse that for everything instead of playing games getting those CSV files (dates in particular) to play nicely.

Even a one off analysis is actually FASTER in Pandas because I've done the work of farting around with the formatting. Now I can just write the necessary analysis code, rather than deal with the formatting.

That said, my data analytics work is seriously small potatoes compared to many. But I can write a quick pivot table using Dplyr faster than I can do it in Excel.


Often that work exists regardless of if a table of processed data that engineering formatted and schema-fied is dumped out to Excel or queried over SQL into Pandas...

I've seen this myself: the person who "naively" downloads that table and plays around in excel finds interesting things that the person who was using Pandas hadn't, because the code to manipulate columns and do certain types of calcs is actually more time consuming to write and modify than making a bunch of new columns in Excel with a bunch of formulas!

A good data scientist will have a more rigorous approach to their notebooks and practice reuse and so on... but that's not necesssarily easy.


> the person who "naively" downloads that table and plays around in excel finds interesting things that the person who was using Pandas hadn't,...

I think they call that serendipity. Never underestimate its power.

https://didgets.substack.com/p/data-science-and-serendipity


I say this about every other day at work (we even have only internal users so it's part of their job to talk to us). So far impact: zero....


Would you say the big data threshold moves every year?

That would explain why people think a <1TB is big data.


>Would you say the big data threshold moves every year?

It moves with Moore's law. Big data is anything that cannot reasonably fit into memory for a single server, so yes that number is well over 1TB now.


I know this isn't the correct definition but I think of "big data" as the set of data which takes me more than 15 minutes to query on average with a moderately complex Postgres SQL join on well indexed information. I use JSONB in Postgres regularly and have indices on that too. So far I have gotten really far with increasing Postgres work_mem to a gig or more, a fast SSD, and strategically placed materialized views. These kinds of operations in Pandas make my computer billow smoke by comparison.


I don’t think many give much thought to what it really means. They just use the term because it sounds cool, either to themselves or to their superiors. Same as with Machine learning.


What used to be 'big data' is now just 'normal data'.

https://didgets.substack.com/p/big-data


Why not both…


Maybe what I wrote comes off a bit one sided - I'm really urging people to do what actually makes sense in their specific context - which can be both!


To use Clayton Christensen’s theory of innovation here, to sustain innovation, businesses tend to be purely data driven. They continue to grow and make more money based on choices made with pure data.

For disruptive innovation however, there needs to be an “argument” or opinion to help drive that data based on the industry trends. Companies then take a risk of delivering something new and good enough to the market. Also known as disruptive innovation.

This has shifted the idea of being data-driven to being one of “data-inspired”.

Anyone can make the same dataset fall into their favor. That’s the problem with being purely data-driven. Another way to think of it in the US especially is that our two party system makes wildly different conclusions from the same data. What’s preventing businesses from doing the same?


To the author... I'd suggest a rewrite of what you're trying to communicate because your usage of "good-argument-driven" is a textbook example of Begging The Question: https://en.wikipedia.org/wiki/Begging_the_question

For discussion's sake, let's go along with excluding data/metrics/science in pushing for arguments. In this framework, what exactly is a "good" argument based on? Gut feel? Opinion?

There was a famous quote by Jim Barksdale, the former CEO of Netscape: "If we have data, let’s look at the data. If all we have are opinions, let’s go with mine."

(So the tie-breaker in competing arguments in that case was "hierarchy-of-arguer-driven".)

So Jane and Bob disagree on the next action to take. Jane thinks her argument is a "good argument" but has no data. But Bob thinks he has a "good argument" but no data.

How does this thread's blog post help resolve the above scenario? (Blog's answer: you're driven by the one that has the good argument.) ... which is circular.


This is a textbook example of The Strawman Argument.

I'm pretty sure the author is talking about "data" in the context of "databases", i.e. repositories of digital information that can be queried, transformed and displayed (dashboarded).

In other words the author is assuming the value of human's more natural data processing: common sense, personal experience and conversing with others (empathy).

If a process/feature/etc doesn't make sense within how you understand your product, then you can make an argument based on that. The argument will involve data (i.e. the current architecture) but not data in any database.


>I'm pretty sure the author is talking about "data" in the context of "databases",

Yes I agree and the "data metrics" was the interpretation I was commenting on. Instead of straw-man, I actually steel-manned what the author was trying to communicate in my other comment. (One has to read this thread's blog post combined with his previous blog entry to understand what the author means by "good argument".)


A simple answer to this is that good explanations are hard to vary.

More here: https://www.lesswrong.com/posts/jcTsbaQ8hNc7qxwaQ/explanatio...


>A simple answer to this is that good explanations are hard to vary.

But the "hard to vary" explanations were built up from observing data of smashing particles. E.g. from your link:

- Frank Wilczek describes hard-to-vary-ness as follows "A theory begins to be perfect if any change makes it worse." He explains further using the Standard Model as an example of a hard-to-vary explanation: Too many gluons! But each of the eight colour gluons is there for a purpose. Together, they fulfil complete symmetry among the color charges. [...] No fudge factors or tweaks are available.

This author's blog post about "data" also links to his previous post[1] about "science" leading one astray from "good arguments" is the opposite of "hard to vary" explanations.

Here's the reason for the disconnect: The author is using the adjective "good" in his idiosyncratic way to describe the type of arguments that depend more on "storytelling" and "intrinsic motivation" -- rather than empirical science/data. Excerpt:

- >And here is a secret: in the natural sciences themselves, storytelling and bare conjecture are far more important modes of persuasion than data-based empirical argument, anyway. [...]

- >A good example of the sort of argument I think is helpful is A Philosophy of Software Design. Ousterhout defines his terms clearly, accompanies his definitions and claims with illustrative examples, and tells an occasional story. You, the reader, are free to evaluate each claim based on whether it plausibly seems to capture the essence of what you have encountered in your experiences writing software. For my part, I didn’t find most of Ousterhout’s ideas to be persuasive, as some of my colleagues did, but that doesn’t mean they aren’t good arguments,

Those types of subjective claims arguments the author is espousing are actually "easy to vary" -- because they don't require constructing a cohesive theory that reconciles data that looks contradictory (e.g. like the The Standard Model, or Theory of General Relativity reconciling the speed-of-light observations).

[1] http://twitchard.github.io/posts/2019-10-13-software-develop...


Duh, science is the interplay between observations and explanations. But that doesn’t change that what makes an explanation good or bad is whether it’s hard to vary.

I’ll use a David Deutsch example: let’s say a theory that eating 1kg of grass cures the common cold.

You could do an experiment and find it does not. But you could easily vary the theory and say actually it’s 1.1kg and so on.

But you wouldn’t actually need to do the experiment because there is no good, invariable explanation as to _why_ eating the grass cures a cold.

In that case, you wouldn’t need any data or observations at all. You could simply ask why eating exactly 1kg of grass cures the cold. What is the mechanism of action?

In this way you can see that empiricism is not sufficient in any case to give evidence to a theory. We need only a good explanation to judge whether a theory is worth considering to be true. From there, we can do further experiments/observations to rule it out. But never to prove it true.


This is not what the data shows

https://www.google.com/search?q=data+driven+companies+more+p...

Any good-argument-driven based argument you attempt to make is almost always based on political motivating factors, rather on what is good for the business.

Intuition driven decisions work when the market is behaving normally, however, are generally too slow in a fast changing market like we have been since the start of COVID.


> Any good-argument-driven based argument you attempt to make is almost always based on political motivating factors

If this is true in the case of a specific theory, then that is not a good theory.


I was mostly referring to business decisions. For that type of decisions there are always political factors at play (building empires, career growth, dislike for another person/team) that do not necessarily align with business success. Lehman Brothers is one of those examples.


You’re actually proving my point. Those so-called successful business decisions never turn out to be true enough to work long term. Like all knowledge, they are eventually shown to be false.


Data driven companies, like Amazon, tend to do better, on average than other companies. However, this doesn't mean they are immune to mistakes. At some point in time Amazon wil fall in decline and disappear, however, that doesn't imply that being data driven was a bad decision.


I think the problem is that people chronically underestimate how hard good science is.

Professors get this wrong all the time, despite being some of the smartest people we have around, despite decades of experience and education, despite a career and reputation on the line, and despite a system of peer review to catch mistakes before they get published.

Designing experiments is really difficult.

Interpreting experiments is difficult and unintuitive.

Statistics is difficult. You can't just look at whether the number went up. You need to have a deep understanding of significance, power and effect size, you should probably be doing ANOVA or some such.


> A weak argument founded on poorly-interpreted data is not better than a well-reasoned argument founded on observation and theory.

So a good argument is founded on...good data and good understanding of data?

The article more seriously makes the mistake of begging the question: it presupposes the known classier of good and bad arguments and then goes on to say bad arguments with data is worse than good arguments. But how do you know good arguments from bad arguments in the first place? What makes a good argument if not empirical data?


> It presupposes the known classier of good and bad arguments and then goes on to say bad arguments with data is worse than good arguments.

It does indeed assume that there's a way to learn bad arguments from good; and so the focus should be on learning what are good argument and what are bad.

> ...What makes a good argument if not empirical data?

Consider the following conversation:

A: We've done some numbers, and we've determined that there's a correlation between the number of firemen at a fire and the total damage done by the fire; with the fires handled by a single crew of three firemen doing the least damage. So we should limit all fire responses to a single crew to minimize damage.

B: That doesn't make any sense -- of course we send more firemen to bigger fires, and bigger fires cause more destruction! If we take your advice, those big fires will cause even more damage!

A: Hey, my argument is backed by empirical data; yours is just theoretical!

Like, sure, it might be even better if B had empirical data to back him up; but even without that data, B should be winning the argument here. And the argument of the article is that many people espousing "data-driven" approaches end up being like A: Not scrutinizing the logic that they're using to analyze the data, and not acknowledging the limitations of what the data collected can say.


You left out hypothesis C: if you send too many firefighters they get in eachothers way and become difficult to coordinate, making them worse at putting out the blaze.

And hypothesis D: fires with the same number of firefighters cause different levels of destruction because some departments are organized to let their 10X firefighters work more efficiently.

And hypothesis E: Many arsonists become firefighters thus more firefighters increases the risk that an arsonist will be on the team

And hypothesis F: The same as hypothesis A but since some tools require more than one person there's actually a minimum threshold below which destruction skyrockets

And hypothesis G: Wealthy areas that can hire more firefighters also suffer more expensive destruction for a given blaze.

And hypothesis H: If we invest the resources we're spending on firefighters into fire prevention we can reduce total fire damage

And infinitely more hypotheses.

There will always be another argument that makes some logical sense. And unfortunately reality is under no obligation to make sense, so it's entirely possible something that sounds stupid and counterintuitive could just happen to be correct anyways.

But with data, we can test hypotheses. Vary the number of firefighters and see what happens.


Good arguments (explanations) are hard to vary.

More here: https://www.lesswrong.com/posts/jcTsbaQ8hNc7qxwaQ/explanatio...


In Range, David Epstein talks about about NASA and some of their disasters, like the explosion of Challenger. NASA is the entirely encased in specialized knowledge, and has a completely data-driven mindset, with no room for logic. If you can't prove it with data, they wouldn't even consider it. He explains that, “Reason without numbers was not accepted. In the face of an unfamiliar challenge, NASA managers failed to drop their familiar tools... The Challenger managers made mistakes of conformity. They stuck to the usual tools in the face of an unusual challenge.” Even though the mistake that led to the Challenger disaster could have been caught, it was the uniformity of thinking that lead to an organizational blind spot, and that uniformity was to be too focused on data-driven arguments.

There is a famous call prior to the disaster on which engineers had raised the concerns but it was based on intuition and a few cherry picked samples, not a full set of data, and this was the night before the launch. Because of the lack of data, they went ahead with it and we all know the tragedy that ensued. Moreover, other engineers who agreed that there was an issue didn't speak up, because they too lacked the data, and knew that management wouldn't care.


One of the big reasons why data driven approaches are so seductive is, it's very difficult in the moment to distinguish between a good argument and a well crafted rationalization.


The issue is that it doesn't fundamentally solve the problem. It's true that a good argument logically supported by data is better than a good argument that hasn't been checked against data. But the existence of data in the argument doesn't help you determine whether it's a good argument logically supported by data, or a well-crafted rationalization speciously supported by data.


1. If there are no good arguments in the collective - there's no retrospective and it's primarily a management and psychological issue. No one is able to fully self-reflect and it breaks the existing delegation / escalation chains, respectively.

2. If there are no viable data sources, when it can be proven that there's a correlation with an actual business processes, - it's a management problem. People Can't establish viable metrics, once again, mostly due to 1.

This is something any company of any size and any budget can struggle with due to lack of XP and the usual collective XP-accumulation / knowledge sharing deficiency. You can't self-reflect onto something you haven't learned about, yet. And due to 1 this is a closed loop because lack of XP can't be escalated accordingly, most of the time it's also a Workplace Deviance factor.

3. Practically, it ends up in a bouquet of Workplace Deviance because no one in the end will be willing to take the blame and actual responsibility to fix anything.

Any Problem vs Solution type of culture will worsen things a lot i.e. "All the blame and no Compassion". Companies are usually forced to adopt some Teal stuff in the end, maybe for really no other good reason, but just to keep on growing.

The idea of hiring HR that can "work by the booK" and actually build up a personal profile of how anyone could fit into all this mess is impossible by definition - due to Employee Silence and broken retro no one will be willing to expose all the shit that is happening, in the first place... So, most of the time I see Kitchen Sink companies with volatile outcomes where there really no one who could even be able to listen to any arguments, in the first place.

Google's internal ML-driven productivity metrics became a meme already for all the reasons described above. You can't reason with Toxic and Inadequate people.

Also Asana claim that Social Loafing is a myth and everything else is a retro deficiency really wrong - retro can prevent and display certain glorious occasions, but it's not a root cause of any psychological effect by definition.


Good article. When your only tool is a hammer, every problem looks like a thumb.

While we're at it: I've actually been in scrums where the "burndown rate" was analyzed as if it was actually A Thing. It is not A Thing.


Key idea is the "data maturity" of the topic under discussion.

Where there is data, you should use it and be smart about it.

For a lot of big decisions, especially in companies doing something new, there is no good data at first. You have to reason about it based on experience and analogy.

Then, once you commit to a path, you can start gathering data to see if your hypothesis was correct. The further you go, the more you can rely on data, assuming you know how to think about it.

Discussions about being data-driven that don't take into account the "data maturity" of the situation are nonsensical.

Being "data driven" when you're considering something radically new is either delusional or a cop out.

Ignoring data when it could correct your biases is either lazy or wrong or both.

And finally, lots of people who claim to be "data driven" are not smart about data. To paraphrase Wilde, "data is rarely pure and never simple." It doesn't just reveal truths you can treat as dogma. It's ambiguous and takes a lot of work to interpret. A lot of "data driven" teams aren't doing that work.


I'm surprised there's no mention of Goodhart's Law [0].

Even if the metric is "well understood and free from human/social factors", once you start using it as a target that will no longer be the case.

[0]: https://en.wikipedia.org/wiki/Goodhart%27s_law


I couldn't agree with this more. I feel like the author took some of the arguments straight from my brain—I'm exhausted by pseudoscientific "data-driven" arguments.

From my experience, most of these try to distill an incredibly complex problem space down to a one-dimensional black and white decision. But the real world doesn't work like that–it's full of grey area, and things we can't effectively measure. If you're trying to slice and dice data down to a happy one-dimensional decision point, you're often missing or ignoring important detail.

At work, I'm far more happy with postmortems with general, open "good/bad" lists of after the fact feedback, that we use to consider how we prioritize and design what comes next.


Being data-driven for the sake is being data-driven is indeed becoming an issue. The resources spent measuring and analysing data are overwhelmingly larger than they should in most cases. Cohorts of "data scientists" and "managers" dive head on into data without much (if any!) first-principles thinking. People tend to replicate metrics without much thought into their relevance to the specific situation. Thinking properly is a very hard skill to acquire (the hardest?), and most do everything they can to avoid it.

"What you measure affects what you do. If you don't measure the right thing, you don't do the right thing." -- Joseph Stiglitz


Great article, but I think it somewhat misunderstands the impetus for the concept. "Data has its place" sounds obvious precisely because "data-driven" has been such a successful concept. The alternative perspective, which used to be very common in our industry and still pops up from time to time, is that metrics are something you write for debugging and business decisions are made by gut feeling or abstract philosophical analysis. (Most software companies had to make decisions this way in the pre-cloud era, because it wasn't usually feasible to collect usage metrics.)


The hidden assumption here is that things go well if and only if (you think) you understand all the factors that influence your metrics, can do experiments and are prepared to use fancy statistics.

Which I reckon is a bit iffy. Special relativity was thought out well before any experiments to test it were feasible, and if understanding everything that influences your metric is a prerequisite then you can blame all failures on insufficient understanding without having any way of knowing when you have enough understanding.


I’ll probably be buried in all these comments, but my position is that data is only as good as how it is collected. Sloppy data collection gives rise to sloppy conclusion through unknown biases.

The key is to understand the ‘data generation process’ so you can identify biases. My experience suggests that doing so side-step some common pitfalls.

I recommend reach out for ‘The Book Of Why’ by Judea Pearl. He includes many real life examples that’s surprisingly applicable to modern data science.


David Deutsch (Father of Quantum Computation and one of the most brilliant human beings alive) have a really great way of thinking these kinds of discussions.

He calls it good explanations.

A good explanation is something that is hard to vary while still solving the problem it purports to solve.

He is against most use of Bayesianism when used for predictions.

Great presentation here

https://www.youtube.com/watch?v=EVwjofV5TgU


A major exception to this reasoning is performance. Argument driven performance suggestions are wrong more than 80% of the time and likely wrong by several orders of magnitude. You can’t know just how wrong you are without appropriate data.

This makes for a good litmus test of whether people are lying to you about software or, more likely, have absolutely no idea what they are doing.


Performance falls into the article's category of "things you can reliably measure."

Thus, the author would agree that in performance optimization, you should collect and analyze data.


The problem isn’t what the article author believes, but rather what developers commonly (perhaps almost universally) believe.

Most developers will fall back to intuition for any performance oriented decision even when they otherwise prefer data oriented decisions and even when the task at hand is critical to the health of their product/business. This is because performance measures require:

1. Additional effort

2. (most importantly) A willingness to abandon familiar concepts of approach

Sometimes such decisions vested in intuition are truth by omission, a form of lying, because the resulting self-comfort is worth more than the numeric benefits.


A whole book was written on this very topic: "The Tyranny of Metrics" by Jerry Z. Muller https://press.princeton.edu/books/hardcover/9780691174952/th...


Sure, but the thing with "good arguments" is that when two hypotheses oppose each other, it is the case that supporters on each side are sure they are behind the "good argument" so ...

Data doesn't lie; it could be nuanced, yes, but if its truthful then you cannot really argue against that.


This reminds me of the Principal Chalmers meme. In this case, first pondering whether he is wrong, only to conclude that it's the data that's wrong.

I know that's not what the article says per se, but it's only one slightly abstracted reinterpretation removed, as OP's title demonstrates.


Minor nit:

Principal Skinner; Chalmers was the superintendent.

https://www.knowyourmeme.com/memes/am-i-so-out-of-touch


good point. Still; I would've presumed Chalmers was superintendent at some point in his career. Additionally, Chalmers has on occasion [1] been referred to as "Super Nintendo Chalmers".

[1] https://www.youtube.com/watch?v=av4lbel9aIo


Doh!


Be data-driven, and question the provenance of your data all the time. Otherwise you will end up like economics, a field with prettier models and more mathematics than almost every engineering field, and yet gets every major prediction wrong.


> Be good-argument-driven, not data-driven

FWIW the proper term is "data-informed."


I've seen a lot of good arguments put to rest with a good test.

The key is collecting and looking at the data correctly.

Data without a keen understanding of why you need it and what you're looking to solve with it is not much use.


Good argument is just another name for confirmation bias, most of the time.


Yes, data is useless without a qualitative explanation. There are simply too many possible confounding factors that you cannot eliminate without understanding what they may be.


Be politically driven (company politics) driven.

Good arguments should take in account people's ambitions, and political aspirations especially at big fortune 500 companies.

Startups can be more honest.


Being argument driven gives control to the organization's 'lawyers'. People can be very persuasive independent of the reality of the situation.


See also the book How to lie with statistics and similar (I think a follow up book was called How to lie with charts and graphs).


"According to the data on business failures, you should have never started this business"


I don't know. There are a ton of great arguments that will lead to dead ends and stalled projects. :(


“Resist! Be skeptical! Have no tolerance for poor arguments made with data. Keep intrinsic motivation alive.” the last sentence was the TL;DR




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: