Hacker News new | past | comments | ask | show | jobs | submit login

Isn't this a symptom of the exact testing procedure being a white box that rarely changes? Many students would score 100% on the ACT if the entire question and answer set were widely published.

Also that seems like a silly test. Couldn't you just measure the differential after mixing? TFA also mentions it, but noise is also a factor. For the water-heater example, the limited number of probes introduces a known amount of potential variance.

Rather than a straight % efficiency, a more useful number to report would include the confidence interval - Its 99% ± 3%. Consumers can't do stats but at least it's more honest. TFA touches on this but doesn't draw any conclusions other than "we need to look at the source data and do our own analysis".




This phenomenon also appears in the testing of crash helmets. The DOT (US government) helmet standard is easy to game since it's very strict on how and where the helmets are tested. The Snell Foundation standard is a bit better since the humans testing the helmets are allowed to look for weak points to target for anvil drops. The new FIM standard adds more variability to the tests to better simulate the varying angles of crashes.

https://youtu.be/uLj9WfoWPSQ?t=413


Crash test dummies have basically this problem also. They're designed for realism in certain very narrow ways, and then the very small number of approved dummies are used for testing car safety.

The industry has made a bit of progress, surprisingly unprompted by regulations - female and child dummies came into circulation before they were required in tests. But overall, testing is still run against a tiny handful of body types which move 'realistically' in only a few regulation-guided respects.


I think some of this falls into the simulation paradox: the more accurate the simulation, the closer the simulation is to the thing being modelled. But it's a quadratic relationship in most cases, so at some point meaningful increases in simulation accuracy cease to be economically viable.


Yeah, but in the words of RyanF9, "The US government can afford a BB gun", so there's no reason that DOT can't test helmet visors.

The main reason the DOT standard is so bad is because its mired in bureaucracy and managed by a severely underfunded organization.



How did I know this linked to some kind of general principle with a cool short name even before reading or opening the link?


Well, there are several such:

https://en.wikipedia.org/wiki/List_of_eponymous_laws

Though I'm not aware of a law stating that for any given principle there is an existing eponymous law.


Haven’t you heard of DredMorbius’s Law of Eponymity?


But bananamerica really discovered it.

... oh, wait: https://en.wikipedia.org/wiki/Stigler%27s_law_of_eponymy


The patent system in a nutshell, ladies and gentlemen.


Something tells me there’s a XKCD for that, or at least it should be.


It means the test procedure has vulnerabilities which need to be patched.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: