
Cryptographically Certified Hypothesis Testing [pdf] - ogisan
http://sachaservanschreiber.com/thesis.pdf
======
hannob
This is basically "Do preregistration and put it in a blockchain-like
structure".

It's not a bad idea. It adds cryptographic integrity to good scientific
practice.

But hardly anyone outside medicine does hypothesis preregistration yet. In
computer science it's basically unknown to do it. All the A/B testing you
heard about in user studies is almost certainly p-hacked. So the first step
would be to convince people that they actually have to do something about
p-hacking.

~~~
ianbooker
Most fields are not accustomed to confirmatory statistical analysis as
medicine, psychology or finance are. The poor understanding of p-hacking is a
mere consequence of this for most fields.

A good custom and first step for any field would be to elaborate more on the
validity of an analysis per se, or to give studies a "reality check" at all.

------
mitchtbaum
Complexity systems researcher, Didier Sornette, did something similar in 2009
for [The Financial Bubble Experiment: advanced diagnostics and forecasts of
bubble terminations](0):

• ...We do not make this document public. Instead, we make its digital
fingerprint public. We generate three digital fingerprints for each document,
with the publicly available (1) MD5 hash algorithm [1] and (2) 256 and 512 bit
versions of the SHA-2 hash algorithm [2] [3]. This creates three strings of
letters and numbers that are unique to this file. Any change at all in the
contents of this file will result in different MD5 and SHA-2 signatures.

• We create the first version of our main document, containing the first two
sections of this document, a brief description of our theory and methods, the
MD5 and SHA-2 hashes of our first forecast and the date (1 May 2010) on which
we will make the first original .pdf document public.

• We upload this main ‘meta’ document to [http://arxiv.org](http://arxiv.org).
This makes public our experiment and the MD5 and SHA-2 hashes of our first
forecast. In addition, it generates an independent timestamp documenting the
date on which we made (or at least uploaded) our forecast. arxiv.org
automatically places the date of when the document was first placed on its
server as ‘v1’ (version 1). It is important for the integrity of the
experiment that this date is documented by a trusted third party.

• We continue our research until we find our next confident forecast. We again
put the forecast results in a .pdf document and generate the MD5 and SHA-2
hashes. We now update our master document with the date and digital
fingerprint of this new forecast and upload this latest version of the master
document to arxiv.org. The server will call this ‘v2’ (version 2) of the same
document while keeping ‘v1’ publicly available as a way to ensure integrity of
the experiment (i.e., to ensure that we do not modify the MD5 and SHA-2 hashes
in the original document). Again, ‘v2’ has a timestamp created by arxiv.org.

• Notice that each new version contains the previous MD5 and SHA-2 signatures,
so that in the end there will be a list of dates of publication and associated
MD5 and SHA-2 signatures.

• We continue this protocol until the future date (1 May 2010) at which time
we upload our final version of the master document. For this final version, we
include the URL of a web site where the .pdf documents of all of our past
forecasts can be downloaded and independently checked for consistent MD

0: [https://arxiv.org/abs/0911.0454](https://arxiv.org/abs/0911.0454)

~~~
devereaux
Wonderful approach!

I would just change it to work with notebooks instead of PDF (to include the
data and the algorithm and make replication easier) - through the notebooks
could certainly generate PDFs.

Then it would bring into publishing the problem of replicable builds: the PDF
would not be very helpful if the MD5 didn't match the published signature!

The field seems ripe for disruption, with a new model for scientific research
and publication.

