Under a uniform prior [0, 1] the posterior mean is the empirical mean. How you sample is of no consequence. The likelihood/posterior f(p|#heads, #tails) is p^(#heads)(1-p)^(#tails) regardless of how you sample. Differentiate with respect to p and you get p*=heads/total.
It is rather amusing that most statistics professors are happy to have taught their students that the sampling procedures matter while at he same time crushing the natural intuition that your decisions should be based on the data you observe not on what might have happened in a world that doesn't exist.
If someone then takes your dataset and assumes it's a random sample -- e.g. just the same as the N=100 doctor trial -- he's wrong. It's not, it's something else, and that something else is less useful.
You say "how you sample is of no consequence". But suppose your sampling method selectively throws out some data that it doesn't like. That is of consequence, right? So sampling methods do matter. Now consider a method which implicitly throws out data because some sample collections are never completed. That matters too.
See my other comment, up a few times then down the other branch, the one with the pastebin code.
However the example with the two doctors was not the halting type.
Can you agree to that? Or do you have a defense of non-halting stopping rules, even though they are incapable of reporting some data sets?
I think I figured this out but would be interested in criticism on this point if not. Is there some way of dealing with non-halting that makes it OK?
The book says if there's a stopping rule then inferences must depend only on the resulting sample but that assumes there is a resulting sample -- that the procedure halts.