However, because the way they measured it pushed the limits of engineering, if GP-B had NOT agreed with GR there is good chance the results would have been dismissed as most likely due to equipment flaws.
While it is in general a good idea to confirm measurements, especially using different techniques, in a case like this where the confirmation will be much less precise than the other experiments and will likely be rejected if it fails to confirm, you have to wonder why this was funded over other projects.
The answer to that turns out to be simple: politics. When space scientists ranked all the proposed missions under consideration, GP-B came in dead last. However, its proponents went to Congress, and got Congress to override the normal process for prioritizing missions, forcing NASA to move it to the front, ahead of more scientifically worthy missions.
There are a lot of very worthwhile scientific missions that we can't fly due to budget limitations. It's a shame to see $750 million of the limited budget go to a mission so far down on the importance list.
Also I'd be interested in sources, as well as genuinely curious what missions would be more important.
Here is the stack rank of operating missions as of 2008 (when NASA needed to shut down some missions to save money):
1. Swift, 2004
2. Chandra, 1999
3. Galex (Galaxy Evolution Explorer, ultraviolet), 2003
4. Suzaku (X-ray), 2005
5. Spitzer, 2003
6. WMAP (Wilkinson Microwave Anisotropy Probe), 2001
7. XMM-Newton (X-ray Multi-Mirror Mission), 1999
8. Integral (INTErnational Gamma-Ray Astrophysics Laboratory), 2002
9. Rossi X-ray Timing Explorer, 1995
10. Gravity Probe B, 2004
I also found this read (http://www.skyandtelescope.com/news/121390204.html) very interesting.
So, sure - the proof may not be epic -- but the prediction certainly is, especially even moreso that he is correct (again)
He developed mathematical models for currently available observations. Those models in turn also suggested other phenomenon that could not be measured at the time. As time went on, some of these have been measured and found to hold true, others observations about the universe have been had that suggest they don't hold true in all cases (eg: "dark" stuff)
If not, how do they know that the deviation was in fact caused by twisting space time?
The GP-B experiment measured this effect and found agreement with the GR prediction. This does not prove that GR is correct; rather, it is a piece of evidence that implies GR is more likely to be a correct description of gravity than what we previously believed . Because the prediction was quantitative, it is unlikely that the result is caused by something else, which makes the evidence in favor of GR that much stronger.
Now, control groups are often used in life sciences fields. For example when you test a drug, you have a control group that takes placebo. It's not my field but as far as I understand this is done for two reasons. First, there is no quantitative prediction regarding how effective the drug should be, because drugs are not understood so precisely. So the prediction you're testing is much weaker; it's just a boolean. Second, there is a known effect -- the placebo effect -- that can affect results. In other words your null hypothesis is that there may be some effect. These things mean that, without a control group, the evidence in favor of a drug's effectiveness is not very strong.
 That is not to say that we believed GR was wrong, but we can never be 100% sure, and every piece of positive evidence strengthens the case.
If you accidentally did an experiment where GR and Newton predict the same thing, the control would kick in and tell you that you hadn't proved anything.
> Second, there is a known effect -- the placebo effect --
> that can affect results. In other words your null
> hypothesis is that there may be some effect. These
> things mean that, without a control group, the evidence
> in favor of a drug's effectiveness is not very strong.
In order to keep confounds in check, scientists attempt to keep everything either equivalent (by matching samples as precisely as scientifically feasible) or randomly distributed (by using, for instance, Latin squares and other randomization techniques).
They perform the experiment and obtain result B. This disproves Newton's theory immediately, and confirms Einstein's theory. We don't know that it is right, only that it has successfully made a prediction about something that hadn't been tested at the time the theory was developed.
Testing a place in space-time that should not be twisted would provide another element of confirmation, however the strongest and most interesting tests are those where the two theories differ in their prediction, not where they agree.
The experiment set out to experimentally test the accuracy of Einstein's predictions. The results of the experiment showed that under these specific circumstances, the predictions line up extremely well with observations.
There's no "control group" for this kind of test. Of course, they could potentially run multiple experiments to verify Einstein's predictions under different circumstances. That might provide useful information, but the lack of such information in no way discredits the results of this particular experiment.
If you want to read more about the experimental protocol, check out the final results paper: http://arxiv.org/pdf/1105.3456v1
That would put a different spin on the notion of Einstein "being right" as I think a lot of folks subconsciously equate GR with the "strangeness" of the Minkowski metric and non-Euclidean space-time manifold.
I will never imagine general relativity the same way again.