There is a certain amount of overhead involved in the creation of the dataset. Hiring people to call and collect answers, filtering the data, formatting it, fielding questions about it, etc. The University of Michigan seems to partially pay for this by the fees they charge Reuters. I am not sure if they make a profit on it or, if so, how much. But you could probably found some company to collect and provide the same amount and quality of data for cheaper.
However, even if you were able to beat them on cost, you wouldn't be able to charge nearly the same amount that UofM does for it. The reason is that the UofM data goes back for over three decades, so traders can do all sorts of backtesting with it. They are also a recognized "brand" in this field and over time sources of trading information like the UofM report become ingrained in traders' minds as "indicators" of one type or another. So, for various practical and psychological reasons, your hypothetical startup would have to charge a much lower price for the same data and spend many years earning the confidence of traders.
Ironically, you might be able to make more money by selling the data as an early predictor of the UofM report. Since traders know that the UofM report can be a market moving event, any early predictor of it (even one with ~90% accuracy) would be valuable. This would probably just force UofM to collect and release the data earlier and with less polish though, negating much of your advantage.
Your last point is kind of what I had in mind with my question. Not to try to compete with UofM but to gather data to predict their results and use it for your own trades. You could hold the results private and play the market to beat even the high-frequency guys. You could even do a shoddy job of collecting the data to keep costs down but maintain, say, a 90% confidence in your prediction of UofM's numbers, that's enough to beat the market 9 out of 10 times.
You don't factor in the number of times UofM 'gets it wrong'. Even given their influence you wouldn't be winning 9 out of 10 times. But it could still be an edge.
Yeah, the mind-bending part of the whole thing is that the reason the traders are making trades based on the information ISN'T because the data predicts the long term (or even short term) profitability of any company; they are making trades based on it because they know the number will affect the STOCK PRICE of the company... at least temporarily.
It doesn't matter how good your own data is if the data's release doesn't affect stock prices.