Some other indices of open data sets I've found:
+ For individual requests, come over to https://opendata.stackexchange.com/ and ask!
+ Wikidata has loads of structured data, but using SPARQL is often a barrier. But you can request help: https://www.wikidata.org/wiki/Wikidata:Request_a_query
I actually have found FOIA requests and downloads from government websites to be the easiest & most effective way to get robust datasets.
And data is simple, it's parameters plus timestamps plus a lot of storage.
Realtime access is harder but it's a well-specified problem.
The issue is inference. No one does inference extremely well except in limited circumstances. It's one of our greatest bottlenecks as humans and our software is going to be limited by it as well insofar as our understanding of what to build is controlled by what kind of inference we want.
There are a very limited number of companies that have complete datasets of buildings in all cities throughout the country. The leader is CoStar, they have the most complete and accurate data. What I have noticed with other CRE data companies is they are focusing on leasing activity or 1 other part of the CRE process. CoStar became the leader by valuing their data over all things, if other companies want to compete they need to do the same.
Additional link to his data site: http://people.stern.nyu.edu/adamodar/New_Home_Page/data.html
Unfortunately, most every result for the word `philosophy' is borderline garbage imho. Keyword indexing of datasets may need improving?
Overall a bit underwhelming.