I was looking for these types of sources this week to populate a document database. One I ended up using for a demonstration was the "startup company information" hosted at http://jsonstudio.com/resources/ (apparently an extract from CrunchBase, mentioned in Jen's blog post).
I naively thought I could just grab a pile of tweets or something, but most public APIs require registration as a developer.
One quick tip, if you're dealing with JSON dumps as a series of objects (e.g., {} {} {}) that you want to wrap in an array (e.g., [{}, {}, {}]), is to "slurp" them into jq (https://stedolan.github.io/jq/):
Yes. I used the GDELT data set in a Geo-intelligence hackathon and it's very powerful, just have in mind that if you use Google BigQuery (actually the easiest way to use the data set) it will cost you money.