As the person that convinced them to use GitHub, I suggested it to them because they were looking for an extremely easy way to bring changes from the community back in to the city's data. I suggested GitHub and GeoJSON because I envisioned them taking pull requests from citizens interested in adding more detail to their data or correcting existing data.

You're right, though: GitHub is horrible for large blobs of data like this. At the time I didn't know how big the data releases would end up being.

Tom and I have plans to talk more about future data releases and how they might be made with a more appropriate tool.

I have a hard time imagining a more appropriate tool.

The best tool would offer

  * Discoverability
  * Updatability
  * Transparancy (who, specifically, is behind it)
  * Tracability
  * Time stamping 
  * Linkability, with ├╝ber-stable URIs
  * Public issue tracking.
  * Documentation, including the possible crowd-sourcing thereof.
How is this not GitHub?

Edit: In this case we have identifiable and passionate individuals behind the initiative. This is far from faceless and cursory, as most data-dumps are. What's not to love here?

The only issue is that with huge data dumps (the buildings dataset here as GeoJSON is ~2GB uncompressed and ~1GB as shapefile) it becomes difficult to make direct pull requests against the data. Indeed they zipped the JSON file up before uploading it so it's impossible to make pull requests (I originally suggested GeoJSON because a pull request could be read by a human as opposed to a shapefile diff which could not be read).

At AmigoCloud, we are building a platform that can also be used to disseminate/crowdsource geodata. We have delta exports and a dashboard to review edits. Users can use our Android/iOS apps or apis (geojson) to edit the geodata. The system can sync the deltas back as 50+ different formats including ArcGIS Server. Full disclosure: I am the founder of AmigoCloud http://www.amigocloud.com

Your website doesn't work:

Error 310 (net::ERR_TOO_MANY_REDIRECTS): There were too many redirects.

Thanks for letting me know yellowbkpk. I was switching servers and you probably accessed it during the 10 minute downtime we had scheduled.

I think it is excellent, and the haters should fork the data and put it where they want it.

Can you keep us in the loop with how this progresses so we can cover it?

There are far better solutions, including things like http://ckan.org.. sounds like very little research was done before this decision was reached.

Thanks for your constructive feedback.

When I was searching for suggestions I didn't see any way for people to submit changes to the released data back to the organization releasing the data. Getting community feedback is what they cared about the most so I suggested GitHub.

