Hacker News new | comments | show | ask | jobs | submit login
NYC Space/Time Directory (nypl.org)
163 points by jcolman on Jan 30, 2015 | hide | past | web | favorite | 9 comments

So for anyone interested in this, I did my PhD thesis in computer vision on automatically reconstructing 3D models of cities as they change over time purely from historical and modern photos -- you can see my results for lower Manhattan (1928-2010) in these slides from CVPR 2010: http://www.cc.gatech.edu/~phlosoft/files/schindler10cvpr_sli...

We were also able to estimate the date of historical photos fairly accurately by first incorporating them into a 3D reconstruction and then reasoning about the visibility of structures in the scene.

Some videos of the time-varying 3D point cloud here: http://www.cc.gatech.edu/~phlosoft/

That page also has an interactive demo of a unified 3D model of Atlanta from 1864 to the 2000s based purely on image-based reconstruction. Here's a photo of Atlanta from the civil war (1864) with a modern skyline rendered into it by projecting the reconstructed building geometry and textures from modern photos into the recovered camera position of the civil war era photo: http://4d-cities.cc.gatech.edu/atlanta/img/atlanta1864_2008....

Grant, I'm from the team at NYPL working on the Space/Time Directory effort. This is absolutely amazing. We'd love to have you (and everyone here) help out!

We've been so surprised at the excitement over the response and we've had to bump up our timeline to catch up the energy of everyone who's gotten in touch.

There's a few big things we're focused on right now: * Improving performance of CV recognition of historical buildings [https://github.com/nypl/map-vectorizer] from insurance atlases. * Parsing data from historical records. We love old documents that are basically databases in print form. Old playbills, city directories, tax records; lots of tabular data with locations, people, businesses - the names of history. Getting those names out and tied to locations are what become the searchable elements for places. * Rebuilding the underlying gazetteer. We worked with the team from Topomancy to prototype the underlying system a few years ago [https://github.com/topomancy/gazetteer]

There's a lot more, but that's where were starting. If you're interested, drop us a line at spacetime@nypl.org or sign up for our (comically nascent) forum at http://talk.spacetime.nypl.org where we're trying to coordinate most of the effort.

Really interesting! Could you further explain how you grouped points into buildings (as shown on slide 10)? I would imagine that different sections of the city have different distance thresholds, for instance. Were they determined by hand or automatically?

The distance threshold was determined by hand on a per scene basis, and then we find connected components in the graph of neighboring 3D points, so it's a bit of a hack, but there are two important things to note. First, we can get away with an overly permissive threshold because we also require that any two grouped points are also observed at the same time in at least N images -- meaning SIFT features were detected for both 3D points in the same image (so they therefore exist at the same point in time). This filters out lots of spurious groupings. Second, this approach was inspired by super-pixels, which is an OVER-segmentation of an image into groups of pixels that are somewhat coherent -- each super-pixel probably lives on the same semantic object in the world, but they are by no means complete. Still, it's massively better than reasoning about individual pixels (or individual points).

So we err on the side of dividing single buildings up into multiple semantic objects. If our data included detailed reconstructions of the streets between each building, then the whole thing might be connected and we'd need more criteria to separate them out -- we do automatically estimate a ground plane so that's one way: just ignore everything near the ground for grouping purposes.

There's slightly more detail in the paper: http://www.cc.gatech.edu/~phlosoft/files/schindler10cvpr.pdf

I find this fascinating, and I think it's a great reminder of what public institutions are capable of, especially since they are oftentimes the sole stewards of these sorts of information.

Aside: I am really interested in contributing to these sorts of efforts, except, ideally, at a level higher than a mechanical turk. [0] Might anyone be able to suggest self-directed learning resources (Coursera courses, ebooks, websites, open source projects, etc.) in the realm of geographically-oriented web development or the latest in geographic/GIS data wrangling? Any suggestions for languages to learn or frameworks to become familiar with? Thanks.

[0] http://buildinginspector.nypl.org/

One thing to keep in mind is that they recently won the Knight Foundation prototype fund grant. These public institutions are only capable of these amazing projects, when they are funded.

Check out the recently-released Turf.js, http://turfjs.org/

Really well done by the smart folks at Mapbox.

Many (many) years ago I was briefly involved in this: http://sydney.edu.au/arts/timemap/

It was a larger scale, lower resolution version of this. The PoC demo was to demonstrate the borders of the Mongol empire over time. It was, as far as I can tell, a few years ahead of it's time in 1997(ish).

It was also the first time I got to use a proper SGI workstation. Damn, those things could make every other computer you used feel like a Jalopy.

EDIT: This has since been bundled into the "Heurist" project: http://heuristnetwork.org/

I always wondered if you could from the ground up visualize, delineate, and track ethnic neighborhoods in NYC without human input. All of the historic naturalization documents and ship manifests have been digitized and OCR'd now and include country of origin and nationality. By tracking the addresses from naturalization and census forms, a computer should be able to build a really accurate map of how neighborhoods changed and shifted from the 19th century through the early 20th up until the data becomes private.

For example, everyone in NYC knows that "Little Italy" is really a shadow of its former self and is mostly "Chinatown" now. If all the data was processed, you could visualize changes like that on a map over time and increase accuracy by adding more data sets (e.g. business directories, also in City archives).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact