
Finding the perfect house using open data - phiggy
http://dealloc.me/2014/05/24/opendata-house-hunting/?utm_content=buffere775a&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
======
gdudeman
This is awesome. I love analysis like this.

Funny enough, something very similar to this was the impetus for me starting
Estately. Early on, we had a bunch of wild spatial queries that would let you
even filter by distance to specific transit line stops:
[http://blog.estately.com/wp-
content/uploads/2008/06/near_tra...](http://blog.estately.com/wp-
content/uploads/2008/06/near_transit.gif)

When we ran stats on what searches people were actually doing, we found that
very few consumers actually used these kinds of filters. We found we had a
small and passionate audience for specific geo features and a much larger
audience for "great site to search for and learn about homes." It took me way
to long to come to terms with this. We still have walkscore search and
distance from a neighborhood, school or address search even though they are
very little used.

Why wasn't it popular? My guess is it's 80% because the market was small and
20% because we didn't present them right.

The thing about real estate - especially in the current market - is there just
are few enough homes for sale that meet your price / proximity to work /
school / bedrooms criteria that most people want to just glance at all of them
and rule them in or out. Additionally, most people start with a bunch of
"rules," then they bend one, change their mind on another and finally
compromise on the last one.

The typical home search starts so clean, but dissolves over time as you tour
houses, learn more about what you do and don't like and find something that
works.

~~~
Dwolb
Did you ever try to give people the option to filter by school district or
school district ranking?

~~~
gdudeman
We let people search by specific school and we are considering adding school
district ranking soon. We've had a bunch of users requesting it.

~~~
Dwolb
seems like one of the big factors that affects property values. maybe since
it's so important some people would rather search by top school within a
radius or boundary than know the district ranking associated with the house.

------
cjoh
The biggest users of the open data movement are, in fact, real estate
professionals. It's companies like Trulia and Estately and Zillow and what not
that are actually taking open data from government and putting it to good use.

It'd be interesting to see some kind of startup come out of this. Though it's
fairly hands-on, real estate is still a fairly hands on market, and with real
estate lobbying being what it is, it's likely that agents aren't going
anywhere anytime soon. Building them tools that allow them to make better
recommendations to their clients or to find better buyers might be something
really interesting.

~~~
kaylarose
I don't know if Redfin uses the same feeds, but their update times seem vastly
superior to Trulia, Zillow & Estately. After spending 1 1/2 years trying to
find a house in a turnkey-scarce market, and using RedFin (after it launched
in our market a few months ago), Estately, Zillow, and Trulia in parallel -
this was the killer feature of Redfin for me.

For instance, I knew an offer we had made on a house was going to fall through
when I got an alert from Redfin half hour after sending the offer, that the
house was pending. Similarly - I found, toured, offered, and signed on a house
(and was alerted to the "new listing", "pending", and "sold" status by redfin)
before Zillow even had is registered as "For Sale".

So I am genuinely curious how their feeds differ from the rest. I originally
assumed they all just scraped the MLS, but maybe this isn't the case...?

~~~
zippergz
Redfin is an actual real estate brokerage, and are therefore real members of
the MLS, with access to the data that provides. The others are simply scraping
via publicly available data. This is a major difference.

~~~
gdudeman
Close, except Estately gets access to MLS data too. (Founder & CEO here)

~~~
mrfusion
Do you guys have an API? I have a couple cool ideas for apps bit don't know
where to get data?

~~~
gdudeman
Hi mrfusion, we unfortunately have strict licensing restrictions on our data.
Sorry!

------
akgerber
The "moneyball" move is buying land right before the mass transit extension is
completed and the neighborhood develops:
[http://en.wikipedia.org/wiki/MAX_Orange_Line](http://en.wikipedia.org/wiki/MAX_Orange_Line)

But this is a really fun article on how to use open-source GIS tools to do
something practical, which is something I've wanted to delve into for a while—
I'll try it later tonight, thanks!

~~~
jweir
Yeah, Governor Christie's brother likes that strategy.

"Governor Christie's brother invested in real estate near new PATH station in
Harrison"

[http://www.northjersey.com/news/governor-christie-s-
brother-...](http://www.northjersey.com/news/governor-christie-s-brother-
invested-in-real-estate-near-new-path-station-in-harrison-1.667498)

~~~
akgerber
It's only problematic when you're doing it before the extension is public.

Likewise, property on Second Avenue on the Upper East Side of Manhattan has
been cheap lately, because blasting out subway stations is unpleasant. But a
new subway coming to an area that was recently relatively far from one means
property values are shooting up.

------
Jemaclus
I work in online real estate, and this is exactly the kind of thing we love to
do when it comes to utilizing data. How can we take X, Y and Z data points and
generate something useful? Taking into account crime reports, graffiti
reports, available parking spaces, etc, can all help you decide whether this
place is good or not. The requirements here are kinda strange (proximity to a
grocery store? not crime rates or anything like that?), but to each his own, I
suppose. I'm really impressed with the resourcefulness of the author, though.
I think I would probably just go to Zillow or Trulia and see what they say...

~~~
enjo
I'm in the midst of a pretty similar house-hunt to this guy here in Denver. I
currently live Downtown and love it. I'm not willing to move outside of the
city, so we're focusing on a proper house in one of the city-center
neighborhoods.

Proximity to a grocery store is absolutely at the top of my list. I like to
buy my food as fresh as possible and generally only buy for the meal I'm
cooking that night. So having not only a grocery store, but a high-quality
one, near me is incredibly important. The difference of even a block or two is
huge when it comes to something you want to walk to on a very regular basis.

Sure I pay attention to things like the crime rate, but in most cities it's a
nearly meaningless stat. The actual chances of being involved in a crime are
very low in any neighborhood in my city (and this is true of _almost_ every
city in the U.S.). I think this is true of people predisposed to urban living.
We're simply willing to tolerate a bit of crime risk in exchange for
everything else that goes along with it.

From a technology point of view, I imagine it makes online real-estate very
difficult to optimize for. There are so many different priorities that people
have that you have to find ways to build general data-mining tools that are
still actually usable by most folks. That's a tough challenge, one I don't
think anyone has come close to cracking. For instance walk scores seem to be
the closest to what I want and they tell you virtually nothing. Don't get me
started on how we "rank" schools too:)

~~~
maxerickson
I bet walkable fresh correlates pretty well with lower crime (especially
within an urban context, but I would expect it to work even when you ignore
development patterns/density).

------
thrownaway2424
Might be worth combining with walkscore data?
[http://www.walkscore.com/professional/research.php](http://www.walkscore.com/professional/research.php)

Not sure how freely they distribute this.

~~~
oniTony
walkscore can already map out all "within walking distance of grocery + rail"
locations in Portland (and elsewhere). E.g.
[http://www.walkscore.com/apartments/search/Portland-
OR?zoom=...](http://www.walkscore.com/apartments/search/Portland-
OR?zoom=12&sort=14_low&hood=off&nearby=%7B%7D&places=%5B%5B%22Groceries%22%2C%2210%22%5D%5D&transit=%5B30%2C%2219.81.83-85.99%22%2C45.537169%2C-122.65015900000003%5D&lat=45.535517621410435&lng=-122.64287605285642)

The actual scores would have to be looked up one at a time, on per-address
basis (e.g. [http://www.walkscore.com/score/320-pioneer-way-mountain-
view...](http://www.walkscore.com/score/320-pioneer-way-mountain-view-
ca-94041) )

------
aembleton
Wow; this is great to see. I have recently done something similar in the
Greater Manchester, UK area. I've been looking for a house and wrote some code
to scrape rightmove.co.uk and zoopla.com to get the house data. Checked their
locations against data from [http://data.police.uk/](http://data.police.uk/).
Checked time to walk to the nearest metro or railway station and time to cycle
to work. Metro and Railway station locations were scraped from wikipedia and
time to travel was taken from Google Maps.

From all this, as well as £s per m^2 of floor space I ordered by the least
compromise.

Currently trying to clean it all up and turn it into a webapp to make it
useful for anyone else who wants to find property in an area.

I'm about to complete on the purchase of my first house and am really looking
forward to it. It's not perfect; but I know for the price, I can't get much
better.

If I finish the webapp; I might try and include data such as walking distance
to the nearest pub from CAMRAs Good Beer Guide and maybe something similar for
restaurants.

~~~
freyfogle
You may wish to check out Nestoria's API of property listings and house price
trends
[http://www.nestoria.co.uk/help/api](http://www.nestoria.co.uk/help/api)

------
joshdotsmith
I didn't do anything this intense, but back in 2008 I took some data I
gathered from city-data and mapped neighborhood ratings for San Diego. After
living here for 6 years they're still pretty accurate.
[https://www.google.com/maps/@32.7740233,-117.1799609,12z/dat...](https://www.google.com/maps/@32.7740233,-117.1799609,12z/data=!4m2!6m1!4s215611903328390102803.0004515da0f446355193e)

I'd just take an address from Craigslist and paste it in to quickly see
whether I wanted to live there.

Today I live in one of the "bit dumpy and rough, but OK" neighborhoods.

~~~
petersellers
Thanks, this is really interesting. I love how Kensington turns from
"Affluent" to "Gang-influenced" as soon as you cross El Cajon Blvd. That might
be a slight exaggeration, but I've definitely noticed a few times walking
through there how drastically the feel of the neighborhood changes after just
a block or two.

------
chrisamiller
I was worried about commute times from three different points when looking for
a house, so I wrote a quick and dirty little ruby script to hit the Google
Driving Directions API. It tiles across the area you specify and spits out a
kml file that can be loaded into google earth and lets you see which areas fit
your parameters ([https://github.com/chrisamiller/commute-
times](https://github.com/chrisamiller/commute-times)).

There's lots of room for improvement in the code, but it's in the same vein as
what this guy did and helped us quite a bit.

------
fryguy
When I was looking for a place to live, I made my own program to scrape the
MLS, and did similar things to this. My rectangles were hand drawn, not
scraped from open data though, and were more based on the neighborhoods that
had houses I knew I didn't like.

------
ChuckFrank
Can you add a legend to your final map? I'm not sure what the different colors
represent.

~~~
Caged
I'll try to get one on there. In the meantime, the colors represent rankings
(scored by feature intersections). You can see the ranking selectors here
(pink, blue, green, yellow-green):
[https://cloud.githubusercontent.com/assets/25/3075424/5497c2...](https://cloud.githubusercontent.com/assets/25/3075424/5497c28a-e376-11e3-9d74-3d8430519db2.png).

~~~
jaggederest
Thought I'd mention to you that your 'supermarkets' data is pretty out of date
or marginal quality. I see at least 3-4 supermarkets opened in the last year
not on there.

------
schooldistrict
[http://schooldistrictfinder.com/](http://schooldistrictfinder.com/)

Finding the perfect school (one of the many steps to find the perfect house)
using open data

~~~
huherto
Great, Just today I was thinking about this.

So, how do I know the district area. I can see the pin. Is that the center?
Sorry, I am just getting familiar with this subject in the US.

------
mrfusion
Where do you get the house listing data? Do you then geo code the addresses
and filter them against your acceptable locations?

(Sorry if I missed reading that part)

~~~
Caged
Are you referring to real estate listings? In my case, I use Trulia to search
the zones after I narrow down the locations. However, if you're referring to
the building footprints I have a Make target that pulls GIS data from public
sources: [https://github.com/caged/portland-
atlas](https://github.com/caged/portland-atlas)

~~~
mrfusion
I guess I'm asking how you find only listings in the zones you want?

Does trulia have an API?

------
qrybam
I'm working on something exactly like this at the moment (not US) - it's super
exciting to see people make use of the data available to them.

Kudos :)

------
andbberger
My startup does exactly this and we just released!

[https://www.dwellaware.com/](https://www.dwellaware.com/)

------
misuba
Finding unaffordable neighborhoods, more like.

~~~
Eric_WVGG
ikr. Three problems about this article.

First, he better be rich if he’s serious about living in these zones.

Second, his data analysis is blowing past a multitude of equally livable, more
affordable PDX neighborhoods.

Third, within his zones are actual unlivable subzones (nobody wants to live in
actual-downtown-portland).

Impressive analysis, but he'd be better served by living for a month in a PDX
AirBnB and learning his way around the city.

------
frik
Great, way to go!

Reminds me of Bill Gates book "The Road Ahead" (1st ed, 1995, p. 158) - it's
about his vision of a next-gen network that would replace the middle-man
beside other innovative things:

    
    
      [...] most market places are very inefficient. For 
      instance, if you are trying to find a doctor, lawyer, 
      accountant, or similar professional, or are buying a house,
      information is incomplete and comparisons are difficult 
      to make.
      The information highway will extend the electronic 
      marketplace and make it the ultimate go-between, the 
      universal middleman. Often the only human involved in a 
      transaction will be the actual buyer and seller. All the 
      goods for sale in the world will be available for you to
      examine, compare, and, often, customize. When you want to
      buy something you'll be able to tell your computer to 
      find it for you at the best price offered by any 
      acceptable source or ask your computer to 'haggle' with 
      the computers of various sellers. Information about 
      venders and their products will be available to any 
      computer connected to the highway. Servers distributed 
      worldwide will accept bids, resolve offers into completed 
      transactions, control authentication and security, and 
      handle all other aspects of the marketplace, including 
      the transfer of funds. This will carry us into a new 
      world of low-friction, low-overhead capitalism, in which 
      market information will be plentiful and transaction 
      costs low. It will be a shopper's heaven. [...]

~~~
gohrt
Amazing that the book was written by a man who failed to communicate that
(rather obvious) idea to the employees of the multi-billion dollar company he
owned and directly managed.

~~~
drzaiusapelord
He had the financial incentive to not make that happen, namely to lock people
onto IE/Windows and direct them out of AOL and into a 1990's MSN walled
garden.

If anything, Gates the futurist vs Gates the businessperson shows us that,
regardless of what the pro-lassiez faire types say, most often corporations
left unfettered don't product the best results, but only results that directly
benefit themselves and that often translates into things like vendor lock-in,
poor software, poor support, lack of progress.

I mean, does Gates even question why the market for the things he complains
about are so inefficient? Its because its full of guys like himself.

~~~
burntsushi
Corporations left unfettered are no longer corporations.

This isn't just semantics. An unfettered corporation no longer has the
benefits of government _either_. I'm thinking of things like subsidies, added
regulation that raises barriers of entry and limited liability.

> regardless of what the pro-lassiez faire types say

Not all pro laissez-faire types are strict utilitarians. Most that I know tend
to appeal to ethics in addition to a freed market's utility.

> but only results that directly benefit themselves

That doesn't make any sense. Microsoft has surely provided benefits to a great
number of people. Perhaps you'd argue that Microsoft benefited more in certain
transactions, but that's an entirely different claim (and it doesn't imply
that others didn't benefit _at all_ ).

~~~
deveac
>Not all pro laissez-faire types are strict utilitarians. Most that I know
tend to appeal to ethics in addition

Utilitarianism is actually an normative moral theory concerned with
determining whether an action is right or wrong. Any strict utilitarian would
dispute that ethics are in addition to their view rather than a core component
:)

