
Introducing the Priceonomics Business Model: Data Crawling Services - kevinburke
http://priceonomics.com/introducing-the-priceonomics-business-model-data/
======
natural219
I think this is excellent! I loved the original idea behind Pricenomics (I had
built some toy side projects exploring the same concept) and have enjoyed the
direction change to focus on understanding pricing rather than just returning
dumb results with a web scraper.

This announcement ties it all together. Philosophically, it's less important
what actual products/services they are outputting, but that they're sticking
to their original mission -- discovering the best price for a commodity and
sharing it with the world.

While we're on the subject of philosophy and prices, I really liked this
ribbonfarm article about bartering that seems relevant, I think you will enjoy
it:

[http://www.ribbonfarm.com/2008/03/16/bargaining-with-your-
ri...](http://www.ribbonfarm.com/2008/03/16/bargaining-with-your-right-brain/)

------
gjreda
Do you guys have any concerns about websites' terms of use restrictions?

I've done a bit of crawling and have always been curious how selling this sort
of service would handle those potential use restrictions / legal issues.

~~~
harvestmoon
I've been building a product search tool. I wanted to scrap data to power it.
It'd be similar to the caching and scrapping Google does.

But I spoke to a lawyer and she strongly advised against it (she said if the
Terms of Service of a site say it isn't allowed, you have to follow that).

Dunno. Confuses me how something is supposedly illegal yet many companies do
all the time without a concern.

~~~
vijayr
These people seem to be making a decent business out of selling scraped data -
[http://www.aggdata.com/](http://www.aggdata.com/) There might be others too.

------
zissou
Economist and long time web scraper here.

In your original business model you wanted to understand the price of
everything. In what ways did the problem of a lack of information on the
demand side come up? That is, it is easy to scrape the price in many markets
(supply side), but what kinds of conversations came up within your team about
the lack of information on how many units were actually sold at a posted
price?

By the way, glad to see you guys were able to make a business out of crawling.
I've landed a handful of freelance gigs since leaving grad school based on
scraping data for clients, but never tried to expand it to anything beyond
consulting projects.

~~~
chad_c
Not an economist, but I have been mulling over a project surrounding scraping
and pricing.

Without having access to the actual monetary transaction data, how does one
know what was sold and for how much? Without this (or a mechanism by which the
lister closes or updates the listing), how do you know anything was actually
sold?

~~~
binarysolo
Also an economist and a data scraper/consultant here -- depending on the data,
some times all you need to figure out is correlation -- frequency of updates,
listings being live for X time; clusters of listings around Y days, etc.

In terms of a few real-life examples, on the one hand you have eBay which
provides you with sold data (API through Terapeak). On the other hand you have
Craigslist, which is kinda opaque, hates scraping, but you can monitor
listings and their half-life. (Listings that disappear quickly presumably get
sold quick; listings that stick around for weeks relisted over and over have
lower liquidity presumably and/or are priced high.)

~~~
zissou
eBay's completed listings is definitely one of the best applications of
obtaining sales data on the Internet that I'm aware of. Besides that, in some
cases there are ways to imperfectly estimate quantities when best seller
rankings are available (e.g. at Amazon) -- Chevalier and Goolsbee where the
first to suggest this approach back in 2003.[1]

As you mentioned, monitoring half-life is another imperfect approach, but it
is of course plagued by false positives (a listing goes away but no sale was
made). There was a Google Tech Talk many years ago where some economists took
this approach[2], except they were looking at pricing power instead of
measuring quantity sold.

[1]
[http://www.hss.caltech.edu/~mshum/ec106/chevaliergoolsbee.pd...](http://www.hss.caltech.edu/~mshum/ec106/chevaliergoolsbee.pdf)

[2]
[https://www.youtube.com/watch?v=SfjAezl3-cU#t=27m20s](https://www.youtube.com/watch?v=SfjAezl3-cU#t=27m20s)

------
sksk
Is this service going to be very different (in final output) from what you
would get with [http://import.io](http://import.io)? Import.io takes the stand
that we will provide you with the tools to scrape a page so they are not
really the one doing the scraping (legally speaking). They also provide an API
so easy to consume the scraped data in your programs.

I guess with Import.io if something breaks, you will have to redo the scraping
logic. Maybe PE will manage that on the users behalf and that's their value
add? Looks like PE may be doing some intelligent scraping as well (not
everyone is going to list Dell LCD Monitors $400 in their title but they will
do the normalization for you -- just a guess).

I have used import.io and it works very well for most sites and it is fairly
easy to create a scraper.

------
agibsonccc
Great pivot guys! If any of you guys are working on textual insights with
scraping, please let me know. I do pdfs, text docs as well as more traditional
html.

I think vertical specific ecommerce pricing data like what these guys are
doing is a hard problem ( I have clients in this space as well) and it's great
to specialize in something like this. I think the real value comes from an end
to end service like what these guys are offering.

Current projects for clients include custom sentiment analysis engines, real
time event streams, ad intel type projects, location data, and even menus.

Much of this is done with deep learning and more generalizable models that are
likely to fit your domain.

I have a dashboard I'm working on with some customers now that will allow you
to avoid having to contact a service provider.

Email is in profile if there's any requests. I hope to put up a good portfolio
site soon. Only started this a few months after building out my own text
analytics engine.

Also again: best of luck to priceonomics here. It's a lucrative market if done
right.

------
staunch
The idea of doing a bunch of different kinds of data is a bit spooky. It's
_really_ hard to have the best data in the world in just a single vertical.

------
jey
What was the original Priceonomics product?

~~~
codegeek
Their very first blog post [0] said

"We’re building the price guide for everything, from bicycles to boats, Aeron
chairs to iPads. We hope as you look to buy or sell things you can use our
data to get a good deal (or at least prevent yourself from getting ripped
off). For example Priceonomics can show you how much a used iPad 2 3G should
cost, as well as a whole range of prices"

[0] [http://blog.priceonomics.com/post/14567999429/how-to-use-
pri...](http://blog.priceonomics.com/post/14567999429/how-to-use-priceonomics)

------
LogicX
Sounds just like the needlebase project ITA software developed in the late
2000s, but was shut down after google acquired them.
[http://www.quora.com/Needlebase](http://www.quora.com/Needlebase)

------
benackles
It's good to see a publisher trying a new business model other than
advertising and paywalls. I like to see companies leveraging their core
competency and providing it as a service. Priceonomics is killing it!

------
xxcode
Basically, you seem to be going the LifeStyle Business route. Good for you!!!

------
ChuckMcM
Answers the previously unanswered question about where they get their data :-)
So yes, they crawl Craigslist and other sites. Now is it more 80legs or
something else? Hard to say.

------
mdgrech23
You guys are great at blogging. Have you thought about pivoting?

~~~
rohin
Rohin from Priceonomics here. Thanks for the kind words.

Well, we're trying to build a company that one day has hundreds of writers and
engineers working together. So, we've been writing all along and want to do
more of it for sure.

------
sixQuarks
How much would it cost to scrape about 100,000 data points each month?

------
cynusx
what's the frequency of crawling you guys support? e.g. hourly updates, daily
updates, ..

