
Experiments at Airbnb - lennysan
http://nerds.airbnb.com/experiments-at-airbnb/
======
nostromo
Airbnb could likely get a lot more bang for their buck by letting hosts run
experiments on pricing than by testing button colors and whatnot.

I ran an online marketplace at a previous gig. Our service providers always
complained that they didn't know what to charge to maximize their business.
They couldn't see the forest as a tree. Because we had the data for all
providers, we started letting them know if they were under- or over-priced,
and we saw more conversions and revenue.

Dynamic pricing (like Uber does on holidays) alone could be hugely valuable.

~~~
coffeecheque
I agree. I've had to spend some time lately on the host pricing page for an
AirBnB rental and I would love some more advanced features.

I've previously thought about writing a little program that keeps an eye on
the major hotels in the city, so I can work out when rooms are more in demand
and amend my pricing.

Even a "this is the average cost of similar rentals within a X radius of you"
would be helpful.

It would be beneficial to AirBnB too. They'd get more money in fees and money
flowing through their system. It also works both ways - if I reduce pricing in
low demand times, it's possible I'd get bookings I wouldn't have before -
again increasing the amount for AirBnB.

~~~
zaroth
Classic 'push this button to make more money' app, with a nice path to
acquisition baked in.

Someone with an afternoon free should really step on this. Just make sure it
doesn't end up starting price wars with itself!

Call it Autopilot, or something.

------
wkonkel
A simple hack is to run an A-A-B-B test instead of an A-B test. Rather than
splitting 50-50, use 25-25-25-25 splits. When A1==A2 and B1==B2, then you know
that you have statistically relevant data and you can compare A to B.
Depending on the dataset, this could happen in minutes or weeks.

~~~
bjlorenzen
Using four buckets instead of two like that will improve your confidence in
the results, but will also double the required sample / testing duration. You
could just as easily use two buckets and wait twice as long to achieve the
same effect.

~~~
blauwbilgorgel
A/A testing (Null testing) or A/A/B testing gives a different effect than A/B
testing.

Microsoft Research suggested
([http://ai.stanford.edu/~ronnyk/2009controlledExperimentsOnTh...](http://ai.stanford.edu/~ronnyk/2009controlledExperimentsOnTheWebSurvey.pdf))
that you continuously run A/A tests alongside your experiments. An A/A test
can:

 _\- Collect data and assess its variability for power calculations_

 _\- test the experimentation system (the Null hypothesis should be rejected
about 5% of the time when a 95% confidence level is used)_

 _\- tell if users are split according to the planned percentages_

------
thinkmoore
Statisticians have spent time thinking about the right way to deal with these
sorts of problems for a long time:
[https://en.wikipedia.org/wiki/Sequential_analysis](https://en.wikipedia.org/wiki/Sequential_analysis).

Funnily enough, the page they reference for calculating the right sample size
actually talks about sequential analysis, but AirBnB doesn't mention this in
describing their solution...

------
sutterbomb
HN user btilly has a really helpful essay on the math behind stopping tests
earlier than your predetermined sample size. It calls for setting a maximum
duration, and provides stopping points along the way. Works similar to the
method AirBnB describes.

[http://elem.com/~btilly/ab-testing-multiple-
looks/part2-limi...](http://elem.com/~btilly/ab-testing-multiple-
looks/part2-limited-data.html)

------
bjlorenzen
As a developer working for a major competitor to airbnb on a shopping page,
and having implemented hundreds of experiments on my page, I can say that
these guys are way too obsessed with statistical certainty.

Rate of deployment of experiments is a better focus; since all your opponents
are bound to copy your winners anyways, you have to rely on the few months
edge you've earned before they do so, and constantly maintain that lead.

~~~
thinkmoore
Unless random high bias means your "edge" is exactly the opposite.

~~~
001sky
This is about proportionality. Finite window trading strategies need to take
into account the link between implementation overhead (time) and overall
profitability (linked to time). Just the same way they need to link pricing
and profits (linked to pricing). That seems to be the crux of the issue.

~~~
thinkmoore
Yes, I'll agree with that. Just warning about throwing the baby out with the
bath water. Applying quantitative decision making tools is worthwhile, being
picky about it can protect us from our emotions and preconceptions.

------
coherentpony
This article contains some serious p-value abuse. The p-value should be
adjusted to account for multiple testing. You do this to minimise the effect
that a hypothesis would be accepted purely due to random chance.

Try setting your p-value to your Type 1 error rate _divided by the number of
tests you perform_. It will be _much_ smaller, and this is a good thing.
Significance should really test for significance, not random chance.

------
jessriedel
I wish AirBnB would make the cost scale logarithmic, to match the fact that
this is roughly how the prices will be distributed too. I'm usually only using
the left-most 5% of that slider.

~~~
ansimionescu
Isn't there any better alternative to sliders, though? I never use them, as I
usually know exactly how much I'm willing to spend, and will probably not feel
comfortable going out of my bounds anyway.

------
cbovis
Can anyone point out a good introduction to some of the methods used in the
article? Terms such as the p-value, treatment effect etc.

------
RA_Fisher
The cult of statistical significance is alive and well. A 0.05 p-value implies
a 1:20 chance of "alternative" performing worse upon final installation.
That's rather risk adverse. It also implies that "alternative" is worse from
the get-go. When is that the case? Type 1 and Type 2 errors are much more
balanced in web apps. Anyone care to show me why that's a bad mentality?

~~~
thinkmoore
No, it doesn't. It means that there is a 1 in 20 chance that you would have
seen results as good or better if your change had no effect (assuming a
standard one-sided hypothesis test). Thus if the effect appears to be good,
you should take the test as some evidence that it is worth implementing.

~~~
RA_Fisher
Right, that's a specific use, but I'm speaking of a two-sided test where
you're indifferent between alternatives.

------
205guy
Ok, I'll be "that" guy who heckles every AirBnB post, even if this one did
have some nice graphs (and ideas).

When is AirBnB going to experiment with helping their hosts follow the law? I
bet I can predict that graph. Why, look at all those illegal rentals in SF
right there in the sample screenshots--oh the irony.

Remember, DON'T FUCK UP THE CULTURE! But it's OK to fuck up your host city for
a buck or 2 billion.

~~~
chris_jg
You rant about "follow the law" but who exactly is being hurt? The person
making rent off his/her extra room? Please give examples or real harm rather
than, "follow the law" statements. I suppose you're the kind of person that'd
turn in Ann Frank. That'd be following the law.

~~~
205guy
We have a Godwinner!!!

In case you haven't heard:
[http://en.wikipedia.org/wiki/Tragedy_of_the_commons](http://en.wikipedia.org/wiki/Tragedy_of_the_commons)

Also: there is a local law on the books, duly enacted by a democratic process,
but you're arguing nobody needs to follow that law because you think nobody is
being impacted? Is that your position? I have to assume you're an AirBnB host,
so I wonder if you've contacted every one of your neighbors to see if they're
cool with your gig.

~~~
chris_jg
(1) Tragedy of the Commons? Are you saying an extra room in your own home is
"The Commons" (2) "... nobody needs to follow that law because you think
nobody is being impacted? Is that your position?" Yes. That is my position. At
one point in the US it was illegal to be Gay. It was illegal for blacks to
drink from a white persons fountain. Those were laws "duly enacted by a
democratic process". I'm saying that before you comply to with a law or
suggest others should comply with a law that you ask yourself, "is this law
just". If you think so, great. Back up the virtue of the law itself. My point
remains, a law that is unjust doesn't deserve to be followed regardless of the
process that enacted it.

~~~
205guy
For my definition of the commons being eroded by AirBnB, see my reply to
eddieroger above.

In a sense you are right that "democratic" doesn't necessarily mean just or
morally good, but I don't think you can compare what are essentially civil
rights with property rights. I agree that laws should be backed with arguments
about why they are needed and beneficial--though you realize it's not always
possible to go into such details when I'm already so far off topic :-)

So here is my quick defense of SF's banning of short-term rentals (essentially
making all AirBnB rentals in SF illegal): Regulating property is about zoning
and controlling the market so that financial forces don't overwhelm the people
involved. It is the city's responsibility to keep the city livable for its
residents. This regulation defends local communities and avoids the
instability of speculation properties, evictions for AirBnB conversions, etc.

Note that I think the total ban is both slightly too strict and probably
expensive to enforce. That's why I advocate for a new policy: allow any owner
(and renter) to do short-term sublets up to a limit of 30-40 days per year.
For renters, they are limited to collecting the full amount of their rent in
any calendar month, and any additional money collected belongs to the
landlord. And finally, any 3rd party booking service must enforce these
limits, collect hotel and sales tax, and turn records over to the city. That
way people (even renters) can rent out when they go on vacation or make a
little extra money, the city gets an elastic supply of rooms for big
conventions, concerts, and sports events, but residential stays residential
the other 330 days of the year and housing doesn't get bought up by
speculators, nor does it have a bubble due to the value of AirBnB conversion.

