

Synference - an API for A/B testing with reinforcement learning - randomtask
http://gigaom.com/2013/12/20/synference-thinks-ab-testing-can-get-a-lot-smarter-with-machine-learning/

======
bonzoq
Congrats for launching. It's an interesting application of machine learning.
Might give it a try with my tiny website.

------
feral
I'm one of the founders - happy to answer any questions.

I believe this sort of bandit-meets-machine-learning approach is going to have
a big impact on web optimisation.

~~~
hooande
How do you plan to deal with the "cold start" problem, not having enough
information about the objects in the prediction? The majority of visitors will
be viewing pages on a given site for the first time. In these cases there is a
limited amount of information available: referring url, browser, perhaps first
or third party cookies, visit time, etc. The whole point of A/B testing is to
find a configuration of page elements that will appeal to the largest number
of people, because there is so little information available about each
individual.

I know of some cold start solutions, my personal favorite being variants of
regression based latent factor models [1]. I'm not expecting you to reveal
your secret sauce, but I am curious about how you plan to address the problem
of having so little information about each person.

[1]
[http://dl.acm.org/citation.cfm?id=1557029](http://dl.acm.org/citation.cfm?id=1557029)

~~~
feral
There are really two different problems you are mentioning there.

Problem 1 is the typical recommendation-system/ML problem of 'cold start',
where there isn't sufficient training data examples to produce good
predictions. (Recommender systems based on clustering of items and users are
particularly vulnerable to this.)

Problem 2 is a different problem, which is where you only have a small amount
of data for each user (low dimensional feature data), where, even if you had a
lot of examples, you wouldn't have enough information about each example (i.e.
enough features) to make predictions.

Problem 2 isn't really as big an issue here as you might think. You mention a
limited amount of available information: "referring url, browser, perhaps
first or third party cookies, visit time, etc." This is actually a fair amount
of feature data, if you treat it in the right way - even just browser+IP are.

If our system just has IP address, we'll build features to do with geo-
location, and derive features like predicted income level from a combination
of geo and device/browser.

Now, those features are only useful if they are predictive - if its the case
that the set of options you want to choose between will vary in some way that
correlates with those features.

However, this is often the case. If you have a global website, users from some
geographies will have different preferences. Time-of-day is also an important
feature. Further, if you want to adjust the discount you give people, how
modern their hardware (from the user agent) is is a good signal. Those are
just examples - there's a lot in there, if you treat it right.

Even these very common, lowest-common-denominator features can thus give you
better results than just blindly pretending your population is a homogeneous
whole, which is what existing bandit approaches do. Further, our API supports
custom features, too.

Problem 1 is a more technical question.

The honest answer is that the particular type of framework we use doesn't
really suffer from that problem.

Our solution is adaptive. If there are very few training data points, then our
system will treat the problem as if the user population is relatively
homogeneous, and will attempt to predict the best option for the population as
a whole - like a traditional bandit algorithm would, or like the result you'd
get from a simple AB testing framework.

However, as the amount of data gets large, our system functions more like a
predictive model (i.e. like a ML system) and less like a simple bandit
algorithm.

This means that there's really no cold-start problem with the learning
algorithm - instead, there's a transition to finer and finer predictions as
the data density increases. So it ends up using the data _roughly_ as
efficiently (not quite as efficiently, but close) as a custom solution which
is calibrated to the level of available data would.

And that's an awful lot more efficient than running an AB test for a fixed
length of time, and waiting to see the results.

If you want more detail, maybe have a read of our FAQ:
[http://www.synference.com/faq.html](http://www.synference.com/faq.html) or
send me an e-mail.

