Laughed at that one. Booking.com is so full of dark patterns that I dread using it.
But if I find an alternative that has the same width of offers and a booking process that doesn't feel like a drill sergeant constantly yelling "BOOK NOW YOU WORTHLESS SCUM, BOOK, BOOK, WHAT ARE YOU WAITING FOR YOU IMBECILE, CLICK IT, BOOK, NOW, NOW" - what do they think will happen?
Booking.com A/B tests everything: the drill-sergeant-like funnel probably has higher conversion rates than any gentler variation. So to answer your question - they think you might not book through them without the shoutiness.
I'm going to guess the answer is no - which is why organizations have to be careful which metrics they measure and incentivize on. Granted, this failing is industry-wide as the longest view on most orgs' dashboards is YoY. When that metric starts freefalling, it most likely will be too late to do anything about it, but most of the staff (up to the CEO) will have padded Resumes with amazing numbers for improved conversion/revenue which will get them to the next job.
One lessons from all these things is they maximize for what’s easiest to measure, not what’s most important. Conversions aren’t the end all be all, nobody wants to come back to a store with the pushy salesperson.
I made sure to provide feedback.
When people book with you but give you 1/10 stars, that's probably a pretty strong warning sign that the customer isn't happy with the site and the first usable competitor that they find will get their business.
But that leaves you at the mercy of the competition, which in an open data business (I mean, airlines are more than happy to tell you which flights they have available) implies that your product is undifferentiated. So eventually you resort to this tactics: as soon as a user gets in, do anything that's humanly possible to convert that sucker.
That's in essence what's wrong with Booking. This trickles down, unfortunately: Ryanair hides most of its costs in effectively forcing you to upgrade to Premium in order to be treated as something just a bit more than lifestock, because the assumption that travelers' main concern is price pervades the industry, even if the price they are shown isn't the price they pay in the end.
Just like the unholy abomination that professional project managers have turned “agile methodologies” into.
I bet a "How to dox and stalk people with Python" post would be flagged down, so maybe I'm just complaining about the prevailing ethics on the site.
Now I understand why it's so bad-- "user interface optimization models"
I refuse to believe this brings real value. The more plausible reality is, they have fantastic SEO and a tightening stranglehold on marketshare, and some AI to squeeze a few more pennies out along the way. Whatever metrics they are seeing, it won't be worth it in the long run. This kind of UX and product won't last.
For example, LinkedIn having a flow that has an e-mail and password box, which will get a less attentive user to just re-enter their LinkedIn credentials. But it's actually a phishing form for your e-mail, so if your LinkedIn and e-mail password is the same, you have now "consented" to have your address book scraped and your contacts spammed.
Or, in the case of Booking.com:
* Every step has items designed to pressure you to book NOW because it'll be too late otherwise:
- "booked x times in the last x hours" on the listing, or
- "Only 1 room left!" (they now add "on our site" after they lost a consumer protection lawsuit)
- Showing booked-out listings "You missed it"
- Various notifications like "last booked X minutes ago" and "limited supply" popping in while you're scrolling to raise the pressure
* Misleading or deceptive claims
- "Jackpot, this is the cheapest price you've seen" (emphasis should be on "you've seen", this will be shown even if you look at overpriced properties)
- They seem to have stopped the "one person looking at this property" thing (to make you think that it may be gone if you don't book now - that one person is you), probably after being forced to do so by court
- a misleading rating system (the lowest possible rating is 2.5/10, and you rate category-by-category, which means that if the staff is friendly and the hotel is in a good location etc. but the rats and cockroaches ate your luggage while you slept, that's an 8/10 property - in practice, you should assume that anything below 8 is not good, below 7.5 is bad, below 7 is catastrophic, below 6 you may not survive)
- I'd also assume that they mess with the reviews in various ways, like showing mostly positive ones etc., but I haven't verified that.
Overall, I like to compare the booking experience with a drill sergeant yelling into your ear to convert (book) right now, NOW, DO IT, NOW, YOU MAGGOT! They seem to have improved significantly over previous experiences with them, probably due to a combination of me getting used to ignore the yelling, or because they realized that such a bad experience pushes customers away, or because their practices got banned one by one.
It's a shame, because other than the drill sergeant, their site is great.
See more examples here: https://www.darkpatterns.org/
Someone pointed out "confirmshaming" to me a few years ago...and since then I feel like it shows up on > 50% of the sites I visit.
In its best form, perhaps. But more likely than not it looks something like this:
[ ] YES, I want to fight racism by subscribing to CrappyPublisher.net's twice-daily newsletter!
[ ] NO, I am a racist (and also a pedophile)!
If I'm cancelling amazon prime because it "costs too much" but you say "are you sure you want to miss out on all the fast shipping" someone who is easily manipulated may continue to subscribe because they are weak willed.
You could instead ask: "Why are you cancelling?":
- Don't use it enough
- Other (please specify:)
Do websites usually just use t-test only? Like adding one feature at a time?
Are you talking about more than one experimental design in terms of comparing the exp/control distributions or something else?
> your t-test results should then be valid if the two features are independent.
Assuming that your assumption are correct on interaction effect.
You can do a hypothesis test on that assumption while including both factors (the two features). Which will clear away any doubt with a 95% confidence or hire a statistician =).
> developing an organisational capability to design, build, and deploy successful machine learned models in user-facing contexts is, in my opinion, as fundamental to an organisation’s competitiveness
You hear that, right? In 2019 already you have to have AI and do it well to be competitive. I just wanted to point out how cyberpunk that is.
>... I just wanted to point out how cyberpunk that is.
Nah, that is corporate flavor of the month/year/etc. It's not the 90s, so they're not "synergizing" any more but otherwise, whatever.
In #devops is turtle all way down but at bottom is perl script. - @devops_borat
Comedy: You, trying to launch a startup from scratch using Java. Tragedy: Me, trying to debug 27k lines of legacy Perl that brings $113MM/yr - @NeckbeardHacker
Today's ML is really good at speech and image recogniziton, which makes for some very eye-popping layman demos.
Whereas what businesses really want is time series prediction, and modern ML really sucks balls at solving this problem.
I agree on the former, and quite strongly disagree on the latter, even if it means redefining ML to be dressed-up statistics.
This is interesting. Sometimes some people from business side consider that AI is the solution to all problems (as if there was just one catch-them-all AI solution) and some academic people think that the top-performance model for some classification task is the must-go, and all they forget that the goal is to earn money.
First of all it turned out that the winner wasn't actually all that useful for various reasons such as computational intensity.
But, more interestingly, it also turned out that the goals of the model--"best" recommendations--isn't actually the goal of Netflix at all which is much more interested in customer retention and similar metrics. The two things may be correlated but they're certainly not the same thing.
I don't remember all the details but I thought it was a really good insight at the time.
Relevant news.yc discussion from a month ago: https://news.ycombinator.com/item?id=20876158
Yep, from my experience with booking.com it seems that instead of using highly trained AIs the decision was made to simply slap every dark pattern known to man onto the site and auto-subscribe every customer to a dozen newsletters.
But what really amazes me is the market failure that hotels and other accomodation providers can't come up with a co-op booking site. I am sure there are issues that are difficult to solve from competition point of view, but are they really so difficult to solve that the rent seeking fees of current booking sites are justified?
And people like me and you, who really don't want to book at Booking.com and make an effort to book elsewhere, are often in bad luck, because they own a bunch of other booking sites too.
Any idea what these are ? especially the pre-computation/caching and batching. I'm not able to see what advantage does batching bring...or how you can really cache a prediction request
Pre-compute the recommended hotels for my top users every night. Now when that user comes back, they see a slightly stale recommendation, but it's lightning fast.
You can also pre-compute and cache some of the inputs cheer model, like maybe a vector representation of the description of a hotel.
For the same hardware load, you can process several samples instead of just one.
Pre-computation means running your model on samples in advance, before the model result is needed, so it's ready to use instantly when needed.
Caching works probably because there are model results that are reused again and again, so it makes sense to cache them. For example, there are deep models that process room pictures, room and customer characteristics. Only customer characteristics change between customers, so it makes sense to cache the features output by the deep CNN that processes the room pictures.
Once you start doing prediction at scale, there are lots of these optimizations to pick up.
All our models are balanced using multi-armed bandits, so for our recommendations engine, we run lots of arms that depends on the incoming channel, were in the app is being shown the recommendation, etc and just combine the outputs of the models.
This is one of the reasons I am a big believer in having a system to track model research and deployment lineage. (I personally use Domino Data Lab for this. I also work for Domino, but use it in my own modeling work and that of others I mentor.) No matter which system you use to track lineage, I've found it important to have a strict history of retraining, versioning, and experimentation. When models are used in downstream systems from the one they were originally intended, it becomes even more critical to able to explain and reproduce the 'research' that led up to deployment.
x1 * a + x2 * b + x3 * c + ... + x1000 * zzz + ...
If a, b, c ... zzz, are all fixed constants already discovered by your learning algorithm. That's a very fast calculation, and doesn't take anything like 50ms.
Also, in the real world, you can establish a significance cutoff for a lot of these constants and get something like this as your final equation:
x13 * m + x523 * cdf + x777 * wdc + x893 * ydz
And the learning side of things should have culled that list of thousand features down to a list of 5 - 10 that mattered.
It really sounds like the off-the-shelf stuff isn't built for efficiency.
The team behind the paper built a model that had good performance on training data. They're a smart lot so they knew they needed to cross-validate. The results held up in cross-validation! Hooray, the model works! ...right?
That's as far as a lot of data scientists go. This paper points out that you need to have a model that does (at least) three things:
1. Generates good scores with training and testing data
2. Outperforms existing models in the real world
3. Runs really really quickly
There are a lot of data scientists who have no idea how to do #2 and #3. This paper says "These parts are really important!!!"
Features likely include more than just the single users history, so they need to be updated often enough for the model to do fast predictions. E.g. you want your model to capture if many people are booking from the same area at once because there were results from a sports game etc, but you dont want to run an expensive query for every user of the page.
Definitely not the first thing to worry about in a startup, but better performance at Booking.com's scale is serious $$$.