There are, I believe, three main reasons why the recommendations are so poor on YouTube:
a) YouTube doesn't know anything about the content itself, can only use metadata
b) The algorithm itself is biased towards creators that post often and keep users hook the longest, which is almost always vlogers (ask any animator what they think of YouTube)
c) Many recommendations systems today create many buckets and once you watch something from one bucket (you show your intent), the algorithm will focus on that bucket only. (You can see it working extremely poorly on Amazon that tries to sell you a fridge after you just bought a fridge).
It's very hard to build a great recommendation system (look at Spotify's Discover weekly), but because this is 101 of any machine learning course, it's the primary thing that companies refuse to outsource (I build company around it, failed badly).
I've found that the YouTube recommendations do a good job of picking a "next" video to watch, but an exceptionally poor job of constructing the front page.
If I watch "Some Video (part 1)", the recommendations reliably pick "Some Video (part 2)" next, with the other parts as the other related videos and similar content further down. If I watch a random video from a particular channel, the recommendations show more videos from that channel. If I watch a video of a particular game or a reaction to a given episode of a show, the recommendations show more videos of that game or more reactions to that same episode. If I listen to music by a particular artist, the recommendations show more music by that artist.
On the other hand, the front page consistently shows me 1) old videos I've already seen, 2) collections of highly viewed content that I have no interest in even if I've already hit the "not interested" X on it, and 3) popular videos by channels I already subscribe to (I don't want to know what's popular, I want to know what's new.).
Yes, the front page curation is exactly broken in the ways that you've described. I've seen people who are working on this write on threads like this on HN. I really wish our voice is heard. Youtube could be infinitely better with improvements addressing these points.
Youtube does have automatic transcription for videos. It's not too hard to link this to a topic hierarchy (maybe they already do this). It seems like a hard problem at their scale, since unlike Spotify, the list of genres isn't knowable.
I've been building a search engine for lectures as a research project. For a small list of videos I find that browsing topic taxonomy is really nice compared to the recommenders that try to guess your intent.
There are commercial systems for automatically tagging the text (e.g. Watson) which hierarchies which don't go into niche areas - e.g. the Watson taxonomy tagger does 1,000 tags.
For more niche topics, I've explored Watson's entity recognition system, e.g. to recognize the names of diseases. The advantage is it picks up terms it hasn't seen- The problem is you can only identify entities that someone has trained a system to recognize.
The UI challenges are interesting as well. If spotified identified 100 genres that interested me, they could pick any arbitrary subset of playlists and I'd be pretty happy. If I used youtube to get home repair videos, and then they showed me videos about repairing parts of my house that aren't broken, it'd get pretty irritating.
d) The recommendations algorithm is one of the primary ways that YouTube users find videos to watch. No matter how bad its recommendations are, a lot of users will still act upon them, simply because the recommended videos are so visible. This becomes a self-fulfilling prophecy: videos that are frequently recommended are viewed many times because they are seen so often; the high view count on those videos makes them high-priority candidates for recommendation, and so on.
a) they really don't need to know the content, they can infer most of the content by 'who' watches and how much, aka collaborative filtering.
b) Shouldn't bias towards who "posts often" esp if the videos are shit. "Hooked the longest" would be ranked highly if you get hooked like the 'hooked users', makes sense?
c) those are poor systems, and actually, I think amazon has one of the much better rec systems.
Youtube has one of the worst. Just today their 4th ranked vid for me was the 2 hr long 9/11/2001 broadcast. Wat? Then, they never boost new videos of a channel I subscribe to and have seen every video of theirs over the last 3 months. I literally have to check the channel most recent vid list daily to see if I missed something.
I think youtube's weak recommender system is more a result of them having a hammer(deep learning) and seeing every problem as a nail.
a) That's the general problem. If you are being encapsulated by the algorithm, there is no way out of the bubble so it doesn't help at the end that they use collaborative filtering, because majority of people ends up watching the same things.
b) They chose the metric that made the most business sense. That's why if you spend 6 months working on your video and somebody else produces one video a day, you'll never show up anywhere close to top of the suggested videos.
c) Agree. Amazon has one of the better ones, but it's still terrible.
You hit the nail on its head with your assumption. This is in general Google's approach. But even with deep learning, the system is heavily biased, intentionally.
On a) if the vast majority of users end up watching the same videos then they wouldn't need a fancy rank system and instead rank videos by most viewed per day, or something like that. My gut feeling is there's ways to segment users by their views and infer many characteristics of the videos a user wants to watch by looking at what similar users watch. It's still highly dimensional though.
b) My guess is they're optimizing for viewership engagement with the added side-benefit of video-creator engagement.
d) revenue considerations, could be related to b). That would be something that could degrade de recomendations.
On a tangential note, ads. They sell those ads as "targeted" but when you play your "yoga" video... BAHM! a coca-cola add. So why build those "targeting" algorithms?
I was just talking about how terrible Youtube's recommendations are with my brother today and I realize this idea is naive but I think it would work better than the current machine language system:
- Gather up all the channels that are followed by channels that I follow and/or have liked videos on.
- Recommend me videos from those channels.
I'm pretty sure in my case this would be much better results than what I currently get shown at any given time.
If you are a parent, remember that YouTube results and suggestions can sometimes be rather "suggestive". All society needs some baseline of a moral code and an algorithm doesn't understand that.
Imagine if your child asked an adult neighhbor about the movie "beaches" and they responded with the same answers YouTube does. Go ahead search beaches. Or Beach, or vine.
I just tried this, and my first four results were for the Bette Midler film.
The rest are about beaches (most dangerous, weird things found on beaches, top 5 beaches in Brazil etc)
What is striking, and I've noticed this before on YouTube, is that the thumbnails all feature nearly nude women. You'd perhaps expect this to happen randomly for beach related videos, but I've noticed that if there even a fleeting bit of nudity in a film trailer or similar, it seems to end up in the thumbnail.
Does a human scan through and choose that moment, based on trying to maximise clicks? Or does an algorithm try random frames and then keep the ones that are click baitiest?
That's a human doing it. The algorithm just picks a timestamp to screenshot, and before people were allowed to choose their own thumbnails, they would put a single-frame picture at the timestamp the algorithm was expepected to use.
What are the methods in ML and predictive modelling that are used to counter 'bucketing'. As much as it is necessary for product creators to land us in groups by behavior, i think it is also necessary to have counter techniques to eventually split those groups and create new ones, no?
This is super interesting. How would I go about working with these guys, considering that they're in Palo Alto and I'm in London? I understand there's a bunch of hops you have to jump through in terms of visas, but I've never really looked into it.
These guys may be based in California but Google has a London office, and if you want to work on deep learning then Google DeepMind is the obvious place to go and they're based in London as well.
Actually a significant portion of the tech team (for instance Content ID) is based in Zurich, not in San Bruno. Check out their jobs page [0]. You don't need a friend to recommend you, but it would be great if you find somebody on a team that you are interested to join that would root for you.
Why do you say that? What they recommend rarely is what I end up watching, but it's usually ballpark correct. So much of what motivates your daily YouTube viewings is externally, so there's only such much you can do. They are aware of this,
> Historical user behavior on YouTube is inherently
difficult to predict due to sparsity and a variety
of unobservable external factors.
a) YouTube doesn't know anything about the content itself, can only use metadata
b) The algorithm itself is biased towards creators that post often and keep users hook the longest, which is almost always vlogers (ask any animator what they think of YouTube)
c) Many recommendations systems today create many buckets and once you watch something from one bucket (you show your intent), the algorithm will focus on that bucket only. (You can see it working extremely poorly on Amazon that tries to sell you a fridge after you just bought a fridge).
It's very hard to build a great recommendation system (look at Spotify's Discover weekly), but because this is 101 of any machine learning course, it's the primary thing that companies refuse to outsource (I build company around it, failed badly).