Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Why is making a good recommendation system for YouTube hard?
43 points by nassimslab2 on Dec 23, 2020 | hide | past | favorite | 52 comments
YouTube has been criticized for their recommendation algorithm which can often time recommend the same thing over and over.

This got me thinking about what makes it that hard to build a good recommendation system?

On my side, I'm currently working on a new way to do so. I called it Channel Tree. The goal is to start out with a list of YouTube channels provided by the user.

Then my software will go look for the channel section of each channels to look for other channels there. This has the effect of building a deep tree of different channels related to each other through the channel section. The user will only have to specify how many layers of the tree he would like to have so the algorithm can stop there.

Finally, you'll only have to look at the tree and look out for channels that may seem interesting based on who's the parent in the tree.




It's possible that YouTube's current recommendation algorithm actually is close to optimal for maximizing watch time / ad revenue, even if it isn't optimal for finding videos related to ones you've watched already.

I would assume that Google has tried various tweaks to their algorithm and that it is the way it is for a reason.


Yes, this is the real answer. What you want to watch, or what you would like youtube to recommend to you, is at odds with what youtube would like you to watch (and potentially share), which is enraging, clickbaity, low quality material.

As a n=1 data point I'd like to add that I watch youtube mostly for music, and it that sense the algorithm is not so bad (since there isn't much room for "viral" clickbait), and my only complaint is that it usually shows me the same videos over and over, but it still shows me related videos that I usually like. Whenever I commit the mistake of clicking on a non-music video, my recommendations get polluted for something between one day and one week.

I never log in, so if for some reason my cookies get deleted or corrupted I suddenly get a fuckton of shit with celebrities, "outrageous" political opinions and so on. The recommendations don't get back to normal until at least 3 or 4 weeks. This has happened 2 or 3 times, and it's so infuriating that I wonder if youtube does it on purpose to nudge me into logging in and have a view history associated to my account.


Is that what it's detecting you might like due to having profiled you based on other aspects of your google identity unrelated to that cookie reset? I think if you were a teen, for example, you might get skateboarding videos or Billie Eilish or certain games or whatever teens do or like these days, rather than "X TOTALLY DESTROYS Y" (two people in suits and ties in a political debate).


I don’t log in to YouTube on any of my devices and the same thing happens to me if the cookie ever resets. You can kinda see what “real” YouTube is by just opening a private tab. After you click on something it tries really hard to funnel you into a specific demographic. I’ll see videos I’ve watched before as it’s testing the waters on what I will click on.


If I hit reload on the home page I get the same 8 videos recommended to me that I've been intentionally not watching for two weeks. Where's the ad revenue in that?


this is the answer, before you always had next to a video a list of related videos, now it just personalized recommendations to you, that's the behavior that confirms this


I hate it how the Related Videos has been reduced to 1 related video then a lot of rubbish.


Agreed. There is just too many variables at play and that assuming something we have no information on (YT Algo), we cannot assume anything.


You can make a great recommendation algorithm, certainly better than YouTube's, if you use a different criteria for success. YouTube's criteria is to recommend whatever causes you to watch a lot of YouTube.

It could be that most people's goal in life is not to watch the maximum amount of YouTube.

One thing I've wanted since I saw a prototype Joe Edelman made for Chrome, is after a video is played, prompt the user for a rating, and a reason for the rating. Joe had a taxonomy of human values for this, but it could also be a freeform tag based system.

Then when you go to YouTube next time, you say that the reason you're at YouTube is to "increase my knowledge of machine learning." Or maybe it's Friday evening and you just want to "make me laugh."

Most people probably won't pick "make me outraged or scared.", but that does give good engagement metrics...


if anyone here works at YouTube i'd actually genuinely like to know if it is an explicit goal to make people "watch the maximum amount of YouTube" (or something spiritually similar). It sounds unfathomably wrong to me to maximize for this, on this side of the fence. you couldn't pay me enough to work on something like that.


I don't work at YouTube, but I'd like to play advocate for the devil and see how far I'll get.

Premises:

1. YouTube's primary function is entertainment. Many people also derive educational and informational value from it, but that's incidental. Wikipedia and Google News probably serve that purpose better.

2. The most straightforward way to measure how well YouTube serves as an entertainment channel is viewer minutes. There are drawbacks to measuring per-session or per-view minutes (short term outrage spikes can backfire later), so it's better to maximize life-time viewer minutes. These can't be measured directly, but need to be modeled/predicted. But it's a refinement of viewer-minutes, not a fundamentally different metric.

Contention:

Maximizing viewer lifetime minutes is the best practical way for YouTube to serve it's primary function.


The non-premium user is the product, on Youtube. They're selling your eyeballs to advertisers unless you're a Youtube Premium subscriber, but even then they use the same algos because they want to maximize engagement. It's the same tired BS as Facebook.


I don't work for YouTube.

But what if I told you that 30% of users who stop watching YouTube go on to use a different media service, even though they could have watched exactly the same content on YouTube. Would you feel comfortable informing these people that they could watch on your site?


given these terms then yes, sure. i dont feel its a realistic scenario though.


That's the sort of scenario that would be presented to you. If it happened to engage some people that would otherwise have spent some time with their family, that would not be mentioned. I didn't have to argue very hard to get you to agree to this job.


1) Someone says "look at this idiot" and sends me a link to a video. I watch it, and reply to the person "yep, the person in the video is an idiot". The video website has no option for me to say "don't show me any more of this kind of stuff".

2) COPPA means youtube doesn't allow under 13s to have an account. But those children want to use the like and subscribe and notification bell. This means that youtube gets all of my viewing and all of my child's view, but can't tell the difference.

3) There's no way to tell youtube what I like. I watch tiny channels doing original songs and covers. Youtube thinks that's music, and so pushes general chart shit at me. Or it thinks it's some genre of music and pushes huge channels that are roughly that genre at me. I have no way of telling YT that it's the small channels with fewer than 1000 / 10000 subs that I want to see.


> The video website has no option for me to say "don't show me any more of this kind of stuff".

There are actually two different actions you can take on recommended videos that help. Just click the 3 vertical dots under a thumbnail:

"Not Interested" or "Don't recommend channel."

> There's no way to tell youtube what I like.

I can't imagine a scenario where there's a feasible way to "tell" YouTube exactly what recommended videos you want other than what it's doing now unless the content creators themselves were all on board to properly title & tag their videos. Bigger channels use clickbaity titles (eg. "I was shook when I heard this"). And smaller channels don't know how to utilize keywords.

I personally have never had an issue with YouTube recommendations, especially after using the above method to filter out things I don't want to see.


Both features are broken. "Not interested" does nothing aside from temporarily hiding the video. It's not a signal for "I'm not interested in this category of videos". These videos keep coming back anyway and the problem gets much worse if the same content is uploaded to multiple channels. There are videos I've blocked at least five times "Don't recommend channel" is a cruel joke. It works at first but it gets reversed when you view a single video from that channel. There's no way to enable it pernamently. Sometimes I like certain videos from a particular author but dislike the others and don't want them recommended to me and I can't click on a video added to my favourites without reversing the channel block.


If you want a channel gone from your recommendations, subscribe to it, then watch a video, then unsubscribe. At least for me, that makes the channel never show up even if you watch videos from that channel later.


Additionally, you can go to your history and choose "Remove from Watch History". I've found this can do a good job keeping certain "highly 'viral' videos" from infecting your recommendations.


For point one, give a thumbs down to the video? One would imagine that is used as a negative recommendation signal. Or delete the video from your watch history.

For point two, I don't think you're correct. Kids under 13 can have accounts, but only with parental approval.

For point three, you're totally right. But if you think through the implications, every aspect of that seems like an incredibly hard problem to solve. And at least my expectation is that you wouldn't get most users to actually interact with that system, since it is too complex. The perfect example of a power user feature that would get killed a few years later as expensive to maintain and little usage.


> For point two, I don't think you're correct. Kids under 13 can have accounts, but only with parental approval.

Sure, but how many parents are going to go to the trouble of setting up a separate account? Much more likely they'll just let the kid use their account.


Can you show me where on Youtube I can set up an account for an under 13 year old?

https://support.google.com/families/answer/7124142?hl=en

> When you use Family Link to create a Google Account for your child under 13, your child can use the YouTube Kids app where it’s available. However, they can't use any other YouTube apps, websites, or features until they turn 13 and manage their own Google Account. Your child will be able to use YouTube if you added supervision to their previously existing Google Account.


It seems to me that the last sentence you quoted (on accounts that had supervision added to them) has to be talking about use of YouTube generally, not just YouTube Kids.

If they did behave the same way, why list this as a special case but with totally different langauge.


You can also click on the vertical ellipsis menu -> Not Interested, and tell them the reason. My recommendations on youtube are extremely on point.


You can give feedback by clicking thumbs up/down or selecting "not interested".


Thumbs up / down has no impact on what videos Youtube pushes.

Not interested and don't recommend channel only removes that channel. It doesn't do anything to remove all the other similar content, and it doesn't do anything to add more relevant content.


Please cite a source. It seems like it would be an obvious signal to use in recommendations.


I'm unsure how popular or unpopular opinion this is, but I've been very happy with the current recommendation algorithm. Sure, every now and then something stupid pops up, like today it recommended me the channel of a software engineer where all his videos are of him complaining about his employers, family and how he has it more difficult than anyone. But beyond those hiccups, I've found great channels, videos and even music. I could be the exception but idk...


I find all my new music from YouTube recommendations. I’ve never had Apple Music or Spotify (when I briefly used it) recommend me something new that I liked.


Recommendation algorithms themselves are the problem. It's a net negative. They encourage you to turn your brain off and follow wherever they take you. When you do this with your own thoughts, we consider that a form of mental illness. The inability to engage in executive control over where your mind wanders, resulting in endless chasing of tangents without any purpose is practically the definition of schizophrenia. Yet this is precisely what recommendation algorithms encourage. Better to step back, think about the things that interest you and you actually care about, then directly search for those, with perhaps keyword tagging related videos displayed at most.


We don't have much control over what our desires are, even if we can focus our thoughts on how to attain them. In this sense, we are only superficially different from the schizophrenic mind you describe.


>Then my software will go look for the channel section of each channels to look for other channels there. This has the effect of building a deep tree of different channels related to each other through the channel section. The user will only have to specify how many layers of the tree he would like to have so the algorithm can stop there.

Your proposed algorithm wouldn't work for me because the majority of the good videos I'm exposed to in education/science come from channels that are not listed in the channel sections of my subscribed list.

E.g., here's a channel about machine learning that doesn't list any other channels: https://www.youtube.com/c/YannicKilcher/channels

Here's another machine learning channel that doesn't reference any other AI channels: https://www.youtube.com/c/K%C3%A1rolyZsolnai/channels

Neither of the above channels reference each other but both have videos that are relevant to my interests.

Because the Youtube algorithm didn't depend on building a "channel tree" from whitelisted channel listings, it can suggest quality videos from both of those channels that your algorithm would miss.


I'm a huge fan of the Youtube recommendation algorithm, and I find it works extremely well for me.

Also as a person working on recommendation algorithms at a large competitor, I would say Youtube does a pretty good job overall.


> I'm a huge fan of the Youtube recommendation algorithm, and I find it works extremely well for me.

This fascinates me. I'd guesstimate that roughly 60% of the recommendations I get from YouTube are for videos I've already seen. (To be fair I "watch" a decent amount of music on YouTube, so maybe I'm disproportionately likely to rewatch videos I've seen before.) Another 30% is content based on misinterpretation of my political beliefs, like "you watched a critical analysis of a Prager U video, maybe you'd like these 6 Trump-stanning videos from Newsmax".

The political thing seems like an obvious blindspot in the YouTube recommendation algorithm to me. Going all the way back to 2015 YouTube seems unable to distinguish between "critical of establishment Democrats" and "intellectual-dark-web / pro-Trump" interests. I'll watch one lefty "breadtube" video that's critical of someone like Biden or Pelosi or complimentary to someone like AOC or Sanders and get Newsmax/OAN/Fox-and-Friends video recommendations for a week.

More generally it seems to me that YouTube's recommendation engine is far too volatile and heavily influenced by recently watched content. A one-off, slightly out-of-character view will noticeably taint my recommendations for a week or two.

In contrast I'd say Spotify, Google Music (before it was folded into YouTube), Netflix, Twitter, Amazon Prime, or even Pocket-via-Firefox does OK - not great but acceptably well - so I don't think I'm especially inscrutable. I'm curious about your experience with this. I have moderately broad interests, I feel like it shouldn't be hard to come up with something I would find interesting but it virtually never seems to happen organically on YouTube.

I've always just assumed they were optimizing for something other than things I would consciously find interesting or engaging. For example I'm guessing that triggering "outrage" is probably a good way to drive engagement (comments, shares, etc.). I don't really do either of those things regardless but I'm pretty sure that strategy works in general.

Do you find that YouTube frequently recommends novel content you enjoy? Is that even what they (or your company) is trying to do?


Yes I find YouTube typically recommends new content for me that I enjoy.

No I don't get too much political content, although when I see political content I generally tell YouTube that I am "not interested".

I don't work there so I don't know what they are optimizing for. Based on their papers it is probably a combination of total watch time, total interactions (likes comments), and minimizing certain kinds of "integrity problems" like views on eventually-removed content.

Yes my company, and I would guess most others, are trying to recommend content that people enjoy and provides long term value for the viewer, not just short term watch time. It's hard to do this for everyone, but ML-based recommendation systems are better than any heuristic for 95%+ of people.


I have no idea why people complain about the recommendation engine that much. I either ignore it or if there is something particularly stupid, I say stop recommending. I stick to my subbed videos, and I dont have to care.

I just think its trendy to hate on things even when it gives you tools to tweak its behavior


How does your recommendation algorithm idea handle channels that have no other channels listed in their Channels Section? How does it handle suggesting small or new youtubers who likely don't have themselves in anyone else's Channels Section?

I just did a small random sampling of my own subscriptions, and I'd say about 30% or more have no channels listed in their Channels Section. And of the ones that do, it's all much larger youtuber's in their own circles that I'm usually already aware of.


I’m not an ML expert, but my understanding is that a model is going to spit out a list of recommendations — each with a score. As you design your model, you need to figure out what it is your optimizing for.

I understand your approach to be more of a heuristic - build a tree like structure by traversing through the channels for each channel.

If I’m interested in A, how do you determine where to start traversal in this tree? And how do you pick out a set of recommendations and rank them?

I have a cursory understanding of ML. In addition to finding the relevant entities, you need to rank them.

An important question is, “what are you optimizing for?” For YouTube, presumably they want to optimize for watch time. If they want to suggest ads, that model might want to optimize for maximizing revenue. I’m going on a tangent here, but when we say “YouTube’s recommendations are bad” we should keep in mind YouTube might be optimizing for revenue... which isn’t the same as optimizing for seconds watched or optimizing for clicks.


Garbage in equals garbage out.

The foundational error I feel is trying to capitalize on peoples attention as opposed to aiding in the public good.

Sure when youtube and like services were starting out it was the wild west and all about gaining users and creating traction, however that was a long time ago. We now know where that party was headed and how it ended, and it almost ended society!

At this point any improvements I feel should be focused on minimizing damage and maximizing the public good.

This is the antithesis of anticipating interests as that approach has failed to deliver and even worse has only served to exacerbate echo chambers and divide populations.

Better to gauge interests and make suggestions from a large and vetted list of diverse sources as a primary ingredient of a larger cake of suggestions with actual user defined interest suggestions being the subtle and minimal spice.


I would say it was good at one point. They used to just recommend videos based on whatever tags the video had, or the title and description.

This lead to much more relevant results than what it does now it seems, especially when looking for more niche topics.


Sometimes I thank yt as it's not like tiktok. Tiktok gets you hooked for hours and aditiction starts the next day.

Now a days notifications are irrelevant and comment you posted gets a notification and start searching what you actually posted. Those are hard to find.

YT is imperfect I am fine with that.

What I hate is though l, when a video gets deleted from your playlist, they should have courtesy to say the video title what got deleted. It just blindly says, video is deleted, it is restricted in your country amd I have no clue what video was that.


> On my side, I'm currently working on a new way to do so. I called it Channel Tree. The goal is to start out with a list of YouTube channels provided by the user.

Feedback from someone who is vaguely disappointed in YouTube's recommendation algorithm: I don't love every video by a given channel. Which videos I specifically enjoyed watching matter quite a bit.


The recommendation algorithm is fine. The problem is that angry, toxic people (the kind that spit bile on HN) love to click on angry, toxic videos. Your youtube suggestions are a reflection of your mind. Youtube isn't therapy.


Depends on the definition of "good". The question's assumption is that it's "good for the viewer/user". YouTube isn't necessarily focused on that definition of good.


The system should be configurable imo. You should be able to adjust parameters for how far-fetched suggestions you would get and whether you'd like to see videos you have already watched, and so on.


Start backwards: what metric of "goodness" are you using? Then, once you have a goal in mind, you can decide on the best way to get there.


like others have said, the goal of the YouTube algorithm is to get people to watch YouTube. this set of incentives is similar but still very different than maximizing "worthwhile" content (however you may define that)


Do you have a page setup for this project yet (or even an email) where you can take suggestions about features and ideas? Fixing the many obvious shortcomings of the YouTube platform seems like one of the biggest no-brainer projects out there, so much low hanging fruit for the picking.


Define "Good". It could be different for everyone.


Well, in my personal use case, I would like youtube to recommend me is new and upcoming release movie and television trailers. I love watching trailers! They have enough plot in them it's almost like a mini movie or short story. And also I like to know what new shows and movies are coming out so I can watch them later or read the books they were based on. I've subscribed to movie trailer channels and like, watch and upvote them all the time. Yet youtube constantly recommends me other nonsense and I have to direct-search for movie trailers. I haven't trained the alogorithm to show me movie trailers, or maybe youtube is smart enough to KNOW I will now hunt and search for the movie trailers on my own, so they will only recommend me other things? I click "do not show me this" and "I don't like this" on the other videos.

Anyways, it's my personal youtube white whale and I haven't made any progress, so suggestions welcome!


It’s an actively adversarial environment.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: