Hacker News new | past | comments | ask | show | jobs | submit login
Twitter's Recommendation Algorithm (blog.twitter.com)
1700 points by jonknee on March 31, 2023 | hide | past | favorite | 1185 comments

Context: I teach at Princeton and study social media and recommendation systems.

From a very quick skim of the repositories, this appears to be quite limited transparency. The documentation gives a decent high-level overview of how Tweet recommendation works—no surprises—and the code tracks that roadmap. Those are meaningful positive steps. But the underlying policies and models are almost entirely missing (there are a couple valuable components in [1]). Without those, we can't evaluate the behavior and possible effects of "the algorithm."

[1] https://github.com/twitter/the-algorithm-ml

I work on Google Assistant Suggestions and I don't think it's very practical to open-source an algorithm like that including the models and the underlying data. Both of them can live in separate services and be frequently updated.

I am assuming that open sourcing the code aims to increase transparency about the business logic of the ranking decisions. At the same time you don't want spammers to be able to easily run experiments against a cloned version of your system.

> But the underlying policies and models are almost entirely missing (there are a couple valuable components in [1]). Without those, we can't evaluate the behavior and possible effects of "the algorithm."

Haven't gone through yet, but yeah, if that's the case, all this is, is a glorified framework to plug your own in.. Not exactly what was promised.

Did you also skim the accompanying (or rather, main) repo, https://github.com/twitter/the-algorithm ?

From a quick clone and line-count, it has:

  235 kLOC .scala
  136 kLOC .java
  22  kLOC .py
  7   kLOC .rs
So I don't think you did, since you posted so quickly and that's a LOT of code.

I also haven't skimmed this code except very superficially, but perhaps you should since you're out there making statements with your Princeton credentials.

(I posted this comment with the heads-up a few minutes after your comment above and then expanded it as you didn't respond.)

I think you misunderstood. He's saying the training models are not there.

For example, MostRecentCombinedUserSnapshotSource seems to be influential (such as for calculating "tweepcred"), but we can't see how it's calculated.

Wouldn’t that make them easy prey of “spam SEO”. However, given the framework isn’t it still possible to guess the models?

The spam SEO issue should be dealt/thought about _before_ engaging in the whole adventure, and having to guess how it could work if decently implemented properly defeats the "open source" spirit of it.

More credits would be given if the very idea of open sourcing the algorithm hasn't already been discussed to death with predictions of the difficult points and how it probably won't happen in any sane way.

And them be pilloried for not doing it or not fast enough. Damn if you do, damn if you don’t.

I’m starting to think the broblem with Elon is mostly personal, he’s just a proxy and default wrong.

(not that I approve of his behaviors, but I can’t enjoy this whole mobbing that he’s getting; not that he cares this I’m not worried he’s getting traumatized in any way? it’s just how it’s become an identitarian trait for a certain group that irks me.)

Makes me wonder if a way to override people SEO hacking the algorithm is to create a market of open-source algorithms that each individual can choose and then it's not trying to hack THE algorithm but having to hack many and not knowing which algorithm an individual is using.

You don't have to target each 'algorithm' all at once. You can target them one at a time. Hell you can run A/B test single out the easiest targets.

Yes but right now there is 100% of the users using the one algorithm (or chronological). If one doesn't know what percentage or which people are using which algorithm, it becomes harder to know which ones to try to hack to have the biggest result.

What about these? https://huggingface.co/Twitter

Those look older to me. They all have last updated dates for October and November 2022.

FB open source algo looks much better, right? /s

Is it valid to focus tracking a Dem/Rep split when that split is an exclusionary design for many Americans? Or is it not exclusionary in your belief? I'm curious of a social science perspective.

Ignoring the global nature of Twitter for a moment.

So why did they opensource it?

So they could pretend to be open. It's the "Open"AI model. Open-washing?

This is a very cynical take. They should be commended for publishing recommendation code at all, which no other major social network does.

Well if they say “we will open source the algorithm” and then what they really open source is a little bit of slightly relevant code that doesn’t allow us to understand the algorithm, then what we can deduce is that they are trying to weasel out of public commitments.

I can’t say for sure if that happened, but if they made a clear promise and then did something else, it’s perfectly reasonable to call that out.

Devil's advocate though: imagine you were to open source (probably with quite a short deadline) some 'algorithm' used in whatever you work on, but the rest should stay private; how would you go about that?

I don't think it's easy, there's inherently some interface(s!) where it's a hand-wavey 'get the thing from the private bit', and defining that sensibly is hard, and if you try to do it well will probably lead to a lot of meetings, scope creep, etc. - and as far as that goes it's not easy anyway, since it's highly technical and implementation-specific yet also a management/policy decision to make.

It depends on what your goal in open sourcing is. Are you looking to provide a base for others to build software on, and to provide a way for others to contribute back to your code? Then publishing the code makes sense.

Are you looking to build public trust in you and your organization? Then dumping a bunch of code with no context isn't going to help much, as it's not code but behavior that builds or destroys trust.

Are you looking to lean into a polarized partisan environment, pushing a narrative where its you and your supporters against an unfair group of "others"? Then a big splashy move high on symbolism and low on substance that will inspire lots of high profile, divisive media coverage is a great way to go.

If you were doing it in good faith, you wouldn't need to publish the actual code. Most likely you should publish an article and a flowchart explaining how the algorithm works. Publishing a partial chunk of code just creates a story that supporters who don't understand can parrot that "they opened their algorithm".

Exactly. Publishing what they have is the worst of both worlds - hopefully people will create flowcharts based off it, though, although it sounds like there will still be a low level of accuracy.

I still hear reverse-FUD about nvidia supposedly fully open-sourcing their Linux driver, when in reality they opened a tiny kernel portion of it that allows the main proprietary blob to connect to necessary kernel interfaces. You have to call out this bullshit when you see it.

Wait, what? AFAIU what you say is true, except for the part where the “main proprietary blob” does not run on the CPU. This isn’t as glorious as an actual open-source driver would be, but it does have meaningful advantages—e.g. you now have a ghost of a chance of implementing Nvidia GPU support on a non-Linux kernel, by uploading the GPU-side blob and rewriting the CPU-side shim as required. Or is the blob license-restricted from being used line that?

The "main proprietary blob" they're talking about is the userspace portion of the driver; the portion which does all of the heavy lifting. That definitely runs on your CPU. The only part they open-sourced is the kernel portion of the driver, which just exists to facilitate communication between the userspace driver and the hardware.

Hey, we can get even more cynical. Why should we trust that this code is even similar to what they run in production currently?

I can't imagine deliberately special casing Elon's account in something they made from scratch to fool people.

Let's have reasonable goals, shall we ? "Their shit doesn't stink as bad as others'" is nothing commendable, especially after souch publicity.

I say "why not both". Even if they are doing it only for good PR, we encourage it by giving them praise, because we should encourage things we want. (While remembering that they are not our friend, they are an entity we should pressure, and the way we pressure is by giving praise when they do things we like, and critcisim when they do not).

I’d give them more credit if they’d been honest and kept it secret then lie to my face and pretend they didn’t?

They should be commended for open sourcing something they don't understand because they fired all of the people whom understood it? Elon admitted as much.


Because the way he acts gives people every right to. I agree that he may be misrepresented, but if he is, then he has to shoulder at least some of the blame.

The question is: are they right?

Any time a billionaire buys a media company it's bad for the health of democracy.

Not necessarily. What if the media company was bad for the health of democracy, and the billionaire's incompetence destroys the company's social standing and thus its ability to do more damage (even in the billionaire's own interests)?

Yeah, have to wonder how many people, if they had the money, would want to buy out Twitter just to wipe it out. Doesn't a huge chunk of HN hate Twitter and wish it were dead?

(Regardless I think that would be useless in the long run, since the millions of stranded users will still want another Twitter-like platform. And Twitter imploding without a designated archive will wipe out a tremendous amount of digital history.)

A lot of his decisions look pretty incompetent in the surface, like how could he not see how charging for verification devalue the system to whoever has the money?

Instead it could just be an intentional ploy to completely devalue Twitter disguised as incompetence. He can justify firing employees and charging for API access/verification as money-saving strategies, even if they're terrible strategies that have little chance of succeeding. And he could make enough people believe he's an idiot who makes things up as he goes rather than someone specifically driven or apathetic enough to run Twitter into the ground. Not to mention he was forced to buy them after changing his mind. Almost feels like a "so that's what happens" response.

I wonder how higher powers would be able to distinguish fake incompetence from real incompetence. Would they care how Twitter as a private company ends up if it's the case that it implodes from its own legitimately bad business decisions? It reminds me of how employers won't directly fire employees for discriminatory reasons, instead they make the employees' lives miserable so they're compelled to leave on their own, thus they escape scrutiny.

This is basically at the level of "9/11 was an inside job to bring down WTC 1, but WTC2 was destroyed in an unrelated but simultaneous terrorist attack"

> Yeah, have to wonder how many people, if they had the money, would want to buy out Twitter just to wipe it out. Doesn't a huge chunk of HN hate Twitter and wish it were dead?

> (Regardless I think that would be useless in the long run, since the millions of stranded users will still want another Twitter-like platform.

If there's not an obvious successor, right when its shutdown, a lot of those people might get their habit broken and find something better to do. I know Mastodon was held up as a successor, but it's unclear to me if that's actually capable of scaling to that level.

Mastodon is way too flawed to be anything but a niche tool for tech people and activists. I highly highly doubt such a system can cross the chasm. That doesn’t mean that’s a bad thing though.

Or, he’s as incompetent as he looks.

Can you name one relevant media company owned by someone from the working class?

If you personally own a media company, you are by definition bourgeoisie. But see:


Which is why HN was so incensed about Bezos buying the Washington Post.

And when a highly scrutinised, highly visible billionaire buys it off a different bunch of billionaires which you know little about?

i wasnt referring to him buying twitter, i was referring to him saying he was going to open source the recommendation engine and then doing it.

i agree billionaires owning media companies is huge problem

Do you believe billionaires can do good? Is their existence an existential threat to democracy?

Yes. There are plenty of philanthropic billionaires. Yes. That much money buys a destabilizing amount of influence.

Billionaires are billionaires not by literally storing cash. The rest of the society values their contributions and creations in the companies/corporations they run. Sure, they have some liquidity but the entire concept of resentment towards billionaires is essentially equal to resentment for the betterment of the world. There are some exceptions but for the most part, in a well oiled market, you can't just become a billionaire by fucking over people. See Adani and how it turns out for him: https://www.ft.com/content/5c0b6174-e66d-4fa5-89a5-6da1d69ab...

Every major media company is owned by a billionaire


It's because there was close to zero newsworthy information in them, just nonsense being disseminated by wannabe-journalists.

I encourage you to watch the C-SPAN recordings of the senate sessions where they brought in Twitter employees and journalists to cover what was in the Twitter files.

From your comment it sounds like you’ve been consuming the 30s soundbytes from those hearings and the misinformation spreading around the internet.

A long list of 3 letter agencies were compiling lists of citizens and journalists and sending them to social media companies to review for ToS violations.

There is a very real threat to civil rights here. When this cannon swings around and points back at LGBTQ, racial equality, stopping the war on drugs, etc. this is going to be “not pretty.”

And the hearings covering them were unbelievably shameful. Senators talk passed the guests in the room. Refused to abandon their “sick burn” scripts regardless of where the conversation went. Insulted their guests. Went in random directions of questioning that had little to do with the root problem…

At the core of this, 3 letter agencies (seemingly across the board) have decided that it’s acceptable to ask social media companies to prevent citizens from communicating on their platforms by selectively directing the attention of their moderation teams towards individuals. Whether this is legal, or a violation of 1st amendment rights, is for sure an open question.

Only one senator directly addressed that and only briefly by saying “maybe they’re trying their best” - a statement that doesn’t exempt anyone involved from following the law.

Is the government allowed to censor citizens by weaponizing their ToS for selective enforcement and, if the government can do that, where is the line drawn? How specific are they required to be? Can a platform ban all political speech and then only selectively enforce requests from the government without doing their own moderation? How far can we launder the 1st amendment through a public-private collaboration of enforcing ToS?

Honestly I’m not sure what the hearings were really meant for, the government is unlikely to hold itself accountable. At this point I do believe the ball is in the citizen’s court to bring suit against the agencies named in the Twitter files like we did with the presidential surveillance program.

The government requesting that the tos of a private company be upheld seems rather mild to me. Did we get the reasons for the requests in the released files? Were they trying to reduce foreign propaganda or public health misinformation or something else important?

You like your government trying to tell a private company what's true and untrue?

More than I like a private company telling me what's true and untrue.

You clearly are oblivious as to what they contain


Please don't break the HN guidelines like this. It's not what this site is for, and destroys what it is for.

What's worse, if you have a true point, then posting like this actually discredits the truth and gives people a reason to reject it. That isn't in your interest and in fact hurts everyone.



"Way too negatively"? We're talking about one of the world's most influential people who uses their power to randomly accuse innocent normal people of being pedophiles. There is no portrayal too negative.

This is like FB open sourcing the compiled frontend code you can see yourself using inspect.

If we commend them for this we're helping promote and encourage this faux open source virtue signaling

No, that's very different.

There is clearly a lot of information to share. It's worth considering this could be step 1 of n as opposed to assuming the worst possible intention.

It's healthy to have a normal amount of cynicism. They released it for a reason. "The goal of our open source endeavor is to provide full transparency to you, our users, about how our systems work."

Why be transparent (or try to appear transparent)? To convince people to trust your platform (or to recruit - which seems to be another goal of the post). Why would Twitter want or need to do this now? Well, there is a bit of context. This disclosure doesn't exist in a vacuum.

I love this take. Doomed if you do, doomed if you don't.


I agree, which is why I wonder what your motivation is to defend Twitter. You're posting about this for a reason. If I were a social media company, I'd probably have paid agitators to defend them.

If we are willing to not assume some borderline "it's what they want you to think" conspiracy play, obviously there was always going to be a lot of highly interested and qualified people taking a very close look at this and, at some point, there was always going to be very definitive conclusion of what's the deal with what they released.

If your play was "it's some source code, hence people will think we are open, and that should be really good for us", that would make you a very special kind of idiot in this space.

That was one of Elon’s core statements when he first talked about buying Twitter. If he had gotten it out sooner there would be an easier link between the two, but if you want more context just go read the old tweets and articles from the Twitter vs Elon days.

If we can't build anything with this, is it "source"?

"Does not include batteries"

You must be new to Musk's business practices.

It's no secret that Twitter, like any other social media platform, is driven by user engagement and ad revenues. The more time we spend on the platform, the more valuable it becomes for them. With this new open-source algorithm, they're essentially crowdsourcing improvements to their system to better serve us the content we crave.

this move could be seen as a strategic PR play to boost their public image amidst the growing concerns around algorithmic bias and lack of transparency. By inviting the community to collaborate and address these issues, they're not only shifting some of the responsibility onto the users but also deflecting potential criticism.

Because they let go many of the engineers working on it?

Noone has mentioned this before - I don't know if it's really related, but afaik the European Union is thinking about requiring social media platforms to be more transparent when it comes to recommendations etc. If you can already say "hey we have a lot already online!" then maybe the laws will become less strict.

bc he have no devs anymore and thinks the community will fix it for free

PR and it was already leaked last week.


> But the underlying policies and models are almost entirely missing... Without those, we can't evaluate the behavior and possible effects of "the algorithm

And neither can spammers find and test the cracks and edge cases that would allow them to break the system, that does sound reasonable to me. If they were public there would be an arms race between spammers/those wishing to game the system and Twitter engineers.

Then don’t pretend to release “the algorithm.”

They’re explaining how it works without giving the specifics. Much like the US military explains how the nuclear deterrent works without disclosing detailed plans and control codes.

It's an open algorithm, but it's not open data! (joking)


imagine thinking you need to read every file in a project to understand the architecture and which pieces are important for specific functionality you're looking to understand. Have you ever picked up a bugfix ticket for some code you didn't write?

It's fast to read stuff when you have the domain knowledge. The weights won't be a 5kb Scala file: they'd probably be a big binary file, which is easy to search it github/locally after cloning.

Otherwise, if they are provided, someone in the thread will surely point to them.

You missed this in your rush to display your newly acquired sarcasm101 skills:

  "Skim": To read quickly or cursorily, to glance over, or to omit details in order to get the gist of something.

Context: I studied at Oxford

Fair point, I missed that when I skimmed OPs comment

class project, 200 students, 1500 LoC each. Time for grading.

there are contexts in which this may be well practiced.

We should really all just bow in awe as we are clearly inferior.

Princeton has a Code Reading 101 that all postdocs/professors must take, however in exchange for the Secrets of Speed Reading you must acknowledge every message with where you learnt those skills.


The context is relevant for indicating that they’ve familiar with the problem and have thought about these issues in depth. It’s also useful for not being accused of hiding their identity if someone thinks they have an unmentioned agenda. Argument from authority is bad when it’s of the form “I am an expert, therefore you shouldn’t question this claim”, not when it’s used to provide an identity to a previously-unknown name while also providing a cogent argument and supporting evidence.

What did you expect?

I don’t know if the parent’s expectations matter here. This is more about making sure others don’t misunderstand the meaning here.

Good point. I didn't see it like that. Thanks!

Can i audit your classs for free?

It's disappointing the comments are so obsessed with the political angle to this that there's a total lack of appreciation (or discussion) of opening up the most influential social media platform in the world.

This is transparency theatre, not actual transparency.

There's no way to actually use this limited release to understand how or why any tweet is boosted, so we're in exactly the same boat we were in yesterday.

This sentiment has high correlation to driving conclusions from a very time limited information set. This isn't the only part that is going to be posted to github.

What is the net benefit from rushing to condemn something that can only be a net positive compared to the past alternatives? I don't understand the purpose of that approach. Help me.

> can only be a net positive compared to the past alternatives

This seems to be unsubstantiated. Are you really claiming that selective disclosure is always superior to complete lack of transparency?

The degree to which it is selective has yet to be determined.

Are you claiming total ignorance is superior to partial revelation? I think we would all do ourselves better to go live on a desert island and abandon everything about modern life. A shovel might be useful to bury our heads while we're there.

> Are you claiming total ignorance is superior to partial revelation?

I am claiming that this is at least sometimes true, yes. Not always, but sometimes.

You're the one claiming that partial revelation is always, without exception, superior to total ignorance. That seems unlikely. Propoganda is often partial revelation, are you saying it is always better to receive only propoganda than to receive no information at all?

I think there is objective value to understanding propaganda's origins and goals. Those who do not study history are doomed to repeat it. Those who do not understand propaganda are highly likely to be controlled by it.

Propaganda can be a pretty vague term by the way. Can you describe how the public coming to a better understanding of the inner workings of the worlds most influential social media site is merely propaganda?

> I think there is objective value to understanding propaganda's origins and goals

Of course, but this requires the propaganda be contextualized, which wasn't a part of the situation I was suggesting.

> Can you describe how the public coming to a better understanding of the inner workings of the worlds most influential social media site is merely propaganda?

You're begging the question.

Does Twitters disclosure of parts of the algorithm used (but certainly not all!) actually lead to a better understanding of the inner workings of Twitter? Or is such a release actually serving some purpose beyond transparency?

If we had that, I'd agree it would be good. But I'm not convinced we do.

Elon's done the exact same thing before at Twitter with his selective disclosure of material to friendly journalists in the "Twitter files". That, in my opinion, led to an overall worse understanding of Twitter's actions, not better.

Then the analogy fails to hold up to reality. There's plenty of information to contextualize. Assuming by default everyone but you is a naïve doe lost in a forest and therefore lacks the intellect to contextualize anything for themselves is undemocratic.

>But I'm not convinced we do.

That would likely be because, as mentioned previously there is still more to come out; it's not possible to be reasonably convinced yet. Going back to the original question:

>>What is the net benefit from rushing to condemn something that can only be a net positive compared to the past alternatives?

> There's plenty of information to contextualize.

There's nothing, that's what the point of this thread/argument is. The portion that Twitter has opened (or even simply committed to opening) is not remotely enough to hold them accountable.

The code they've released here is less helpful than a single Helm chart.

>>>What is the net benefit from rushing to condemn something that can only be a net positive compared to the past alternatives?

Because this is useless, and worse yet, it's pointless and blatant virtue signalling. I stood up in defense of Musk's private bid for Twitter, but there's nothing worth licking his boot over here. The suits don't care. The Open Source community gains nothing. The users will never see, interact with or modify the recommendation code. Nobody will be able to meaningfully audit anything until Musk stops selectively burning the books at Twitter HQ.

If Elon wants Twitter, he has the money to go get it. If he wants my respect, he's got to do an awful lot more than making "the algorithm" public. This release is so pathetic that it's probably colored my opinion of Musk more than any of the opinion rags I've seen yet.

What analogy?

> Assuming by default everyone but you is a naïve doe lost in a forest and therefore lacks the intellect to contextualize anything for themselves is undemocratic.

Luckily not a claim I'm making. It's better for conversation if you reply to what I say, not misrepresentations thereof.

> Going back to the original question:

And I replied to it already, but I'll reiterate: I don't believe this change "can only" be a net positive, what makes you believe that is the case?

Yep "Limited Hangouts" are a common disinfo technique

Yes. 1>0

What is "one" and what is "zero"? Are you saying that information can't ever be misleading?

Like if I tell you that your boyfriend has been having secret meetings with some woman you don't know, with full knowledge that the secret meetings are because she's a photographer and he's planning to propose, have I improved things by disclosing the information to you in that manner? Were my actions a "net positive"?

I did find the article more enlightening than the source code. I always suspected the quality of a tweet did not factor in to its promotion, only its engagement, and now they have confirmed this to be the case. Now I understand better why twitter seems to be filled with people angrily retweeting what I consider to be low quality clickbait tweets.

As long as they don’t try to tackle tweet quality at all separately from engagement twitter will remain unappealing to me.

What does "quality of a tweet" mean, how might you measure it?

We pretty much knew this is how all social media works, becuase a) engagement is what they want, why wouldn't they be optimizing for it, and b) how else might you measure 'quality'? Back when this started, I have no trouble believing some well-intentioned engineers thought that engagement was a good proxy for quality. A bunch of users give it the "like", isn't that a collective assessment of quality? Who is to say what quality is, overruling the users in aggregate?

I agree it has the negative effects you mention; and I've read lots of people writing about this, it's of course not a new observation.

But I agree it's good to have an explanation of what's going on, even when parts of are what we basically knew was happening on all social media networks. confirmed is better than "basically knew", for understanding how these things that effect our experiences (and our society) work.

"Quality of a tweet" is impossible for a computer to understand. That's like asking to quantify the value of a work of art just from the image.

But why not shower Twitter with praise for doing more than any other social media company in a similar market position? In the best case scenario, we might inspire a transparency arms race. In the worst case, we merely signal that transparency of any kind is rewarded, and that's a good thing if we want more transparency.

> But why not shower Twitter with praise for doing more than any other social media company in a similar market position?

Because selective disclosure is often propaganda. If a third party had chosen what the release or verified this is what's actually running in production, I would praise them.

Considering Elon's incessant lying, self-promotion, and manipulation, it's impossible to be complimentary of this at all. Everything he does is in bad faith.

This is definitely incorrect. Weights for images, links, misspellings, etc are all laid out in the code and have been detailed by multiple people on Twitter already.

The funny thing is that angle owes itself to Elon coming through on his promise to open source this.

This is a great thing.

It would also be a total Elon move to confuse the open sourcing of Twitter's internal code with actual transparency.

You would need, at a minimum, a neutral third-party audit of Twitter's servers to conclude that the source code we see on GitHub is, in fact, the source code running Twitter. How often will they keep their GitHub repo in sync with their internal code, I wonder.

Presumably Twitter uses a version control system. But they scrubbed the history so that's also a point against their "transparency" claims. Without knowing the when and the why of changes you can't understand what you are looking at. People are pointing to that "author_is_elon" without knowing whether that was done before Elon bought Twitter or after.

But even then, git history can be faked.

> This is a great thing.

I disagree. It's the opposite. It provides the illusion of openness without the quality of openness, thus killing the debate once and for all.

Just read the article and not the comments. Comments here used to be something you learn new stuffs, apparently that is not the case anymore.

I'm sure we can all think of examples where a power structure (a company, a country, a prison, a family) invited people in for a supervised tour that was less than honest in its presentation.

But really, if people respond to Twitter's actions politically, that response exists within a context that was certainly influenced by Twitter's prior actions.

"Opening up"? You must be kidding. Nothing is open there. It's just open-washing. A few nice diagrams, but how the services _actually_ work is still hidden.

If you ignore the hundreds of thousands of LoC... then yeah I guess it's just diagrams? Are you sure you actually looked at the main code repo?

> Most of the recommendation algorithm will be made open source today. The rest will follow.

So let's wait and see...

I had to go to the second page to find this. I completely agree. I clicked this thread looking forward to seeing some intelligent discussion of the merits of the source and issues/interesting tidbits, instead I see a bunch of Elon Musk ranting and complaining and pointing out drive-by poor pull requests issues. I really wish the adults would start to talk.

I would definitely love a techy community that focused on technology like the hackernews of 10 years ago. If you know of one then please link me to it.

lobste.rs is pretty good, in terms of higher tech discussion signal to noise, I think, though it still does have a bit of a political bent. Email me if you want an invite.

One of the things that makes my spidey sense tingle is when people say oddly sycophantic things about Elon Musk. Twitter is big, it's important. It's not "the most influential social media platform in the world".

I only see one social media site posts being constantly reposted on global news organizations. Are all of the corporations, world governments, and leaders with tens of millions of followers actually wasting their time by dedicating their social media teams time to twitter instead of focusing on some other social media site thats more influential? Which one is it then?

It's quite odd to attribute objective analysis to sycophancy. I intentionally didn't mention him but here you are bringing him up and fulfilling my point. Who is the sycophant?

I have to admit I am geeking out whilst skimming source code I barely understand.

I would love for the system to be somehow auditable, to verify this algorithm is THE algorithm.

Attacks on Elon has been growing since he started calling out corruption.


I would love to know where the OG users went before the reddit invasion completes.

If anyone knows then please tell me!

They had kids, stopped using social media, and/or started charging for their insights.

Besides, if you have to ask, it’s not for you.

I would agree the quality of posts has gone downhill, and is way too political for my liking, but there's not much other place to go that isn't elitist.

Discord. As much as I hate Discord the platform (and I really do hate it, far more than, say, Twitter), Discord does have a lot of good communities.

I don't think Discord is a substitute for HN. If I know an exact community with a shared interest I want to join then yes, I can find their Discord an join it. But how am I supposed to come across that topic and become interested in the first place? HN (in its ideal form) is a place where I am introduced to a variety of random but potentially interesting technology-related topics, from programming languages to passion projects to lessons in business etc, some of which I will pursue and seek out a community for, and others which I only care about minimally, not enough to want to join a dedicated community for.

If you want to compare HN to Discord, I would say that HN is like joining 500 Discord servers that together comprise the type of content discussed and posted about on HN, and every day checking each one's announcements channel, and skimming through general chats. But you still have the issue of being introduced to new topics, for example if you're interested in the AI trends then your selection of Discord servers that comprise what's posted on HN would be dramatically changing in the past few months.

I've found Discord worse in every way. Almost every server I've been in is highly reactionary to any kind of accidental phrasing. Walking on eggshells is a constant process in any Discord server. It also descends into the same type of problems as any other chat services as it encourages short messages rather than long well-thought-out posting.

It seems to be that way but its also in the rules not to complain about that. I do wonder if it's getting worse.

It has indeed felt a bit like r/LateStageCapitalism or r/antiwork in here, with so many low quality comments that all read exactly the same. I figured it was just because of the tech layoffs that that sort of sentiment has been gaining traction though.

It's actually shocking to see some of the discussion and understanding of the code here. I'd say the "part-time programmers" qualifier is even generous.

Sad to see.

Great! But nothing is going to change until people realize that the problem is the feedback loop. It's not the recommendation engine itself, it's the fact that there's no way "out" of the feed that the engine produces. It recommends you stuff, you have little choice but to engage with it, and then it trains on that information.

This is the problem with most of social media today. It is a very well known problem in ML [1], but nobody is willing to do anything about it because it's a fundamental UX change. Facebook, Twitter, YouTube, TikTok, they have defined themselves by their recommendation engines.

[1] https://towardsdatascience.com/dangerous-feedback-loops-in-m...

reminds me of a story about a guy who was given a gift, a decorative plate with a rooster on it i think it was. didn’t care for it too much, but out of politeness put it on display on an empty cabinet he had. a while later someone noticed he had it and figured he liked it, so got him a similar decorative plate with a rooster on it. again, out of politeness, he put it next to the old one. now other people started to think he just really liked roosters, and started giving him little rooster statues and nicknacks. Eventually he just has a whole display cabinet of rooster themed gifts that he never really cared for to begin with, but people just assume he likes them because people keep giving them to him.

I had a friend who this happened to with ducks. You would go to her house and there were little duck icons and duck figurines and duck themed things everywhere.

When questioned, it turns out she never cared about or liked ducks. Someone gave her one and then it became the go to present for her for every occasion, for decades.

Decades of ducks.

I am probably part of this problem - an enabler, so to say. I was hosted a nice lady with a house full of duck themed things (this was in southern England, probably not the same lol), and left her two (rather cute) duck-themed things when I moved assuming that she'd like those.

She was unequivocally happy when I did, but perhaps it was just because of the gesture.

(EDIT: also, the rooster/horse/duck example of feedback loop would make for a terrific blog post)

I have a whole shelf full of horse stuff because of this!~

this is a hilarious example for a serious problem. well done!

and in the end, he starts to like that decorative plates with roosters or at least thinks rooster on plate is a nice decoration if it is produced that much.

I think Instagram in particularly is bad in this regard. It seemingly becomes convinced that I care deeply about the subject of any post that I even momentarily linger on.

Don't you have a choice.. to not engage with it? If you didn't like it then assuming the metrics system is working correctly, this would be negative feedback to the ML model, causing said content to not be shown in the future.

Couple problems:

1. Actively supplying negative feedback is sometimes hidden behind secondary menus, making it much higher friction compared to just...scrolling past. So most users don't spend the effort. Even with a dislike button, it's unclear what the system is learning. It can't know that I don't like this particular video because it's a conspiracy theory, and to stop showing me those. These platforms often don't even support explicit categories, so how would they know?

2. It's extremely high friction to teach the algorithm you're interested in something that it doesn't suggest to you! There's the whole unknown unknowns problem: how do you teach the algorithm you're interested in something that you've never seen before?

I still think Reddit has handled this the best. No system is perfect, but Reddit's challenges are much more manageable than the quagmire that TikTok, Facebook, and YouTube have gotten themselves into. I can just unsubscribe from r/conspiracy, and I'm out. Basically impossible to teach that to YouTube without weeks of careful curation. They think they're smart enough to know what I like, but they're not and never will be.

Eh, 1 is sort of why I see implicit negative feedback as more useful here. Namely, tracking the duration a user is probably giving their attention to a given item, weighted in accordance with how long you'd expect someone to give their attention to an item based on how long it is.

For example, I might see a some specific word or pattern of words in a tweet and quickly skip to the next one. That's very low friction but also a powerful signal. There are drawbacks (e.g. the long boring video that looks like something is about to happen but never does), but that's mitigated by combining this with other signals.

With 2, it's an explore/exploit trade-off. You try to explore as far and wide as you can while trying to avoid the things the user may dislike, all while sprinkling in just enough of the stuff that you know they'll like.

You're talking like an ML engineer. :)

Yes, there are signals in human behavior that you can feed to your model. But no, it is never going to learn "on Mondays he works on his comics, so he'd prefer to see webcomic-related content" Don't sprinkle in your explore/exploit experiments. I know what I want, just let me decide based on what I'm in the mood for!

TikTok-style feeds are the absolute worst offender here, where they couldn't care less about what you think. They will serve you content, and you will either say "yes" or "no". So the only option for you as the user is to just wander through their content hyperspace. There's no structured way to jump between topics because everything lives in this formless content soup.

The other problem is that many social media platforms have you subscribe to the content streams of individuals directly. Individuals are high variance. How can you teach these engines that "I only care about this person's posts about pianos, not their terrarium hobby."

> Namely, tracking the duration a user is probably giving their attention to a given item,

I pay a lot of attention to classes of things I don't want to see at all.

Then, according to the algorithm, you do want to see them.

That's exactly the problem! I don't want to see them, but the algorithm refuses to ask me.

If they think that the length of time I spend on a post is 100% correlated with the amount that I want to see it, they really don't understand people at all. It's an embarrassingly shallow model of how how humans behave.

> It's an embarrassingly shallow model of how how humans behave.

It's a trade off. Viewing rates give back feedback 100% of the time. Asking users for a thumbs up or down gives feedback almost none of the time, and still might not be accurate.

Twitter could recreate a similar system: 1) auto-tag tweets with labels, rather than users needing to submit to subreddits 2) auto-sub most people some default set of labels 3) let them un-sub if they want.

They just don't want to.

Auto-tagging is not a business they want to be in, because they will invariably get it wrong sometimes and people will start drama over it. And they'd be right to: human communication is nuanced and you can't just "guess" what someone means within 280 characters.

People already use hashtags, what would happen if you could subscribe to them? It would create a whole new way of interacting with the platform, something closer to Reddit where people post to streams. But the fact that these streams can't be moderated (or even downvoted) might be a nonstarter.

Wait, Twitter already has this in the form of Topics. It's very coarse-grained, but still.

Regarding 1. I would expect like rate on recommended tweets to be a better signal than impressions.

Thank you! Working on a concept for a big org on what may become a large ML-based system one day. I knew about this feedback loop issue, but was too dumb to actually remember and face this problem. :) It's all over today's rec engines -- and yet, just like the things we're not shown in these systems, the problem itself seems to become invisible. Because it requires new thinking.

Worth exploring.

Reddit is a good example of an alternative system, and it works because subreddits serve as lower-variance composable streams as well as shared digital spaces. Even though your main feed is partially driven by a recommendation engine, it never contains any content that you didn't subscribe to (ignoring ads, of course). Instead, you may see a "posts from other subreddits" section that breaks up the feed explicitly to offer alternatives. If you're interested in a particular topic, you can drop into that subreddit, where you'll see pretty much the same view as everyone else. In all these cases, you're always in control.

I wouldn't be surprised if Reddit actually has the best data on their user's preferences, because they let them explicitly decide what they want rather than playing this "hot and cold" game like TikTok. Those platforms also only let you subscribe to individual accounts, so they have to infer what your interests actually are.

I feel like the Youtube one is good. You can mark videos and channels as "not interested" and Youtbe really knows me due to my account age and usage... It recommends me unknown videos and I tend to like them but also more mainstream stuff.

It doesn't matter how good they try to make their recommendation system, they will never know you like yourself. For example, when I go to the YouTube home page, there is a list of categories at the top that it's identified for me. I didn't choose these. I can't add or remove them myself if it's wrong. I just have to hope that I watch the "right videos" and it picks up on a new interest that I have.

But I already know what interests I have! I want to have videos about terrariums on my home page now, not in a week when I've watched enough. This is what I mean by recommendation systems not being good enough. They need to give the user more control over what they want to see, because they can never read my mind. Their recommendations will get even better with that information!

Isn’t the “I’m not interested in this tweet”, “Show me fewer tweets from X”, etc options working for you? They seem to have an effect on my end.

What did they learn though? They can't possibly know why I'm not interested in that tweet, because there's just not enough information. So all they'll do is stop showing me tweets from that one person. No way to express "yeah I'm interested in his tweets about music, not gardening." The problem is that mostly following people means that you can't teach the algorithm about your particular interests.

Topics are better, but there doesn't seem to be a way to make sure your tweet ends up in a particular topic. Again, you have to pray to The Algorithm that it will label tweets correctly. And if your particular interest isn't a Topic, you're out of luck.

So , all we need to do is to open source humans?

Haha no, we just need to stop pretending as if ML can read our minds. It's a tool, it's good at recommending me content, but it will never know what I'm in the mood for on a particular Friday night.

Great pull request here which improves the algorithm: https://github.com/twitter/the-algorithm/pull/17

That would be great (unweighting bluechecks) but they actually plan to go in the other direction: Starting April 15th non-bluechecks won't show up in the "For you" section (the algorithm timeline) at all. Unpaid users are being written completely out of the algo.


Inaccurate. Musk stated that people you follow will continue to show up. I only use the 'for you' feed when I'm bored and want stupid dopamine hits, I leave it on Following almost all the time. But that's on desktop, my understanding it that it keeps resetting itself for mobile users (of whom I am not one).

He only said that later; he seems to be working on a "say we'll do feature and then change it all after people yell at him for weeks" process.

Sounds like moving fast and adjusting to user feedback, usually something commendable in tech, right?

No. There's an implied willingness to listen to people in the first place when you're agile in response to feedback. You shouldn't have to bruise Elon's ego to get him to do not-stupid things.

This distinction you’re making relies entirely upon how you personally interpret the motives of the people involved. An interpretation that you’ve entirely invented for yourself.

Approximately every US news organization and Lebron James announced they weren't going to pay for checkmarks yesterday, so he announced a surprise twist where the top 10,000 famous people get free checkmarks.

…so he's now not charging people who can pay, while charging people less able to pay.

Though, the main problem is that Twitter is about as big in Japan as it is in the US, but Elon only thinks about US culture wars because his only friends are a few VCs and a retired postal worker crank named catturd2. So none of his new ideas are going to work there.

Criticizing somebody by speculating about their motives really only reflects badly upon you, especially if you’re going to accompany it with that much hyperbole. All you’re communicating is that you’ve replaced your ability to reason about a topic with some version of the fundamental attribution error.

> the main problem is that Twitter is about as big in Japan as it is in the US

It’s not even close. When you account for the value of ad impressions, the US market is worth about 3.5x its next biggest market (Japan), and it drops off very sharply after that. You can argue that Twitter’s policies should be less US-centric. But the reason they are that way in the first place is because the US market is their most valuable market by a long shot.

> Criticizing somebody by speculating about their motives really only reflects badly upon you, especially if you’re going to accompany it with that much hyperbole.

No, you can tell who he cares about because he replies to them, and because his emails were released in discovery from the earlier Twitter lawsuit and it turned out his friends put him up to it because they were mad the Babylon Bee got banned for a US culture war pronouns joke.

I certainly don't have to expect he'll make good decisions, since he has no experience with the business, is a completely atypical user, was forced to buy it in the first place, and has lost $20 billion so far by his own valuation. (May have lost a few millions more by firing that disabled Icelandic acquihire in violation of his contract and a few discrimination laws.)

The network effect is very very strong though. Even if someone makes a good competitor (Instagram is prototyping one apparently) I'd be surprised if bad decisions actually killed it.

> When you account for the value of ad impressions, the US market is worth about 3.5x its next biggest market (Japan), and it drops off very sharply after that.

I suggest reverse adjusting for how poor their ad targeting is. I've used it both places; the ads are actually good and relevant in Japan (…except for being for domestic apps, so not valid for tourists) but in the US they've always been nonsense. eg if you follow any doctors it will just assume you're also one and burn the budget of every medical ad it's got showing them to you.

More like rewarding someone for putting out a fire that they started. Not exactly a model of good governance.

Musk said from day one of his Twitter purchase that there will be mistakes. Move fast and break things as that other guy said. :)

I have had it reset a couple times on mobile a few months ago but at least recently I have not noticed it happening anymore.

Just pressing the home icon twice takes it back to For You.

Which is incredibly annoying, as it used to scroll you to the latest tweets, I have years of muscle memory for hitting the home button

I don’t see a way out of this with the GPT/AI able to create fake persona in an instant.

What does that matter? If people find the content engaging then it will be amplified. If not, it shouldn't be there in the first place. This whole "AI / Bot swarm" excuse is just smoke and mirrors for "I want more people to pay twitter".

If "bots indistinguisable from humans" find your "account that posts pro-russian propaganda" engaging it will be amplified onto your For You page.

If _accounts_ find the tweet engaging

If you don't care whether the content is generated then you don't need to use a social network at all. GPT can generate you an endless interesting feed no problem.

I believe LeBron James said recently he isn't going to waste his money on a blue checkmark, so it should be interesting to see what stays and what goes.

Most of the major news outlets are not doing so either. The Elon stans are crowing that this will be the long-overdue end of legacy media, but it strikes me that the new 'blue check twitter' might end up becoming even more of a social bubble than what it replaced. There are so many low quality accounts sporting a checkmark now that users who value substance will soon be incentivized to just block anyone they find annoying.

yeah, they, uh seem to have realized that's gonna be a problem.


and of course all this means is that the organizations most likely to be able to afford it won't have to

Good find! It'll be funny to see if the incumbents respond with 'don't do me any favors.' Also to see whether Musk's frens sulk aboit him selling out to the elites or so - their gratitude has an extremely short half-life.

NY Times, WaPo, LA Times and other major accounts too https://www.thewrap.com/ny-times-la-times-not-pay-for-twitte...

Seems dumb of them. Cost is trivial and their competition that isn’t so politically motivated will have a much further reach.

The smart move would be silent on the policy change, pay, and support rival platforms as they can. Instead they will eventually pay and look like they lost.

It's smart because Musk already blinked

The top 10,000 are getting exemptions and won't have to pay


> Seems dumb of them. Cost is trivial and their competition that isn’t so politically motivated will have a much further reach.

It's wild hubris for twitter to try to invoice/penalize the very users and organizations that make twitter anything but insolvent. There should be money exchanged here, but it should be flowing generously and most importantly in the other direction.

maybe they can make their own twitter and give themselves a big blue check mark there

For the NYT to verify their official accounts plus those of their reporters (using the Twitter Blue Affiliations feature) would be $1m annually. This, for a budget line item that has heretofore been $0. In this economy, that's a reach.

I don't think the NYT is worried about "reach."

> I don't think the NYT is worried about "reach."

LOL they are desperate for reach. Incredibly so; have you not listen to any podcast by them? They are begging people to go to their site. They get a fraction of the organic traffic they used to and nearly everything is driven from other site like Twitter, Google News, Facebook, etc. The internet age has not been kind to classic news orgs.

The NYT has been doing great recently. They're probably the legacy news company that's doing the best online of anyone.

That's on the strength of having a lot of verticals like games, recipes and Wirecutter though.

have you seen @nyttypos and (to a lesser extent) @nyt_diff? NYT online editorial standards are hilariously abysmal.



Yeah and it doesn't matter.

correctly writing words and punctuation on the page digitally printed by "the legacy news company that's doing the best online of anyone" doesn't matter at all? isn't that like the bare minimum of what their job consists of?

I was responding to a post saying they weren't getting online traffic by saying they are getting online traffic. Nothing about the quality of their content.

> I don't think the NYT is worried about "reach."

Then why are they on Twitter?

Having a presence doesn't mean you can infer they are worried about not having a presence.

How do you get the 1 million figure?

An affiliated account to a verified org is $50 per month per seat, so NYT would have to authenticate 1,647 affiliated accounts to reach 1 million dollars per year

Exactly this. Use the published number of reporters at the NYT and multiply.

Are all of these 1647 reporters (they have that many??) and posting on Twitter? That’s a lot of traffic generators or not. Surely they could just do the bulk with 100 or so.

You're suggesting the NYT further tier its reporting ranks, along with all the internal difficulty that would entail. For ex: obviously the 100 have to include the most senior reporters, who are also older and therefore the least likely to create the viral content NYT wants affiliated with their account, so immediately they probably need to look at a much larger number. For another example: social media is different for each reader, or from the other side, each reporter has a constituency. In one season, the fashion reporters are driving views, while the following season it's the European war correspondents or the economics reporters (and all of these desks have subdivisions that wax and wane in popularity).

And all that discussion so that they can spend $72k annually with Twitter, a y/o/y increase of $72k from last year. With no guarantees of reach, because the whole paid-only verification thing is an experiment that began an hour ago. Let me just say that this whole pitch is going to be...difficult... at the point in the economic cycle where we find ourselves.

The NYT has revenues of 2.1 billion. I’m sure they have a marketing budget and probably already spending money on Twitter to get traffic. This isn’t something strange.

Facebook did they same thing btw, just more gradual. For years they changed the algorithm slowly to take away reach from Pages only to offer it back as long as you paid.

Seems like a fair cost to pump spam

We’re in a time where ideology trumps revenue for some companies. You know, “Get woke, go broke.”

Everyone says that about things that made tons of money though.

Oh, I don't disagree. I wasn't necessarily (always) agreeing with it. As with anything it's a matter of degrees.

The phrase is "Better Broke Than Woke", we'll see how it works out for folks

I had to search who this person is, and I still could not care less.

Depends on how they weight the is_user_china_mouthpiece variable

So you need a blue check mark to reach non followers and Lebron won’t get one? Sounds like it’s making Twitter better already.

LeBron doesn’t get $84 of value from Twitter? Definitely not a political statement going on there.

Parent didn’t say it’s not “political”. It’s reasonable for a wealthy person to feel that a system that discriminates against the poor is not a system they want to participate in.

(Note that I use discriminate in the literal sense, as a simple statement of fact.)

But the example you give is an appeal to a universal moral good. Not partisan politics. So despite saying it’s not not political, your justification is that it’s not political.

Also, how did you get a blue check before being able to buy one?

Some people unfortunately view concern for the poor as political. However my point of mentioning politics is to say that “it’s political” is not any kind of gotcha when it was never denied as being political. Regardless of the actual justification being political or not, the “political” gotcha is nonsense.

That’s bullshit. Virtually everyone agrees poverty is a problem. Sure, the welfare state feeds the cycle of poverty, but it’s not like that was the goal.

Absolutely not bullshit. Some cynical people on the internet believe this, and that's what I thought the person I was replying to was saying. It is an extremely low bar to say "some people believe X", and I don't know why you care to question that. Even with your own reply you say "virtually all people agree" and your use of "virtually" acknowledges that not everyone agrees with you, therefore some people do believe what I say. This is anyway such a silly tangent and was not even my point.

The exception proves the rule though.

Are you sure about that? If I break out of the poverty cycle who am I going to vote for?

Republicans I guess, but the point is that doesn’t mean Democrats hate poor people. That’s just political framing.

lol you don't actually believe that's his reason do you?

LeBron started a school where poor kids can get free food and clothes. The man obviously cares.

yeah he gave some money and shows up for photo ops sometimes, sticking the taxpayers with the rest of the bill - swell guy

Twitter doesn’t get $84 of value from Lebron?

Lebron famously uses the free version of Spotify. Maybe he just does not see the value of paying for the blue mark.

He may well, but he may have also concluded that the indirect cost of having a blue tick outweighs the benefits.

Like it or not but it's the twitter that gets value from celebrities. How many people are on social networks jusy so see what their fav celebrites are doing?

It obviously goes both ways. Social media is a megaphone and ego boost for celebs.

The problem for twitter is it isn't the only game in town when it comes to social media, not by a long shot. They're not even in the top ten. They're a megaphone in a large pile of megaphones, and those other megaphones don't bite the hand that picks them up.

Twitter needs the LeBrons of the world far more than they need Twitter.

As someone who was on Twitter long before Oprah, or Elon, or Obama, or most other celebrities and politicians: I strongly disagree.

(Speaking about Twitter the product, not necessarily Twitter the company.)

That was an entirely different era. If you yearn for it, come on over to Mastodon.

Twitter the multi-billion dollar advertising company needs the big whales. I think it was pretty clear the company was under discussion.

What their fav celebrites say -v- what 'the media' says?

Followed users will still show

But i agree it is a bad idea. The worst actors have money to buy the blue marks of an army of accounts.

This is basically making it easier for authoritarian governments to abuse it

Like that wasn't already happening?

not with blue marks

I'm not in the US

It's actually non-US autharitarian governments that the biggest abusers, using social media against their own people


it's a shame we can no longer short twitter stock

Aside from the spam PRs, there is actually one PR that fixes a bug: https://github.com/twitter/the-algorithm/pull/242/files

Modern Java actually allows `10_000` for that very reason, as does Scala (https://scala-lang.org/files/archive/spec/2.13/01-lexical-sy...)

Add golang to the list.

It removes the extra weight to Twitter blue tweets?

If the property names are to be believed it sets a weight multiplier to 0. So it prevents recommending them entirely.

It sets the default to zero, but apparently can range up to 100. So... what modifies it? (The answer is probably in there somewhere, but I'm sure someone will find it before I do.)

Me feed has lately been full of accounts that have blocked me. Like, I see a tweet from someone unknown, click their profile and it says I'm blocked.

So wonder if some value is wrong in one of those constants. Anyways, the blocking feature is broken..

I think it's okay to give Twitter Blue users a boost, as it's most likely not spam (unlike the 95% of my non-blue followers who are bots).

I think it makes sense for out-of-network. However I see no value for boosting among people that I have followed. If I have followed them they definitely aren't spammers (from my PoV).

I agree! Haven't thought of that.

Drive-by pull requests that break the intention of something aren't ever going to be taken by a maintainer.

Why do companies even bother to put source up on github? To put up a front that their open source? What a joke.

Yes please! I definitely put my thumbs up in there!

That will definitely do something! Good job!!


The trouble with spammy jokes like this is it discourages companies from bothering with open-source in the future. I know I'd be less likely to champion an initiative like this if I thought it might blow up in my face.

Mature. I'm sure companies looking to outsource work with Twin Prime Media would be stoked to see this level of maturity.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact