Twitter takes its algorithm ‘open-source,’ as Elon Musk promised

The company looks at things like how likely you are to interact with a user in the future and what communities and tweets are trending.

Twitter has released the code that chooses which tweets show up on your timeline to GitHub and has put out a blog post explaining the decision. It breaks down what the algorithm looks at when determining which tweets to feature in the For You timeline and how it ranks and filters them.

According to Twitter’s blog post, “the recommendation pipeline is made up of three main stages.” First, it gathers “the best Tweets from different recommendation sources,” then it ranks those tweets with “a machine learning model.” Lastly, it filters out tweets from people you’ve blocked, tweets you’ve already seen, or tweets that are not safe for work, before putting them on your timeline.

The post also further explains each step of the process. For example, it notes that the first step looks at around 1,500 tweets and that the goal is to make the For You timeline around 50 percent tweets from people that you follow (who are called “In-Network”) and 50 percent tweets from “out-of-network” accounts that you don’t follow. It also says that the ranking is meant to “optimize for positive engagement (e.g., Likes, Retweets, and Replies)” and that the final step will try to make sure that you’re not seeing too many tweets from the same person.

Of course, the most detail will be available by picking through the code, which researchers are already doing.

CEO Elon Musk has been promising the move for a while — on March 24th, 2022, before he owned the site, he polled his followers about whether Twitter’s algorithm should be open source, and around 83 percent of the responses said “yes.” In February, he promised it would happen within a week before pushing back the deadline to March 31st earlier this month.

Musk tweeted that Friday’s release was “most of the recommendation algorithm” and said that the rest would be released in the future. He also said that the hope is “that independent third parties should be able to determine, with reasonable accuracy, what will probably be shown to users.” In a Space discussing the algorithm’s release, he said the plan was to make it “the least gameable system on the internet” and to make it as robust as Linux, perhaps the most famous and successful open-source project. “The overall goal is to maximize on unregretted user minutes,” he added.

Musk has been preparing his audience to be disappointed in the algorithm when they see it (which is, of course, making a big assumption that people will actually understand the complex code). He’s said it’s “overly complex & not fully understood internally” and that people will “discover many silly things” but has promised to fix issues as they’re discovered. “Providing code transparency will be incredibly embarrassing at first, but it should lead to rapid improvement in recommendation quality,” he tweeted.

There is a difference between code transparency, where users will be able to see the mechanisms that choose tweets for their timelines, and code being open source, where the community can actually submit its own code for consideration and use the algorithm in other projects. While Musk has said it’ll be open source, Twitter will have to actually do the work if it wants to earn that label. That involves figuring out systems for governance that decide what pull requests to approve, what user-raised issues deserve attention, and how to stop bad actors from trying to sabotage the code for their own purposes.

The company does say it’s working on this. The readme for the GitHub says, “We invite the community to submit GitHub issues and pull requests for suggestions on improving the recommendation algorithm.” It does, however, go on to say that Twitter’s still in the process of building “tools to manage these suggestions and sync changes to our internal repository.” But Musk’s Twitter has promised to do many things (like polling users before making major decisions) that it hasn’t stuck with, so the proof will be in whether it actually accepts any community code.

The decision to increase transparency around its recommendations isn’t happening in a bubble. Musk has been openly critical of how Twitter’s previous management handled moderation and recommendation and orchestrated a barrage of stories that he claimed would expose the platform’s “free speech suppression.” (Mostly, it just served to show how normal content moderation works.)

But now that he’s in charge, he’s faced a lot of backlash as well — from users annoyed about their For You pages shoving his tweets in their faces to his conservative boosters growing increasingly concerned about how little engagement they’re getting. He’s argued that negative and hate content is being “max deboosted” in the site’s new recommendation algorithms, a claim outside analysts without access to the code have disputed.

Twitter is also potentially facing some competition from the open-source community. Mastodon, a decentralized social network, has been gaining traction in some circles, and Twitter co-founder Jack Dorsey is backing another similar project called Bluesky, which is built on top of an open-source protocol.