Here are three specifically interesting things about Takahē:
1. The "multiple domains" feature. I'm running my own Mastodon instance right now purely so I can have my simonwillison.net domain as my identifier there (and protect myself from losing my identifier if the server I am using shuts down). This feels pretty wasteful! I'd much rather be able to point my domain at a Takahē instance shared with some of my friends, each with their own domains for it.
2. It's a Django app that's taking full advantage of the async features that have been added in the most recent releases of that framework. Async is a perfect match for ActivityPub due to the need to send thousands of outbound HTTP requests when publishing a message. And Takahē creator Andrew Godwin is the perfect person to build this because he's been driving the integration of async into Django for the past four years: https://www.aeracode.org/2018/06/04/django-async-roadmap/
This is really interesting. Thanks for writing this comment and sharing.
Async is good for lots of IO work and managing independent tasks with low coupling.
I am interested in task scheduling and asynchronous code I am interested in programming language development and parallelism and simultaneity without parallelism and cooperative and preemptive scheduling.
As an experiment inspired by Protothreads (a C library for implementing cooperative multitasking with a switch statement) I recently implemented async/await in Java as a giant switch statement and a while loop.
Providing that each coroutine only runs once, the amount of memory used shall not grow. The goal is to be stackless.
I began writing a programming language that looks similar to JavaScript but targets an imaginary interpreter that is multithreaded. I hope to think of how to represent async await so that the high level language can target the interpreter. I need to think of the code I need to codegen to implement async/await.
I played around with an C++ coroutines but someone told me that the approach I used is not C++20.
The problem is that you can't have multiple domains point to a single Mastodon instance. I'd like to share my single instance with friends who can bring their own domain name.
Basically the problem is that current Mastodon only supports single settings for the LOCAL_DOMAIN and WEB_DOMAIN.
I know of an organization that just advertised their new Mastodon instance as being at social.[domain].com. Is it too late now for them to start using WebFinger and advertise Mastodon handles at simply [domain].com?
Perhaps naive but is it possible to create some sort of Mastodon proxy that exists independent of any specific instance? Rather than run your own instance or point to a shared instance, a proxy could be a fairly simple system that uses DNS records (?) to route requests to the appropriate instance -- much like email.
Unfortunately that doesn't quite work with out-of-the-box Mastodon.
I'm running a bit of a proxy at https://simonwillison.net/.well-known/webfinger?resource=acc... but it still needs to point to my own dedicated instance, just because Mastodon can't have multiple domains pointed at a single instance of the software yet.
Ahh this is so exciting to see so much happening in this space all of a sudden! My quest to get a personal instance running has been a long slog for me personally.
I had been working on an ActivityPub server in Node.js/TypeScript for a while before the Twitter migration. It's got most of the features I'd want in a small server but it's basically bring-your-own-client at the moment.
Finding all the resources to build a complete server that can interact with other instances isn't easy, so maybe this can help someone. The spec is well worded, but the checklist is confusing, the test server is down, Mastodon has its own rules, etc. Plus you have to have at least a cursory knowledge of JSON-LD/RDF.
I had the idea of running a single user server on CloudFlare Workers and using D2 (their SQLite based db). A light weight JS/TS implementation would be perfect. Looks like you have Postgres planned, it would probably be possible to expand from that to SQLite.
Yes there are multiple people here, but the general sentiment was negative among most threads.
Go ahead and look over any Mastodon thread a year ago or before.
Generally it was dismissed with "oh it's too niche" or "moderation will be too difficult".
People ignored the communities already on it and the tech overall.
Only until people got pissy about Elon running Twitter instead of hedge funds did the general sentiment here change.
It wasn't about the tech, and not even about Elon specifically, it was the Twitter safe space got taken away.
But now hopefully the people who want that safe space will isolate themselves in mastodon instances that block all others and we can all live in peace from them.
I've been in the mastodon world for longer than the Twitter drama, so I'm well aware.
When you block other instances you realize you are islanding yourself off, right? Not the other way around.
Everyone is federated until you block, so you are isolating yourself from the norm when you block others.
It's not too noticeable as the english speaking instances are small currently, but those who don't want wolfballs and friends, or this or that are more isolated than those who do.
It makes sense that those who want to hide views from others are the outliers. Those who want to be open and allowing of diverse thought are more interoperable.
I think this refers to handling JSON ActivityStreams objects at the `/outbox` endpoint for a logged-in user, and then broadcasting those out appropriately. If so, then yes that's the only API that's used. It also handles the uploadMedia endpoint and a few other details that are included in the spec.
I have tried unsuccessfully so far to set up an OAuth provider server along with it, so that you could log in on your phone, etc.
That is great news, one of the few implementations that does it. :D Do you have a demo server set up anywhere? Mine (based on my own activitypub code) is at https://federated.id :D
> Features on the long-term roadmap:
> “Since you were gone” optional algorithmic timeline
That's exciting! The fediverse is severely lacking algorithmic curation presumably due to the belief that it's inherently evil (I'd strongly disagree; it's merely the algorithm not being user-controllable what's bad).
Fully agree, the algorithmic timeline (sprinkling in some likes and comments from other people that might be interesting to you) is one of Twitters best features even if many people (who mostly use third party clients for that reason) would not agree.
Do you people know that "# Explore" section on the right of Mastodon already lists posts which are gaining traction? It also lists news which are trending.
It says:
"These posts from this and other servers in the decentralized network are gaining traction on this server right now."
I don't know what the logic is, but on big servers it's listing a lot of content.
That's very different from the Twitter timeline though, which shows you "good" content from people you already follow that happened since the last time you used Twitter. So if you refresh a bunch of times you'll always see more interesting tweets / likes / comments.
Yes you are right, but IMO it's algorithmic curation already, like the parent said. It's just not from your timeline but from whole fediverse / server you are on.
It's probably not that difficult to add one based on your feed if there is one globalized already.
To me, the main difference to Twitter seems to be that you have to explicitly go to the "Explore" section to look for trending content, leaving your "default" home timeline chronological.
So if you want to check out what's the current buzz, you can, but you won't unknowingly be missing posts from those you follow (which seems to be the common complaint on Twitter).
The word "algorithm" has suffered wild semantic drift at the hands of journalists. Let's see if we can start to fix that now by making sure that on HN and in adjacent communities of all places we use the appropriate words for the thing we are talking about.
We are talking about heuristics here, not algorithms.
Why does it need a Postgresql server? For just a handful of users, isn't sqlite the leaner, yet sufficient choice?
How does it compare to GoToSocial, which requires 50-100MB of RAM? They are also in alpha stage and i like their approach of keeping the web UI separate.
Author here - it's just to reduce support surface area. I know I'll need PostgreSQL's full text indexing and GIN indexes for hashtags/search eventually, and I probably also want to use some of the upsert and other specialised queries, and it's easier to just target one DB I know is very capable.
For reference, when I say "small to medium", in my head that means "up to about 1,000 people right now".
People were getting priced out of hosting an instance with "only" 10-20k users and the instance hosting services quote <= 4k users with the 4k end being >$US100/month. With the "low end" 1-200 user instances having 4 cores, 5tb of monthly bandwidth, etc.
The general sense I have got is that mastodon - the default software at least - is extremely resource heavy for relatively low user counts. My assumption/hope was that the bulk of this is that the server software hasn't ever really been under sufficient pressure to improve, and takahē seems to indicate that there's at least some room for improvement on the server side (i.e. performance problems aren't entirely protocol/architecture problems)
Is there any advantage to using a traditional db as opposed to a graph db since json-ld is just a text representation of graph nodes?
I was thinking the easiest path would be have the server deal with all the activityPub stuff and expose something like a graphQL interface for a bring your own client implementation. Of all the stuff they shoehorned graphQL into this seems like a valid fit, like they were made for each other.
For better or worse, many servers are targeting Mastodon API compatibility to be able to leverage the existing clients. Adding GraphQL increases surface area without solving the bigger issue of creating the clients.
I didn’t get as far as looking into the mastodon API for clients but that makes perfect sense, I just assumed it was an overlay on the more general API.
Mostly I was thinking how one could implement something in the most efficient way and graph databases/graphQL were literally designed for this stuff.
I tried swapping that for SQLite and successfully ran the test suite about a week ago, but I've not tried that again against the large number of more recent changes.
SQLite is magical and incredibly lean, but it is not leaner than Postgres if you need real database features. You end up reimplementing a lot of features in code that belong in the db.
This doesn't match my experience from the last few years. SQLite in WAL mode is extremely capable.
The only thing I really miss from PostgreSQL is that PostgreSQL has more built-in functions for things like date handling - but SQLite custom functions are very easy to register when you need them.
It also has excellent JSON features - JSON maybe stored as text rather than a binary format like JSONB in PostgreSQL, but the SQLite JSON functions crunch through it at multiple GBs per second so it doesn't seem to matter.
Nice to see a Python/Django implementation of ActivityPub. Having a nice, lean implementation of ActivityPub that I can customize to my liking is the only thing that keeps me from using the Fediverse more regularly. So I am watching the space closely.
What I find a bit unfortunate about Takahe is the coupling with Docker.
An even leaner ActivityPub implementation seems to be MicroBlogPub. I have not yet managed to set it up though.
Anybody interested in collaborating on a MicroBlogPub install script that turns a fresh Ubuntu installation (or container) into a running MicroBlogPub instance?
It's not coupled with Docker. Docker is purely one suggested way of running it - it's a classic Django app so running it directly on Ubuntu should work the same as any other Django application.
When I saw "Prerequisites: Something that can run Docker/OCI images" in the documentation, my interpretation was that containers are needed. It also says "You need to run at least two copies of the Docker image". Maybe you want to change the wording a bit.
I would also collaborate on writing a setup script for Takahe then!
I really like to write a setup script instead of following manual installation guides. So for every software I try, my first step is to write a script that turns a fresh Debian installation into a running instance. (MicroBlogPub needs Python 3.10 which is not in Debian stable, so I would use Ubuntu)
Hmm.. does not look good for the non-Docker setup. The developer replied with "I am deliberately avoiding offering a non-Docker install path" and closed the issue:
> So, I am deliberately avoiding offering a non-Docker install path that is supported right now as it leads to a lot of support burden with different OS package versions and the like!
That doesn't mean you can't write and share a script for people who want to install it without Docker.
It means that he doesn't want to take responsibility for non-Docker installation scripts as part of the official documentation (yet), because if he did that he'd be on the hook to keep researching and updating those scripts in the future.
> What I find a bit unfortunate about Takahe is the coupling with Docker.
While I don't love it, it's very understandable for a single-dev application. Anything else involves blizzards of questions and bugs filed against people using their disto version of Django vs their downloaded version of Django and the many versions of distros and the many conventions for Python environments and...
Surprised it hasn't been attacked over this yet as there's so much needless hand wringing about anything non-AGPL being a threat to the anti-capitalism views of the Fediverse
You’re wrong. Just because it’s not the absolute peak of efficiency, written in C with asm routines to talk to the db, doesn’t mean it’s not efficient.
Performance and efficiency aren’t the same thing. Django does a lot of things other frameworks ranked here don’t do.
Such framework rankings are also utterly irrelevant when you want something widely used enough to easily find contributors and integrations. That restricts you quite a bit more than “any so called framework that just handles http”.
I see honk hasn't been mentioned on this thread. It's also an activitypub server which is very lightweight (golang) and easy to set up your own server.
https://humungus.tedunangst.com/r/honk
It's unfortunate, because Honk appears to be well designed otherwise, but I found it difficult enough to grok the idiosyncratic naming conventions that I gave up.
I see the sibling comment about obfuscation, but not sure I follow either of you. Is this code not clear?
To me the code reads with humor and creativity, while every bit as self-evident as a Gary Larson FarSide cartoon on second glance. I mean, what else is nomoroboto going to do than what it does?
I've never seen this tone in the wild before, but got a kick out of it, might even find it refreshing maintaining it.
Anyway, you're right, all code should be written in haiku form, to maximize creativity and succinctness, plus keeping methods short! True elite coders ensure variable names are always a prime number of characters
I'm very interested in this federated renaissance happening, but having trouble understanding how all the pieces fit together. Is there a good overview I can read? I think ActivityPub is the (a?) protocol, and Mastodon is one particular implementation of it, just like the software linked here? Are there other relevant or competing protocols? How does Matrix fit in exactly? What about identity? Is OpenID a part of all this somehow?
The Fediverse is the loose network of servers that exchange data with the ActivityPub protocol. Mastodon is a server that implements a chunk of AP. Mastodon also specifies a client API that is not AP, but is fairly often implemented by other packages because it's convenient.
AP is not entirely Twitter-style microblogging. It can be used to exchange (data or links) photos, video, audio, documents, invitations and meeting appointments. The default privacy assumption for all AP content is that it is basically public.
Matrix is not built on AP. Matrix is a real-time communications protocol suitable for private messages and public chats. Its mission appears to be to bridge every other protocol, so there's at least one Matrix-ActivityPub bridge module, MXToot.
OpenID is not part of these. Some Matrix servers can use OpenID for authentication. As far as I know, no ActivityPub servers currently use OpenID.
Nice project! I like the goals, and hope we see more good ActivityPub options like this. Just a note that the web UI isn’t formatting correctly on iOS!
Takahē are interesting birds too. There is a related bird the pūkeko which was also blown to NZ but at a different time. It has relatives in Australia and South America also. It was thought to be extinct at one point, caused by predation by introduced pests, and introduced deer eating the grasses they rely on for food. Now there is a population of about 400?
Counterpoint: If the fediverse has any hope of achieving critical mass, it can’t confuse users by constantly breaking federation for every minor disagreement.
Constantly, it's great. There's big air-gaps between the US Lefty and US Righty fediverses. If you want to talk to both sides of that divide then you'll want to have accounts on multiple servers. If you decide to only be on one side then it's easy to migrate your accounts around between servers.
Nah. There are various people who call for policies like this, but then most reasonable instance admins ignore them, and the fediverse continues to operate.
The same thing has happened in the free software movement in general; some folks have called for copyleft-only, and the rest of the world has largely ignored them and is fine with shipping BSD licensed software along with GPL licensed software.
There is constantly threats like that in Fediverse. For instance some threatened to defederate if Tumblr implements ActivityPub and don't remove Ads and what not.
In reality biggest servers still federate with most servers. With latest changes in Mastodon you can now see whom they don't federate. It's not huge list, and if you browse those who don't federate with, it's pretty obvious why.
A lot of those defederated services advertises themselves as some sort of free-speech absolutist alternatives.
My point was moreso that this kind of attitude among the group of people who are technically adept enough to run servers is unattractive to the average user, who couldn't care less about the license of the open source software they're using.
I have come here to rail against (pun intended) the use of the name Takahē for a piece of software. The author is well-intentioned and there is some aptness to the name, but many people here in Aotearoa / New Zealand, are sensitive to the use of the names of our tāonga / treasures for businesses, technologies and other objects. Specifically, the takahē is an indigenous species tāonga as described in https://www.taiuru.maori.nz/branding/. As such it would be best to consult with the Ngāi Tahu people before using the takahē’s name in this manner. In this link from NZ’s Department of Conservation (https://www.doc.govt.nz/news/media-releases/2021-media-relea...), you can see that NZers take this sort of iwi partnership seriously.
More broadly, I find it sad when the the names of natural species and features are adopted in the business and technology world without any deep connection. A canonical case would be Amazon the company, which has prospered and become a household name while the Amazon itself, with its people and ecosystems has suffered and declined. An egregious case relevant to NZ is Kiwi Farms.
The trend of using species names in technology perhaps started with the O’Reilly books. The argument can be raised that such use raises awareness of endangered species such as the takahē. But perhaps that is best left to other means, for fear that the mauri of a species should be captured and harmed.
As with most things it's not that black and white. Ngāi Tahu are actually the biggest polluters of the water supply where I live as they cut down acres of forest to create massive dairy farms. The creator of this software picked a really great name since it actually promotes NZ conservation. I've donated my time and money to plant native species in NZ and I find your position ridiculous. Does everyone need to run their language past some woke authority now to make sure it doesn't offend some guys on the internet? If you want to help out with conservation why don't you spend your time volunteering for one of the many tree planting charities?
The intentions of the creator of the product do seem genuine and worthy, but it's still cultural appropriation without consultation or acknowledgement of the tāonga behind the name.
Just to be clear, its actually the bird itself I care about. I think its identity is something that deserves respect and shouldn't just be randomly adopted by someone who thinks it is cool. I am glad that some Māori people are asserting a degree of hegemony over the use of NZ names and identities and I appreciate their intent. If you think this is ridiculous, just try and call your technology product Walmart, Amazon, The Warehouse or even the All Blacks and see what happens.
1. The "multiple domains" feature. I'm running my own Mastodon instance right now purely so I can have my simonwillison.net domain as my identifier there (and protect myself from losing my identifier if the server I am using shuts down). This feels pretty wasteful! I'd much rather be able to point my domain at a Takahē instance shared with some of my friends, each with their own domains for it.
2. It's a Django app that's taking full advantage of the async features that have been added in the most recent releases of that framework. Async is a perfect match for ActivityPub due to the need to send thousands of outbound HTTP requests when publishing a message. And Takahē creator Andrew Godwin is the perfect person to build this because he's been driving the integration of async into Django for the past four years: https://www.aeracode.org/2018/06/04/django-async-roadmap/
3. The way it handles task queueing is super interesting. I've not fully got my head around it yet but it's the part of the codebase called Stator and it's modeled on things like the Kubernetes reconciliation loop - Andrew wrote a bit more about that here: https://www.aeracode.org/2022/11/14/takahe-new-server/ - Stator code is here: https://github.com/jointakahe/takahe/blob/main/stator/runner...