Tom MacWright

tom@macwright.com

Playing with ActivityPub

Mastodon

ActivityPub, WebFinger, and Mastodon are getting some attention because of chaos at Twitter.

It’s anyone’s guess how this all shakes out. As an active user of Twitter, it’ll be sad if it goes away. But in the meantime, let’s have some fun with ActivityPub.

ActivityPub

Under the hood, there’s ActivityPub, WebFinger, and a number of other neat standards like JSON-LD, but for most people, they’re using Mastodon, the application. Mastodon is the software that you sign into and use as a Twitter alternative, and it’d built on all of those standards. There are a few other implementations of social networks based on the same standards, like Frendica, and Pixelfed, but right now, Mastodon is where the people are.

Mastodon is decentralized through federation: users can choose a Mastodon server on which to create an account, and they can follow and interact with users on other servers. You’re relying on someone else to host the server and protect your data, but instead of Twitter, you have a choice of servers. If one Mastodon host crashes, those users will lose their accounts, but other hosts will keep going.

So Mastodon doesn’t offer the sort of serverless decentralization you can get with something more radical like Secure Scuttlebutt, but on the other hand, it’s much more user-friendly. Just like Twitter, you log into a server with a username and a password, and you can easily access it on an iPhone and share content on Mastodon with a link.

But anyway, if we’re going to have this federation system, we might as well take it seriously. One of the benefits of Mastodon is that you can run your own instance. The benefit of Mastodon being built on standards like ActivityPub is that you can interact with Mastodon without running the Mastodon application software in particular: you can build your own. So why not: why not make macwright.com an ActivityPub host?

Context

This blog runs on Jekyll, one of the original static site generators. It’s hosted on Netlify, which has branched out to support a bunch of products, but started out as a static site host.

I’m not going to abandon these systems to support ActivityPub. Jekyll works great for me: I’ve been using it for over a decade and have few complaints. There are spectacular examples of what you can do with custom code and indieweb standards, like Aaron’s site, but that’s not for me.

So, ActivityPub needs to be a simple addition on top of this existing site. What’s the absolute least I’ll need to implement?

I started by reading the ActivityPub specification, and then Mastodon’s documentation of ActivityPub. Right off the bat I had a few takeaways:

  • It isn’t possible to implement ActivityPub without a server and a database. You can’t do it with just a static site.
  • ActivityPub is the kind of specification that’s so generic that everything implemented on top of it is a particular “flavor” of the specification. There’s an opinionated kind of ActivityPub that Mastodon speaks, which is different from bookwyrm or pixelfed.
  • The documentation for all of this is sort of spread out - to implement something compatible with Mastodon, you’ll need both WebFinger and ActivityPub support, and make sure that you’re making compatible decisions. Plus do some specialized cryptography to do HTTP signatures - something that the ActivityPub spec doesn’t specify. It’s good that we’re reusing existing specifications instead of inventing a whole new thing, but it fragments documentation and makes it a lot harder to get to a working implementation. So for the intent of getting something done, it’ll be better for me to just find a reference.
  • There are still things, like unfollowing, that aren’t implemented in the reference implementation, and aren’t well-documented anywhere.

And a reference arrived, thanks to Darius Kazemi, perhaps the internet’s most famous bot maker and experimenter. He’s been after this for years, writing ActivityPub servers on Glitch, written guides to ActivityPub, the whole thing.

So, the whole time I was doing this I was looking at express-activitypub, one of Darius’s projects. It’s great - simple, but it works. Most of my work here was making it even simpler - removing some of the configurability and hardcoding things like accounts - and porting code that was dependent on Node.js to code that could run in Netlify’s edge functions, which are a whitelabeled layer on top of Deno and thus use standard web APIs instead.

What needs building

After spelunking in the express-activitypub reference implementation, I eventually ended up with the following extremely minimal ActivityPub essentials, listed nearly in order of difficulty:

  • A WebFinger endpoint that returns account information.
  • A user endpoint (https://macwright.com/u/photos) that returns more account information if you use an Accept: application/json header.
  • An inbox (https://macwright.com/api/inbox) that receives follow requests.
  • A process to post new photos when I publish them.

With all these together, the photos section of this website is a “user” that you can follow from a Mastodon server: @photos@macwright.com.

WebFinger

Step one is WebFinger. Computer history buffs might remember the finger protocol. This is that, for the web, without the infamous security exploits, hopefully. It’s an endpoint that you can hit to get account information. Mine only supports one user:

https://macwright.com/.well-known/webfinger
  ?resource=acct:photos@macwright.com

So, when you search for @photos@macwright.com from a Mastodon host, this endpoint is what it hits: it extracts macwright.com from the username, assumes that .well-known/webfinger is there on the server, and finds the account. Simple as that. Here’s the code - it’s nothing all that interesting.

User endpoint

This, like WebFinger, was easy to implement. It’s just an endpoint that returns some JSON. Here it is.

Inbox

Here’s where things get a lot more complicated. The /api/inbox function needs to:

  • Implement some HTTP signatures cryptography, which is, as far as I can tell, still a work-in-progress specification and isn’t very well described anywhere.
  • Store follow requests, and respond to them with a signed message.

So, there’s more complexity in the specific code file (which you can see here) as well as in the system. We need persistence to be an ActivityPub host – we’ll need to store a list of all our subscribers, so that we can send them updates.

This is where it sinks in: ActivityPub is totally different from RSS. Of course it is - this is a federated realtime messaging system. But think about it:

  • You can implement an RSS feed with basically any system. A static site generated by a static site generator like Jekyll? Sure! You can even write an RSS feed by hand and upload it with FTP if you want.
  • Your RSS feed doesn’t know who’s reading it. If you have 1 million people subscribed, sure, that’s fine. At most you’ll need to use caching or a CDN to help the server serve those requests, but they’re just GET requests, the simplest possible kind of internet.
  • RSS has obvious points of optimization. If 10,000 people subscribe to my RSS feed but 5,000 of them are using Feedbin, those 5,000 can share the same GET request that Feedbin makes to pull the latest posts.
  • An RSS feed reader only needs a list of feed URLs and an XML parser. It doesn’t need to have its own domain name or identity in the system. A feed reader can be a command-line script or a desktop application.

RSS (and Atom) might be the most successful “worse is better” standards of all time, up there with Markdown and JSON. Really S-Tier stuff.

Because with ActivityPub:

  • If 10,000 people follow my blog, I have a database with 10,000 entries in it.
  • Every time I publish something, I send an update to every subscriber. If this blog gets popular, it’ll send an enormous amount of updates. Maybe there’s a more efficient way to get this done, but I couldn’t find it.
  • There are many Mastodon hosts and they don’t share any kind of cache so popular posts themselves have been known to DDoS websites.
  • There’s nothing like a “feed reader” in the world of ActivityPub. If you want to subscribe to someone’s content, you need an account and to send and receive messages. You need to be addressable on the internet.

So, given the requirements of being an participant with ActivityPub, this is the edge function that uses a database. I’m using PlanetScale, because it’s fun and a good learning experience, but anything would work.

Publishing

So, with the Inbox receiving new followers and recording them in a database, when I publish I’ll need to send messages to those followers.

I publish this site by pushing to GitHub: that’s the setup that Netlify gives me, and what I prefer for deploying overall. It’s a nice setup. It also means that, unlike a WordPress site or a hosted service, there’s no “Publish” button.

So, to publish something, I need to devise a trigger and a way for the publishing script to find new content. Here’s the publishing script I cooked up. Connecting this to Netlify’s webhooks did the trick for a trigger: when the site deploys, it hits the publishing script (which is part of the site) and publishes new updates to followers. It pulls the follower list from the database, pulls posts from the RSS feed, and pushes them.

You might notice - this doesn’t check to see what’s new, it just publishes all the RSS items to all the subscribers. This is because I’ve found that publishing, in ActivityPub, is idempotent: each post has an ID, and if you push that post multiple times, Mastodon servers will check that they already have a post with that ID and ignore it.

Architecture

Flow

So, in the whole loop, this website receives follow requests, stores them in a database, and then sends new posts when I publish something to all of the followers.

My site is still deployed as a static website using Jekyll, but the ActivityPub and WebFinger endpoints are served by Netlify Edge Functions. This, to be, is a pretty good setup: I keep the simplicity and efficiency of static content, only layering in server-like dynamic systems where necessary.

The publishing flow - a webhook that triggers an edge function - is a hack, and something I’ll change if I can figure out a better way to do it.

It works, so far, with my photos page.

Fin

So, how does this make you feel? Excited? Overwhelmed? A little of both?

Hacking on ActivityPub was a fun project, but it was chaotic. ActivityPub in practice is a grab-bag of specifications and implementation-specific details. It was hard to find documentation for a lot of things and hard to debug requests that didn’t have their intended effect on Mastodon.

ActivityPub is a distributed architecture, so it’s going to be a lot more complicated than RSS. People smarter than me rightfully wish that ActivityPub was more sophisticated and more on the side of “better” than worse. And the chattiness of the protocol - the fact that if I have thousands of subscribers I’ll have to send out thousands of updates - that comes with the territory. Just look at how much overhead there is in BitTorrent.

What I built isn’t an ActivityPub system as much as a Mastodon-compatible one. I think this is the key contradiction of the ActivityPub system: it’s a specification broad enough to encompass many different services, but ends up being too general to be useful by itself. There are other specifications like this - things like KML which are technically open and specified but practically defined by what Google Earth supports and produces.

With this frame of mind, the question becomes, if ActivityPub probably isn’t going to be a self-contained standard and instead the basis for one or two popular, homogenous implementations, and if federation is probably going to be a secondary property of those implementations, is the specification technically good enough, useful enough, correct enough, that a future Twitter-competitor will use it? I’m not sure.