I’m quite interested in instant messaging technology! I’ve been an avid user of internet relay chat (IRC) – probably the oldest chat protocol in existence – for years, I went through a brief period of using Matrix.org for everything, and I’ve of course sampled most of the smorgasbord of proprietary chat solutions, including WhatsApp, Discord, Facebook Messenger (eww!), and probably some others I’m forgetting. This blog post is essentially a data dump of opinions and thoughts on the various open chat standards and protocols that are out there (and if you get all the way to the very end, you get to hear me propose yet another one of them…!).

The dream of the universal standard

There have been countless blog posts on how it sucks that we have lots of things to choose from, and there isn’t any universal standard yet for everything. Of course, there’s even a relevant xkcd:

xkcd #1810: Chat Systems

Writing a universal standard is hard, and indeed there’s yet another relevant xkcd about that one as well1. Part of the problem is that, well, instant messaging provides various ways for people to express themselves, and so different people are likely to prefer different platforms – some people actually like IRC for what it is, a basic, purely textual form of communication, whereas others want features like avatars, images, and other modern niceties. Some people want encryption or privacy and would prefer that to take centre stage, whereas others want to use custom emoji, have native GIF support, and stuff like that. In some ways, the reason we have so many chat protocols and services available is due to this difference in preference and target market:

  • WhatsApp is, or was originally, “simple SMS equivalent that doesn’t cost as much and that lets you have group chats”
  • Discord is “chat for gamers”
  • Slack is “chat for Big Important Corporate Uses”
  • IRC is “basic text-mode chat for developers and people like SirCmpwn

While it would conceivably be possible to unify all of these competing applications under one common standard, doing so would seem a bit awkward. WhatsApp, for example, is the only one of the above platforms that supports end-to-end encryption, and it doesn’t store any chat history on the server either; conversely, Discord and Slack treat your chat history more like an append-only database, with the whole thing being easily searchable (indeed, Slack try and make money by charging you access to this database).

Bridging

The eventual fate of an open messaging standard is not generally one where everyone on the planet drops whatever tools they’re using and starts to use it. After all, standards are paper, and paper isn’t worth much on its own; you need to be able to talk to the friends you already have! To remedy this, most protocols worth their salt have an ecosystem of various bridging tools to go with them that let you do just that; for example, my own sms-irc project lets you talk to your WhatsApp contacts using your IRC client.

Of course, not many chat services open up their protocols and make this easy to do – the last mainstream one to really do so was Google Talk, which was based on the open XMPP standard, and even supported federation (i.e. you could spin up your own XMPP server and talk to people using Google Talk, which was nice). Services like WhatsApp are based on standards like XMPP, but they’ve heavily locked everything down so you can’t just go and connect with your XMPP client. Discord has an API, but using it on user accounts (instead of their officially sanctioned bot account option) is forbidden; I could extend this list on and on. The main point is, of course, it’s not beneficial for 99% of these chat companies to allow people to connect their own custom software; their business model usually heavily relies on you using their version, so they can either serve you with ads, perform some tracking, etc., or so they can pull a Slack and prevent you logging your chat history elsewhere. In fact, Slack even shut down their IRC gateway a while ago – the reason they gave was the above one of “we have custom features that don’t come through well via gateway”, but everyone secretly suspected it was to increase vendor lock-in.

Nevertheless, though, bridging tools do exist - bitlbee is one of the most popular ones for IRC, XMPP has Spectrum 2, and Matrix has their collection of bridges. As an example, I personally use IRC, but most of the people I talk to use other chat platforms; IRC is just a way to bring all of my communication together in one place. Indeed, this is one of the main selling points for open protocols and platforms: “in the future, people will realise our standard is better, but for now, you can still talk to your old friends”.

Competitor analysis

In terms of protocols (not services) that I could use to set up a personal messaging system, there are really only three competitors: IRC, XMPP, and Matrix. Of these three, they each have their advantages and disadvantages:

IRC

  • …is rock-solid reliable, perhaps to a fault
    • If you send a message to someone, the message is going to arrive within a few seconds of you sending it, maximum.
    • If it doesn’t, you will get some obvious error as to why.
  • …doesn’t handle mobile connections / Multiple Points of Presence (MPoP) well
    • People nowadays expect to be able to use their mobile device. IRC is still really tethered to the desktop.
    • Mobile IRC clients exist, but require usage of a bouncer or something to multiplex your desktop and mobile connections together, and maintain the connection when your phone goes offline.
    • On Android, the mobile story is a bit dire, if you exclude things like Weechat Android and Quasseldroid that are frontends for some bouncer or other desktop client.
  • …is incredibly simplistic at the protocol level
    • You can chat on IRC using just telnet. The protocol really isn’t very involved, and is quite easy to write parsers for.
    • IRC doesn’t use JSON, XML or any other sort of formal data format.
  • …is incredibly simplistic in terms of functionality
    • There’s no chat history, sending messages to people who are offline, or anything modern, really.
    • No typing or delivery notifications either.

XMPP

  • …uses XML
    • XML is crap2 and hard to write parsers for.
    • Older programming languages like Java and friends handle XML fine, but it doesn’t necessarily translate that well in a world where everything is JavaScript or Rust.
    • Even if you think XML is okay, it’s still arguably too complex for a messaging protocol.
  • …suffers from extreme fragmentation, both protocol-wise and app-wise
    • XMPP comes as a base standard that then gets extended by a bunch of XEPs (extensions to the spec). This is dangerous, since quite a few clients don’t bother implementing all the useful XEPs you actually need to make XMPP worthwhile.
    • There seems to be quite a large disparity between best-in-class XMPP clients like Conversations and the rest of the ecosystem.
  • …has built-in support for Multiple Points of Presence (MPoP)
    • At least they thought this one through; you can chat on multiple devices at once, and that’s natively supported by the protocol.
  • …doesn’t really seemed to be used (as an open protocol, that is) by many
    • Quality iOS XMPP clients, for example, don’t really exist. Sure, there are a few, but they’re quite hacky.
    • Weirdly popular in Germany, though (?)
  • …has some support for end-to-end encryption
    • See this website, which tracks implementation in multiple XMPP clients.
  • …subjectively never really worked for me
    • I did give XMPP a try once, and found it quite shoddy. Group chats would fail to work in mysterious ways or not sync across devices, with no real reason as to why; chat history didn’t always work as advertised; and I generally got a bad impression of the whole protocol and its ecosystem.
    • The fact that multi-user chats were an extension instead of being built in to the protocol is perhaps a cause of some of the jankiness.
    • I acknowledge that I’m probably not doing XMPP justice, but there you go; I didn’t see anything in it when I tried it!

Matrix

Oh, boy.

  • …uses JSON and HTTP
    • This is better than XMPP, since these standards are actually used nowadays, and are relatively lightweight; pretty much everything can speak JSON.
    • Some people do complain about HTTP being still a bit too heavyweight, in contrast to something like XMPP where you don’t need to pay the price of making a new HTTP request every time you want to send something.
    • However, Matrix’s long polling model for fetching new messages is something I actually think is quite clever; it’s clearly been designed thoughtfully to allow clients to perform well with patchy connections, which is a pain point in older protocols like IRC.
  • …has a growing amount of people, organizations and development effort rallying around it
  • …has a healthy bridging ecosystem
    • Taking a look at their bridging page shows as much; this is one thing they are pretty good at.
    • The specification even includes a separate part for Application Services (ASes), which are specifically designed to do things like bridging.
  • …has a somewhat problematic, slow reference implementation
    • Essentially, the problem is that the protocol requires a “state resolution” algorithm to verify permissions in a chatroom. This is the root of all sorts of performance issues, and also has been the source of security issues in the past (q.v.)
    • There are long-standing related issues like #1760 that still aren’t fixed and have lots of users complaining.
      • (although there’s now a somewhat hacky workaround for this that involves sending dummy events into the room)
    • I mean, just browsing issues flagged major is pretty enlightening, and hints at some real problems with the way the reference implementation is built.
    • In my personal experience, I ran a Matrix homeserver for the best part of the year until I got fed up with it; I had to enable zswap / zram on the servers I was running it on (since it had a habit of eating all the RAM available), and had to contend with 100% CPU usage spikes every time I logged on.
  • …is questionably reliable as a messaging platform
    • When I used to use it, I’d frequently encounter problems with messages just not being delivered, or push notifications breaking.
    • I once had an incident where a friend of mine, trying to send messages to me on my self-hosted server, was sending things for about a week without me getting any part of it, until I figured out something was amiss and rebooted it.
    • I personally value the messages actually being delivered above all else, and Matrix at least in my experience is pretty bad at this…
  • …has a questionable security story
    • The official matrix.org server has been hacked in the past, although this is nothing to do with the protocol.
    • The original version of the protocol had a bug known as “state resets”, where room state would be reset back to some earlier version. This caused all sorts of fun security issues – one user was even able to wrest control of the main Matrix HQ chatroom back in the day and ban the project developers from it – until they fixed it.
      • This also resulted in the hilariously named “Hotel California bug”, where people could leave a chatroom, but would end up being forcibly rejoined whenever a state reset occurred.
      • This was eventually solved with a new implementation of the “state resolution” algorithm – which required upgrading rooms to support the new version, essentially creating a new chatroom and trying to get everyone to join it.
    • Although they’ve cleaned a lot of things up for their Synapse 1.0 release, and it’s improving, looking at their list of open and closed security bugs isn’t exactly reassuring.
  • …has a questionable data model for chat purposes
    • Really, the root cause of a lot of Matrix’s problems seems to be that each chatroom is actually a distributed database, stored in the form of a directed acyclic graph (DAG) – that anyone can append to, given it’s all publicly available over an open federation.
    • Trying to do things in a way where the chatroom is completely independent of any of the servers it’s hosted on is cool, but also quite unwieldy.
    • Arguably, a more lightweight protocol based on message passing would be better suited to chat – but hey, it’s a free world, and different implementations can exist for a reason.

A gap in the market

This is getting quite long (especially with the rather dense bullet point list presented above!), so I’ll wrap things up: essentially, I think there’s a gap in the market somewhere between IRC and Matrix for a new standard, or at least an attempt at one. Matrix does a lot of things right, but is a bit too ambitious; they’re trying to tackle the problem of making a distributed (in such a way that it doesn’t depend on any single server), end-to-end-encrypted, fault-tolerant database, which is arguably a cool thing! However, I posit that something more like IRC or XMPP is better suited to chat, which seems like it could be implemented in a far more lightweight manner while retaining quite a lot of the functionality – like XMPP does, but with less XML and hacky standards.

Bonus round: Mastodon

The Mastodon decentralized social network is a pretty good example of what I mean, actually; it’s a relatively simple, open protocol (ActivityPub) that doesn’t try to implement the world (users are still tied to a single server, for example, and data is lost if that server dies) - but its simplicity has allowed for the development of other so-called “fediverse” servers, like Pleroma, and even ultra-minimalist ones like honk3, without too much hassle.

It’s simple, reliable, and still federated; it works well enough for non-geeks to use it, because the Mastodon UI essentially imitates Twitter. IRC is simple, reliable, but not federated and hard for non-geeks to use. XMPP is, well, XMPP. Matrix is not simple and not reliable, but it is federated and reasonably easy to use.

So, I suppose, why don’t we just build a Mastodon, but for chat?

(Stay tuned; this isn’t the only thing I’m going to post about this…!)


  1. Which I’m not going to include inline, otherwise this whole blog post is gonna be a string of xkcd comics. 

  2. “XML is crap. Really. There are no excuses. XML is nasty to parse for humans, and it’s a disaster to parse even for computers. There’s just no reason for that horrible crap to exist.” ~ Linus Torvalds 

  3. I use this! Check out honk.theta.eu.org (you can follow @eta@honk.theta.eu.org from your Mastodon if you like!).