I've been revisiting moving to self hosted Matrix around every 4 months now for 2 years, and every single time I failed.
The reasons vary; initially synapse refused to work, then I got stuck trying to set up a multi-domain service.
That said, this document verifies what I feared in the background: what matrix offers as self-hosted is too simple to be true, and thus it's no surprising I never got it completely running.
XMPP has it's own issues, but when I self host it, it's there, nowhere else. No identity servers, no push servers, no jitsi servers in the background.
It seems like I'm going to be with XMPP for a much longer time.
Matrix tries to do a lot more than XMPP. In my experience, people find XMPP too limiting, so they don't use it.
It's not that simple. Many doesn't know about omemo, jingle, etc, when it comes to xmpp. Or xmpp bridges like biboumi.
Matrix is is doing the same thing, but on differrent - and way more complicated - infrastructure ideas.
Prosody is definitely much simpler to configure, even with multiple domains, especially when per domain specific setting are needed.
So you can't just say, let's use XMPP. You have to be very specific and make sure people use the right versions.
You can say that Prosody is easy. I don't find the following list easy: https://prosody.im/doc/modules
And you probably need 'Component "conference.example.org" "muc"' for any kind of 'room' support.
The the next question, does prosody have the equivent of federated rooms in matrix. Here is a list of XMPP extensions in the documentation:
I guess the answer is, there are no federated rooms in prosody.
Another question is whether it is possible to send someone an XMPP message when that person is offline. I have no idea how to search for that.
I guess newer Prosody supports this and much more out of the box but the generic configuration instructions I used are here: https://serverfault.com/questions/835635/what-prosody-module...
> The problem with omemo, jingle, etc. is that you have makes sure that the XMPP components you use support them. It would be fine if all popular implementations support those features, but that's not the case.
Isn't this also the case with Matrix that no implementations except the official ones support E2E encryption?
This isn't true. There are independent working E2E implementations in weechat-matrix and pantalaimon (python-nio), nheko (mtxclient), and even a read-only one in purple-matrix. Meanwhile lots of independent apps build on the official SDKs (e.g. Seaglass on macOS, bots like Matrix-Recorder, the various Riot forks, etc)
In Matrix, clients are supposed to implement the full Client-Server API. If a client leaves out e2e then it cannot claim to implement the matrix protocol.
Given that 1.0 of the full matrix protocol was published only a few days ago, it makes sense that anything other than the official implementations are behind. E2e in the official implementation is not that old either.
Even the official implementations need some work to be useful. For example, cross device signing is not there yet.
Simple retaining messages until the client is online was supported since ages, I don't think there is XEP for that but I could be wrong. More elaborate scheme with persistence, multidevice access, paging etc is here: https://xmpp.org/extensions/xep-0313.html
> In Matrix, clients are supposed to implement the full Client-Server API. If a client leaves out e2e then it cannot claim to implement the matrix protocol.
Yes, XMPP has XEP suites that serve the same purpose: https://xmpp.org/extensions/xep-0387.html
A general purpose project to provide all clients with E2E encryption is being developed (and is usable right now): https://github.com/matrix-org/pantalaimon
Right now it runs as a daemon on a user's local machine.
I too ran an XMPP server for years, used plaintext and OTR, it was nice.
This was in the days before everyone had tablets, phones and laptops. I used ChatSecure (formerly Gibberbot) on my phone and Pidgin on my PC.
OMEMO wasn't invented then and nobody else had the double ratchet so I had to just deal with the fact there wasn't multi device support and E2E.
Google and Facebook both offered XMPP bridges too. Google and Facebook have discontinued such services. Voice/Video never worked with them and file transfer with Pidgin never worked with Google Talk.
Now in today's world, how am I seriously going to convince my friends to use XMPP when they will say, can we use camera/video, oh do we have group E2E too?
Am I seriously going to say "lets use this XMPP client for chatting in text because it supports OMEMO, and lets use this other client because now we want to have a video call?".. What am I going to do when they're on Android?. Conversations.im is nice, but there's no voice/video with that.
The problem simply is there's no reference client that does everything. Many of the clients are ancient fugly GTK clients, and if they do Jingle it's only on Linux (Pidgin, Gajim, Telepathy based etc)
If we all use different clients for different things how am I supposed to say "Mom you click here to do that".
Then you do have some promising clients like Coy.im that look nice. And they've said NOPE NO OMEMO HERE. https://github.com/coyim/coyim/issues/233#issuecomment-21200...
Oh you can have video here, but no OMEMO https://github.com/jitsi/jitsi/issues/199#issuecomment-17017...
If Conversations offered voice/video, that would have been a different story.
That said, self hosting matrix seems to be similarly hard to execute at this point in time - simply too many opaque and moving components on the server side.
The riot client is also incredibly slow for my taste.
in terms of too many opaque moving components serverside; the baseline is just a homeserver. pip install matrix-synapse and off you go. configure your client not to use an identity or integration server if you are worried about them.
How about removing the ability to copy text? Or the ability to take screenshots?
The following stack will be used as reference, with users
connecting via web, desktop and smartphone clients:
Client: Riot-web v1.2.1,
Riot Desktop v1.2.1,
Riot Android v0.9.1
Server: Synapse v1.0.0
Unfortunately, it might not be a good idea to trusting that a version number consistently maps to a specific URL, or that a server will give the same file to everyone each time they ask fo a URL. We know that sending different versions to different people is common ("A/B testing"). If you're investigating the security of something or worse: you suspect you might have sentient opponents actively trying to deceive you, then version numbers are no longer sufficient: you should also include cryptographic checksums! The only way you can know that the file you received is the same is if you have e.g. SHA-2 hashes as proof. Even better, if it's important, include the RIPEMD-160, SHA-1, CRC32, and any other available hash/checksum because why not add redundancy and give people options.
$ sha256sum *
In the interest of making a reproducible investigation, it might be a good idea to include hashes for the specific packages being investigates.
> Give the git commit hash maybe?
That would probably work? This gets into the problem of reproducible builds, where builds from different environments might not be identical. This means documenting that you used "a build of version 1.2.1 git commit 7446799e4b0e3e65122f5642b5f3a8c59aae15bf" means something slightly different than saying you used "riot-v1.2.1.tar.gz with SHA256 8020cc617367a4318be090b1562a26571f1a3417b0d4a52b2d4f19e03d6126c1". That said, obviously having literally any hash to work from is much better than using version numbers alone.
Github links that include the commit hash might be useful, but it seems like you cannot link to both a tag and a hash? I wonder if github supports links that are a combination of https://github.com/vector-im/riot-web/releases/tag/v1.2.1 and https://github.com/vector-im/riot-web/commit/7446799e4b0e3e6... ?
We acknowledge the need for reproducible investigation, but the document did not explain in a scientific manner how we reached such outcomes. We had to draw a line to keep the document on point with our message. Adding hashes wouldn't really make a significant difference.
We'll make sure to keep this in mind if we do write a follow-up with details on reproducible checks thought. Thank you for your insight!
Much to improve.
Agreed that we should do better at presenting a max-privacy config preset and explaining how identity/integ/notary servers work to users (without making the UX unusable), but to throw away the whole project over this is throwing out the baby with the bathwater, imo.
Example: for Scalar, the issue says that Riot talks "too much" to it. The research is not about how many times Riot talks to it. It is that Riot talks to it before the user explicitly requested the service, and in a way that the user does not expect.
As we wrote in the paper: "Privacy protection is a mindset". It is not about fixing individual issues and then have new ones pop up because the underlying problem is not fixed. It is about having a process in place so it cannot happen again.
Matrix devs, instead of battling the reviewer here, please make a proper blog post and explain what is really going on here. Tell us the truth about your data handling and the data retention.
The reviewer did his own share of work. If there are mistaken parts in his reporting, please correctly explain them in a civilized way in a possible blog post.
Yeah, the issue with all these replies is that the real replies that addresses the issues can be lost. I really recommend that you put that on your official blog so we all can benefit from reading your response.
Replying to a blog to a living research document intended to be used by another project/protocol feels silly. Whatever you write certainly is not all up to date anymore with the corrections we made thanks to the Matrix community that joined the room, and the Grid community that kept on discussion the doc.
Where is Matrix.org tho? This could also be a good occasion for the three new guardians to join the community.
"Riot needs permission to access your address book contacts to find other Matrix users based on their email and phone numbers. Please allow access on the next pop-up to discover address book users reachable from Riot."
That said, this analysis does have a few valid points in it, specifically:
* We should probably provide a click-thru when users interact with 3rd party identity lookup servers or integration managers
* We should hash contacts when doing bulk lookups
* Riot/Web has a bug where it talks to the integration manager too frequently (https://github.com/vector-im/riot-web/issues/5846)
* Notary servers should eventually be removed entirely (as per MSC1228).
However, most of the rest of it is alarmist and disproportionate FUD, plus the author has sadly forgotten to disclose that he's working on a hostile fork of Matrix. A point by point response is at https://matrix.org/~matthew/Response_to_-_Notes_on_privacy_a... fwiw (apologies for the PDF, but Google Docs doesn't seem to expose a read-only view of commented docs.)
Please don't say please to make people perform questionable privacy violations. How about:
"If you want Riot to determine which of your contacts also use Matrix and to easily enable you to talk to them via Riot, you can allow Riot to access your contact list.
Or something like that, whatever it really does.
Disclaimer: I like matrix :)))
I know very well that concise wording in UIs is incredibly hard, but this wording is misleading. It makes it sound like Riot will not work unless I allow access to the address book. I'd suggest something like this:
> Riot can check your address book to find other Matrix users based on their email and phone numbers. If you agree to sharing your address book for this purpose, please allow access on the next pop-up.
Why can't Riot "check my address book to find other Matrix users" without sharing it to their server? Could the client make a one-way hash for each contact and send those to the server to compare against hashes of other contacts?
So in your wording, it's not clear that B (agree to sharing my whole address book) follows from A (Riot wants to check my address book for a purpose). Actually it's not clear if "agree to sharing your address book" means sharing it with the client app, or sharing it with a remote server that someone else controls.
In terms of actually of handling the issues, the scalar issue is one we brought up with Ben months ago in private as per your disclosure policy, and yet nothing was done. This is just an example of a long list of issues brought up over the years.
The point of the document is not to find justification for what is happening, but to inform users that it is happening. An attacker got access to your systems which contained logs from which such data can be gathered. It is important that users who self-host and do not expect such data to get out realize that it does so they can take appropriate action.
The document might feel alarmist, certainly. It does not feel alarmist because we wrote it. It feels alarmist because the behaviour described is happening and nothing is done about it. It is not discussed anywhere. Attempts to do so are shut down. But it does not change anything: leaks are happening right now on thousands of servers and for millions of users (up to 9M, as per Matrix.org figure) and every person who we showed this to before publishing had the same reaction: "I never expected such data to go out like this. I am worried".
As for Grid, we made a specific effort out of respect for the Matrix.org people not to mention it or steer towards it. Yes we have forked Matrix. No it is not hostile, despite your continuous claims to label it as such.
We think it is time to stop talking about all the good reasons why, in the 5 years it took to get Matrix out of beta, there was just no time to deal with such leaks. We think it is time to start talking about how we can make sure it stops from happening and which decisions lead to it happening for so long unnoticed.
You wrote the software. Start respecting your users privacy.
Especially your ongoing notion of metadata as private information which should be hidden is funny: how do you intend to do that? Short of wrapping your application into Tor (which seriously impacts performance letting your average family member happily pass it), I can't think of any method not including any BS-Bingo (how about a blockchain...).
I agree that the vector.im-identity service seems really unnecessary and it reminds of Mozillas approach to sync (yeaah, the ones with your cited manifesto cough); still, I was well aware that this means regularly contacting this server and probably also checking my contacts DB against it (as well as having metadata on my browser, like every other website + it's 23 ad-networks, uuh)? Also for anyone interested in actually hosting a server it's really spelled out plainly, that this is a measure for convenience and you can still host your own server – btw: did you ever try to integrate federation into syndent (you might show the world your archived Issue/PR...).
The part about the integration server is indeed worrying (but not you, putting at the end?!?) because without it, I don't really see the value proposition of matrix compared to plain old XMPP (and I wonder how you intend to monetize on kamax...). And I wasn't really aware of it...
The other parts
- I didn't give an eMail, wasn't a problem for me and I'm seriously not imaging any way to resolve this w/o aforementioned BS-bingo or yet another personal information (private/public key, which is beyond scope for most people + creates its own set of problems (people with unencrypted keys on their machines...)
- so the only way for matrix to read messages is by adding a bot? can the scalar.vector.im server initiate that too? otherwise your claim that vector.im can read all your messages is just BS
- you never mention that encryption by default would be cool. How will kamax.io handle this?
We did the next best thing after improving sydent: we wrote our own implementation of an Identity server: mxisd. We linked it several times in the doc. You should give it a look. That's one example of how you can be better at privacy.
If the content of the document does not surprise you, and you were fully aware of all that was going on, it is also a win! Sadly, this is not our experience with the many users we came in contact with. They did not know, but wanted to know in details.
We do not mention End-to-End encryption would be cool indeed because it would not change what is happening here. In Matrix, the encryption would only cover the content of the event, but not its metadata (sender, source, timestamp, etc.). The document is clear that the vast majority of the leaks are around metadata (who sent what, who did what, when, from where) and not data itself (the message itself).
This document only scratches the surface of privacy in Matrix, by being specific to Matrix.org and its choice of recommended software. It gets worse as we start investigating the protocol itself. It is your choice to see this as FUD. It does not make it less true, and while you might not care, some do. We published the document for those who care and do not have the means, time or capacity to do such a research themselves.
For the perception/expectations of average Joe on privacy/obscurity on the internet I recommend you read the recurring threads on any platform whenever there is a new "scandal" centered on whatsapp (europe): half of your commenters will just tell you that they are gonna use Telegram (yeah, the ones, where you don't know exactly whose behind and which think that encrypted group chat is too much of a hassle).
Regarding your comments that the protocol is broken, I'm really surprised how you are intending to tackle this? Why the hell are you using the very same protocol which is driven by a body which you claim intransparent and non-cooperative? If all your allegiations are true you would have been better of rolling your own/your software won't be compatible for long if you take your own writing seriously...
P.S.: care to elaborate who's "we"? Your projects have a surprisingly low number of contributors (which hopefully changes now), so I can't really figure out, why you are not just saying "I". Also don't know what's so bad on taking a stand in a civilized public discussion (if "we" decided to be anonymous)?
You want to quite a length to work in this irrelevant slight.
I don't know enough about the design/implementation and the overall context to add anything of technical value to conversation and I won't even try. I would, however, like to point out that both your 'tone' and 'demeanour' come across as incredibly hostile and unnecessarily personal.
Also, dismissing a possibly valid criticism or review of something because it doesn't present immediate solutions to the problems highlighted doesn't mean the criticism itself has no worth, and to discount it out of hand for this reason is folly.
Other than that I suppose this is personal, since this seems to be the personal pet project of the author (and he constantly assumes a "we" as if multiple people were signing his rant).
And if you read my comments carefully, I never critizise the lack of solutions. My critic relies on the fact that he starts with very clickbaity "facts" which are then elaborated assuming that average joe is running his own homeserver and not knowing the tradeoffs of it. He then happily mixes up problems of any non-TOR messenger (of which matrix.org matrix is one...) and specific problems of running a matrix-homeserver with the recommended settings, using the vector.im phonebook (and apparently a bunch of these settings are bugs...).
Such as it stands, this is convoluted FUD. If he wanted to make a constructive contribution he might as well have stated the problems upfront (the current working matrix implementation relies on proprietary/centralized services) and then gone into a discussion of every problem step-by-step (and advertizing the very solutions he tries to sell). The problems of matrix/vector/riot are as real as Mozilla pivoting to the Firefox Service Company and integrating a host of proprietary tools, but the problem has deserved better than this rant...
> Bitmessage was conceived by software developer Jonathan Warren, who based its design on the decentralized digital currency, bitcoin.
So, looks like "blockchain"? Anyone else?
From the FAQ: https://bitmessage.org/wiki/FAQ
> On average it should take 8 minutes from the time you click the send button to the time you receive a response.
Whew. Privacy really costs.
This is way better than the behavior of proprietary programs, but I'd much prefer if the program actively discouraged uploading private information. I can deny the permission myself, but there's not much I can do about other people's behavior, and with the way Riot currently works they are going to end up uploading my personal information.
A client that isn't even able to access contacts is one of my top wishes for Matrix right now. As soon as that appears I'm going to start recommending it to everyone over miniVector.
> If you don't specify an email address, you won't be able to reset your password. Are you sure?
In your response on pg. 4, you say:
> Commented : Yup, this is the point of the service -to map email addresses and phone numbers to matrix IDs.
Is it possible to specify an e-mail address to be able to reset passwords without making this e-mail address public? Clearly, this should be the default setting if someone enters an e-mail address after the above prompt by Riot.
We could also split it into separate actions (one to set it for password reset, and one to use it for discovery), and indeed before Riot this is how it used to be (there was a checkbox in Matrix Console at registration to let the user choose whether to bind their email). This got lost in Riot because of concerns that it made the registration UX too noisy and complicated (especially with custom HS & IS URLs flying around the place), so it currently binds their email by default. I've just filed https://github.com/vector-im/riot-web/issues/10054 to track addressing this.
You said "making this e-mail address public" in your question - it's worth noting that binding a 3PID does not publish it in a public list; instead, it means it can be used as a key to look up your MXID for users who already know your email address.
In terms of the other valid points the analysis raises, I've also filed a bug to track hashing contact details when doing lookups (https://github.com/matrix-org/matrix-doc/issues/2130, although i could have sworn we had one already). The other two issues (Riot/Web talking to Scalar too much, and the desire to remove notary servers entirely) already have bugs - https://github.com/vector-im/riot-web/issues/5846 and https://github.com/matrix-org/matrix-doc/issues/1228 respectively).
> You said "making this e-mail address public" in your question - it's worth noting that binding a 3PID does not publish it in a public list; instead, it means it can be used as a key to look up your MXID for users who already know your email address.
The domain part of e-mail addresses is public anyways due to certificate transparency, meaning that an interested party would only have to enumerate the local part to find all e-mail addresses from a specific domain used by Matrix users. In this respect, the lookup answers the question "Does this address exist?" and as such makes it public.
Once again, it's not about brute listing things. It's about knowing a 3PID from another source, like a dump of email/phone number on the darkweb which can then be used to query for a mapped Matrix ID. Or simply an email given for another purpose to the same server.
It is all fun and games until you start correlating data sets, like claudius points out correctly with other public lists.
access != upload
This is the same wording how Facebook/LinkedIn/etc got our contact list.
> a hostile fork
What? Matrix is Free Software. There is no such thing as "hostile fork".
Of course there is.
however, it does provide some defence-in-depth against Identity Server inspecting the email & phone number details in plaintext, so we'll go sort it out as per https://github.com/matrix-org/matrix-doc/issues/2130
* You could fully encrypt push notifications?
We encourage anyone who already read the initial version to check out the revisions of it for new content or re-visit the document.