It's easier to understand this if you look at the onion protocol. Broadly, it introduces noise between all users on the network by constantly sending and receiving random bytes to and from each node. This prevents external listeners from figuring out where the main server is. Originally designed to protect naval command ships, it was later used for the "dark web". If you don't know which node is the server, you can't shut it down, or read data off of it.
Simplex does something similar. A connects to B. B connects to C. And A and C connect. They all chat. but there is no way to know A, B, or C, because from the outside, it all looks like: X connects to Y, X connects to Y, X connects to Y. So who spoke to whom?
This is great. Even if "the authorities" demand access to chat logs, first, they won't know what to ask for. Chats between whom? Second, they still won't know who spoke to whom even if they have all the data. It's anonymized chats. They would have to sift through all of it.
It still won't prevent someone invading privacy if they have physical access to your device, since the identities are stored locally for your usage convenience.
The solution to this is of course to simply outlaw the use of communication systems that cannot be monitored by law enforcement. India has it, the EU is working on it, and I'm sure the US will do something like that as well.
These super-anonymous communication technologies are touted time and again to solve the problem of a surveillance state, while they do nothing of the sort. You cannot solve a social problem with technology.
The point that was trying to be made was that it doesn’t matter how secure and unbeatable something is if a sovereign state wishes to simply criminalize its use. It can then utilize its full power to enact violence upon any S̵u̵b̵j̵e̵c̵t̵ citizen, who is caught using it.
> It still won't prevent someone invading privacy if they have physical access to your device, since the identities are stored locally for your usage convenience.
does it mean if a lot of Simplex users band together and sift through all their local identities they can connect the dots?
They really bury the detail IMO after the banner claim front and centre on the website (I guess because it's hard/awkward to explain without it sounding just like a difference in nomenclature).
What makes it work afaict is the combination of:
- there are still queue (inbox) IDs
- key (and (just initial?) queue ID) exchange out of band
So messages are still delivered to an identifier, it's just that every user has tonnes of identifiers (per contact/group), there's no server tracking and handling their exchange, and possibly they rotate via encrypted messages once established anyway.
Exchanging out of band gets you the secrecy, and having one per-chat protects you from a contact turning out bad/leaking/compromised - it's fine that they have metadata about their own chat with you, because they have that & the plaintext anyway.
That isn't what GP is saying. In order to effectively use the chat, you are going to tag that connection identifier with a name like Joe on your endpoint device, so that when a message comes in you remember this was the conversation with Joe. The server may not know that this is you and Joe, but you do.
Yeah, that was my take. The issue with end to end encryption is the ends. The ends are the weak link. I think though that simplex may be the best option in this regard though. The user has to know who they are speaking with, but here the user can choose not to tag the connection, and therefore take on the mental load of remembering which connection identifier referred to which person.
Having not used this chat I don't know how easy thing might be but I do remember, before mobile phones were a thing, being able to remember at least 8 phone numbers that I used to call regularly. Certainly if it called for it you could do this with simplex?
Let's say the whole network is just 4 nodes. A, B, C, D. And A and B are connected. And C and D are connected. And I have physical access to all 4 devices.
In phone A, you will have B's contact stored locally, let's say as "Dan".
It's a little hard to grok the details, but it sounds like it's a collection of "dead drops". Like, if done in a physical way, imagine locations around a city where you can drop off scraps of paper with cipher encoded message on them. Encoded not just with sender/recipient keys, but a dead drop location specific key too. And an agreement that replies to a received message get dropped off at a different, randomly selected one.
Everyone can see and read the cipher text on all the papers, but each of the 4 people can only decode the things meant for them.
So, if that's how it works, you could certainly learn who was talking to who if you had access to all the devices. But access to one device only shows you what came and went to the device, but no data about which of the other three users were involved in those reads/writes. You would have to gain access to each device, in turn, to prove whether it was in contact with the first device.
I specifically said a lot of users. A lot is the opposite of one. Imagine a lot (>40% of total users) of impostor devices acting in accord to deanonymize some of the X and Y. Is it vulnerable to that. Like apparently Tor is.
You would at most be able to deanonymize a certain percentage within the impostor network itself. Kind of pointless.
Physical access of devices is the only way to have some chance of deanonymizing some of the users (always less than number of devices you have access to).
That’s my understanding, but the maker of this thing is here, and maybe can respond better?
I'm overall intrigued by SimpleX and I'd be excited to see it undergo a proper audit. Even if the "no IDs" thing is more of a gimmick than a useful feature, a competitor to Signal would be great given the direction it appears to be heading.
I'm a little put off by parts of their advertising though. Their homepage states Signal can be MITMed given the "operator's servers are compromised", which I don't doubt is true to the extent of the actor stealing, maybe, phone numbers and metadata? But my understanding of Signal's protocol is that a compromised server couldn't intercept _messages_, which is what I feel they're implying.
According to GitHub, their (mono?)repo[1] is split 33% between Haskell, Kotlin, and Swift, which is nice.
The direction people refer to is the company making a few decisions that deviate from being 100% idealistic. Forget the fact that the client is open source, the parent company being a non-profit, and the app being ad-free. Instead, focus on delayed server source code releases, a weird US crypto currency thing, and the company not supporting private/standalone Signal networks. Oh, also, super concentrate on the client using phone numbers as if it's a fatal flaw rather than a design choice. Instead, keep using WhatsApp or Telegram which have way more problems (like being owned by Meta or having completely closed source server code).
Companies generally trying to do the right thing (e.g. Signal and Mozilla) are held to impossible standards. They are by no means perfect but they're leagues better than the alternatives.
I generally believe that for-profit, venture funded company, will some IP help in non-profit, has much better chances of delivering privacy preserving service.
Initially, in 2020, I thought that SimpleX Chat should be non-profit, but then after a long chat with Joseph Jacks in April 2020, who is evangelising VC investment in decentralized open-source tech for a long time, he both convinced me that 1) a dual model is better both for the users and for the scale of change that can be achieved 2) to make a dive into it - the idea at the time seemed too crazy to do something about it for real.
So here we are, with a for-profit company building a privacy-preserving communication network, that will have more than one provider by design.
I really, really don't understand the obsession with the server source code. The client verifiably performs E2E encryption so who cares what the server does. Moreover, there is basically no way for external parties to verify what code is running on the servers. In the era of practically every free service selling out their users to the lowest bidder, it literally blows my mind the amount of hate Signal gets because of an old complaint about their server source (repo looks quite lively now).
The tech community is seriously guilty of letting perfect be the enemy of good.
For the record, I'm not a huge fan of the crypto wallet thing at all but I just don't use it.
> For the record, I'm not a huge fan of the crypto wallet thing at all but I just don't use it.
Why not, if it solves payments between Signal users (believe the Signal-MobileCoin integration, disregarding Moxie's involvement in both for a second, is a direct response to the now-defunct Libra/Deim)? Personally, I see MobileCoin / wallets as a genuine path to monetization for the Signal Foundation. Though, it remains to be seen if it is any successful.
>I really, really don't understand the obsession with the server source code.
You just used "closed server source code" as an argument against telegram? I also agree that it doesn't matter at all. But then why trashtalk telegram?
Normally, a user identity in an end to end encrypted messaging system ultimately takes the form of a ridiculously long number. From the SimpleX white paper:
>SMP is initialized with an in-person or out-of-band introduction message, where Alice provides Bob with details of a server (including IP, port, and hash of the long-lived offline certificate), a queue ID, and Alice's public key for her receiving queue.
So the identity here is the ridiculously long number formed from "Alice's public key for her receiving queue" combined with all that other stuff, isn't it? In other words, a user ID?
Would it perhaps be more accurate to describe this as a system with separate user IDs for each contact?
> So the identity here is the ridiculously long number formed from "Alice's public key for her receiving queue" combined with all that other stuff, isn't it? In other words, a user ID?
If Alice's public key is an ephemeral key used only for a single connection -- as with e.g. HTTPS -- then it can't meaningfully be called a "user ID". Then it really is just a random number negotiated as part of making the connection and thrown away when the connection is terminated.
I like this! As a feature request I'd like to be able to select 'one-time pad' as the encryption method. If I'm already meeting someone in person I no longer have all the constraints that fancier cryptography tries to overcome. I'll happily reserve space for thousands of messages for each of my contacts. I'd also like to be able to password protect that specific storage.
do you mean archive your contacts log? (their conversations with others)
It could be a per user option with new layers of encryption applied to the log at some slow interval. You would get something like a soft delete that would take n hours to decrypt even if you had the keys. Could gradually make stale conversations harder to decrypt with the oldest messages de hardest.
you could have part of your data stored with each contact in decryptable form and the rest stored with them too but in increasingly unreasonable. If enough contacts preserve your data you can easily decode your logs, with less it takes increasingly more cpu cycles.
The Android app has an option for periodic notifications, where it only checks for messages every 10 minutes. It positions this as battery saving, but in our information saturated, always on world, I think this is a great feature for user experience. It reminds me of old school email, where you could set the polling time. I like the idea that I can set the rhythm of my devices again.
I have never had my (I)phone do anything other than getting email when I load my email client. Why would I want that? The only notifications s I have are sms/WhatsApp, phone calls, and a 2FA app, and even then it’s only messages from my family that get pushed.
I used SimpleX Chat (and hosted my own server) for a while last year, when it was going through some growing pains such as image transfers and group chats. Overall, it left a good impression.
I liked the concept, IMO the accountlessness makes it much easier to invite people via link and start talking immediately (compared to any other popular messenger where some sort of account is needed). The apps’ UI was clean and simple.
The main issue was that one of the user’s Android app would not be able to get timely notifications no matter what we tried. My best guess is that it was an over-eager OS killing background jobs according to a whitelist (Conversations worked fine, for example). iOS app worked really well though (I know it has to use the push notification crutch). Another issue that bit me was that in order to preserve connections between users, server has to be configured to create a form of a backup, and has to be shutdown gently, not with SIGTERM. So a power outage (or carelessness) could sever existing connections.
The biggest problem in these kind of products is not implementation but trust.
The home page says that everything is private etc etc. Whatsapp also made similar claims before they were acquired, then suddenly these things didn't matter.
If this gets critical mass, what's stopping them from selling out. Bonus if they "accidently" started tracking users between update v4.2 and v5.2. Also, the TOS magically changed overnight.
The sad part is if they didn't sell out, they'll be buried under lawsuits and eventually banned. Just like KATorrents, yt-dl, Internet Archive, Dread Pirate Roberts etc.
Small group messenger. Compared with session/signal/status you miss out on large and public groups.
The transmit to every node with fake data all the time approach quickly becomes unsustainable for large groups of >100 people so even if you had a public user ID it wouldn't be seen by that many people anyway. But if they solve this feature disparity without IDs or bloat it will be very interesting.
Yes, we know that (I'm the founder). The large groups support is coming, currently the largest group I know of approaches 600 people and is not too usable. We've just released an experimental directory service for finding user groups.
Why would you want to be a part of a >100 user chat in the first place? I don't know anyone from my colleagues or friends who are taking an active part in such groups.
Largest group I have on WhatsApp has a few dozen people, and after some invisible threshold being passed, activity in the group drops, as people are hesitant to chat with so many unknown people.
Having read the overviews of SimpleX's message queue and chat protocols, I feel that the project would be accurately characterised as well meaning, yet also needlessly reinventing existing technology in some places. I would like to offer my initial impressions and feedback here.
The base message queue protocol chooses sensible cryptography, but leaves some important security aspects under-specified. This can lead to mistakes when implementers do not fully understand the security implications. In the context of an open ecosystem, being under-specified here also opens up more risk of implementers causing vendor lock-in (deliberately or otherwise) by breaking interoperability.
For instance, "servers have long-lived, self-signed, offline certificates whose hash is pre-shared with clients over secure channels". Exactly how that pre-sharing is performed is left unprescribed, although SimpleX proposes that clients could introduce other clients to new servers' addresses and public keys. Key distribution is always a challenging problem in cryptography, but traditional x509 PKI is not generally considered to be an issue, so I do not understand what additional benefit to privacy would be found when distributing the keys using the SimpleX protocol itself.
The overview goes on to explain how all client-server communication takes place using blocks of data fixed at 16KiB to make it more difficult for passive observers like ISPs to collude and work out which two parties are communicating. This protection is just taking advantage of statistics - the larger a block is, the more possible messages it could contain, and so the sender is more ambiguous. The downside, though, is that it's very wasteful: assuming that you need to refresh at least once per second for a text chat to feel responsive, that would entail sending a minimum of just under 1MiB of data each way every minute, which is approaching the bandwidth required for a two-way VoIP call! Such an extreme level of resistance against active attempts at monitoring is not usually required by most people, so it would seem more sensible to me to merely ensure that SimpleX is compatible with a lower-level private protocol like TOR for when that protection is necessary.
Moving on to the overview of the chat part of the protocol, I see a fairly uninteresting JSON-based format. There is nothing immediately wrong about it, but it's hardly state-of-the-art either: there's no CBOR to reduce overhead, no JSON-LD to improve extensibility, no MIME types to account for different types of attachment.
From a long-term community perspective, the seemingly arbitrary choice of sub-protocols within the chat protocol (currently group chats, file sharing, contacts and WebRTC calls) makes me hope that SimpleX have a plan in place to avoid what has happened to Matrix. Matrix has numerous built-in features (some of which are in a rather half-baked state), often with a tight coupling between the protocol and the expected user interface for the features, making Matrix notoriously difficult to implement.
Ultimately, I fully support what SimpleX is trying to do: remove the dependence on long-term, globally-unique user IDs in internet communication, which at the moment hampers even federated applications like those that use ActivityPub or Matrix. However, I get the sense from reading the introductory documents that those behind SimpleX aren't aware of (or worse, don't care about) existing standards upon which they could build, and so the novel features are obscured by a lot of rather pedestrian protocol definition that has to be implemented. I fear that SimpleX won't be able to achieve mainstream success unless it fits better into an existing ecosystem of protocols, and whilst I wish them good luck, I must admit that I'd be putting my money on W3C's Decentralized Identifiers (DIDs) to liberate us from centrally-managed online identities.
Do you have examples of protocols that have JSON-LD extensibility mechanism that is embraced by the developer ecosystem to create interoperable extensions. I.e. where the mechanism works more than in theory. I know ActivityPub where JSON-LD is actively shunned, because devs hate it. And Solid project where the specs around using it appropriately get ever more complex. Heard that DID/VC has another extension mechanism fleshed out, but don't know about uptake.
> but leaves some important security aspects under-specified
Not sure what you mean by underspecified - it is specified to the level of wire encodings. Possibly you looked at the wrong doc?
> There is nothing immediately wrong about it, but it's hardly state-of-the-art either: there's no CBOR to reduce overhead, no JSON-LD to improve extensibility, no MIME types to account for different types of attachment.
We considered all that, and it seems that they all offer a bad value, compared with lower ubiquity. Also given that messages are padded to fixed 16kb size, there is no value in reducing JSON overhead, and files are sent as binary anyway. Being boring where it doesn't matter is good.
> avoid what has happened to Matrix
Messaging clients are hard to implement indeed, and forking the UI is usually easier than rebuilding it. We purposefully don't want to encourage the development of alternative clients too early, before the spec stabilised, to avoid the fragmentation that happened both with XMPP and with Matrix.
> to avoid the fragmentation that happened both with XMPP and with Matrix.
what fragmentation are you thinking of with Matrix? to my knowledge, we have zero fragmentation so far. some clients implement more features than others, but we don’t have any classic “my client sends different reactions to yours” or “my client archives messages differently” or “my encryption is incompatible” style problems. otherwise this smells a bit FUDy…
> we have zero fragmentation so far. some clients implement more features than others
The Matrix spec has many versions and many features. Clients implement and keep up with varying parts of it due to varying reasons usually involving varying amounts of manpower and funding. Same as with XMPP. I don't see the difference.
The difference is that Matrix is curated as a single spec (currently at v1.7: https://matrix.org/blog/2023/05/25/matrix-v-1-7-release/), which ensures that competing implementations for new features don’t fragment incompatibly but only a single one-true-way to talk a given feature exists. Anything else is a transient experiment. Meanwhile, we’ve never yet broken backwards compatibility in the spec, meaning that in theory any client can talk to any other client as long as it has implemented the required features. The inspiration here is HTML5 (albeit with versioned releases, and a clearer spec proposal process).
In other words, I’m defining fragmentation to be incompatible features - not just clients/servers which haven’t yet implemented a given feature (which is inevitable, just like browsers lag behind specced HTML and CSS features)
One way of putting this is that we’ve traded off the risk of fragmentation (but with free-for-all governance) for the risk of more centralised governance by the Matrix.org Foundation, with associated high drama when folks don’t agree with the curation decisions we make in what gets merged into the official spec.
Both are valid approaches with different tradeoffs; I was just trying to flag the confusion upthread accusing Matrix of being fragmented when it really isn’t (to a fault!)
> Matrix has numerous built-in features (some of which are in a rather half-baked state), often with a tight coupling between the protocol and the expected user interface for the features, making Matrix notoriously difficult to implement.
citation needed? If anything Matrix doesn’t have enough coupling between the API and the expected UI for features - making UIs trivial to implement (which is why there are so many unfragmented Matrix clients - in fact, I’m not aware of any fragmentation?), but the lack of UI coupling then makes performance harder.
So ironically on the Matrix side we’ve been busy adding tighter APIs like Sliding Sync (MSC3575) which make more opinionated choices about the UI in exchange for better-than-telegram perf.
It doesn't have user IDs... seems like an odd selling point... if you have good end to end encryption who cares if you have user IDs? I imagine you could do network analysis and so on to try an determine a network of people to deanonymize... but if you have encrypted chats, you have no context... unless guilt by association is enough.
On the other hand, a bigger problem than IDs would probably be IP addresses, etc. Random IDs per-conversation would also go a long way.
That's pretty tame compared to what potentates the world over do with metadata.
If you are considering writing a communications tool for humanitarian reasons, you have to consider that people are thrown into holes and forgotten for looking like they might be in cahoots with someone deemed bad by the powers that be (but not necessarily by the world at large)
If they aim at tech enthusiasts and privacy conscious people then that won't be a problem. But I doubt this messenger will gain any attention among wider regular audience because exactly of lack of user ID system. Majority of people needs an easy way to get in touch with family, friends and everyone else and that's done with IDs. Scanning qr codes is easy nowadays but to build a messenger around this it just feels like a missed idea.
Somehow I'm also amused by this clip where woman points qr code at webcam.
There are three types of people who seek strong anonymity: whistle blowers, criminals, and tech-nerd posers. For this reason any viable, strong anonymous comms solution is doomed to be misused 2/3 of the time.
But in reality there is not an equal distribution between these 3 groups. And there is a high probability that the user base is not as limited as in your pseudo factual simplification. (journalists come to mind for example etc. pp)
Sometimes who someone is talking to is enough to cause them problems. If we all had a unique ID for every person we talk to, there's no way to build a social graph. Even if IDs are pseudonymous, there's no forward secrecy there, once you've identified a participant in one social connection you've identified them in all of their social connections. Simplex solves this. Who everyone knows, even if you don't know the surname of everyone involved, is useful information to an adversary.
One "user ID" I could imagine right now would be the username.
In my experince with my volunteer job, we sort of do this. Querying a user's profile via the API only needs a username. I assume they're going to actually use the user ID in the next major release, but I highly doubt it.
I fear you are not aware of what AI models can pick out these days. Even encryption is not enough unless padded with constant-broadcast. If you know a medium is a text channel, the size, frequency, and rate of communications can accurately determine user emotional state, age, and gender, as well as possible subject matters and the relative position of power of conversation participants (who is boss, who is right hand, who are minions, etc)
How are you going to monetise without collecting a whole bunch of personal data? Crypto? Even if you believe that the future of payments is crypto, it's not like the average person easy access to crypto funds which massively limits your ability to monetise.
"Proposed", not "incoming". We will react appropriately IF it is passed and IF it applies to us. The UK as a jurisdiction has many advantages over alternatives.
Could we make a text chat where all (encrypted) message data is public knowledge in a globally shared bitcoin-like ledger?
There are ~3 billion users on the worlds biggest messenger app, sending ~100 billion messages per day. Assume a message averages 15 words. State of the art compression achieves ~0.65 bytes per word [1]. Thats 1 TB of data per day, for all human conversation.
Imagine any phone can send out an encrypted message, and it gets added to the global ledger.
The data stream of all ledger entries could be broadcast to all phones worldwide (LTE networks support broadcast), and your phone just pick out the messages aimed at it.
End result: Even if every device in the network is evil, nobody can ever determine who you are communicating with.
Current e2e systems might hide the contents of your messages, but they still reveal who you are communicating with to intermediate servers. That info can be used to find and arrest your friends if you're a terrorist, protestor, activist, or just a bit gay in a country that doesn't allow it.
No tor-like system can prevent this if an adversary can run a bunch of the nodes.
> Also, the founder appears to be of russian origin [0]
It could be argued that he has more need of a reliable private chat than others, except maybe someone of chinese origin :)
> you can’t just take their phone number and check if they have a WhatsApp/Signal/etc account
So why the hell should I have to give away my phone number?
Edit: someone pointed out in a different comment thread that the company is registered in the UK. Considering the upcoming anti encryption legislation over there, maybe that's what should worry you.
Have you been living under a rock for the past couple of years? Some have never learned anything from the original Red Scare, so now we're full into Red Scare 2.0.
Reminds me of that HN comment I saw where some US company started a rewrite of their frontend when they discovered that the 100% FOSS UI components library they had been using was developed by a Chinese company. Cut off your nose to spite your face and all that.
Wait until they learn how much code of Chinese and Russian origin there is in the Linux kernel.
Supply chain attacks are a thing. Given that we are in an active war with Russia and a trade war with China (which is reasonably expectable to turn into an active war as well), it makes sense to drop everything out of your supply chain where the people with commit and release access are in the reach of these nations' secret services.
Hardly anyone walls off their on-prem/on-cloud CI services, the amount of damage that a dedicated and actually skilled actor can do before being stopped is immense - we're lucky that most of the malware in the NPM ecosystem has been credential stealers (which were then used to mine cryptocoins) and cryptocoin miners, so relatively harmless in comparison to what an attacker might do in a war.
Don't forget how Russia killed off a bunch of windmill remote management systems as they executed a hack on a sat-internet provider early in the Ukraine war. No one is safe from being collateral damage.
Simplex does something similar. A connects to B. B connects to C. And A and C connect. They all chat. but there is no way to know A, B, or C, because from the outside, it all looks like: X connects to Y, X connects to Y, X connects to Y. So who spoke to whom?
This is great. Even if "the authorities" demand access to chat logs, first, they won't know what to ask for. Chats between whom? Second, they still won't know who spoke to whom even if they have all the data. It's anonymized chats. They would have to sift through all of it.
It still won't prevent someone invading privacy if they have physical access to your device, since the identities are stored locally for your usage convenience.