Hacker News new | past | comments | ask | show | jobs | submit login
How Tutanota replaced Google’s FCM with their own notification system (f-droid.org)
187 points by grammers on Sept 9, 2018 | hide | past | web | favorite | 66 comments



Is it possible for every app developer to stop using FCM? Yes, in China.

The result is a chaotic market. App devs will have to use multiple push machinism for different device: on Huawei phones, Huawei's own push service is the most reliable; on other phones, you might want use SDKs from Tencent/Baidu/Alibaba so their app (Wechat/Baidu search/Taobao/Alipay) can "wake up" your app to receive push. Battery life become miserable, push become unreliable. It's to a point where the government start "regulating" the market (the Unified Push Alliance, http://www.chinaupa.com, is lead by CAICT, a think tank of Gov of China).


But do we really need "push providers"? The problem is (IMHO) that app developers don't know how to use the connection efficiently causing the radio to wake up constantly (IDLE socket connection has very low battery impact). Instead of pushing every darn detail queue less important ones and send them out when something more important arrives; your client app can signal the server, that app is in the background (or phone is going to sleep) to send less frequent updates. But - companies will abuse everything they can to notify users about absolutely everything. Currently I'm running phone without google play services and it works just fine (with exception of Viber, which can't deliver messages correctly without push services, but it's not that important for me).


If you don't have a push provider mediating the many apps trying to get notifications, you end up in a tragedy of the commons scenario, where every single app will try to implement its own push service and will prioritize its own push service, quickly resulting in a race to the bottom where the device is constantly connecting to a different push service, draining battery and exhausting your data connection.

It isn't very different from preemptive/cooperative multitasking.


Why can't the device just chose a single wake schedule and communicate that to each push provider? Hopefully UDP can be used so there's no need for any keepalives or handshakes during the wake periods?

Are clocks too unreliable or is there too much NAT because everybody's too lazy to implement IPv6 still?


This actually happens on Android (read about Doze mode), but the issue with your approach is that you don't get realtime delivery - only periodic sync. Which is actually less efficient, because the device must wake up on interval, activate the radio and poll for changes.

FCM (and Apple push) run a single keep-alive TCP connection and incoming packets will cause radio and application process wakeup only when necessary.

This change significantly improved Android battery life (remember that HN Apple lovers really loved to bash on Android battery life before this FCM push happened). The fact of the matter is that a lot of developers didn't care one bit about battery life and abused Android background services when there was no need to do so (e.g. schedule periodic wakeups to check for a preset alarm needed once 24 hours) and if we learned anything by this experiment is that you simply can't trust developers to respect users anymore.


Dev here : you are 100% right that battery life is not optimized for in most cases.

For one it is hard to measure, and many devs don't really know/care about it.

FCM is a boon in this regard. So are the restrictions that Android is adding to background processing. They are technically not that necessary ... except that you can't trust any app not to abuse it, so the new model is so much better.

Also, thanks Google for finally forcing apps to target the last version or so of this OS so they have to adopt all these new optimizations, whether they want to or not.


Does the 1 tcp connwction notnrequire the radio to wakeup to receive pushes?


It does, but not really periodically - the radio is just listening and not transmitting until a packet comes. Then it wakes up the application processor to handle it.


I guess the question is, if a single listening TCP connection is not a problem because everything can sleep until a packet arrives, what's the problem with multiple apps each opening such a connection?


Increase amount of chatter, more keep alives, problems with NATs (see the other post in this thread) which don't whitelist non-Google/Apple endpoints, etc.

But majorly - it makes little business sense for Google to support that. It can only make Android look worse for little gain to them.


I think you're describing a pull workflow... which more apps ought to be using, but does fill a different role than pushes.


Other posts answered most of your question, but there is still one point regarding IPv6. IPv6 doesn't solve any of this problem, because the problem isn't actually the connection, but the synchronization between all the apps using push services, the wake state of the device and the radio utilization.

IPv6 would solve a single push service, but with every app deploying its own push service, you get back to the tragedy of the commons problems.

As it has been seen again and again on Android, you absolutely cannot trust app developers to be good citizens and stop running services on background that quickly wipe your battery (not to mention even more shady stuff), and no way you'll be able to trust developers with push services that don't do the same.

If you need proof, just look at China, it is the wild west of push services.


Would it not be possible for an OS to implement a push receiver service with an open port. With IPv6, after registering through an app with a server, the server can just send a UDP packet to the open port? Maybe you could do an "hmac firewall" like in OpenVPN to avoid processing nonsense packets?


> With IPv6, after registering through an app with a server, the server can just send a UDP packet to the open port?

That's your problem, if the radio is off and you're using UDP, the message will simply be lost. You can leave the radio on permanently, but you'll kill the battery.


OK, dumb question - do all applications need push communication? Apart from IMs (and maybe e-mail) I would say that everything else should rather do pull than push.

And Google is doing virtually that, with enforcing push via their services and controlling Doze, making some apps receiving notifications with (sometimes significant) delay.

What's more, I would argue that in quite a lot of cases NOT having immediate push notification could work wonders on our attention span, but this is slightly less technical problem :-)


As soon as you are maintaining multiple connections, battery life gets worse.


We need the core operating system to provide a push provider that we trust. Do we trust Google to act as a push provider?


This is really the chance for a great open source project. So many open source apps (Signal, Mastodon clients, riot.im, telegram-foss and more) have to do hacks to be able to deliver push notifications without GCM/FCM

What if there was an open source software one could setup on a server that would provide push services for all these apps and would interact with one open source client running on android.

That way one would have the energy saving benefits of only handling one server connection, but the privacy benefits of a private server for oneself/friends/people one trusts.

This is definitely not easy and would require coordination between many open source projects and also additions in Android to run on a system level (maybe in lineageos and similar), but I really think it would be worth it.


One of the keys to low-power push system is the ability to manage the radio. It is obviously much easier, if you have a single connection that you manage, know when it is possible to drop it, or when to send the heartbeat.

Once multiple processes start doing it, without cooperating together, your battery is going to be shot.

Apps also should never send anything private over push notification. It should be only a ping to the app, to check it's event source.


I wish there were an open standard for push notifications with some sort of mux/proxy support. Let me run a daemon (or use a service provider) that collects all my notifications in a data-center, then send a datagram, sms, or even pocsag message to let me know notifications are available for pickup.


There is an open standard - it's called Web Push (https://www.w3.org/TR/push-api/). It's already used by Firefox and Chrome to implement push notifications for web pages (via service workers). The notifications are proxied via Mozilla/Google server respectively, but there is no reason you could not run your own server.


It's not an open standard, but this was the one of the interesting pieces of RIM's software stack. Their push system worked reasonably well, used little power on the handset, and organizations were able to run their own on-site push proxy supporting their own in-house applications.


BlackBerry cracked push like this years ago, though it wasn't open. The phones used to connect over the carrier's network to a private APN (over VPN or leased line with to the carrier), so they always knew your BlackBerry's IP address. If you were out of range they'd do the whole storing it all up and when your device came back into coverage they'd send it all down. No hanging connections, the device was directly exposed into the BlackBerry network.


AFAIK, signal doesn't work without GCM.


It does, but again via an own implementation of web sockets (I think) which creates an additional connection that drains power.


> Wouldn’t it be great if the user could just pick a “push notifications provider” in the phone settings and OS managed all these hard details by itself? So every app, which doesn’t want to be policed by the platform owner, didn’t have to invent the system anew? It could be end-to-end encrypted between the app and the app server. There’s no real technical difficulty in that, but as long as our systems are controlled by big players who do not allow this, we have to solve it by ourselves.

Fking this. ideally, this would also involve a standard API on the backend for how to send push notifications. E.g., something like:

1) App on phone queries OS for selected push provider.

2) Phone OS returns some metadata about the push provider to the app, including a a backend URL for sending messages.

3) App sends that URL to it's own backend server.

4) When a push message should be sent, the app's backend server invokes that URL with the message to be sent in some standardized way.

If - which is likely - push servies require that developers register their apps with them before use, this could be expanded by the phone OS returning a list of providers instead of just one.


I would encourage app developers to take a look at the network architecture of the major US carriers while determining how to give their app notifications from the server. In the US, the 4 major LTE cellular carriers operate IPv6 only networks with NATed IPv4 access tunneled in as an unsupported afterthought.

From my testing with IPv6 enabled VOIP for inbound calls, compared to FCM with calls coming from an IPv4 only server, only the former setup would reliably get calls to a test device on the first ring. If you need very low latency push notifications, IPv6 and FCM/Apple Push Notification are needed for best performance. Thus far, inter-carrier roaming generally does not support IPv6, so you still must support legacy IPv4 to a degree.


To this end, Apple has been leaning on developers to support iOS devices that are on IPv6 only networks: https://developer.apple.com/support/ipv6/


People migrating from GCM to FCM need to be aware that in FCM analytics are turned on by default: https://github.com/siacs/Conversations/issues/3041


From your link:

>I’m closing this issue. I think current master successfully disables analytics. Thanks for bringing this to my attention.


"Current master" obviously refers to Conversations, not FCM libs.


>SSE fits our needs better than WebSocket would (it is cheaper and converges faster, because it’s not duplex). We’ve seen multiple chat apps trying using WebSocket for push notifications and it didn’t seem power efficient.

This is just not true unless there is some serious fuckery going on with the websocket implementation. Both require full duplex communications on the transport layer.


I'm wondering how it works with different messaging apps. I'm using mostly Facebook Messenger, because of the network effects. A few months ago I tried to use several apps to communicate with my wife. Both Signal and Wire failed to reliably send messages in under 30 minutes or so. More often then not the message would be delivered after a few hours. That's totally unacceptable. If there are some magic settings in Android they were not properly advertised. Our phones are on quite pure 8.1 and 7 versions. So I'm back to Messenger.

Reading this it seems that there is a known solution to this problem. Does Mastodon offer something like private messages? Or is there a messaging app that doesn't use FCM, but works in a way described in article?


If your phone has Google Play Services running, Signal should use GCM/FCM; they have a websockets mode, but only for devices lacking the former.

I don't use Signal, but I don't think that delay is typical, could there be an issue with the connectivity to their servers?


Nope, this is not normal behavior for Signal, it is a sign that the sending or receiving phone has FCM and data connectivity issues.

I have seen this issue on Motorola Android phones with known dying LTE Modems (eg: these phones couldn't maintain a data session for more than a few minutes, GPS off WiFi is also unreliable).


One option I like is the app "Conversations" (available on F-Droid and Play Store). It's a nice XMPP client with Signal-based (OMEMO) encryption, a great UI (in my opinion anyway), and it's extremely battery friendly in my experience -- my device doesn't have GCM/FCM.

You'll need an XMPP server though, and to use the E2E encryption you'll need a server which supports certain XEPs. The app dev also runs a server at conversations.im that is, of course, compatible. It costs €8/yr though. That said, there are several public servers with OMEMO support, plus self-hosting is an option.


Mastodon has “private” messages but it is primarily about being Distributed Twitter. I put “private” in quotes because they’re not encrypted in transit between servers, and an unethical admin could easily intercept them.


During so far nearly two years of frequent Signal use, I have had one - one - not getting instantly delivered, and that was due to network breakdown at my end.

There's plenty not to like about Signal, but reliability is second to none.


Final thought: Every user should be able to choose a “Notification Provider” for every app

At Qbix, we thought deeply about this problem. Ideally, to maintain anonymity, each notification would come from some random endpoint on the cloud. But that’s not how the Internet works these days, you have to connect periodically to SOMETHING. So you can choose your own notification providers.

The trouble is if you have many apps, then your device is constantly connected to many servers.

What we settled on for now is a background process with WebSockets, but perhaps this works better. The operating systems weren’t designed for this use case and the phones aren’t optimized for it. For example, how would you do the same on iOS?

However, what IS possible is tunneling through the native iOS VoIP notifications support and encrypting your notifications. You can even process them on the client side with some “IFTTT” type logic. To do that for Android, however, we had to implement this background process approach. But it’s a hack.


each app having its own notification provider cones at the cost of extra cpu cycles, network calls, and interrupts. Worsening battery life.

It's likely in the users benefit to only have 1 notification provider


Right. I think they should have one other mechanism “in the cloud” somewhere and no need to have too many, one for each domain.

I am starting to realize that DNS is main source of problems on the Internet. We need to replace it with a DHT using something like Kademlia.

It was originally designed to have human-readable domains on the Internet. But it’s become a glorified federated search engine, essentially, whereas we can have lots of different search engines, one per app or location. Today, there is very little benefit to having human-readable URLs. If it’s anything much longer than a hostname, you won’t even say it or read it. So it only works for a tiny subset of internet resources but serves to centralize control in the hands of a few large websites. It leads to centralized databases that attract the NSA and advertisers — and all the stuff from Cambridge Analytica to the Equifax breach are downstream results that can be fixed by apps switching from DNS to a DHT.


Even if we ignore the readability of URLs (which I wouldn't), DNS is a nice service discovery protocol with aligned incentives. To find out which server to contact to get hacker news, you ask the DNS server for .com, which has incentive to answer honestly because hacker news pays them to do so, and then the DNS server for ycombinator.com which has incentive to answer honestly because it's operated/payed for by the people running hacker news. Alternativly you can ask the DNS server of your ISP which has incentive to answer honestly because otherwise the ISP risks loosing customers (assuming working capitalism, i.e. competition).

In a DHT, anyone with very different incentives can add as many nodes as he likes. If the node id assignment isn't designed very carefully any attacker can even dominate whatever part of the DHT he is interested in and deny or modify information or just log information.

Sure, the DNS system isn't immune against state actors, especially the US. But DHT is vulnerable against everyone willing to spend some money on spinning up hosts.


Why would you care about readability of URLs, when each client can just show the cached metadata? Which is what happens anyway, kind of, in your browser history and anywhere you share the links. Most people won’t read the url.

Now, as for the incentives. In fact, DNS does get poisoned and that’s why DNSSEC was invented. The same exact thing can be done on a DHT with private keys and certificates authenticating the author of a document. Except you don’t have the kind of dynamics that lead to centralized databases and breaches. Instead, we can even have content addressing, so the same exact resource can be cached even if it would have been “on a different domain”. And you know that resource hasn’t been changes from under you. Talk about incentives — there is a huge incentive for hackers to access these giant centralized honeypots and change up what’s being returned from a URL, or do phishing based on a slightly similar domain name.

Look up the MaidSAFE project and tell me, how exactly is the Kademlia DHT there vulnerable against everyone willing to spin up hosts? You can still use certificates and domains can still sign everything. The domain system is basically a search engine, which also has its own incentives.

And finally, Capitalism is the 2nd best system for organizing people and information, the first being open source. The Web, Wikipedia, WebKit, MySQL, PHP and Linux have long ago beaten AOL, Britannica, MSSQL, ASP.NET and Windows NT Server.


What's the deal with push notifications if the modem has to be switched on every couple of minutes to keep a connection alive? I was under the impression that push notifications were implemented similar to a call handshake that only activates the phone if there is a call.

Are the FCM service and the iOS equivalent also IP based or can they use some lower level, more energy efficient protocols to wake up the phones?


They are IP based.

There are several problems that make replicating FCM hard, which is one reason why Google tries to push you to using it (wordplay definitely intended).

It all boils down to IPv4. IPv4 address space is exhausted, so carriers have had to deploy carrier NAT in front of cell towers, often multiple levels of carrier NAT. Well NAT requires translation tables to be held in RAM and carrier-grade boxes are very expensive in general, so to keep the machines alive they need to aggressively garbage collect dead translations. Otherwise they'd run out of RAM.

This is the origin of the 'keep alive' problem - NATs want to close your connection to free up their own resources, and you want the connection to stay open so you can receive push messages. So phones have to wake up every so often and send keepalive packets or do a connection rebuild.

Google and Apple have an interesting solution to this problem .... it's a mix of learning and data analytics to figure out what NAT timeouts used by each carrier are, and cutting deals with carriers directly to adjust the timeouts for their IP ranges specifically. Therefore you cannot compete directly with Google or Apple on energy efficiency. This is something a lot of hackers don't realise. Getting to the level of efficiency FCM has is very hard and takes a lot of work, and basically requires you to be a giant company. This is why it's an OS level service.

They also use very tight protocols, batch things together and of course provide oodles of server side disk space for buffering messages to disk until the devices return, lots of other MQ type things that are hard to do at scale.

IPv6 solves this problem by allowing carriers to lose the NAT boxes. No more machines that need stuff in RAM for every TCP connection, which means connections are no longer a scarce resource that must be culled from time to time. If you only have to maintain connections to devices that support IPv6 you could theoretically maintain very long lived connections if the kernel, radio and server cooperate. Of course there are still timeouts: the TCP connection requires some state server-side, so the servers will kill off connections from time to time because devices may roam across different IP addresses. But it should be a lot more stable and not require cutting deals anymore.


> and cutting deals with carriers directly to adjust the timeouts for their IP ranges specifically

In theory, seems like they could just sidestep the NAT entirely. Co-locate a front-end server inside the carrier's private network.

Then the phone and front-end server talk to each other with private addresses. Once it hits the front-end server, everything can be multiplexed over a single TCP connection back to the Google data center. (You'd need a way to discover the server, but that's doable. For example, the regular server can refer you based on cell carrier, source IP, etc.)

Google already has reasons to want servers co-located, such as serving static web content and making sure new TCP connections warm up fast. Apple too, somewhat. So the hardware might be there already and this might be just a configuration change, possibly including the addition of a private address.

Point being, the economies of scale here might be even greater. They might have a nearly free way to do it, and without the need to convince the carrier to give up precious NAT resources.


Thank you for clear explanation.

Haven't most big land and mobile carriers deployed IPv6 on their core network yet for other benefits? If so, one could create alternative cloud notification service based on IPv6.


I doubt the accuracy of what you are saying. Treating certain IP ranges differently would not be net neutral, which is exactly why we have net neutrality: otherwise it's impossible for new entrants in the market to compete. Also, most ISPs don't run CGNAT (except historically notoriously many in Italy, and by now also a few in the USA). And "oodles" sounds like a lot but running a push server doesn't sound like it would require that much disk space. At Google's level, sure, but for your average app... it can easily be run without Google.

The real problem (afaik) is that people change address whenever they change network (so it doesn't all boil down to v4, either), so you need regular keepalives. If every app starts doing keepalives individually, your phone will be using its radio every 30 seconds 24/7 (assuming you have four apps that need push, and those devs want a maximum delay of 2 minutes, and their timers are roughly evenly distributed). By having a single server to keepalive, the issue is avoided.


> Also, most ISPs don't run CGNAT (except historically notoriously many in Italy, and by now also a few in the USA).

My understanding is that essentially every mobile network in the world uses CGNAT. I happen to have AT&T, T-Mobile, and Verizon SIMs handy at the moment, and all 3 of them are behind NAT.


It's interesting to look how Exchange ActiveSync handles this. They do a hanging request to the server, and the client keeps track of when the connection closes, and negotiates with the server to push the keepalive ping time up and up, until the device hits the carrier TCP timeout. The client then knows how long it can stay asleep before having to wake up to "PING" to keep the connection open.


Great answer, thanks for taking the time


> Wouldn’t it be great if the user could just pick a “push notifications provider” in the phone settings and OS managed all these hard details by itself?

Having the possibility of choice is a good thing but in this particular case I don't think it would be good if 3rd party notifications providers were available. This should be an integral part of the OS, not a field for competition; knowing that Google itself can harvest data from notifications we can't exclude the possibility that there would appear companies interested only in that activity, at the same time enticing users with pretty and simple design of notifications or whatever else.


Well, any Android app can harvest data from notifications right now, by setting up a notification listener. I don't see how this would be any different.


The difference is that it's server side. I'm not aware whether the third party server logs my messages, but I am aware when an app has local permissions (you can see it in the permissions overview) to read my notifications.


Of course this "harvesting" needs to be explicitly approved by the user.


You would need a standard API like intent API.


Another difficulty was caused by the Doze mode, introduced in Android M. The Doze, which is turned on after a period of inactivity, among other things prevents background processes to access the network. As you can imagine, this prevents our app from receiving notifications.

We mitigate this problem by asking users to make an exemption from battery optimisations for our app. It worked fairly well.

This is why we were migrating to FCM from our own custom built MQTT solution that was invented back in 2012.


MQTT still requires a network connection to publish to (or receive subscription events from) a MQTT server.. so you still have the exact same problem with Doze preventing that. What are you doing to get around that problem, if not exempting your app from battery 'optimisations' ?

Edit: Would anyone care to explain how I am wrong in addition to downvoting me? If Doze blocks network connections to your app when it is backgrounded, how can it receive MQTT events?


With every release Android makes it harder to have a service running in the background. With the introduction of doze there is no alternative to using FCM. FCM has a privileged service where it is able to wake up the device from doze. A custom push service will never be able to do this.

There might be hacks, but I'd rather take Googles approach than to work around on each and every release.


All of the 4 major LTE cellular carriers operate IPv6 only networks with NATed IPv4 access tunneled in as an unsupported afterthought.

Having done quite a bit of testing with IPv6 enabled VOIP for inbound calls as compared to FCM with calls coming from an IPv4 only server, only the former setup would reliably get calls to my device on the first ring.


GP knows that. They said they're migrating away from MQTT, to the Google solution.


Ah, somehow I missed that :(


They are saying that they had to use FCM because MQTT was killed by Doze (i.e. the same you are saying).


This is a very exciting development, it's great to see more developers take this centralisation issue seriously.

On this subject, many people including myself have been failing (hard) to convince Moxie that dropping FCM (or GCM or whatever the name) is the way to go for Signal, if anyone want to give a hand...

https://www.reddit.com/r/signal/comments/9ekawn/moxie_drop_t...

https://github.com/signalapp/Signal-Android/issues/7638


Wow reading this seems like android push is a total mess (outside of FCM). Things are much simplier on the iOS side...


As an User and Dev of both iOS and Android, I can tell you Push Notifications are miles ahead on Android compared to iOS.

The article refers to not wanting use Google Solution for non technical reasons.




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: