This is really useful to test new features before they land in master and are deployed out - we actually have tooling that lets you specify a branch you want to try out, and you can quickly have your client pull built assets from that branch - and even share a link that someone can click on to test out their client on a certain build. Just the other week, we shipped some non-obvious bug, and I was able to bisect the troubled commit using our build override stuff.
The worker stuff is completely integrated into our CI pipeline, and on a usual day, we're pushing a bunch of worker updates to our different release channels. The backing-store for all assets and pre-rendered content sits in a multi-regional GCS bucket that the worker signs + proxies + caches requests to.
We build the worker's JS using webpack, and our CI toolchain snapshots each worker deploy, and allows us to roll back the worker to any point in time with ease.
The value comes from 1) What can trigger that code to run and 2) What services that code can interact with.
And on those two points, AWS still wins hands down. They have by far the most possible triggers for Lambda, and they have by far the most services that Lambda can interact with.
It's cool that Cloudflare built something faster, but unless you're running in a vacuum, speed is the least of your concerns.
If you are referring to SNS, you've always been able to send SNS messages to HTTP endpoints. (https://docs.aws.amazon.com/sns/latest/dg/SendMessageToHttp....)
That is the cynical way to look at it. It also creates value because it lets you do more with what you already have.
> If you are looking to execute code when something visits a URL you have more options.
Sure, but unless that code works in isolation, it probably needs at the very least access to some sort of data store.
Realistically you can get as locked into Amazon as you want, lambda alone does not create inescapable lockin by any measure so I would argue Jeremy still has a point in the fact that tools become more useful when you can use them to do more work (ecosystem)...
We are rolling out a CDN, with a goal of 20 ms latency in most countries. We want more granularity that AWS - and some zones are just not well served (No AWS in Africa, incomplete offer in Brasil, etc)
Still, we figured we would use Route 53 as you can do Latency Based Routing even with non-AWS servers. Computing latency or using EDNS0 as a proxy is not rocket science, so we thought the DNS would not be a limiting point.
Oh boy, how wrong we were! After wrongly blaming the bad performance on Cloudflare caching, further tests revealed Route 53 takes as much as 0.7s to reply to some DNS queries - and even worse when fronted by Cloudflare, as for some reason the DNS TTL seems to be ignored by Cloudflare. The latency only drops down after about 4 queries, which makes me thing they have some kind of Round-Robin that does not share the DNS queries (I could be wrong)
In the article, the author says: "Most of that delay is DNS however (Route53?). Just showing the time spent waiting for a response (ignoring DNS and connection time)". No you should not ignore the DNS delays! Route53 performance is very bad - 2 full seconds for you!!
We are fortunate it did not take 2s for us. Still, having servers all over the world that reply in 20 ms is useless when the first DNS query takes 700ms.
We ended up leaving for Azure: Traffic Manager outperforms Route 53 by a factor of 2.
Eventually, we will roll our own GeoIP with DNS resolvers on a anycast subnet.
I do not understand how this level of "performance" can be tolerated. At 2 seconds for a DNS query, you are better off using the registrar free DNS service!!
Later queries are fast of course, as the results are cached (TTL).
Even if the DNS is very pooly configured, all queries after the first one will benefit from the cache!! So the first few queries matter much more, and this is what we should be talking about instead of distributions and percentiless.
Said differently: If each of your visitor has to way a second or two until the site comes up the first time, then the site works normally, it may still give them a bad impression.
I measured the DNS delay on first Route53 reply to be over 700 ms personally. For the author it is 2000 ms. These results are in the same order of magnitude, and make Route53 unsuitable for many applications. Of coutse, you could start hacking, like keeping Amazon cache warm by issuing queries through chron, or by setting extremely long TTLs and hoping your visitors DSL modem will keep your A records in cache as long as you asked for - but these are just hacks trying to compensate that the first DNS query takes SECONDS to process.
Route53 LBR DNS is not as a "slow and requiring hacks". It's supposed to be fast, simple to run, and to ingrate with different ecosystems. To me, it seems to be none of that.
After assessing Route53 as fubar, I switched from AWS to Azure: TrafficManager offers the same features, and the first request takes less than 350ms. There must still be some cruft in there, but at least it is manageable.
BTW, if you're into modern C++ and this kind of work interests you, please e-mail me at kenton at cloudflare.com. We're hiring!
The setup is: domain.com -> geoiplbr.domain.com with cloudflare caching enabled. Nothing else that is fancy and could cause delays.
If I measure the TTFB for domain.com, I see a large DNS delay until about the 4th consecutive query - and then the DNS is no longer the limiting factor.
The same measures on geoiplbr.domain.com normalize after the 2nd query.
It seems to me you have some kind of Roud Robin going on that does not share the DNS results.
Or maybe the caching is not done at the POP level?
Also I looked into using cloudflare workers to write my own custom edge cdn but they currently don't allow you to change where in the call requests are processed or telling cloudflare what to cache vs not cache. If they could have some functionality that would allow you to easily write your own multi layered CDN this would be interesting.
The statements in the second paragraph are fortunately incorrect. With the exception of some security features Workers totally takes over the incoming request. It can use flags in its subrequests to configure the cache as you need, and will soon have access to the raw Cache API.
On the first paragraph we have shifted some computationally heavy and horizontally restricted functions from our own servers to Lambda, this allows us to instantly scale to meet our non-consistent demand. With the lambda workers we are using we are averaging 5 to 11s of execution time with approximately 800mb of memory and utilize the cpu heavily. If Cloudflare workers ever expanded to allow for a similar scope I would definitely take a second look at it.
As you're guessing that Cloudfare superior JS runtime plays a big role, it could be interesting to see if it can compete against Golang Lambda as well.
I guess one of the main challenges is that all resources are properly released when a worker is shut down. Releasing memory sounds pretty easy, if V8 does it for you. But releasing all IO resources might be a bit harder, especially if they are shared between isolates.
Luckily, in the CF Workers environment, it turns out that all I/O objects are request-scoped. So, once a request/response completes, we can proactively release all I/O object handles bound into JS during that request/response. If JS is still holding on to those handles and calls them later, it gets an exception.
Yes, I guess a part of my question was whether the destructors/finalizers that the JS object bindings in C++ might impose are called fast enough to guarantee isolation and prevent resource leakage. Looks like in your case that happens through the request scoping.
Comparing them to Lambda@Edge makes sense, but Lambda@Edge is not a very good product.
(Full disclosure: my company competes with Cloud Flare Workers).
The fundamental problem I run into with Lambda@Edge is just that their request stages aren't a great abstraction (OpenResty/nginx has a similar problem). It really limits what kinds of problems you can solve.
But you also have a bunch of other options like:
PS. If you're a storage expert and building a hyper-distributed storage system interests you, e-mail me at kenton at cloudflare. We're hiring.
1) Provision a (free) connection secret from fauna.com
2) Import the FaunaDB driver (npm install faunadb)
3) Create a client object using your connection secret.
After that you are using FaunaDB, which is purely pay-as-you go, with ACID transactions, joins, indexes, etc.
It would be simple to write a tutorial like this for Cloudflare. Hello world on Azure functions: https://blog.fauna.com/azure-functions-with-serverless-node-... and on Lambda: https://blog.fauna.com/serverless-cloud-database
Once this is merged, it clears the way for us adding Terraform support (as terraform-provider-cloudflare wraps cloudflare-go).
There's been lots of interest from our customers in being able to manage Workers using Terraform, so it's high on the list.
 - https://github.com/terraform-providers/terraform-provider-cl...
 - https://github.com/serverless/serverless/issues/4948
At least you acknowledge that it's a bit silly to use a global benchmark to compare a global service with an intentionally-regionalized service.
For the specific use case you tested workers on the edge absolutely make more sense than lambda, but I think the headline is a bit click-baity.
Feel free to email zack [at] cloudflare.com directly if you like.
Not a very interesting benchmark. This would only measure net latency and spin-up time.
One cool property of our app is that it's rarely updated and mostly read.
And the goal is to achieve the lowest possible latency at the edge.
It scales beautiful, I am not sure which other architecture can help us keep this afloat with only 4 developers working on it.
So, we have a DynamoDB table which is replicated to multiple regions using DynamoDB streams and Lambda.
For us, Lambda means achieving a lot without many developers and system administrators but I understand that not all problems yield gracefully to this pattern.
It seems using Cloudflare Workers to trigger our Lambda function instead of API gateway could prove to be cheaper.