
Cloud Computing Without Containers - zackbloom
https://blog.cloudflare.com/cloud-computing-without-containers/?hH
======
kentonv
Hi, I'm the tech lead of Workers.

Note that the core point here is multi-tenancy. With Isolates, we can host
10,000+ tenants on one machine -- something that's totally unrealistic with
containers or VMs.

Why does this matter? Two things:

1) It means we can run your code in many more places, because the cost of each
additional location is so low. So, your code can run close to the end user
rather than in a central location.

2) If your Worker makes requests to a third-party API that is also implemented
as a Cloudflare Worker, this request doesn't even leave the machine. The other
worker runs locally. The idea that you can have a stack of cloud services that
depend on each other without incurring any latency or bandwidth costs will, I
think, be game-changing.

~~~
discoball
At 1/10,000th of a server per each tenant (assuming 10,000 tenants) you can
achieve “hard” isolation and far more cpu/memory per tenant using a
TinkerBoard or something like that per client and charging a flat $2/mo with
no hourly fees. That’s a business model that I see materializing that can eat
into your business.

EDIT to clarify: I’m assuming they can do 10,000 tenants per server. If they
did 1 tenant per 1 TinkerBoard and charged $2/mo for it flat no hourly fee
that would be an interesting business model IMO and it achieves hard isolation
between tenacts.

~~~
roland-s
Scaleway out of France has been doing baremetal cloud nodes for years. I used
them during the beta and they are fantastic (despite slight latency due to
being in France). Their smallest baremetal node starts at €3/mo for 4 ARM
cores/2GB RAM/50GB SSD. Pretty cool infrastructure to be able to auto
provision physical nodes on demand, but I think it's a different use case from
the whole lightweight serverless processes on demand thing.

~~~
discoball
If only they supported NixOS... I would use them if/when they make the leap.

~~~
spindle
Are you sure they don't? NixOS can be installed on top of any Linux
distribution, unless the hardware is really weird.

ETA: It looks like it is complicated, but might work. See
[https://nixos.wiki/wiki/NixOS_on_ARM/Scaleway_C1](https://nixos.wiki/wiki/NixOS_on_ARM/Scaleway_C1)

------
boulos
Disclosure: I work at Google Cloud.

Can you say more about the assumed security model here? We built gVisor [1]
(and historically used nacl or ptrace) precisely because things like Isolates
aren't (historically?!) trustworthy enough for side-by-side. Plus the whole
V8-only ;).

Edit: But let me say, I still think this is awesome (and love seeing more
people jumping into the "just run code" space).

[1] [https://cloud.google.com/blog/products/gcp/open-sourcing-
gvi...](https://cloud.google.com/blog/products/gcp/open-sourcing-gvisor-a-
sandboxed-container-runtime)

~~~
kentonv
Hi, tech lead of Workers here.

This is tricky to answer concisely because we've implemented a huge number and
variety of defense-in-depth measures. (Plus, I'm in transit right now and
can't type much -- maybe I'll edit to add more details later.)

First, we believe Chrome has done a pretty good job hardening V8 over the
years, and we get a lot of comfort knowing there's a $15,000 bug bounty for V8
breakouts. We update V8 continuously, tracking the version shipping in Chrome.

That said, we obviously don't simply rely on V8 for everything. We've modeled
what a V8 breakout is likely to look like, and added a variety of mitigations.

Presumably, a V8 breakout bug is likely to allow an attacker to run arbitrary
native code within the Workers runtime process. That's bad, but they will
quickly run into some barriers to weaponizing this. For example, we obviously
run with ASLR, so an attacker looking for other isolate's data would be flying
blind and would likely raise segfaults. Any segfaults in production raise an
alert and are investigated. Similarly, we run a tight seccomp fitler that
denies all filesystem and network access, and if an attacker ever tries to
invoke such syscalls, it will raise an alert and be investigated.

It's worth noting that we do not allow eval() (or any other mechanism of "code
generation from strings"), hence all code we execute has to have been uploaded
through our code deployment pipeline. This implies we have a copy of all code.
When a segfault raises an alert, we immediately look at the code that caused
it. Anyone using a zero-day against us is very likely to burn their zero-day
long before they manage to pull off a useful attack.

You're probably also wondering about Spectre. Here's a previous comment of
mine on that topic:
[https://news.ycombinator.com/item?id=18280156](https://news.ycombinator.com/item?id=18280156)

Again, this is just a couple of the things we're doing... there's really too
much to list in a HN comment. I hope to find time to write this up more
formally in the future. :)

~~~
jamescostian
> we do not allow eval() (or any other mechanism of "code generation from
> strings")

This is interesting, because preventing code generation from strings goes
beyond `eval()` and `new Function()` - it also includes things like the output
www.jsfuck.com can produce (which can indirectly call things like `eval()` and
`new Function()` without ever mentioning `eval` or `new Function` in their
source). How do you prevent things like that?

~~~
kentonv
V8's embedding API provides an explicit option to disable them.

[https://github.com/v8/v8/blob/53d3f5ba2a7ee77244bdee882da9da...](https://github.com/v8/v8/blob/53d3f5ba2a7ee77244bdee882da9da188daa4c39/include/v8.h#L9097-L9110)

Line 9097, SetAllowCodeGenerationFromStrings()

------
zapita
Of course a language-specific sandbox with a highly restricted domain-specific
API will be faster than an OS container or VM. The tradeoff is that it can run
less applications unmodified, so less people can use it.

I think Cloudflare workers are a cool implementation of domain-specific
scripting. I’ve used it and it’s great. And with their scale I’m sure they’ll
advance the state of the art in various ways. But there’s no radically new
insight here.

~~~
writepub
With the advent of WASM, the mentioned sandbox is no more language specific.
Threads & garbage collection are coming soon to WASM, which should open up
most relevant languages in use today to this more efficient sandbox

~~~
ori_b
> With the advent of WASM, the mentioned sandbox is no more language specific.

Which removes many of the advantages.

~~~
MrEldritch
How do you figure?

~~~
skybrian
One issue might be that Cloudflare won't have the source and can't recompile
the code.

But perhaps web assembly is easy enough to decompile?

~~~
vnorilo
wasm doesn't as of now have any means of code generation independent from the
JS api. There is also no way to obfuscate call targets or corrupt heap outside
the module's linear memory. Any sandbox breaking bugs that do not involve JS
would be codegen errors, AFAICT; in that case what you need to analyze is just
the wasm blob and its interaction with the codegen pipeline.

------
chaitanya
I can’t believe Cloudflare would run untrusted customer code side by side with
that of other customers. V8 isolates sound great but did the Chromium team
have this threat model in mind when designing them?

Not that long ago Cloudflare was bitten by a nasty bug where their own parser
mixed up HTTP responses from different customers
[https://bugs.chromium.org/p/project-
zero/issues/detail?id=11...](https://bugs.chromium.org/p/project-
zero/issues/detail?id=1139)

Cloudflare’s response to this incident did not inspire a lot of confidence.
What I am trying to say is that I won’t trust their workers implementation
unless a competent external security researcher audits it.

Edit: Related discussion
[https://news.ycombinator.com/item?id=18417257](https://news.ycombinator.com/item?id=18417257)

~~~
zackbloom
What about our response was lacking to you? I can tell you from the inside
that huge amounts of code are being moved to rust to prevent that type of
vulnerability from ever happening again.

~~~
chaitanya
> What about our response was lacking to you?

Several observations recorded in the Project Zero report about CF's response
were problematic:

"I had a call with cloudflare, and explained that I was baffled why they were
not sharing their notification with me."

"They gave several excuses that didn't make sense, then asked to speak to me
on the phone to explain. They assured me it was on the way and they just
needed my PGP key. I provided it to them, then heard no further response."

"I can already see that the (huge) list does not contain many pages that still
have cached content from before the bug was fixed."

"Cloudflare did finally send me a draft. It contains an excellent postmortem,
but severely downplays the risk to customers."

Granted, I have not been keeping up with what has changed at CF since then so
maybe things are better. But the decisions taken wrt Workers doesn't inspire a
lot of confidence. Even a Google Cloud engineer mentioned that Isolates
probably aren't trustworthy enough to run untrusted code side-by-side
[https://news.ycombinator.com/item?id=18417257](https://news.ycombinator.com/item?id=18417257)

------
neilwilson
For quite a while now I've wondered why we do this. We take a physical
computer and put a program on it that slices the machine up into multi-tenant
abstractions called 'processes' so you can run more than one. Then somebody
runs just one program on that machine in a process that uses all the machine
that then slices the machine up into multi-tenant abstractions called 'X' (for
various versions of X - isolates, containers, Erlang Actors, Database
triggers).

If the Unix process abstraction is no longer fit for all purposes, why don't
we get around to de-layering the operating system so that the Unix process
abstraction is just another 'Virtual Machine' sat on top of something more
fundamental?

Is it time for the return of the pure hypervisor?

~~~
blasdel
You're asking for unikernels, which have had a recent renaissance:
[http://unikernel.org/projects/](http://unikernel.org/projects/)

Some of the density advantages depended on Paravirtualization, which is quite
difficult to make safe against Meltdown:
[https://xenbits.xen.org/xsa/advisory-254.html](https://xenbits.xen.org/xsa/advisory-254.html)

------
_wmd
> Simultaneously, they don’t use a virtual machine or a container, which means
> you are actually running closer to the metal than any other form of cloud
> computing I’m aware of

I recognize this is just a little marketing (which is fine), but I'm not sure
what this is supposed to mean any more.. containers have identical efficiency
to running on a raw VM, and VMs only trap to emulation in very few
circumstances (e.g. IO) any time in the past decade. Running pure compute load
in a container, guest VM or host VM should have almost identical performance.

~~~
madmax96
Especially since containers use vanilla processes and light-weight access
control mechanisms that I wager V8 uses some of.

But they are running multiple client's services in the same process, so that's
how they're saving - less context switching, smaller memory footprint, etc.

I'm interested in _how_ isolated the client's applications actually are...

~~~
iforgotpassword
Well, theoretically, perfectly. Practically, there's some exploits lingering
in there for sure. But that's true for virtualizers too, so...

As the article says, faster spinup (since v8 is already running) seems to be
the customer's greatest advantage only, saving an order of magnitude of memory
is certainly a big deal for the operator.

I think since you can compile pretty Much anything to webassembly, porting
over some existing tools might even be viable.

------
ilovecaching
As someone who works for a large CDN and who has been working in the
networking space for over two decades, the one thing that really bothers me
about CF is how hyperbolic they are on their blog and twitter.

>> Unlike essentially every other cloud computing platform I know of, it
doesn’t use containers or virtual machines.

This is light in comparison to their posts about QUIC, anycast geolocation,
etc.

They surely have done a lot of interesting work, but they'd lead you to
believe they're hosting half the internet, and invented every advancement to
CDNs in the last decade.

~~~
zackbloom
I'm sorry you feel that way. In general we feel like our biggest target with
this language are the millions of people who wouldn't care about TLS 1.3, or
QUIC, etc if we didn't make it clear how important they are. To us it's not
about the people who know and love CDNs, it's all the people who don't, we're
trying to speak to.

------
cfstras
Gary Bernhardt kind-of predicted this.
[https://www.destroyallsoftware.com/talks/the-birth-and-
death...](https://www.destroyallsoftware.com/talks/the-birth-and-death-of-
javascript)

~~~
MrRadar
The point about this approach being "closer to the metal" than other cloud
providers definitely brought this talk to my mind. I just hope the nuclear war
he also predicted for 2020 doesn't come to pass :/

------
thosakwe
I know Dart isn't that loved on here, but I had a pretty similar idea a couple
of weeks back, a PoC serverless platform using Dart's Isolates, which are
shared-nothing:
[https://github.com/thosakwe/serverless_dart_platform](https://github.com/thosakwe/serverless_dart_platform)

I know that the article briefly addressed security, but in my head, I can't
even work out how they manage to securely manage multiple tenants within the
same process. This isn't a knock on them, it's a lack of knowledge on my part,
and I'm wondering how that would even work.

Isolates in Dart can be run based on different dependency graphs (i.e. you can
have an Isolate for user A's project, and a separate on for user B's, within
the same process). However, things like the current working directory are
shared by default. There's an IOOverrides class that lets you change this
contextually, so theoretically you could hack something together that patches
the entry point code to use it. I'm not sure how that could work in Node,
though. What if someone were trying to overwrite another user's files?

Overall, though, the #1 thing that I can't reconcile in my head is that if all
the isolates are running in the same process, wouldn't they all have the same
permissions? What is there to prevent someone from touching another person's
files, or, if they run in the same process as the server, more sensitive data?
AFAIK you can't run a Worker/Isolate as a different user within the same
process?

Not only that, but what about fork(2) bombs, etc.? The only thing I can think
of is for Isolates to run under the context of some user who not only has no
home directory, but also cannot spawn processes, or read/write/delete files.

Lastly, the only other thing I can't work out is how whether the Isolates run
in the same process as the server (and if so, how do they do this securely),
or how they communicate with the separate process, if technically any Isolate
could console.log (maybe via TCP sockets?).

The whole CloudFlare workers concept is super cool... I just don't understand
how it works. I feel like I'm missing some key here.

~~~
kentonv
Like JavaScript running in your browser, workers don't get direct access to
files or other resources on the machine. We provide a limited API which only
allows safe operations. Currently that means a Worker can send and receive
HTTP requests, and read and write from Workers KV (a distributed storage
system). We'll add more storage options in the future, but not raw disk
access.

~~~
thosakwe
Ahhh okay, makes a lot more sense. Thanks.

------
zhobbs
Fastly's CEO (a Cloudflare competitor) gave a pretty great tech talk on what
it means to be isolated:
[https://www.youtube.com/watch?v=FkM1L8-qcjU](https://www.youtube.com/watch?v=FkM1L8-qcjU)

Guessing we'll see a competitive product from Fastly in the near future.

~~~
whorleater
It's important to note that for caching purposes, you can already run VCL
(Varnish Cache Language) on Fastly's edge servers.

------
nickjj
Really interesting article but the ending is a little weird.

> This might mean Isolate-based Serverless is only for newer, more modern,
> applications in the immediate future.

It sounds like you're implying unless your app is written with Node or Go /
Rust then it must be legacy. That's not really fair to say.

Plenty of modern apps are built with Python, Ruby and Elixir. I don't know if
they are all webassembly compile targets, but in either case, you should
probably rewrite that to be less "better write everything in Node or you're a
dinosaur!".

------
js4ever
Cloudflare workers seems amazing and I would love to migrate few thousands of
nodejs lambda I have to cf workers... But... I noticed the 1mb limit for the
function size, most of my functions are between 1.5mb and 8mb depending on the
npm packages required. Also execution time for my functions is between 200ms
and 30sec. (Loading remote data, transform it, generate pdf,...) and workers
are limited to 5-50ms!

~~~
remyg
Cloudflare Workers execution time is counted a little differently. When you
call fetch() in a Worker and are waiting for remote data, Cloudflare doesn’t
count that waiting towards the “execution time” limit. Other providers do.

Note: I work at Cloudflare

------
luddy
The article says that pushing a new lamda@edge function takes 30 minutes. That
has not been my experience. I routinely push lambda@edge versions from the AWS
console and the new versions typically run within a minute or so. Have others
seen anything like a 30 minute propagation delay?

~~~
zackbloom
Lambda@Edge is built on Cloudfront and pushing any change to Cloudfront
generally takes 20-30 minutes in my experience. Can you tell me more about the
change you're making? Is it code or config changing?

~~~
luddy
I'm talking about using the Lambda@edge console to push a new, numbered
version of a lambda function written in node.js. I have pushed 100s of
versions in this way, and I would guess that the mean time until the new
version begins running is on the order of one minute. By "begins running", I
mean that the new version number appears in CloudWatch output.

No other changes to config.

------
jeswin
Do you have plans to open source this work?

It'd be huge for p2p/decentralized apps. As of now, you can only decentralize
data - this allows decentralized compute as well.

~~~
iovoid
[https://github.com/laverdet/isolated-
vm](https://github.com/laverdet/isolated-vm) is similar, and used by a
competitor that has a similar offer

------
lwansbrough
This is awesome. I noticed the "In an Isolate universe you have to either
write your code in Javascript (we use a lot of TypeScript), or a language
which targets WebAssembly like Go or Rust" \-- this is a very interesting
statement! Being able to run wasm code means it could be possible to run .NET
Core apps (really all I care about in the world right now) using such a system
soon, thanks to work done for CoreRT
([https://github.com/dotnet/corert](https://github.com/dotnet/corert))

Really really interested to see where this could lead.

------
fulafel
> In an Isolate universe you have to either write your code in Javascript (we
> use a lot of TypeScript) [or Wasm]

Not really: There are many nice managed languages that use JS as compile
target. You can use ClojureScript, Scala, Purescript, Fable (F#), Haste
(Haskell) etc.

------
askaboutit
$5 minimum. Overpriced compared to any other faas service. Can only run one
worker at a time. No thanks!

------
nl
There was a company Rackspace bought maybe 6 years ago that was building a
semi-POSIX compatible layer on the v8 VM. They had Python running on it and it
looked pretty interesting.

~~~
nl
[http://www.zerovm.org/](http://www.zerovm.org/)

Not sure what progress has been made on it.

~~~
bklaasen
Looks like none in the past four years:
[https://github.com/zerovm/zerovm](https://github.com/zerovm/zerovm)

------
tschellenbach
Wonder if you could build something similar using Go, A goroutine's overhead
is only about 2KB. No idea how easy it would be to sandbox it.

~~~
wmf
It wouldn't be easy at all to sandbox it; that's the problem with every
language that isn't Java or JS. (And I think people gave up on Java sandboxing
a decade or more ago.)

~~~
madmax96
What about JavaScript makes sandboxing easier than, say, Go, Python, etc?

~~~
wmf
Just the fact that it was designed for sandboxing hostile code from the
beginning.

~~~
madmax96
How does that design actually manifest itself? I don't know very much about
this - what technical details about JavaScript make this easier? And then why
doesn't Node support this kind of sandboxing, but V8 does, if it's so much
easier to support sandboxing than other languages?

~~~
jerf
The primary advantage is that it's already done, really. It's not hard to
_define_ isolated environments in almost any language you could name ("I
should be able to have two runtime environments that support as much of the
original language as possible, and they should not be able to affect each
other" \- a thousand i's to dot and t's to cross, but that pretty much
captures the idea), but _implementing_ them is another thing entirely. So much
of the rest of language design tends to cut the other way, making sure that
things are hooked together deeply for the convenience of the programmer.

I'm not particularly convinced Javascript is that much better than any other
language as the language goes, but the implementation is far ahead of almost
anything else. I know of other implementations of this idea in other languages
("Safe" in Perl [1], various attempts at sandboxing in Python which have
generally failed, Safe Haskell [2], many others), but without the browser use
case driving them forward, they tend to be poorly tested and even decay over
time. It tends to be a use case popular enough to get some stab at support,
but not popular enough to quite result in a robust, real-world-tested library
support. The browser is an exceptional use case.

[1]: [https://perldoc.perl.org/Safe.html](https://perldoc.perl.org/Safe.html)

[2]:
[https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell](https://ghc.haskell.org/trac/ghc/wiki/SafeHaskell)

------
whoisjuan
But this thing only runs JavaScript. The constant AWS bashing in this article
doesn't make sense when the only fair comparison they can make here is AWS
Lambda with a Node.js runtime against Cloudfare Workers.

~~~
zackbloom
You can run any WebAssembly targeting language in a Worker.

------
ilaksh
So I'm guessing that in order to use modules from npm you use WebPack?

And what's available is the same as a Service Worker in Chrome. So you can use
http fetch but there is no raw socket etc. Does it support WebRTC (not sure if
there is a way to actually use that on the server but maybe there is some
weird use case for communicating between Workers).

Also wondering what options there are for files or databases etc.

It sounds like the design is for them to handle individual requests, but just
out of curiousity, how long can the workers keep running?

~~~
treelovinhippie
Yep I've got an index.js file that creates a simple API gateway via a basic
switch() on the url pathname. Then individual files for each function. Bundle
them into a single script with webpack and copy that into the Workers UI
(there's an API for deployment too... eg can use Serverless Framework).

1MB max size, max 5-50ms CPU time, max 128MB RAM:
[https://developers.cloudflare.com/workers/writing-
workers/re...](https://developers.cloudflare.com/workers/writing-
workers/resource-limits)

That said I've got a worker script running a pass-through websocket to GraphQL
origin server and it keeps the connection open.

I'm doing authentication within the worker too. Long-term I can see ability to
run absolutely everything there: microservices, GraphQL server (Apollo has a
beta), auth, static SPA files, SSR etc. $0.50 per million calls ($5/mth
minimum). And they intend to be within 10ms ping of 99% of the global
population.

If they can workout how to do databases on the edge (they're working on it),
then I can see a future where you could build scalable apps that serve
millions of users for maybe less than $100/mth across the entire stack.

------
z3t4
If you know how long a request takes you can roughly calculate how many
requests per second your server can handle ... When working in
JavaScript/Node.JS one thing that is both sync and very slow (almost 1ms) is
console.log ! So next time when running benchmarks, try removing the
console.log's =)

It's a mystery to me why we need millions of instructions just to put "hello
world" in a pipe though. I'm thinking of learning low level programming, but
I'm too scared of what I might find.

~~~
ilaksh
That's what's nice about programming retro computers like the C64. The number
of layers you have to dig through is small and doesn't lead to a lot of
unnecessary bloat or lag.

------
polskibus
The article says: "Unlike essentially every other cloud computing platform I
know of, it doesn’t use containers or virtual machines. "

Surely there were others before like Azure Service Fabric ?

~~~
inopinatus
I’ve seen the automated provisioning and scheduling of jails and zones for
application deployment, pretty much since they were invented. Automated UID
provisioning for isolation and resource constraint and application package
deployment, at least since the nineties.

Ultimately most of us want to put files on a server and start a process.
There’s just an endless variety of constantly re-invented ways to do that.
Choose your scaffolding.

~~~
icebraining
Jails and Zones are essentially containers - they're ways of isolating process
from each other by basically constraining what they can see and change, at the
kernel level.

What they're describing is actually running code from multiple people on the
_same_ process.

~~~
inopinatus
A process is any instance of a program that's being executed. That definition
doesn't change whether it's a program running as an ordinary Unix process, as
an Isolate in a V8 runtime in a container on a virtual machine on a bare-metal
blade in a DC rack, or as a hypothetical MIPS-like CPU on my whiteboard.

By "choose your scaffolding", I'm just saying "how much of the OS kernel would
you like to re-invent?"

That isn't a wholly cynical position. Doing so might introduce new
capabilities or cost efficiencies, which seems to be Cloudflare's pitch. But
very often we get excited by the possibilities and lose sight of the tons of
additional stuff now required to achieve quite simple things. In the worst
case (e.g. VMware) we end up re-inventing hardware abstraction, scheduling,
networking, memory management, and filesystems i.e. most of the damn kernel.
Docker is a less egregious offender but still deserving of the "too much
scaffolding" label.

In the very best outcome, the capability gets stripped down to the core
enhancement and pushed back into the expectations for a base OS, which is for
example why practically every OS now comes with a hypervisor and a container
mechanism.

------
sholladay
So awesome to see this on the front page just a day after I was complaining
about Docker on another HN thread.

Seems like a very pragmatic system. Get some great benefits for a reasonably
well defined and common use case (JavaScript apps that don't need to run
external binaries), without doing anything crazy just to support apps that
don't quite fit in the architecture. It might not work for everything, but for
many apps this will be a very simple and low overhead solution. Kudos!

------
mleonhard
Worker execution is limited to 15s of wallclock time. What happens if the
client has a slow network connection and takes >15s to receive the HTTP
response of a worker?

~~~
zackbloom
It's actually required to _start_ all requests within 15 seconds, they can
take longer before completing.

------
fuball63
I'm working on an experimental function as a service based on CGI apps. It has
a lot of advantages of the article:

\- Process isolation

\- Close to hardware

\- No cold start

But it also has some advantages like:

\- Any language, including compiled languages

\- Local testing, backed by a standard

I am still experimenting with the scaling potential; my biggest concern is
memory requirements. Currently I'm writing the project's blog using the
project, and then am going to do some performance tests.

It's hosted on a 5 dollar digital ocean instance at bigcgi.com.

[edited for formatting]

~~~
jchw
How would this not suffer from similar issues to Lambda? The start times are
going to be worse per request than application servers today because all of
the initialization can no longer be amortized. Long before containers became
the hot new thing, people were using FastCGI to amortize these costs, and
importantly, Scale - CGI requires expensive process creation on every request.
You can't take advantage of fast, usermode context switching like with
Goroutines or event-based architectures like Node.js because each request is a
process in old-school CGI.

And how doesn't that put you to square 1 with isolation? Cloudflare is running
multi-tenant loads; you ideally want to have fine grained control on resource
utilization and access controls. With CGI, you pretty much have to reinvent
Docker to get the same level of isolation...

I'm not sure I follow how this is superior to other solutions except for that
CGI is easier to test locally.

~~~
fuball63
A fast-cgi implementation is something I've looked into if plain CGI is not
fast enough. Initially I want to see what exactly the performance comparison
is with plain CGI, and then iterate from there.

As for resource utilization/constraints, it runs on freebsd and uses rctl to
limit memory usage and number of processes. One thing I want to find out is
how to balance user resource constraints with the total load of the shared
server.

~~~
jchw
Oh, I see - so freebsd jails are used for this. At least for process isolation
that seems like a solid approach.

Still, I do wonder how much of the problem with containers is container
overhead. To me a big difference is about what costs can be amortized. With
CGI, pretty much every cost is paid at every request. With long running
daemons, the per request overhead is removed. With Cloudflare's functions, the
scheduling, request, and startup overhead are (theoretically) removed, leaving
mostly just execution.

With just CGI there's some win since you can drop the scheduling overhead, but
I think you would probably lose that advantage as soon as you hit the upper
limit where forking per request stops scaling. With FastCGI you can enable
better scaling but since you now have to manage long running processes, you
are officially in the scheduling business, though you could probably beat
container scheduler like Kubernetes in terms of raw overhead.

I don't know. Maybe this is a good idea, but I'll admit my first impression
was mostly "isn't this just CGI as it always was?"

------
mleonhard
I read through nearly all of the docs looking for testing guidance. How can I
include a CloudFlare Worker in my automated integration tests?

------
newnewpdro
The implication this article makes that "containers" are somehow further from
the metal than sandboxed javascript is disingenuous.

~~~
zbentley
How so? Context switching has a cost, and the solution outlined within seems
to achieve similar functionality while not paying nearly as much. The same is
true for quota/resource management (what cgroups etc. would provide you with
containers on Linux).

I haven't used this, just read the same article you have, but the claim seems
reasonable.

~~~
newnewpdro
You can run javascript in a container and its distance from the metal is no
different than if you just ran the javascript outside of a container.
Containers are just processes with limited visibility into the rest of the
host, they're nothing like VMs.

We've _deliberately_ incurred the context switching costs of using multiple
processes for _decades_ because the hardware-enforced isolation of address
spaces via the MMU is _desirable_. Of course you can gain efficiency by
throwing away the multiprocess model and running separate security domains in
the same address space of a single process, we're not breaking new ground here
by going backwards.

------
sriku
Been exploring this idea in the app development context for a while now. I
wished PNaCl had filled this role, but now think WebAssembly may take up this
space instead as it has similar performance characteristics. The attraction of
both these for us is that many languages can compile down to be run in wasm.

------
ing33k
This tech can be a game changer for applications where latency matters more
than others. My question is about portability of this. If I spend enough time
and port certain parts of my application to this , will it be just tired to CF
workers runtime ? Is it possible to run workers on other clouds ?

------
dblotsky
> An Isolate-based system runs all of the code in a single process and uses
> its own mechanisms to ensure safe memory access.

What property of this application allows it to share CPU time and memory more
efficiently than the OS can?

------
hinkley
The longer we go down this road, the more the sales pitches feel like Erlang.

------
miguelmota
Can someone shine some light on how secure Workers are compared to containers
that use cgroups and namespaces for virtual isolation?

------
vloz
Are you planning to release any free plan on workers as firebase cloud
functions does? The current offer is 5$/month...

~~~
geostyx
Is $5/month too much?

~~~
kamaln7
(not OP) It is for a side project that gets very little traffic. I'm still
paying that $5/mo just to play with this tech but I can't say it's not too
much for my use case. I guess that's not the target market for Workers anyway

------
mleonhard
Can I get an automatic email when my Workers begin timing out?

Will there be logs or graphs to help investigate such issues?

------
weberc2
Do workers allow you to run init code on cold-boot only, or do you have to do
all of your init with each request? If the latter, presumably that would also
mean that a language like Go (compiled to wasm, obviously) would have its
runtime start up with each request?

~~~
kentonv
Yes, you can run init code at cold-start time. JavaScript's global scope is
evaluated once per isolate created, and you would normally instantiate your
WASM modules at the global scope as well.

------
liquid153
So does this mean my function(s) must be written in JavaScript? Or can it run
wasm assemblies.

~~~
geostyx
Both! [https://blog.cloudflare.com/webassembly-on-cloudflare-
worker...](https://blog.cloudflare.com/webassembly-on-cloudflare-workers/)

------
ZoomStop
> Even more importantly: Amazon is bad. It is monopolistic [...] Its owner
> should have his immoral hoard of wealth forcibly expropriated by the state
> before his power grows so great that all of society is warped by it.

That is a heck of a statement to make.

~~~
ZoomStop
This was a reply to a different article, sorry.

------
kees99
Very cool tech. But ridiculously expensive. According to [1] you pay $5 per
10,000,000 requests, <10ms each. That works out to less than 27.8 hours of CPU
time per month. And that's assuming near-perfect utilization of that 10ms
window and ignoring upfront cost of Pro plan, which this is an add-on to.

Compare that to Linode/Hetzner/OVH/etc. where you get half-CPU for that money
(i.e. 360 hours CPU time per month).

And yeah, sure. Not apples-to-apples. FaaS, infinite scalability, yadda,
yadda. But computing http responses is still computing http responses. Is this
FaaS magic dust really worth 1400+% premium?

[1] [https://www.cloudflare.com/products/cloudflare-
workers/](https://www.cloudflare.com/products/cloudflare-workers/)

~~~
kentonv
Another reason why this is not an apples-to-apples comparison is that most
people don't run their VMs at 100% utilization. In fact, I'd wager the vast
majority sit at single-digit percentages on average. Linode will still charge
you for the full CPU, of course.

~~~
kees99
Well, CF always charges you for full 10,000,000 requests/mo too.

Also, it is super-common (especially on low end) to offer burstable CPU VMs
[1,2]. So you are _not_ getting charged for full CPU, unless you specifically
ask for it.

[1]
[https://www.hetzner.com/cloud?country=us](https://www.hetzner.com/cloud?country=us)
(bottom of the page, toggle between "default" and "dedicated CPU")

[2] [https://aws.amazon.com/ec2/instance-
types/t3/](https://aws.amazon.com/ec2/instance-types/t3/) (product details,
"Baseline Performance/vCPU" column)

~~~
geostyx
You aren't paying for Workers for the CPU time IMO, you're paying to be in 155
locations around the world so that your code runs closer to your user. You
couldn't split a hetzner/ec2 box across locations like that. A lot can be
accomplished in 10ms on Workers in my experience since they don't count
waiting on network requests as part of that 10ms.

If requests only require <10ms of CPU anyway, $5 per 10 million requests is
pretty reasonable to me.

------
rat9988
Does the fact that a worker can run in one of the many data-centers of
Cloudflare guarantee some sort of multi-regional redundancy/high-availability
in case they have a problem with a region?

~~~
zackbloom
Yes. One of our goals with this, not yet mentioned, is to get rid of the idea
of availability zones or regions. Every worker runs in every data center. Any
data center can be taken offline and traffic will be seemlessly rerouted at
the BGP level.

~~~
rat9988
This is great news! In my opinion, you should emphasize a bit more on it in
your writings. It's one of the most important factors. it was for us the
central discussion point when were designing a platform for our webapp. We may
reevaluate Cloudflare for the next iterations if you can provide this
multiple-regions high-availability for your k/v store.

------
davidjnelson
This is awesome. Would be even better if the cpu time could be increased.
Right now it’s 5ms for smaller accounts, 50ms for business accounts ( ~$200/mo
), or custom for large customers.

~~~
geostyx
What kinds of things are you wanting to do with more CPU time?

~~~
davidjnelson
Database and third party api access were the use cases that initially came to
mind.

~~~
zbentley
Do you mean total runtime or actual CPU time? Total runtime would be affected
by the use cases you mention; CPU time would not (or would not be affected
nearly as much, at least). A function waiting on an HTTP response is not
burning CPU.

~~~
davidjnelson
Interesting, I thought they were measuring runtime when they said cpu time.
AWS Lambda is priced on runtime so I figured it would work the same. Thanks
for pointing this out.

I re-read the article and it is indeed cpu time not runtime for Cloudflare
Workers. Sweet!

The part of the article that clarifies this:

> This is not meant to be a referendum on AWS billing, but it’s worth a quick
> mention as the economics are interesting. Lambdas are billed based on how
> long they run for. That billing is rounded up to the nearest 100
> milliseconds, meaning people are overpaying for an average of 50
> milliseconds every execution. Worse, they bill you for the entire time the
> Lambda is running, even if it’s just waiting for an external request to
> complete. As external requests can take hundreds or thousands of ms, you can
> end up paying ridiculous amounts at scale.

> Isolates have such a small memory footprint that we, at least, can afford to
> only bill you while your code is actually executing.

> In our case, due to the lower overhead, Workers end up being around 3x
> cheaper per CPU-cycle. A Worker offering 50 milliseconds of CPU is $0.50 per
> million requests, the equivalent Lambda is $1.84 per million. I believe
> lowering costs by 3x is a strong enough motivator that it alone will
> motivate companies to make the switch to Isolate-based providers.

~~~
elithrar
Yep: 50ms CPU time is a lot. Network calls aren’t meaningful CPU time, which
is why you can use Workers to call to third party APIs. You may want to be
mindful of doing that during a user request as it now adds a dependency &
latency, but it’s certainly possible to call out to a translation API or
mapping API & cache the results...

~~~
davidjnelson
> Network calls aren’t meaningful CPU time

I understand how async io works. I'm used to AWS Lambda charging for request
start to end time regardless of cpu usage.

------
xte
Yes, "The Network is the Computer" (SUN's motto) and it's best implementation
I know is Plan9 one. Which means _me_ (user) at the center with _my_ control
on _my_ system.

Having anything on someone else computer it's not "The Network is the
Computer" it's simply someone else computer. Also it can't scale, no matter
how much you work on it, no matter what wonderful tech you create/integrate.
Many now talk about edge computing because of that, and I fear it can succeed
for a certain amount of time but still it can't scale.

We are people, not puppet, we need to be autonomous and social, not a herd of
sheep with very few shepherds.

------
craftoman
Containers were always a crap. Everyone is bragging about speed and security.
That's all bullshit, it's like jails for BSD. There are tons of exploits, same
goes to speed. There is NOTHING comprared to bare metal, have you ever did any
benchmark on applications cause my team did and find disturbing results enough
to write an article and be king for one day. Docker is made for rock star
developers, not rational system engineers.

~~~
elithrar
That is completely untrue. Containers (distinct from the Docker /daemon/)
provide significant security gains.

Try breaking [https://contained.af/](https://contained.af/)

~~~
craftoman
I know it's kinda old but think about how many exploits are out there in the
jungle.

[https://www.twistlock.com/labs-blog/escaping-docker-
containe...](https://www.twistlock.com/labs-blog/escaping-docker-container-
using-waitid-cve-2017-5123/)

------
z3t4
This is cool. But why should I run stuff in a Cloud worker, when it can be run
in the Browser's service worker !?

~~~
elithrar
Different use-cases. You can do crypto operations in a hosted worker
(sign/validate requests, auth clients), make caching decisions that don’t
require you to consume (precious) client bandwidth, and make routing decisions
based on both client properties (geo-location, auth state, etc) and server
properties (backend uptime, latency, etc).

A good example is HTTP middleware in your web framework of choice: there are
things it does that can’t be done safely or correctly on the client, but that
you may want to do at the edge before it hits your backend entirely.

------
WrtCdEvrydy
I would love to check this out... locally in my own computer ...for evaluation

~~~
jeromegn
This is what we do at fly.io:
[https://github.com/superfly/fly](https://github.com/superfly/fly)

It's based on node.js right now. It's a bit clunky, but we have many success
stories of customers running it locally, on their CI and deploying to
infrastructure.

Currently the open source version is not the same as the production version
(due to the distributed nature of our platform.) We're working on fixing that
very soon and going all-in open source.

~~~
WrtCdEvrydy
And now you have my attention.

------
throwaway487548
What, BSD jails all over again?

