Towards Modern Development of Cloud Applications (2023)

dti · on Jan 14, 2024

Looks interesting.

> The call to hello.Greet looks like a regular method call

That’s a departure from how components interact in boq — an internal and widely used production platform that has _some_ of the features from the paper. There component interfaces _are_ RPC interfaces (e.g., Stubby / gRPC + protocol buffers), and interaction between them is possible exclusively through the component interfaces. Hence it’s very explicit at the call site that an RPC is being made (which could happen to execute locally with all the standard RPC functionality — context and deadline propagation, etc.).

RPCs looking like regular method calls sound a bit scary (easy to miss in code reviews); I wonder if enforced naming conventions + IDE + code review tool support would be enough.

Edit: it seems to require to pass a context object, so the readers won't confuse it with a local call (from https://serviceweaver.dev/):

  sum, err := adder.Add(ctx, 1, 2)

---

Also, the paper claims that most benefits come from a non-versioned serialization format:

> Most of the performance benefits of our prototype come from its use of a custom serialization format designed for non-versioned data exchange [...]

However, I don’t understand why local RPC calls have to serialize protocol buffer messages — can’t they already pass them as-is to the local handler?

(disclaimer: a googler, no internal knowledge on ServiceWeaver)

sanderjd · on Jan 14, 2024

> However, I don’t understand why local RPC calls have to serialize protocol buffer messages — can’t they already pass them as-is to the local handler?

I didn't read the paper in enough detail to know the answer to this, but mightn't this enable different implementation languages for different components? In my experience, it's difficult to accomplish reliably that without using a language-agnostic serialization format (like proto).

Even if that's the goal, it seems like a handler could determine whether it could elide the serialization depending on the implementation details of the components.

dti · on Jan 15, 2024

I see I may have been unclear — I was surprised they don't use protobufs (which one should be able to pass as-is without serialization to the locally-deployed component), but apparently using a custom optimized format for non-local calls is the primary motivation (not local calls with grpc requiring serialization — that shouldn't be the case).

However, now that I think again about the serialization format choice, it may result in a limitation on the size of monoliths (in terms of the number of people / teams contributing to it). When the number of contributors grow, the likelihood of bugs in a binary grows, and teams adopt more elaborate qualification processes, and also become much more sensitive to binary rollbacks as a remedy to discovering bugs in prod. Then they could institute policies like all changes should be protected by a feature flag (aka an experiment).

If non-versioned serialization format is used, that means that the platform cannot possibly rollback a single component. However, using versioned serialization won't be enough on its own to support per-component rollbacks — it at least requires independent component qualification (where each component is tested against "stable" versions of other components) + rollback testing to make rollbacks A2 -> B2 to A2 -> B1 safe.

I wonder if it's an explicit design choice — i.e., whether Service Weaver supports monoliths up to a certain organizational size (and then you should split into separate service weaver apps)?

arter4 · on Jan 15, 2024

In your experience, how does this kind of approach behave with asynchronous dependencies?

Let's say you start from a codebase with three portions (call them services, modules, whatever): A, B, C, D. A sends a (synchronous) remote procedure call to B, which sends a message over a message bus which is also used by C and D. C and D do not talk to each other except over the bus.

It sounds like this approach would identify the remote call dependency between A and B (which could be split into different deployment units), but not the message bus usage. Or, at least, it can't identify who is subscribing to a topic where B pushes its events.

As a result, you would get two deployment modules:

- A

- B and "everything else"

Which doesn't sound right.

Am I missing something?

fizx · on Jan 15, 2024

> local RPC calls have to serialize protocol buffer messages

They don't have to, they artificially added that constraint to make the benchmarks fairer.

5ms protobuf, 2ms custom, 0.4ms in-process

dilyevsky · on Jan 15, 2024

> RPCs looking like regular method calls sound a bit scary (easy to miss in code reviews);

CORBA had the same issue. The call could take 10us or 10s and no way to tell by the user. This was ofc widely considered as huge design flaw.

fizx · on Jan 15, 2024

They're claiming that the runtime will figure it out for you.

It wouldn't be too surprising if this is the sort of thing that an optimizing compiler or query planner could do better than a human.

If not, you're probably at the scale where performance regressions are caught and rolled back at early phases of rollout.

Like most magic, it's either going to make things 100x better or 100x worse, depending on how leaky the abstraction is at its current state of maturity.

dilyevsky · on Jan 15, 2024

I'm going to (at least I should) design my application logic very differently if I know in advance the call might take a while or timeout completely. If I'm not offered that info during development time it's just going to turn into terrible mess in production. Ain't nothing any framework can do about it if the language itself lacks the semantics to express developer's constraints.

erulabs · on Jan 14, 2024

Interesting! This has been my recommended pattern for a for years now: a monorepo with multiple entrypoints. Worker and api and emailer etc etc services are running independently, but it’s one codebase running with different options. All the developer benefits of a monolith and all the devops benefits of sane and isolated deployments.

rezonant · on Jan 14, 2024

Also often allows you to simplify the development experience dramatically-- instead of spinning up twenty processes and running a complex live rebuild process or manually rebuilding and restarting certain parts, you just need to start the whole thing, because in development it all runs in a single process (or, at least less processes than you do when it is run in production)

jamesfinlayson · on Jan 15, 2024

> Also often allows you to simplify the development experience dramatically

Always the afterthought. At a previous job we had a bunch of microservices developed on peoples' computers, and they only place they all ran together was the handful of integration environments.

synthc · on Jan 14, 2024

Interesting paper, i've thought about this topic a lot.

In previous projects I defined systems in a single code base, and parts could be deployed separately by providing different configuration files.

It was a very productive approach: one could run the whole system in a single process during development, which made writing integration tests a lot easier than spinning up dozens of docker images.

It still required some manual work, and deployments were still to static for my liking. Ideally it should be possible to split off and scale subparts dynamically.

In the Clojure world there are several projects now that explore splitting up services in a transparent way.

For example Electric Clojure splits up your code into frontend and backend parts, making the frontend-backend split transparent. Another project is Rama, which does something similar but for distributed steam processing and partitioning.

I'd love to explore something like this but for enterprisey service meshes: the programmer just defines services, and a compiler decides how to split these over different machines, and all the RPC/serialization/deserialization is done for you.

_a_a_a_ · on Jan 14, 2024

Regarding your last paragraph, interesting idea about the compiler doing the work, but isn't RPC/serdes basically built into CORBA or OLE transparently? Never used the former, and the latter was so long ago I forgotten it, but I thought that was the idea.

asim · on Jan 14, 2024

Google has always been a head of the curve: from mapreduce to big table and grpc. They're always building for scale and productivity at the company. Service weaver is an evolution of that. The idea that developers themselves don't have to decide how to separate and manage services. There's so much plumbing to grpc that's unnecessary for developers to deal with. But the service boundary and development style that lets you write modular code is really important. The thing is, most code has overlapping concerns and cross dependencies. Network based boundaries using APIs were a way to allow teams to operate in isolation while providing external APIs to access different services but it comes at a huge costs. It was the hammer for people and team scale, not compute scale. Now the tools are evolving, where we can actually operate on monolithic codebases at huge scale as Google and others have shown and that also means the deployment technology can also evolve to cater to that mode of development while seamlessly handling the technical scale details for separating and deploying the code for specific services across data centers and the cloud at large.

What does it all mean. Well it's a technology built for Google scale. It may have merits in other place as a lot of tech has done, but at the same time, for 90% of teams this doesn't matter. You have a monolithic code base in a single repo and you can deploy and vertically or horizontally scale quite easily depending on your requirements. For companies that are 200+ engineers split across 15-20 teams this might matter. They already be doing some sort of microservices or service splitting while still using a monorepo. Being able to remove a lot of platform level code that you manage versus it being an open source thing is advantageous because you can go back to focusing on the business case not the glue code.

pjmlp · on Jan 14, 2024

There is nothing about being ahead of the curve with gRPC.

That is only people getting the point parsing JSON and XML all over the place doesn't scale and there is a reason why SUN-RPC, DCE, CORBA, DCOM, Java RMI, and .NET Remoting existed in first place.

hugodan · on Jan 14, 2024

Feels a lot like old school CORBA - quoting from the paper: “Components may be hosted by different OS processes (perhaps across many machines). Component method invocations turn into remote procedure calls where necessary, but remain local procedure calls if the caller and callee component are in the same process.”

What is old is new again

RaftPeople · on Jan 15, 2024

I had the exact same thought when I read that.

saulrh · on Jan 14, 2024

This looks like it parallels the shift from MapReduce's paradigm of one binary per execution graph node to FlumeJava (and Apache Beam's) monolithic binary for all worker nodes that reconfigures itself as necessary for each stage. My experience is that Flume/Beam is a lot nicer in almost all ways, so I'm not surprised that the same thing works for services too.

awenix · on Jan 15, 2024

Looks very similar to Service Fabric https://learn.microsoft.com/en-us/azure/service-fabric/servi.... I have heard it scale upto 2000 nodes and extensively used internally at microsoft.

AtNightWeCode · on Jan 16, 2024

Looks like it. I used to work with it some years ago. I will never use it again or any other actor model based platform with remoting and magic schedulers.

throwaway892238 · on Jan 15, 2024

I think this paper is well-intentioned, but is trying to treat the symptoms rather than the cause.

The paper focuses on microservices, and then tries to avoid claims of "they just don't like microservices" by describing the ways in which microservices are improperly used. Do they go back and compare this to monoliths or other architectures? Nope; it's really just "hey I have another microservices idea", heavily gilded. They mention "monolithic applications divided into logically distinct components", but you could just claim your microservices are divided into logically distinct components.

They also seem to completely ignore the problem that a logical separation doesn't mean your components are better off. In a complex system, often completely separate components still need to be integrated together in order for the system to function at all, much less operate efficiently. It's not a design flaw to combine different things. It depends on the application. So just separating things logically isn't some scientific computing advancement, it's just categorization.

In reality, their solution (a "single binary business logic application" and "an interface that can combine them") is literally a description of shell scripting with Unix tools. Don't get me wrong, that obviously works great, since it's been popular for 44 years (older than IPv4). But if you want to come up with some kind of modern paradigm for distributed computing, maybe we should flush it out a bit more. What we have here is a Google engineer's attempt to make a paper suggesting we make shell scripting for the web, without much to show for it.

(Personally, I think the more people try to control the interface, the worse things get. The best and most long-lived solutions in all of computing have had almost no interface at all; a raw TCP stream, 3 raw file descriptors, a set of random arguments, and a set of random key=value pairs, have enabled all modern computing paradigms to flourish)

charkubi · on Jan 14, 2024

Great ideas. Observations:

- Making remote calls seem like they are local resulted in poor design decisions, the benefit of SOAP/REST was that people considered what the interface a useful service should be.

- Why not flip it and look to move groups of microservices onto the same machine, updating how the app communicate. - if component boundaries are fine grained, the combinations of local/remote services relative to each other increases, along with the testing burden; just because the system hides remote deploying, it still should be tested for.

- Incorporating this with storage, eg dynamic shard rebalancing would be super cool

progbits · on Jan 14, 2024

https://serviceweaver.dev/

Because it is a bit buried in the paper, this is the prototype implementation they talk about.

donjoe · on Jan 14, 2024

Discussed here: https://news.ycombinator.com/item?id=34986267

thesecretsquad · on Jan 15, 2024

I was reminded of this article (https://blog.cleancoder.com/uncle-bob/2014/10/01/CleanMicros...) from Uncle Bob while reading this paper.

From the article:

> The Deployment Model is a Detail.

> If the code of the components can be written so that the communications mechanisms, and process separation mechanisms are irrelevant, then those mechanisms are details. And details are never part of an architecture.

> That means that there is no such thing as a micro-service architecture. Micro-services are a deployment option, not an architecture.

arter4 · on Jan 15, 2024

I have read the paper, although maybe not deeply enough.

It is an interesting idea, but I'm not sure I'm fully convinced.

Sure, you can parse remote calls and package imports to find dependencies. But this assumes applications are integrated using remote procedure calls. What if applications talk to each other using a message bus? Asynchronous patterns are meant precisely to decouple publishers and subscribers - in number (how many processes will read a message?), identity (who are those processes?) and time. It looks like this system would not be able to decouple portions of code talking to each other using a message queue.

Any thoughts?

CharlieDigital · on Jan 15, 2024

I built a simple example using Postgres and .NET as a runtime: https://github.com/CharlieDigital/dn8-modular-monolith

I don't think it would be very difficult to replace the poll on the database table with a pull from SQS instead.

qu1cks0rt · on Jan 14, 2024

Implementing proposed approach is not easy without hurting maintenance, scalability and readability. At Google, there is a dedicated framework for writing distributed systems that implements proposals from the document.

notso411 · on Jan 14, 2024

And microservices and nanorepositories solve this issue I assume

solatic · on Jan 16, 2024

Sounds a lot like NextJS's concept of framework-defined infrastructure: https://vercel.com/blog/framework-defined-infrastructure . A single file that includes both backend and frontend code infers a static bundle deployed to a CDN + server-based resources for the backend.

rezonant · on Jan 14, 2024

I've been experimenting with this sort of thing for awhile, but without automation to determine the deployment topology (nice!). Lacking a better name, I've been calling it a monolithic micro service architecture. Obviously nothing here prevents you from creating a distributed monolith, but it can allow a smaller team to enjoy better horizontal scaling without as much cost, while still retaining ease of forward development.

fuzzyengineer · on Jan 14, 2024

Interesting indeed. While still doing two services we manage local development using gems in ruby which allow us to seamlessly include the other service’s code as a library and we write unit tests using that integration. The two binaries are built as two gems with versions and tested locally. When it comes to deploy we deploy both at the same time. We get many advantages mentioned here while the services operate independently.

rockwotj · on Jan 14, 2024

I recently wrote a blog post about how I think WasmCloud is very close to this with the new component model. They use IDL instead of language constructs to specify boundaries. Pros and cons of either

https://blog.rockwotj.com/musings-on-kubecon-wasmday-2023

sgt101 · on Jan 14, 2024

Bold claims...

Performance is evaluated against a single example implementation (section 6.1) and 9x improvement was achieved when co-locating into a single process. To be fair this seems a reasonable thing to do if you can get the application to be efficient enough to allow it - but there may be very good reasons not to do it in many applications.

CharlieDigital · on Jan 14, 2024

Network I/O is almost always the slowest part of any application regardless of the protocol and data encoding mechanism.

It seems to reason that if you can co-locate your calls into a single process, you'd gain at least 9x.

bob1029 · on Jan 14, 2024

For same process + core + hot cache & pipeline, your upper bound is something like 1,000,000x faster than a trip within the same datacenter.

This intuition is what encouraged us to run with SQLite in production back when most developers didn't think it was a good idea.

sgt101 · on Jan 14, 2024

Yeah - I think that their argument would be that their approach enables you to do that, and that there are significant gains on the table without that last step.

My main critique is that it's one example, and it would be good to see the technique exercised across a number so that we can see the strengths and weaknesses.

dti · on Jan 14, 2024

~2.5x when not co-located isn't bad either; and the beauty of it is that the programming model lets the runtime system perform these relocations, based on the application profile.

mahkoh · on Jan 14, 2024

The component model seems like it's just a re-invention of JBoss-like application servers.

Maybe it is explained somewhere in the paper but how do atomic updates work if at least one of the components has to be a singleton that does not support rolling upgrades?

dang · on Jan 15, 2024

tlarkworthy · on Jan 14, 2024

I am a big proponent of the fat lambda paradigm, but it falls short in a few areas and then you need to split - security boundaries (e.g. isolating user code) - dependency overheads

signa11 · on Jan 14, 2024

the proclet name brings back so many old memories of the wonderful technology at startent-networks ! what is described here is how (in principle) various things (inter)worked.

what a lovely time !

iamgopal · on Jan 14, 2024

Isn’t this prevailing wisdom among HN crowd ?

saulrh · on Jan 14, 2024

I think that this might actually be something new - you have the same topology as a microservice architecture, but rather than actually programming and deploying one binary per microservice, you build a single binary with all of the functionality and it gets deployed as a fleet of microservices, running a different subset of all of that functionality depending on what it's deployed as. Which means you have only one build and one deploy step and atomic deploys, like a monolith, but you also get to scale each component independently and isolate data like with microservices. A really nice looking middle ground.

jen20 · on Jan 14, 2024

There’s absolutely nothing new about a binary that can be run in a number of different roles depending on configuration (either statically configured or assigned by some controller process).

For example, FoundationDB does exactly this (via dynamic assignment), as do many other databases. All of the HashiCorp runtime tools also do it. I’m sure there are also much earlier examples.

zilti · on Jan 15, 2024

Ah, the good old "worst of both worlds" approach, I see

quickthrower2 · on Jan 14, 2024

Sounds like DCOM?

crimbles · on Jan 14, 2024

Yeah exactly that. DCOM+DTC+MSMQ is the old new and the new old.

mkleczek · on Jan 14, 2024

Mobile code and "agent systems" were very fashionable 20 years ago. Java introduced built-in RMI with automatic stub downloading. In 1998 Sun published https://en.wikipedia.org/wiki/Jini that was an extension of this idea. Several higher level frameworks emerged (JavaSpaces among the most prolific).

Reading the paper two thoughts come to mind:

- "What's old is new again"

- "those who do not learn from history are doomed to repeat it"

There were several attempts in history to implement transparent RPC (new again). All failed due to https://en.wikipedia.org/wiki/Fallacies_of_distributed_compu... (learn from history)

Looks like any abstraction trying to hide distribution is inherently too leaky.

c0l0 · on Jan 14, 2024

I find it very amusing that most of the people who abhor dynamic linking of libraries and executables as an over-complex, error-prone mistake of computer history these days seem to have not the faintest trace of restraint when they're about to turn EVERYTHING into a RPC/RMI over complex, multi-layer (and I don't mean OSI), often completely proprietary and opaque network abstractions like there's no tomorrow.

signa11 · on Jan 14, 2024

> - "those who do not learn from history are doomed to repeat it"

or as mark-twain would (probably) say

"History Does Not Repeat Itself, But It Rhymes"

pjmlp · on Jan 14, 2024

On the same vein, I now take a kick out of people criticizing Java and .NET application servers, while at the same time praising delivering WASM containers with Kubernetes YAML spaghetti into production.

Need to get that VC money into the hot WASM space.

mkleczek · on Jan 14, 2024

Let's be honest - deployment story with J2EE application servers was total mess and I still have nightmares.

But you're right - today's world of YAML programming is no better.

Nix and Guix are steps in the right direction (but both have their fair share of issues)

pjmlp · on Jan 14, 2024

I rather deal with Websphere 5 (on purpose not the later much improved 6 and 7) than Kubernetes.

dti · on Jan 14, 2024

> trying to hide distribution

The paper unfortunately hides that in reality you have to pass a context object in your RPC calls, hence there is no ambiguity whether you are calling a potentially remote object.

It's in the example on the project home page: https://serviceweaver.dev/

  // The "RPC" handler
  func (adder) Add(_ context.Context, x, y int) (int, error) {
      return x + y, nil
  }

  // The call-site
  var adder Adder = ... // See documentation
  sum, err := adder.Add(ctx, 1, 2)

dti · on Jan 14, 2024

Regarding the comparison with RMI, the authors did mention it:

> Java RMI use a programming model similar to ours but suffered from a number of technical and organizational issues [58] and don’t fully address C1-C5 either

Apart from that, it looks like Java RMI allowed remote objects returning other remote objects, rather than only immutable values. With that you could abuse it by making a call to one java.rmi.Remote object, getting another java.rmi.Remote object in response, then passing it around, and then finding a totally different subsystem suddenly make RPCs (however, such abuse probably would be easy to spot in a code review, as it requires a modification to the remote object interface).

---

The authors also acknowledge that it doesn’t solve the distributed computing challenges:

> our proposal does not solve fundamental challenges of distributed systems [53, 68, 76]. Application developers still need to be aware that components may fail or experience high latency

I think at least in terms of latencies their platform can occasionally inject latencies into some percentage of the tasks, then verifying if any alerts fire, if there is a fear the components become dependent on a certain deployment shape (within a cluster).

mkleczek · on Jan 15, 2024

> Apart from that, it looks like Java RMI allowed remote objects returning other remote objects, rather than only immutable values.

Actually this is a very powerful concept as it allows one to achieve high level of reuse. Jini (https://jan.newmarch.name/java/jini/tutorial/Jini.html) made mobile objects the core idea of its architecture.

How powerful the approach is one can see when looking for example at Jini concept of a Lease and LeaseRevenewalService:

a server program can register an object (client side implementation of a service) in ServiceRegistrar (to make it discoverable and downloadable). Registration is lease based so the server has to renew it periodically. But it can be delegated to a LeaseRenewalService that can do it on its behalf so that the server can go to sleep (ie. not use any server machine resources).

All of the above happens without any party a-priori knowledge about any code that needs to be present at use site - code is downloaded automatically on-demand - the only thing common to client and service is a Java interface.

projectileboy · on Jan 14, 2024

Your last sentence is almost the only reply the article requires.

m0llusk · on Jan 14, 2024

So shipping two applications that use the same authentication service is a mistake?

revskill · on Jan 14, 2024

I've used this paradigm for a long time. What's new here ?

Basically your code has no concept of "network boundary", there's only package import each other, there's no "microservice".

But when deploying, i can choose which package to be deployed as a service.

kitd · on Jan 14, 2024

What's new is that you don't choose, the runtime does, generating the cross-network interface as needed.

revskill · on Jan 14, 2024

No way (or no need) to be such generic. You should have a choice (at least it's my choice). Because your app is separated by functionalities, not technical aspects.

danielvaughn · on Jan 14, 2024

Unless I'm misunderstanding, separation by functionality is only useful for engineering teams. Once it's in production, the only thing that matters are the technical aspects. By technical aspects, I'm interpreting to mean mem/cpu usage, throughput, security requirements, etc.