
Colossus: A New Service Framework from Tumblr - r4um
http://engineering.tumblr.com/post/102906359034/colossus-a-new-service-framework-from-tumblr
======
pothibo
What I'm about to write is abstract so you may want to skip this comment.

I see many big products releasing open-source frameworks and most of those
frameworks are often very similar to other frameworks that already exists.

Moreover, the goal of those frameworks are also very similar.

But they do it nonetheless. And my understanding for this to happens is that
every serious application with enough load is unique in term of architecture
and also in terms of culture.

And I believe it's a mix of the two that makes it very hard for any company to
use something that is already available. You have the workforce to build
something that really looks and talks like you, why would you use something
that does only 90% of the job.

And this makes me wonder a lot about why frameworks like that are open-
sourced. Is it really for the Greater Good? Or is it a recruiting tool?

I'm ambivalent, and I don't have any answer.

~~~
lbotos
It may not be "directly usable" but I'm sure the code could be examined and
patterns gleaned in a way that it contributes something to the greater dialog
of code. I know that I spend a lot of time reading different code from all
kinds of projects with no intention of actually using the code, just trying to
figure out how it's built.

------
amelius
> The general structure of a microservice is that it concurrently processes
> small requests from potentially many clients and keeps little to no internal
> state.

So what is a microservice exactly? Is it just a piece of code that runs based
on a trigger coming from the browser? And what is the main responsibility of
the framework? Is it the routing of requests and responses to/from the actual
microservices?

Further questions that come to mind: (1) Are the results of microservices
cached somehow? (2) Also, if a cached item is not longer required, is it
automatically freed? (3) If the output of a running service is no longer
required, is that service automatically stopped?

~~~
tekacs
In light of your further questions, I feel like I should post.

We're working on a microservice framework which:

    
    
        - does the routing to and from the actual microservices
        - provides a structured enough interface (through our small language), that it can:
          - cache the results of requests out to any service
          - allow services to find and rely upon each other to compose functionality
          - run one lightweight runtime per service per machine (which never shuts down, but has little overhead due to its lack of state)
          - serve static assets both to clients and to microservices requesting them
          - allow services written in differing languages to talk to each other
    

Perhaps this gives you a picture of what might fit the job description of a
microservice framework? :P

... and if this sounds what HTTP servers do, you'd be partially right, but
building such apps today is pretty non-trivial and hard to orchestrate.

The focus on microservices is making it much more possible to build apps like
this and we're personally betting on an approach which focuses on allowing
each microservice to express as much as possible in its own little corner
(using our language, which we use both for app structure and communications
and gives different programming languages a neutral meeting point).

We're over at [http://wym.io/](http://wym.io/), where there's a demo of our
language builder, for now. Hopefully this is low key enough. :P

~~~
amelius
This sounds very useful.

One further question. Say a service A calls service B. And let's say service B
needs several seconds to complete, and emits a progress indicator every second
(I assume this is possible?) Will A be able to read these progress indicators
also every second, and send its own progress indicator to its caller?

~~~
tekacs
It depends on the case, but an easy way to emit progress indicators in our
system is to make the long-running service a generator. It looks like this:

    
    
        g = wym('/email') #=> Generator(id=<...>)
        g.on('value', (v) ->
            ...
        )
    

and /email looks like:

    
    
        function next(inp) { ... }
        return generator(next);
    

Sending values in uses a slightly different syntax, and manually walks the
generator through. :)

A more trivial way is to pass in a callback for the service (we call them
routes) to send its updates back to, but this requires more code to send
values in.

Edit: to be clear, using a Generator as above isn't very microservice-ey and
will trigger our framework to start tracking local state to keep the generator
around, which it will auto-free when it believes it can. This is a necessary
evil with any long-running code, though. Better is to break up the code so
that it may be called multiple times to progress state and returning
generator('/someRoute') if possible. :P

------
shizcakes
Why not just use Finagle, which is almost the same thing, NIH aside?

[1] [https://twitter.github.io/finagle/](https://twitter.github.io/finagle/)

~~~
dansimon
Lead dev here:

Before writing Colossus I had worked with Finagle for several years, and we
still use Finagle for some of Tumblr's older systems. There's definitely
overlap in functionality between the two and Colossus definitely borrows some
ideas from Finagle, but there are also several significant differences:

For the use cases that lead to its development, Colossus is significantly
better performing. In the blog post, when I mention "performance problems we
faced with existing frameworks we tried", Finagle was the framework we had
moved away from. For example, with Colossus you can keep most I/O related code
in the event loops without having to use Futures or any other type of out-of-
thread concurrency. This alone has a pretty big impact on throughput for the
kinds of small requests our services handle. Overall we gained a 4-5x
improvement in throughput and much better tail latency.

Finagle still has its own implementations of Future, Try, Duration, and other
types that have been part of the Scala standard library for quite some time
now. This makes it very unwieldy to use with other popular Scala libraries,
especially Akka, which we use quite heavily now. Finagle also still does not
build for Scala 2.11 (though it looks like they're close).

Lastly, because Finagle is written on top of Netty, I think a lot of Netty's
Java-ness bleeds through: factories, builders, thread pools, etc. Also,
between Finagle, its companion util lib, and Netty, you have a massive amount
of code to sort through to understand how something works. That becomes a
pretty big barrier to entry when trying to do anything beyond the basics, and
was a big factor preventing us from writing more services.

~~~
nnythm
Engineer on finagle, used to work at tumblr when we started running into
problems with finagle. In large part it was because we didn't understand how
it worked, and were doing things like reinventing different pieces of finagle,
like load balancing and service discovery.

I think saying colossus is "significantly better performing" is a bit of a
stretch, although I haven't seen it in the past year, I think the main win was
that you called epoll faster than netty, which I think was because you were
using a faster timer. What are the performance comparisons you've done
recently?

------
remon
I'm curious about the motivation to use Scala for such a relatively small
codebase.

~~~
johncoltrane
A good enough occasion to learn the language?

~~~
remon
Not sure if "learning opportunity" should necessarily be on the priority list
for a framework that apparently is intended to serve as the backbone for a lot
of service development within a company the size of Tumblr ;)

------
tempodox
That name is no good. It is reminiscent of “Colossus: The Forbin Project”, a
sci-fi apocalypse from 1970 that demonstrates how a single stupid user
decision destroys the computing industry for all time. You don't want to
remind anyone of that embarrassment.

~~~
pjc50
It reminds me of the 'Colossus' computer built by Turing, Flowers et. al., and
I've never heard of that movie.

It is a slightly odd name for a framework though.

~~~
jimmcslim
Especially one that espouses building a system out of a collection of loosely
coupled microservices, as opposed to a single 'colossal' monolith!

~~~
akamaozu
First thing I thought when I read the description of the name. Ironic much?

