
LambCI – A continuous integration system built on AWS Lambda - hharnisch
https://github.com/lambci/lambci
======
adamb
There's a lot of harsh commentary, but I think people here are missing the
point. The fact that so much software needs root access to a (often mutable)
global environment in order to properly build is a _bug_.

There are an increasing number of build systems that encourage squashing these
bugs. The resulting build outputs are simpler and are often more portable.
They're also easier to reason about. That translates to simpler deployment,
simpler operations, and fewer edge cases to debug.

IMHO, the most promising answer to the 5 minute limit is finer granularity and
better caching of dependency inputs.

------
somesaba
Cool! I had an idea for something like this but instead of having each build
be it's own lambda event, I wanted to make each individual test it's own
lambda event. The goal is to have the build time for a complex project boil
down to the time it takes to setup + run the longest test.

~~~
falsedan
We're doing something similar, but using mesos. We bundle tests up, so we
don't have to eat the setup costs for every test, and try tracking test
pollution by splitting bundles when they fail & running them separately,
again.

~~~
dominotw
where do you store build caches? npm/gems ect?

~~~
falsedan
Dependencies? On the host; they're ephemeral but they'll run multiple builds
between coming into existence and disappearing.

~~~
dominotw
but first build always take a hit making the build times unpredictable making
build time regressions hard to track.

------
dkarapetyan
Cute.

> No root access > 5 min max build time > Bring-your-own-binaries – Lambda has
> a limited selection of installed software > 1.5GB max memory > Linux only

There's a reason Jenkins is still used so widely. It's not because of
utilization or all the other things pointed out. When your project gets big
enough, managing the CI pipeline turns into a distributed systems problem with
distributed queues, locks, error/failure recovery, and all the other headaches
that such systems bring. Heck, reporting alone on a test suite with 12k tests
is a problem in and of itself.

~~~
adamb
In my (limited) experience, treating your CI pipeline like a distributed
system is a design smell. It leads to build processes that are difficult to
test, fix, and iterate on.

When a build system can only be effectively invoked by CI/CD, it starts to
pervert developer incentives. People need to check things in before they can
be sure they work. They don't bother with tiny fixes because of the inertia.
Flaky jobs get a quick rebuild, because reproducing a build failure locally is
complex enough that they'd prefer to avoid it if they can.

Over time, these add up to a system that grows through accretion, which is the
enemy of both agility and understandability.

Better is a build process that uses simple, reusable components that work
equally well on developers' machines. These tools can be tested, refined, and
replaced incrementally, using the same build processes that the rest of your
code base does. You can do this without needing coupling your build processes
to the specific way(s) that Jenkins (and company) model builds or their
configuration.

~~~
dkarapetyan
Here's a problem statement for you. You have ~12k tests that takes > 40 hours
to run sequentially. What do you do?

I know how we've solved the problem to provide as much validation as possible
before shipping something to production and at pretty high rates of code
churn. Whereas what you're suggesting is untenable on a large enough project.
That's like saying drink your milk and have a hearty breakfast. Nice
platitudes but not actual engineering. Our solution is not unique in fact.
Shoppify and other big shops follow exact same practices
([https://www.youtube.com/watch?v=zWR477ypEsc](https://www.youtube.com/watch?v=zWR477ypEsc)).
Not because they don't know any better and haven't heard of setting up proper
build pipelines using principles from immutable infrastructure but because at
large enough scale you need mutability.

Jenkins was just an example. We don't use Jenkins but you do need something
that manages workers and their lifecycle. Saying reduce your test runtime to 5
minutes and have better engineers and tools doesn't cut it.

~~~
DanielBMarkham
Good discussion guys. Please keep going.

Isn't the architecture of your build directly related to both the architecture
of your system and your deployment?

If so, why would somebody think that a monolithic app, even one with threading
and workers built in, be better than simply engineering your own as you go
along? After all, this is supposed to be engineering, right? Not "How to use
Jenkins"

I agree that platitudes aren't solutions, but code smells are the kind of
thing that lead one to actually take ownership instead of perhaps using the
same paradigm only larger, yes?

Apologies if I missed the point, dkarapetyan.

~~~
dkarapetyan
Code smell is a little ill-defined. Given two experienced enough engineers
they'll smell different things based on what experiences have led them to that
point. The general rough guidelines is I guess "things should be as simple as
possible but not simpler" and depending on what sets of requirements you've
optimized for it might not smell right to someone who values a different set
of requirements.

------
illumin8
This looks very cool, but the 5 minute build time limit (an inherent
limitation of the Lambda service) makes this less than ideal for a build
system. The author does address this by recommending that you use Docker
containers on ECS as an alternative for long running builds.

------
a_imho
Hardware is usually the cheapest component when it comes to software
manufacturing, but we found out we wanted CI/CD to spin 24x365 as much as it
can, increasing the resolution to ~single commits with the shortest possible
cycles. With a sizeable codebase and a thorough testsuite AWS bills went up so
quickly even the proponents decided it was not worth it. We restored our old
CI infra and were able to add a couple of new servers too. The throughput
increased considerably with money to spare. Still an interesting experiment,
but it showed burning money on Amazon not defaults to moving faster.

------
nzoschke
Very nice!

This looks very close to the ideal CI infrastructure. I'm used to waiting on
queues and long VM or container boots and configuration on other services.

We can almost certainly count on Lambda getting longer execution times and
higher memory limits. We can also count on containerization solving the root
problem.

We should also be building software with the goal of tests that run within
reasonable limits like this.

`time make test` takes 39 seconds on my businesses Go projects. I'd consider a
5m test suite serious tech debt. The time that developers wait for feedback on
tests and deployment is becoming a business bottleneck in the continuous
delivery age.

~~~
rgbrgb
Wow, what's your secret to such fast builds? We're at ~35 minutes to build and
test our rails monolith on Wercker. I'm guessing you're not hitting a db too
much or loading phantomjs for end-to-end tests of a web ui?

~~~
Intermernet
At a guess, it's because they're using Go. The build and test processes are
refreshingly (ludicrously) fast in that language.

Note that doesn't mean you should switch to Go, as the ridiculously fast
compile times could be outweighed by other factors (retraining, lack of
specific features, etc.)

I would however investigate it as an option as it seems to have found a place
in the hearts of many former rails shops!

------
nulltype
Is a Google App Engine application also serverless?

~~~
buckbova
Perhaps that's PaaS? Lambda and others are now described as FaaS (function as
a service).

This might help:

[http://martinfowler.com/bliki/Serverless.html](http://martinfowler.com/bliki/Serverless.html)

Google has a seperate cloud functions offering:

[https://cloud.google.com/functions/](https://cloud.google.com/functions/)

~~~
hactually
Kinda surprised Google don't have Golang as their language. AWS offers nodejs,
Java and Python. I wonder if it's due to being able to charge more for slow to
start VM based languages?

------
empath75
You win buzzword bingo for today.

------
dllthomas
"LambCI - A continuous integration system built on surface dwellers"

------
oneplane
How is it serverless if it runs on an Amazon server? Also, how is it
serverless if you need to consume a service? (AWS in this case)...

Every time I see something nice, there is this increasing chance that I'm
gonna end up sad because it requires some sort of external provider like AWS,
DO, Heroku, GCE... I don't have any of those and I don't want any of them.

~~~
btown
"Serverless" here refers to not needing a pre-requisitioned server to exist in
advance of the request. Sure, the physical hardware exists in AWS, but
previously you'd need to have some sort of CI system running on a server that
needs to be running and accepting web requests at any time that you might push
to Github. With AWS Lambda, which LambCI is run on, a lightweight server boots
up in realtime in response to that webhook, runs code, and shuts down. So you
have a CI server that can respond any time of the day or night, but only
consumes resources when it's actually running.

~~~
inopinatus
Serverless also describes the application invocation model.

The design intention for serverless applications uses pub-sub events rather
than client-server calls. That's a major enabler of async processing and
containers-on-demand, and it looks like LambCI has followed the pattern to a
tee.

If you have to accept requests from a non-event-driven world, AWS offer their
API Gateway to provide a listening server endpoint, but I think it's telling
that this was not available when Lambda was released, and that LambCI does not
need it.

