Hacker News new | past | comments | ask | show | jobs | submit login
Launch HN: LayerCI (YC S20) - Staging servers that act like (and replace) CI
174 points by colinchartier on Jan 31, 2021 | hide | past | favorite | 61 comments
Hi HN, Lyn & Colin here. We’re co-founders of LayerCI (https://layerci.com), which gives you a modern DevOps experience (CI/CD & staging environments) with as little work as writing a Dockerfile.

Most teams need CI/CD (run the build and deploy every time a developer pushes) or staging (host a server with my app in it to share), but current approaches always have at least one of these problems:

- Simplistic (only run unit tests)

- Slow (wait 10 minutes to run the same repetitive setup steps like "npm install")

- Complex (cache keys, base images, a slack channel to reserve staging servers, …)

We’ve spent over a year iterating with our customers to build a product that solves all of these problems.

Our configuration files (Layerfiles) look like Dockerfiles, so regular developers can write and maintain them. Here's one that creates a staging server for create-react-app:

FROM vm/ubuntu:18.04

RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add - && curl -fSsL https://deb.nodesource.com/setup_12.x | bash && apt-get install nodejs python3 make gcc build-essential

COPY . .

RUN npm install

RUN npm test

RUN BACKGROUND npm start

EXPOSE WEBSITE http://localhost:3000

We charge a flat $42/mo/developer on our paid plan. Because it's a flat fee and not usage based, we're incentivized to make things as fast as possible: Our current margins come from a custom-built hibernating hypervisor that lets us avoid running "npm install" thousands of times per day.

We’ve upgraded the free tier to 5GB of memory for new installations this week. It’s perfect for personal projects or small MVPs where you’d like a powerful demo server that will build on every push and automatically hibernate when it’s not being used.

The easiest way to try out LayerCI is to follow our interactive tutorial: https://layerci.com/ or look at the docs: https://layerci.com/docs/

We would love to hear your thoughts about CI/CD, staging, and what we’ve built!




Great approach. I've been running ops teams for a long time, and eventually all teams have ended up spending some time optimising our CI tooling to involve some degree of caching - so you definitely are on to something.

Before I clicked your pricing I'd really wished for an affordable pricing plan to run this on BYO resources, but only your enterprise plan seems to cover this. I always get an iffy feeling when I have to build my software on external resources I have 0 control over. Your downtime will prevent teams from shipping their code - but I guess that thought is part of your upsell to the enterprise plan.

Anyway, it's one of these ideas I'd wish I had years ago, so congratz to you.


BYO resources for CI sounds good in theory, but considering costs it's almost always a worse proposition:

- 10 hours / month of maintenance + 200 hours of setup @ $60/hr = $13k in the first year, not including any infrastructure costs.

If you just redirected that money to a hosted offering you'd get significantly more powerful servers without needing to allocate engineering / management resources - it also makes it significantly easier to push updates & monitor uptime on the host's end.

Even traditionally hosted tools like Atlassian's suite are moving to cloud for the same reason: https://www.atlassian.com/migration/journey-to-cloud


> - 10 hours / month of maintenance + 200 hours of setup @ $60/hr = $13k in the first year, not including any infrastructure costs.

Impressive numbers! would love to do these for you for a tenth of the price you are quoting, I'd still get like a 400% margin or something. GitLab runner is just installing the package with apt and running the register command once, then updating just magically happens when you upgrade the system.

As for infrastructure, you will find dedicated servers that will perform extremely well for an equivalent 49$/mo/developer.


The type of CI you're describing would be in the "simple" category in the OP - if your goal is just "run a single docker build and check if it fails" there are lots of free hosted tools you can set up (github actions, gitlab CI, etc)

There is in fact a burgeoning industry of devops consultants that set up CI pipelines for companies, I'd encourage you to consider it if you like the work!


Actually we do, looking for more projects!


Will let you know if we know anyone that's looking for a consultant - do you have a website I could refer folks to?


While this is likely true in many cases I'm in a very sensitive industry (banking) and we tend to self host things not for cost reasons but for security reasons. We spend a lot of time going through pen tests, getting SOC2 compliance, etc.

Handing off something this critical can cause an even more painful audit in many cases so just a thought to consider cost is sometimes not the only factor.

Looks like a really cool product though!


We're actually in the process of getting SOC2 and pen tests ourselves - another benefit of a hosted offering is it can (eventually) integrate into your compliance system (e.g., vanta)

A lot of our customers are in fintech (payroll, banking, etc) so we've spent a lot of effort on our security model: https://layerci.com/security


Not in banking industry but startup I'm working with is in the middle of getting SOC2 compliance.

Any pointers I should consider on engineering side?


Congrats on the launch :)

The pricing incentives sound smart.

Do you see the ability to replicate more complex architecture as a differentiator? For example, one of your homepage quotes mentions "running a Kubernetes stack inside of a Layerfile". Can you elaborate more on how Layerfiles enable this?


With Kubernetes, a typical workflow might look like this:

1. Start kubernetes cluster

2. Build docker images

3. Deploy docker images (helm, kubectl, argocd, ...)

4(a) Run unit tests (kubectl exec -l k8s-app=web rake test)

4(b) Run e2e tests (kubectl run cypress)

4(c) Create ephemeral environment (EXPOSE WEBSITE localhost:8000)

Because we take memory snapshots after each step, you'd effectively get a fresh, fully-provisioned kubernetes cluster immediately after pushing instead of re-running all the steps every time (you'd skip to step 3) and then run 4{a,b,c} in parallel (they'd all "fork" the VM and get a separate copy of all of the resources)

Here are a few links that go into more detail:

- https://layerci.com/docs/tuning-performance/run-repeatable

- https://layerci.com/blog/ci-at-layerci/


Ok, so the more complex the infra, the more advantage there is in caching.


The idea is it starts as simple as a docker build (e.g., as easy or easier than GitHub actions, orbs, etc) but scales as your code/infra does.


Hi @colinchartier!

From what I see your product looks great for running a single app with a ready-to-use staging environment.

In practice, a stating environment is often not limited to one app, but at least 1 frontend inter-connected with at least 1 backend (and often, many frontends connected to many micro-services).

Does your product respond to this kind of architecture?


Yes, it's quite simple if you use something like docker compose.

If your frontend/backend are in separate repositories, I'd reference this doc page: https://layerci.com/docs/advanced-workflows/routing


I see developer tools on HN. I upvote!

Congrats on launching! I know this is a good idea because we actually invested time in building something similar internally.


Thanks, we're big fans of teleport - funny how small the developer tools world actually is.


This sounds really exciting. I love your live demo, really slick introduction and tutorial - and answered some questions I had about E2E! I've got a couple of other questions though.

> 12 full stack preview environments per commit with up to 16GB of memory and 6 CPUs

Quote from your pricing. If we had a monorepo with over 12 small test/integration jobs, but not 12 full preview environments, is this usable? Are they one and the same, or can we have eg. a suite of unit tests that don't count toward this full preview limit? Do some teams keep some unit tests off Layer and just use you for the more interesting pieces?

Second, do you have any documentation about which databases you support? Concretely can you restore a MongoDB snapshot super-fast?

Third, do you have any story around secrets if we want the staging server to hold some secret API keys? Currently we can do this with AWS (own account) CI machines in Gitlab deploying to ECS with AWS secrets - they stay end-to-end encrypted and nobody sees them. Is there any similar way we can get secrets onto a staging server without you having access to them? I suspect this would be a deal breaker for the staging use case for my team.

Similar to above, our database snapshots are stored in non-public S3 buckets, how would that work? Again currently it's a case of giving the CI AWS role permission to access them, not sure what an equivalent would look like.

One more, is it possible to access the built docker images? We deploy Docker images to ECS, and currently they're the exact same ones built and tested in CI which is a nice reassurance. Do your customers have an out-of-band process for building + deploying to production outside of Layer?

Aside, I think this may be a typo on your pricing page? "We'll never increase your the terms of your bill once you start your subscription."


I am curious about the third point of e2e encryption of the secrets. The problem is that the hosting provider would see plaintext secrets or that someone with access to AWS account could see them? You could also input the encrypted secrets theoretically into any service and then decrypt them in your application. Just curious on what kind of considerations do you have when choosing a hosting provider.


It's basically the same concern as having them in source code, we want to be able to control/restrict access to them. In AWS we can do that becase they're encrypted with a KMS key and unless a user/role has permission to decrypt with that key it can't decrypt the secret. This does assume that AWS aren't lying about encrypting things, and that their employees can't access our KMS keys.

We currently use Gitlab for our CI/CD pipelines, but using our own runners in our own AWS account. So if we want to deploy a staging environment from there, it's actually deploying from an AWS role that we control, we're not leaking any secrets to Gitlab or anywhere else.

I'm just wondering how people get around this in Layer/any hosted CI/CD setup where you can't have your own runners inside your AWS account. Especially because they don't replace the production deploy, so ultimately those secrets are staying in AWS - perhaps in addition to wherever they need to go with Layer.

FWIW I did find something about secrets in the docs: https://layerci.com/docs/layerfile-reference/secret-env#secr... - but I'm not sure how you get secrets in from that.


12 is the max parallelization, so you could define more but some of them might queue.

All databases that run locally (mongo included) are supported, here's the doc page for that: https://layerci.com/docs/advanced-workflows/layerfiles-can-s...

We have a secrets dashboard, they are stored encrypted in our database (though many users have something like hashicorp vault with a central repository, with only the access key stored in our database). The secrets are only viewable by admins of your organization.

Most of our users build their docker images within LayerCI, then push to ECS by adding a write-only access key as a secret. Deployment is often done with something like Terraform or ArgoCD.

Thanks for mentioning the typo!


This sound very tempting!

I run Rust CI for Windows/Linux (on azure pipelines) and it need art least 1h to complete, with several tricks already done.

When something broke the iteration is very slow to fix it (you wait 1h, then find the problem, then publish, another hour, you forget something...)

So I have a local deployment so i can fix things faster, but is like negating the whole point.

If yours can help in deploy rust faster chime in https://www.reddit.com/r/rust/ because is a major pain point!

P.D: I assume this not include osx/windows?


We definitely support rust - you could even use "RUN REPEATABLE" to reuse the state from prior builds, the same way you'd build locally.

Happy to chat more and send you some swag if you try it out and write a blog post about your experience :) colin@layerci.com

You can't 'FROM vm/osx' or 'FROM vm/windows' yet, it's relatively difficult to navigate licensing as a startup unfortunately.


Yeah, lets try. If I can cut times for deployment the web sites is still a win. I could wait for the windows builds, them are for utilities...

> You can't 'FROM vm/osx' or 'FROM vm/windows' yet, it's relatively difficult to navigate licensing as a startup unfortunately.

Maybe allowing to connect to vultr/digitalocean? Them have cloud instances with windows.


I'm looking at your docs, and the RUN REPEATABLE command seems really powerful. But if the state is broken after a run, like if you have some pods stuck at Terminating, how would you recover things?

Another question I have is how would you handle state that sometimes needs to update and sometimes doesn't? For example, it would be ideal to have a staging database that can keep having migrations and data added to it when new features are added, but we only want to checkpoint the changes to it from testing when the PR is actually merged.


If the state is broken, you can "rerun without snapshots" via the dropdown in the top right - and since future runs load the latest snapshot, they'd use a "clean" one.

For databases - usually users have a named S3 bucket and use a secret to authenticate, since we take memory snapshots, the top of your Layerfile can be "start the database and populate it from this specific anonymized dump" and then you can edit the file in S3 and re-run without snapshots if you'd like to reload it.

Here's the doc page for that: https://layerci.com/docs/advanced-workflows/layerfiles-can-s...


> - Slow (wait 10 minutes to run the same repetitive setup steps like "npm install")

That should happen any time you change dependencies if your Dockerfile is efficiently setup, not every time you change a line in runtime code ...

Maybe that:

COPY . .

RUN npm install

Is not what you should do to have an efficient docker build phase, did you try this instead ?

COPY package.json

RUN npm install

COPY . .

Multistage builds will help if you build in two languages in the same Dockerfile.

Anyway, still slow when you add/remove a package because it redownloads everything, my best trick is to share ~/.npm cache directory, but I manage this with buildah instead of docker build.


One of the benefits of Layerfiles vs Dockerfiles is that those two examples will do the same thing - we monitor which files are read so you don't have to micromanage copying specific steps.

But I suppose "npm install" is a simplistic example, "npm run setup_our_ci_environment" would be a more likely step for a larger CI pipeline


Impressive, and scary. How does it work? Some kind of ptrace wrapping of each command?

If this genuinely works reliably, this alone could be a good product. As someone said below, you should promote this feature more specifically.


OS-level VFS wrapper (there are a few places you can put probes here, another approach might be eBPS)

We used to sell the features, but that actually didn't work super well - maybe we can add a "technical details" page for the nerds ;)


Yes, please.

If that impressive hypervisor tech can handle production, I see endless possibilities, including mitigating downtimes by teleporting workloads to different cloud vendors and different regions. Easier said than done, of course!


Seconding the request for more details.

Hopefully, you'll put up a quick blog post or even a gist and share the URL with us :)


Nice work! I played around with the demo and launched an SSH session for the last layer (hosting the website). And I was able to trigger a VM reboot from the shell. And now the website is not up anymore. Curious to know if there's any monitoring or alerting that I can see when something like this is happening? And how do I go about debugging it/restoring functionality?


Hey Alok,

RUN BACKGROUND basically does "./thecommand&" so it will stop after a restart, the onboarding example is ephemeral for simplicity.

If you want things to persist across restarts you'd have to add a systemd script or docker container the same way you'd run it in production.


Makes sense. Thanks!


How is this different from the likes of https://vercel.com/ ?

It doesn't need to be – but understanding the difference / similarities with other tools that have a similar sounding proposition helps with user acquisition / defining your niche.


Many production-hosting solutions offer similar functionality, but the catch is that they only really work if you're already using that system.

If you're already deploying on Vercel, then you likely just want to use Vercel's solution. The same goes for Netlify Preview Deploys, Heroku Review Apps, Render Pull Request Previews, etc.

If you are not using one of these PaaS's, trying to tap into one for just this functionality tends to end up rather convoluted, and somewhat defeats the purpose.


We're halfway between CircleCI and Vercel - not for production hosting, but easily allowing for backends / databases to run (including a separate backend for every push)


Why wouldn't I just deploy with every commit to Vercel? Either using GitHub actions or GitLab CI/CD?


It takes significantly longer - we have advanced snapshotting logic to skip steps like "start the database" that you'd otherwise have to run every time you pushed.

We also auto-hibernate the instances, so you don't have to pay for a bunch of separate databases running in parallel.

Don't just listen to me though, try the free tier for yourself! A lot of our users combine LayerCI for developer tooling and vercel/netlify/GCP/AWS for deployment, they're quite complimentary.


How does dependency caching configuration work? I get wanting to make things as fast as possible, but I struggle a bunch a Python dep caching in Circle CI... is there a way to ensure a 100% rebuild? Also interested to see how you plan to support open source


There's a "CACHE" directive if you want traditional dep caching, but if you don't edit requirements.txt, a Layerfile will just entirely skip the pip install

You can "SNAPSHOT disabled" if you'd like certain steps to always run


This is super cool. I would like to know more about VM hibernation!


Hey Kurt, we're big fans of fly.io!

Happy to chat about tech sometime, I think you folks are using firecracker for a similar use-case? Drop me a line at colin@layerci.com


Hi Colin, congrats on the (re-)launch! Fellow SUS 2019 graduate here.

I know layerci's tech precedes snapshot support in Firecracker [0], but are snapshots similar to layerci's magic or there's still a significant way for snapshots to go before it can be on par with layerci's tech? I ask because I remember you mention layerci uses techniques (from how Linux hibernates PCs) which are good enough for tests and staging but not production; [1] whereas Firecracker obviously can't rely on such tech so there must be key differences? Thanks.

[0] https://github.com/firecracker-microvm/firecracker/blob/mast...

[1] https://news.ycombinator.com/item?id=23036776


Firecracker is definitely heading in the same direction we did with their "diff snapshots", though they don't let you create a COW chain yet, so there's still quite a bit of development to be done on that front.


Firecracker snapshots to device mapper could be pretty cool.

We don't do much with snapshot/restore but we do a _lot_ with device mapper. It's pretty badass.


Colin's cofounder here :) Excited to hear from you all


"LayerCI is an all-in-one devops platform, but we provide three main value propositions to our users:"

  - Run e2e tests on every commit.
  - Collaborate easier with per-PR staging environments.
  - Set up CI/CD to build and deploy to production
Anyone can do this without your company. If that's the value you bring, then I don't need your company. What value do you bring that I can't get somewhere else?

Also, red flag: mis-use of the term DevOps for marketing purposes.

"There are a few ways of thinking about these files named ‘Layerfile’:"

  - Auto-discovered Dockerfiles that build entire virtual machines instead of containers, just as quickly.
  - Define a tree of virtual machines. Each subsequent layer can inherit all the running processes in its parents.
  - Trigger specific actions like build, push, test, and deploy in parallel every time you push new code.
It's not explained well what benefit there is to using VMs rather than containers. Later on it briefly mentions possibly better caching than Docker containers. If it's actually better than Docker, you are severely burying the lede. "Better than Docker" should be on the front page. It's also not explaining whether I need to throw away all the time and money I might have already invested in containers, or what this is and isn't compatible with.

A lot of this solution is really built around optimizing one specific problem, which is caching during re-running whole CI pipelines. This limits the viability of your business. As soon as your customers figure out a new way to solve this problem, they don't need your custom platform anymore. I would consider LayerCI more of a specific feature of a much larger offering.

(In particular, the 'correc't way to solve this problem is to stand up an environment for your PR and structure your CI so you only re-run the parts that affect the particular change you want to make, rather than re-running the entire pipeline. It involves putting more thought into running your pipeline but often means the only tasks that get executed are copying a file and restarting your app)

P.S. You're installing the yarn GPG key and then not doing anything with it. Your node setup script isn't installing the yarn repo or yarn itself, so this is a throwaway command.


> In particular, the 'correct way to solve this problem is to stand up an environment for your PR and structure your CI so you only re-run the parts that affect the particular change you want to make, rather than re-running the entire pipeline. It involves putting more thought into running your pipeline but often means the only tasks that get executed are copying a file and restarting your app

This is actually exactly what our platform helps with - instead of micromanaging this (and updating the configurations) you can rely on a friendly SaaS to do it for you.

I wouldn't call us "better" than Docker, we're more like "Take Docker and make it work better for CI/CD and staging servers"


This comment reminds me of the comments people made about Dropbox originally: https://news.ycombinator.com/item?id=9224


Can we get SSO for not a bajillion dollars?

Seems like this functionality is really inflated across the entire SaaS space.

Is SSO really that hard? that expensive?


Most developers log in to LayerCI via GitHub or GitLab - so if you have a GitHub plan with SSO, you'll get SSO with Layer as well.

We even support self-hosted GitLab, so if you use a custom SSO provider there, developers could use that oauth flow for authentication + authorization.


This looks great. I’m interested in your approach to databases and specifically the data in staging environments?

I’ve found setting up the data in databases for staging environments can take as much time as CI, and often works in quite a different way. Shared databases can work, but then may limit staging changes that include migrations.


We'll snapshot the state of the database after you set it up once, and then re-load that snapshot every following time you need the database to exist.


As an early user -- the speedups are very real! Feels great to just skip build step after build step, due to Docker-like layers.


Thanks, Malthe - it's been great working with your team.


how do you know which steps to skip or rerun each run. in your react example, how do you know when to reinstall yarn to latest version and when to skip. https://layerci.com/docs/examples/react


We monitor which files are read by which steps and map that back to the changes you've made automatically!


If you'd like to chat with me (CEO @ LayerCI) personally, I'm at colin@layerci.com


What's the difference between this and something like hashicorp waypoint?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: