
Gravitational (YC S15): Software On-Prem or in Cloud from Single Code Base - twakefield
http://techcrunch.com/2016/02/14/gravitational-helps-deliver-software-on-prem-and-in-cloud-from-single-code-base/
======
twakefield
A few reasons why we started working on this for some additional context:

1) We think developers have had to take on too much responsibility for
operating the software they build (DevOps). We hope to create more clearly
defined roles for developers and operators through better tooling for
operators.

2) A lot of the engineering challenges with SaaS comes from multi-tenancy,
which is one of the reasons Dev and Ops have merged into one role.

3) A big reason to use cloud software is that you (as a user) don't have to
operate it but that has been conflated with the requirement that you need to
give up control of your data.

4) Adoption of containers (Docker[0], Rkt[1]) and container orchestration
(Mesos[2], Kubernetes[3]) technology have created the opportunity to rethink
the traditional model for delivering and operating software.

[0]: [https://www.docker.com/](https://www.docker.com/)

[1]: [https://github.com/coreos/rkt](https://github.com/coreos/rkt)

[2]: [http://mesos.apache.org/](http://mesos.apache.org/)

[3]: [http://kubernetes.io/](http://kubernetes.io/)

Edit: added some examples of container and orchestration tech with links

~~~
icebraining
I find (2) curious, as our pains come from trying to build a single-tenant (as
in 1 process per client, not 1 machine per client) hosted SaaS offering.
Everywhere I look, there's plenty of tooling written for multi-tenant,
distributed webapps, but barely anything if you want to automate the creation
and management of isolated "instances" (subdomain + process + database).

We've built some manual deployment scripts for creating new instances, but
having to build everything ourselves has been time-prohibitive. I really wish
I could find some ready-made tools for our use case.

~~~
old-gregg
Hmm... interesting. I wonder how you define a "tenant" and how tenant affinity
is applied to a connection (routing). I doubt you're talking about process-
per-web-request, in that case a good ole CGI would do.

Shoot me an email to ev@kontsevoy.com, will be glad to chat more about this.

~~~
icebraining
Not process-per-web-request, but a process-per-account. We route using DNS, by
giving each accounts its own subdomain. Essentially, think something like
WPEngine, except for a different software.

~~~
old-gregg
1st idea that comes to mind:

    
    
        1. Package your per-tenant application into a Kubernetes 
           pod (think of it as a lightweight VM, a collection of 
           Docker containers running on a same host).
        2. Write a request router: someone needs to look up tenants 
           based on the virtual hostname in a request.
        3. Use k8s API to find a pod with label == tenant. If 
           doesn't exist -> create one.
        4. Forward the request there.
    

I'd imagine you'll have to "garbage collect" and gracefully shutdown pods when
they sit idle for a while. I'd probably pack that logic into a pod itself.

~~~
icebraining
I have to thank you; I had seen the Pods concept, but didn't look deeply,
since it sounded like they only managed the application itself, while I also
need to manage data (primarily client databases) within the pods.

But now I see that Pods support some really interesting data volume management
systems that could support moving it around if we need to migrate Pods between
hosts.

Thanks!

------
nickmerwin
I gave a talk on delivering Rails apps behind corporate firewalls at
Railsconf'2015[0], but have since migrated Coveralls Enterprise[1] from our
home-baked, pre-packaged VM to Replicated.com[2]

I'd recommend checking them out too since they're Docker based and have a
significant roster: Travis, NPM, Code Climate, etc.

[0]:
[https://www.youtube.com/watch?v=V6e_A9VzPy8](https://www.youtube.com/watch?v=V6e_A9VzPy8)

[1]: [https://enterprise.coveralls.io](https://enterprise.coveralls.io)

[2]: [https://www.replicated.com](https://www.replicated.com)

------
avifreedman
Hi, looks interesting...

Some questions:

All based on what we do at Kentik, where we run a large distributed column
store with high speed data ingest layers feeding it, and a relatively more
easy core cluster of portal/high-level metadata nodes. Ingest nodes are 20
core low-disk and data nodes 36-core 24x2tb disk w/ ZFS, which we rely on for
compression.

We use docker but on top of known equipment types of deterministic latency and
performance. Machines are netbooted and much of the system is controlled by
puppet push. Typical clusters range from 10 to low hundreds of nodes. We do
on-prem but they are typically on top of bare metal and using our control
stack.

So with that as background...

For an application that requires high speed native (in particular disk)
performance and persistent disk to run, how is Gravitational requiring that at
deployment time?

Is there an ability to add metrics to dashboards for on-prem 'operators' to
access? Send those through logged mail gateways for SaaS companies to watch
on-prem status remotely, at least at an aggregate level (dashboards plus
perhaps some log excerpts)? Have part-time (customer-controlled) logged ssh
gateways for supervised and/or logged remote access by SaaSco engineers?

How do you do versioning, deployment, and distributed rollout to running
systems with 24x7x100% uptime expectation? Is there the ability to support
partial upgrade of only some nodes of specific roles followed by automated
and/or human checks? How about rollback?

Thanks, and good luck with the company!

~~~
old-gregg
Hey Avi,

I'll try to be concise:

    
    
      - There is no performance penalty.
      - We can do hardware profiling prior to installation/upgrade to 
        guarantee capabilities as defined in the system requirements 
        for an application.
      - Updates are applied in a "rolling" fashion with rollbacks in 
        case of a failure.
      - No, I am afraid we cannot not do 24x7x100% the application needs 
        to be designed to deliver this.
    

Happy to dive into the details. Your profile doesn't have an email address
listed, ping me at ev@gravitational.com!

~~~
avifreedman
avi @ kentik is my email but I'll ping you.

So HW profiles can include IOPS and throughput testing, disk space, etc?

The 24x7x100% is more of a concern re: not restarting all VMs at once, or at
least being able to specify subsets and rollout orders.

Will take to email to discuss further.

thanks

------
old-gregg
I'm the CEO of Gravitational, AMA about transitioning from single-cloud SaaS
model to launching an Enterprise on-prem version, or going multi-
region/international.

~~~
koolba
How much does it cost? (I don't see any pricing information on your website)

Is the business model taking over the management of on site SaaS applications?

If so, what's the difference between this and outsourcing maintenance (i.e
devops) of your application to a third party?

~~~
twakefield
We don't have standard pricing, yet. Right now, it is priced on a case-by-case
basis after discussing requirements with the customer.

Our objective is to enable operators (whether the operators are at the
software vendor or at third parties) to maintain applications across many
environments, effectively.

Once the application is ported, operations becomes much more efficient and
increases the odds of success if a third party is operating it.

------
pm
Curious, was just thinking about this the other day. What does Gravitational
offer in the way of special sauce that can't be achieved just be working with
Docker/Kubernetes directly? What cost to the developer? How are you taking on
customers at the moment?

Just asking because I was in the planning stages for something and this was
the first step to what I wanted to achieve. I don't know if it's worth
learning Docker/Kubernetes or signing up for something like this before I've
coded anything.

~~~
old-gregg
Sure,

Docker and Kubernetes are fine tools for deploying and orchestrating an
application into an environment you have _full control_ over (typical SaaS
scenario). Our "distribution" of Kubernetes is optimized for running in
"hostile" unknown environments with varied networks, storage, etc. It comes in
HA-configuration with self-updating and self-monitoring turned on.

Secondly, once you have thousands of instances deployed across the globe, we
give you unified and secure remote access to do ops in a _scalable_ way. The
true cost of going on-prem is not deployment (it's not _that_ hard, you're
right: Kubernetes does most of the heavy lifting).

 _The true cost of going multi-region /on-prem is operations._ And, of course,
the security complications that come with it. Our OpsCenter tackles this
problem: it gives you access to thousands of absolutely identical instances of
your app, while staying compliant with on-site security policies.

Give us a call, we'll have a solution engineer spend some time with your team
and we'll do all of the integration work.

The end result: you have the same codebase which runs in your own 5 AWS
regions, plus 5000 on-prem locations, and you don't need to:

    
    
      a) fork your codebase
      b) hire a ton of ops engineers

~~~
pm
I am the team I am afraid, and only at the planning stage so there is no code
to speak of as yet. I'm in Australia so it might be better to send an e-mail
and see what can be done. Thanks for the info.

------
bjoerns
This is great. We went through a lot of pain keeping our cloud and on-premise
offering [0] in check. Eventually took our infrastructure back to the drawing
board and settled on a docker based solution with some custom orchestration on
top. Cloud is docker swarm and on-prem docker on a VM. A service like
Gravitational would have saved us a lot of time.

[0]: [http://www.pathio.com](http://www.pathio.com)

