Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Relay – Event-driven DevOps automation (relay.sh)
100 points by product1087 18 days ago | hide | past | favorite | 32 comments

Hi everyone! PM for Relay here.

Relay is an event-driven DevOps platform. It listens to events from 3rd party services like AWS, Datadog, PagerDuty, Jira and more to trigger simpler, smarter workflows that automate tedious tasks. A lot of existing solutions either require a lot of upfront DIY work (AWS Lambda or running your own script) or they weren’t built for DevOps teams (Zapier, IFTTT).

Relay provides out-of-the-box workflows for common use cases like cloud cost optimization, security, incident response and more. If those don't work for you, you can write your own workflow using integrations with dozens of cloud provider services, open source tools and APIs that can be composed together in a YAML-based workflow (yes, I know, more YAML).

Some features we think are interesting:

- Visual execution graph shows exactly what's happening in your workflow run.

- Connections make it simple to authenticate to other services.

- Audit log shows who initiated every workflow run, whether manual or by an external service.

- Human in the loop approvals give you full confidence in your automation.

- Growing community of 30+ open source integrations with AWS, Azure, GCP, Datadog, PagerDuty, VictorOps, Jira, Terraform, Helm, and more.

We also just recently put out a blog post about some pretty novel uses of Knative and Ambassador to create user-generated triggers: https://blog.getambassador.io/user-defined-webhooks-in-puppe...

Looks great! "Zapier for DevOps" would convey it a lot faster. It didn't click for me until I got to that part. I'm excited to try it out!

A couple things:

1) Please, please support some other config file other than just YAML. Even JSON would be an improvement, since you wouldn't have to change your config reader library. YAML has been a miserable nightmare to use in other tools like Open API Spec. I'm not alone in my YAML hate, either[1][2][3][4].

Implicit typing and semantic whitespace in a data file is insane. At least in Python, you're not unlikely to get an error if your whitespace is off. In a config file, you might just end up with dangerous/confusing behavior.

2) I'm surprised you used Relay, which is already used by Facebook. I don't think Facebook is going to care that much, but it's going to be really hard to Google this library.

1. https://dev.to/jessekphillips/stop-using-yaml-3kec

2. https://hitchdev.com/strictyaml/why/implicit-typing-removed/

3. https://devopsdays.org/events/2019-stockholm/program/philipp...

4. https://www.arp242.net/yaml-config.html

> 1) Please, please support some other config file other than just YAML.

Although we don't directly advertise it, you can in fact write a workflow in JSON and there's an underlying JSON interchange format that is authoritative in our system. We expect that any other languages we support would "compile" to this JSON format. What other languages are interesting to you? We've looked at CUE[0], HCL, and of course Puppet itself.

[0]: https://cuelang.org/

Also, Relay is built by Puppet, so not a separate company :)

I think the YAML hate is a bit exaggerated. It's not perfect but I would rather write out YAML than JSON or XML.

Besides it seems like this company (Relay) has gone to the effort of making a VScode language extension, which enables autocomplete and syntax checking in the YAML files. Which is cool.

FWIW we didn't make a separate extension, it uses the vscode-yaml extension from Red Hat, you just configure it to associate certain filetypes with our json-schema and custom tags. It is awesome for sure.


> Even JSON would be an improvement

JSON is valid YAML, so that should "just work" already. Have you tried it?

I just meant that they could add support for the JSON file extension, since their YAML liv already reads it.

(When) Is a self-hosted version coming? Does Relay use Lyra in some way, or is it a brand new engine?

The usual DevOps automation web systems like Jenkins, GoCD, Rundeck are all disappointing in one way or a dozen. Java, to start with. I'm desperate to see a modern, full-featured, self-hosted alternative.

On-premises support is in progress. No release date yet. That said, would you need it to be fully self-hosted or would it be viable to have an on-premises agent?

Relay uses Tekton under the hood to run workflows and Knative to process triggers. What kinds of workflows are you thinking about running?

Fully self hosted is a requirement in my case. Letting an agent that's essentially externally controlled would be a hard sell in corporate environment.

In my case, I need very simple workflows that e.g. run Ansible playbooks, or do non-urgent monitoring, bookkeeping, and essentially trigger scripts. I usually need a single step - no serious need for a pipeline/workflow even. Cron job and manual triggering would be enough to cover the base usage (though git hook would be next on the list).

The more "convoluted" part that I'd want to see is a capability of jobs controlling themselves (e.g. job disabling itself or others on error), jobs as code, flexible notification system, and make it easy to deschedule/disable jobs in bulk, easy to extend with plugins.

Granted, Rundeck has most of what I need, but the implementation is Java, which also includes its SSH functionality.

What made you go for yaml rather than your own DSL? For a service that's as programmable as relay, it seems nice to have better support for branching control flow for example.

Hey there, I'm an engineer on Relay. Our PM is busy so thought I'd jump in here.

Great question on why we went with YAML instead of our own DSL. The main reason is, really, YAML has become a bit lingua franca for managing configuration of tools similar to Relay. We wanted to keep Relay simple and make it as familiar as possible. So YAML was our go-to choice.

We have some control flow primitives that you can configure but to do really complicated logic we figured it's best to push that in to the steps versus the configuration (there by keeping the configuration simple). Here's a link to our (very simple) conditional execution docs: https://relay.sh/docs/using-workflows/conditionals/

Where is Relay's data (configurations, step results/logs) stored?

It varies. We have a primary PostgreSQL database for non-sensitive data including some workflow configuration. We store sensitive data in Vault and logs in Google Cloud Storage. A bit more info here: https://relay.sh/docs/how-it-works/#where-and-how-is-my-data...

I'm happy to answer any questions about our storage and security architecture too!

Sounds interesting. Have you any comparisons between this and StackStorm? At first glance, this looks like it competes in a similar market.

Additionally, is this only going to be a SaaS offering, or will we have the option to self-host this as well?

Hey there, as mentioned elsewhere I'm an engineer on Relay. Our PM is busy right now so thought I'd jump in.

> Have you any comparisons between this and StackStorm?

Relay is very similar to StackStorm! One of the biggest differences between Relay and StackStorm, however, is that every step and trigger in Relay is just a docker container. You can basically run any container you want, but containers that are authored specifically for Relay are, obviously, better (integration with the execution environment, our secrets management service, etc.).

> is this only going to be a SaaS offering

This is another key difference between Relay and StackStorm: We only offer a SaaS right now. That's not to say that we won't eventually offer a self-hosted version (or some sort of hybrid solution for that matter)--it's just the approach we're going with for now.

> every step and trigger in Relay is just a docker container

Ooh, that makes sense. This is also reminding me a bit of Concourse CI (https://concourse-ci.org) as that's all docker-based too, though last I checked it wasn't as kubernetes-native (which I appreciate)

There's a lineage -- Concourse was an inspiration for Tekton, and Relay uses Tekton as part of its substructure[0].

[0] https://relay.sh/docs/how-it-works/#workflow-execution

So I didn't see anything on your site on my cursory glance at it, but I'd like to know if you have any use cases replacing Jenkins Pipelines and Helm deployments into an OpenShift/Kubernetes environment? I'm writing a lot of Jenkins pipelines now and groovy is killing me!

We do have a Helm integration that might be able to help you out: https://relay.sh/integrations/helm/

The limiting factor right now would be whether your cluster is accessible to the internet, which of course isn't super common. We are working through what our story for connecting to on-prem infrastructure looks like, so if you can provide any additional details on your environment, it would be helpful!

I think we can do the same things using github actions now. P.S we have self-hosted runners in our enviroment where each step is a docker container too.

We have complex flows where we use Slack, Spinnaker(Webhooks), Terraform, PagerDuty ... and much more.

Is there something better that we can achieve with this?


This is a good question, thank you! The flow of events into our system is somewhat different than GitHub Actions. We're trying to be a consumer of all sorts of events, including, say, data published to AWS SNS or via a Docker Hub webhook[0]. All of that isn't quite in place yet, but we want to act more like an event broker than a CI/CD solution alone.

One concept we're throwing around is supporting CloudEvents[1] and dispatching workflows based on event types. If anyone has experience with CloudEvents we'd love to hear if that would be something useful to you.

[0]: https://github.com/relay-integrations/relay-dockerhub/blob/m...

[1]: https://cloudevents.io/

I'm writing a book on Knative and have some material in there about CloudEvents. It's on the MEAP program and the relevant chapter was put up yesterday, as it happens: https://livebook.manning.com/book/knative-in-action/chapter-...

Thank you! I'll take a look through this! Would it be okay if I sent you an email in the next week or two for follow-up questions?

Please do.

Sadly, it doesn't seem to support SSH, so I can't use it to replace Fabric scripts.


I wrote an SSH step for you. Here's the source code:


And since it isn't documented yet, here's how you would use it in a workflow:

  - name: ssh
    image: relaysh/ssh-step-exec
      connection: !Connection {type: ssh, name: my-ssh-connection}
      username: relay
      port: 2222 # defaults to 22
      knownHosts: |
        server1.example.com ssh-rsa AAAAEXAMPLE
        server2.example.com ssh-rsa AAAANOTHEREXAMPLE
      # or
      #strictHostKeyChecking: false
      - server1.example.com
      - server2.example.com
      - whoami
      - uptime
      - cat /etc/passwd
Please feel free to shoot me an email (check my profile) and I'd be happy to help write a workflow for your use case with you!

This looks like it uses tekton pipelines and triggers under the hood.

Yep! Relay uses Tekton under the hood to execute the workflows. It actually knative and ambassador to execute the triggers. You can read more about the trigger system here: https://blog.getambassador.io/user-defined-webhooks-in-puppe...

What's the pricing model? I don't see it on the website

We're thinking about a few different models, but haven't settled on one yet. What kind of model would you be interested in? Metered or other usage-based, per-seat or per-user, or something totally different?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact