Hacker News new | comments | show | ask | jobs | submit login

We use Collins (https://tumblr.github.io/collins/) as a Configuration Management Database, Ansible (https://www.ansible.com/) for automation, Terraform (https://www.terraform.io/) + a bunch of homebrew for orchestration, Packet (https://www.packer.io/) for multi-cloud (and hypervisor) image creation and maintenance, powered by Ansible. Every since thing is committed to a series of bitbucket (https://www.bitbucket.org) repositories.

We connect Ansible and Collins through ansible-cmdb (https://github.com/fboender/ansible-cmdb), then tie the entire thing to our ticketing systems ServiceNOW (https://www.servicenow.com/) and Jira Service Desk (https://www.atlassian.com/software/jira/service-desk), and finally, ensure we have history tracking with Slack (https://www.slack.com).

As a given, we yank test the entire world. If it doesn't pass a yank, it straight up doesn't exist.

Whether it's bare-metal, virtualized, para-virtualized, dockerized, mixed-mode, or cloud - we 100% do this all the time. There is not a single change across any environment, that isn't fully tracked, fully reproducible, fully auditable, and fully automated.

what do you mean by "passing a yank test"? i assume "yank test" refers to unplugging the network cable abruptly from the server under test, but what exactly are you looking for when you do that?

A yank test on process and infrastructure is more than a 'did it come up'. It's a "if we totally nuke the thing" - say, were we to rip the hard drives out of a server, fry it, and recreate it - does it come up identicall(is).

That way we know our CMDB is accurate, our workflows are accurate, credentials, ansible, terraform, images, etc. Right down to tickets.

It's how we manage all of our cloud customers.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact