Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: DevOps learning resources
288 points by durian89 on June 28, 2017 | hide | past | favorite | 64 comments
Hey guys,

I've been a developer for some time now, but I've never had to do anything with deployment, servers, and all these good things.

I'd really like to have the basics so would you guys have some nice resources to gain some general knowledge about it. Thanks!

Well, the best way to learn is to actually do something. Create a small service in your favourite framework. Something really small, an echo service will probably suffice.

Then stop "developing" and switch on "Ops mode".


- builds (build & packaging scripts)

- deployments - try all 3 major approaches:

    - push deployment: running a command on a central server that orchestrates everything (Ansible, Salt, chef-solo, ...)

    - pull deployment: agents running on your target server, that pull the latest changes (Chef, Puppet, ...)

    - immutable infrastructure: VMs or containers that are never modified, only recreated (CloudFormation, Docker, ...)

  include database updates in your deployment orchestration and possibly include even environment pre-warming/pre-caching
- functional tests, especially fast smoke tests


- high availability/load balancing (Nginx, HAProxy, Apache, Elastic Load Balancer)

- detailed technical monitoring and graphing (Nagios, Zabbix, Cloudwatch)

- availability monitoring (Pingdom)

- a status page (can't give you a decent example; you can build your own, but host it somewhere else than your main "app")

- log collection and shipping (Splunk, Graylog)

Basically, for almost everything I listed google options and pick an "Ops stack". Then implement that as best you can.

Oh, and by the way, while working on the "Ops stack", only "develop" things in support of this Ops work in your "app".

The only thing I'd add is be sure to do it manually first - you'll need the solid foundation of understanding how it works (and how it fails) in order to automate it. Note the key word - understanding - and how that means that just googling the steps won't help you in the long run.

Set up & configure Jenkins to pull and build. Set up and configure nginx, MySQL/PostgreSQL, init files for your app, iptables to firewall off unnecessary ports, DNS, and LetsEncrypt SSL certs (and the associated nginx configs). Make sure nothing that doesn't need to be is running as root. Make sure that what is running needs to be running, and nothing else.

Then, start automating it. Toss in cloud features like ELBs, RDS instances, cloudformation (or terraform, etc), autoscaling groups, autoscaling container services, etc.

This will take you a very long time (and you will probably have enough skills to be hired as a "DevOps" engineer well before you're done), especially if you strive for understanding. But it's really the only way to get where you want to get and be good at it.

There's two major types of monitoring you should be concerned with, log aggregation and metrics collection.

For log aggregation the most mature open-source solution I know of is the ELK stack (Elasticsearch for storage, Logstash for aggregation, Kibana for visualization). Set that up (bonus points if you use Puppet/Chef/Ansible/Whatever to set it up for you), and make sure you can detect anomalies from your visualization interface.

For metrics collection, Prometheus has a special place in my heart, I love its architecture. That said there are quite a few options, InfluxDB is another (arguably simpler) option. You can then use something like Grafana to visualize the metrics.

For configuration management tools, Chef is complicated and you better know Ruby and have a few months to dig into it. Puppet is simpler, but has a lot of weird quirks. Ansible is simple, but not as powerful (or used); so pick your poison, they all are pretty similar conceptually.

Basically at the end of your journey you should be able to kill any server (or network connection, or application running on a server) and be alerted to it in near realtime. You should be able to diagnose what's wrong quickly using Kibana and Grafana, and finally you should be able to destroy and recreate the server using a few well understood commands and have it be completely reconfigured and ready to accept connections. Of course this is pie in the sky (like having perfect unit/integration testing for a very large application), but that should be the goal.

I forgot to mention 3 very important things "to add":

- security (you should already know a great deal from your work as a developer, the general things such as no open ports, no plain text passwords, etc.)

- related to security: secret management (Chef, Puppet - I think, Vault)

- backups, backups, backups: for the database and basically anything which is not reproducible from source (which in general should only be your database data)

Interesting only one comment made a clear call to backups. Strongly second the notion of backups as a high priority to learn and implement. Could make it more "devops-y" by also automating the verification of them with restores/testing.

> - log collection and shipping (Splunk, Graylog)

I'll also throw my hat in for setting up (and securing, without Shield) an ELK stack here. You'll learn a lot along the way.

Agreed, recently set up the Elastic "stack" with fluentd shipping logs from Kubernetes nodes. Fun and very rewarding.

I'll have to look into that.

I'm finding Beats + Logstash to be very convenient though.

> - a status page (can't give you a decent example; you can build your own, but host it somewhere else than your main "app")

Does statuspage.io have free trial accounts? Integrate with them for your demo/trial using API calls. Bonus points if you pull in applications metrics to your status page (response time, etc).

No. They start at $29/month

With respect, Puppet is push-based, not pull-based.

Chef is pull-based by default, but you can also use it in a push-oriented manner.

Having worked with both Chef and Puppet, it is my opinion that Chef is the more scalable of the two, and the one that makes more logical sense to me. YMMV.

While I agree that learning by doing is the best way to start, once you get beyond the basics its nice to have some patterns and best practices outlined so you can try to do things more "properly".

Friends don't let friends configure Nagios...

How is he supposed to learn the Ops part of DevOps without pain and suffering? :D

The first thing to realize is that DevOps is an ambiguous term (at least partly by design, it seems).

My belief—shaped by many at the forefront of the DevOps movement—is that it is a cultural focus rather than a technical one. In many ways, it's an extension of agile philosophies, with a focus on fast feedback, transparency, heightened interactions between teams, etc. There is also a heavy focus on automation (CICD), but the automation is there to serve the cultural goals. Just because you do CICD doesn't mean you're necessarily doing DevOps, and you can adopt a lot of DevOps principles without doing full CICD.


* The Phoenix Project— introduces a lot of concepts (such as lean principles) that are foundational to the movement

* Effective DevOps (Oreilly)

* The DevOps Handbook


* Arrested DevOps

* DevOps Cafe


* IT Revolution


* DevOpsDays conferences

* Local meetups

* Velocity conferences

* DevOps Enterprise Summit

Having a good grasp of both development and operations skills is helpful. But it's far from complete. If you solely focus on the technical aspects without examining the cultural, you're missing the foundation of the movement.

For those who've gone, is the DevOpsDays $175 price tag worth it?

They're offered all over the world, so the quality probably varies quite a bit (especially since it's a fairly decentralized organization). However, for the price tag, I'd say it's a steal—the networking value alone is worth more than that.

DevOpsDays has been around since the very beginning of the movement—in fact, the term was popularized at DevOpsDays Belgium in 2009.

Have you written a webapp as a side project before? If yes, then great:

1. Find a bare VM provider, e.g. Linode or Digital Ocean or EC2.

2. Figure out a way to get your code up there, even if it's ghetto. e.g. git pull from github. Get the app server running.

3. Figure out how to expose your app to the internet. Buy a domain name, get it to point to your IP addr. Soon enough you will realize that DNS load balancing is terrible...

4. Then you should install Nginx or HA proxy and put your app behind it. Run your app on localhost, only nginx should be exposed.

5. Once everything is up, iptables is your next concern, expose only ports you want and disable everything else.

6. Repeat 1-5 on a second instance.

7. Soon enough you will find that repeating the same thing sucks, so you will write Python script using Fabric library. But then you realized someone else have done this already, after a quick googling, you will find Ansible or Salt. Use those instead.

8. Rinse and repeat for other networked things. e.g. databases, mail relays, cache servers, etc.

9. Evolve your approaches, rewrite ghetto stuff, make your artifact as immutable as possible with very few network dependencies...

That's pretty much the entire devops evolution up until 3 years ago. Once you got good in these...

1. Start reading about Linux container and why they are useful.

2. Install docker and try to get your toy project in a dockerfile.

3. Repeat the learning exercise again on deploying Docker containers.

Hope that helps :)

The Phoenix Project is a really inspirational novel, good for 'getting' the devops mindset and also as a tool to get other people in your organisation on board.


The DevOps Handbook is a sister book to The Phoenix Project which is more technically oriented around the practicalities of closer integration between Dev and Ops.



One word of caution. The future really will be having a docker file and finding a place to run code (docker swarm, ecs, etc). When you commit code it will be ran through a ci system (unit tested, staged) then updated.

The salts, puppets, chefs, Vaults, running your own Kubernetes (wtf), VMs, Nagios, sendmails, vpcs and all this other drama will be latin in a few years.

The future will be running the code on your macbook (with container of choice), then commiting. The end.

Sorry but Kubernetes is production ready. Swarm is everything but. If you struggle with K8s, give Deis Workflow a try.

ECS is proprietary, learning how Kubernetes works gives you flexibility to move between cloud providers. GKE is a beast.

This could be true, from a developer's perspective. But the OP was in the spirit of expanding one's knowledge and understanding. But even if a single-interface, fully-managed container orchestration service is available to all, the value of understanding how the underlying systems actually function will not cease to exist. Anyone building moderately complex systems will still want to know about OS, networks, storage systems, access control, and all the rest, because those areas of expertise are applicable and useful across the stack.

This is not particularly helpful advice. All of these things are worth learning and they are all still in widespread use and aren't going to disappear overnight. On top of that it's not like docker is some magic wand that magically makes DevOps happen. If that's the future it's not here yet and doesn't answer the OP's question.

not like docker is some magic wand that magically makes DevOps happen

Exactly. So I would start studying this. Like container security, how ci works with containers, how ecs works. Instead of terrible advice like 'learn how puppet works'. Further it is helpful advice. If someone asked me how they can get involved in action script coding I would warn them too.

I work at one of the large insurance companies as a DevOps role wherein I convert the various product build processes into a CI pipeline. We use Jenkins (we use all of them, but our department uses Jenkins). Everyone has roughly the same process in general - you store some code in SCM, and then something checks it out, runs various "Quality Gates" (corporate speek for tests), and then you eventually build the product and commit it to some secure artifact repository. You put some logs on there that you're somewhat convinced employees can't modify or delete, and call that an audit trail.

Ok, so that said, I would say that for an open source core like Jenkins, the source code is hands down the only documentation you can trust. And that's very, very frustrating at first. Trusting what's in a wiki is the easiest way to burn a day, believing something that was once (possibly) true to still be true. Most interfaces are not documented at all, beyond system generated documentation. A certain class of developers are fine with that, but many of us need a little push up the hill in order to script our way around these ecosystems effectively.

After accepting that the source code is the single source of truth, the job got a lot easier (and made me come across as far more of an expert). But it's a difficult sell to get other people interested in sharing this work.

I made a mind map for learning DevOps :


Clicking on nodes with a map will go to other mind maps with resources.

You might want to add in networking components as well. Throw in (OS) packaging as well (not all services are put into containers...). I'd change 'automation' to 'config management', since the former concept is also part of a number of other nodes (particularly CI and orchestration). Datastores and their efficient use are another set of nodes (databases, caches/cdns, things like s3). Secrets management (fun!) is another one, though that belongs under config management.

I have a question. How are automation and CI different?

I don't know much about these stuff and the only thing I have ever touched is Jenkins. So how is Jenkins actually different from Ansible for example?

Nice. Hope you don't mind me pointing this out, but cluster manager seems like odd semantics, wouldn't that grouping be container orchestration?

Yes, I think that is more appropriate too. Will change it now.

Very useful, thanks for sharing

for that linux administration book, how do you know whether its good or not? it's not out yet.

For the build/test/integration/deployment cycle, there's "Continuous Delivery" by Humble and Farley: https://smile.amazon.com/Continuous-Delivery-Deployment-Auto...

I found that very good for understanding the principles and reasons. Implementing it was a very frustrating experience for me, which is why I wrote a (much shorter) book on how to do it practically, with worked examples: https://leanpub.com/deploy

Also, if you can afford it, the Site Reliability Engineering book is an excellent comprehensive resource on practices, theory, return from experience: https://landing.google.com/sre/book.html (and you can read it online for free, too).

Instead of buying the book, why not get Safari (the website, not the browser) and read that book plus many more? And if you are an ACM member you already have access to it through http://learning.acm.org

This is a less general answer, but if you want to work with AWS and you have a Kindle, download the manuals for free and read them during down time. I did this and it helped me a lot.

Do you have a website / blog or any other hobby project?

You can learn basics by automating its deployment.

Let's say you will go with public cloud: Amazon or Google.

Create a small basic infrastructure with some automation tool such as terraform.

For example a VPC with private subnets and two small instances in different availability zones. Another instance for NAT and a load balancer.

Then use Ansible / Puppet / Chef to automate deployment of your application, make sure you can deploy without any outage to the service (set up some health check to verify your app is always up).

Perhaps deploy the app in a container. Automate security settings to lock down ports. Automate SSL certificate renewal with let's encrypt.

You can even take it a step further and automate deployment of a PaaS like Kubernetes or CloudFoundry and deploy your blog / website there.

Next task would be setting up CI pipeline and continuous deployments. Integrate it with PaaS.

Where do you store all secrets and credentials? Look at solutions such as Vault from hachicorp or Ansible Vault.

There's a lot to learn by just automating everything about a simple hobby project or blog.

Of course after you're done with this scale down all this crazy infrastructure so you don't pay $200 for a blog.

Work through all of these tutorials on containerization and orchestration found here: https://github.com/docker-training/orchestration-workshop

When running the Dockercoins example, how can you scale it up to mine as many coins as possible? How would you set up alerts for if one of the services went down? How would you diagnose the problem? What other tools might you use?

Play around and see what tools you like and don't like. Have fun!

Over at DZone, we have big community of developers sharing tutorials and content that could help introduce you to DevOps.

For starters, we have an entire web portal with thousands of free community-written articles about DevOps-related topics: https://dzone.com/devops

We also have several Refcardz (cheatsheets) on a variety of DevOps related topics:

Deployment Automation Patterns https://dzone.com/refcardz/deployment-automation-patterns

Continuous Integration Servers https://dzone.com/refcardz/continuous-integration-servers

Continuous Delivery Patterns https://dzone.com/refcardz/continuous-delivery-patterns

And for an introductory overview, we provide topical research and best practices Guides. Here is our latest 2017 DevOps Guide: https://dzone.com/guides/devops-continuous-delivery-and-auto...

DevOps is a such a broad term and can mean lot of things. If you are looking for something related to orchestration, I am putting together some Ansible recipes specific to AWS. It is here at https://github.com/gshakir/ansible-recipes

DevOps is a moving target right now, and a lot of great fundamentals are linked here already. I'd also recommend subscribing to, and reading old issues of:

https://sreweekly.com/ http://www.devopsweekly.com/ https://lastweekinaws.com/ https://weekly.monitoring.love/

The last two are fairly young, but have good content so far. I wouldn't call it a well-curated list, but there are tons and tons of great posts are linked (in loosely chronological order, as weekly mail blasts tend to be).

I just searched "list of curated devops github", this is what I got.

Awesome Devops. A curated list of resources for DevOps


The new cloud space has changed a lot in the last few years. A short time ago I'd say you should learn about creating VMs, using ansible, scripting etc, now Kubernetes and Docker has changed much of that. Its worth looking at which space you want - with Kubernetes etc you barely need to know any Linux or scripting.

People learn differently - but I'd recommend starting there. Get the generous Google Compute intro special and write some apps. There is a bunch of free resources on the web - I can't point to anything specific. Linux academy, Safari books online are huge but not cheap.

(In case the handle is missed, I created Ansible and I'm a little opinionated on this)

DevOps isn't really a thing, but an amalgam of things. Some people think it's about culture or something, which I think is too obvious to be a thing. In the beginning when many people were using the word lightly, most people really just used it to mean automated systems administration, which is more likely called "Operations" now. That's fine. Some people use it to mean groups of people who make ops tools to allow developers to self-deploy their own stuff. That's also fine.

Most likely what you are looking for is to learn how to do IT Operations stuff they way people are currently doing it.

Reading a lot of articles is fine, trying lots of tools is fine. Talking to people at your company that DO ops is huge. Make friends with the guys who run the build systems, do security, or anything like that for starters and they can show you lots. Plus I strangely find that ops guys are much better to go to lunch with than developers. Don't know why :)

You should read up on AWS lots, as it will likely be across your career path at some time. Try a configuration management tool (or two). Learn about monitoring systems and logfile collection/analysis systems. Do a little bit of reading on computer security. Vagrant is probably useful, but optional, though you should at least get going on a virtualized Linux box. Reading up on Immutable Systems is worthwhile. Pick up either CloudFormation or Terraform, or both if on AWS. I don't know Google as well, but it has a lot of similar things.

DevOps Days conferences can be good sometimes but often they are too cultural to get down into technical bits. But they are cheap and usually close by, so they are things.

If you have a local meetup group that can be absolutely great.

The really nice thing about AWS now is there are tons of parts and it is pretty cheap to try things out, where before you probably couldn't get your IT guy to let you play with a load balancer or get you your own database instance. Now you can, so that makes it a lot easier to learn than it was before.

IMHO I don't like books because they are often written by people who don't DO things (DevOps has an unfortunate "thought leader" problem, which impacts conferences and tries to get everyone to believe the same things), and podcasts/videos are too slow for me, and my brain is a lot more random access.

Don't get caught up in assuming you must do any one particular thing. For instance, Continuous Deployment is a spectrum, it's not appropriate for everyone.

And at most people's scale, you have no need for something like Docker or Kubernetes when basic AWS instances require a lot less to keep going in your head.

Few resources I haven't seen mentioned yet:


If you're looking for a community of people to interact with I've found the following Slack teams to be very active with lots of helpful people:



How do you get an invite to the latter slack?

I believe it was https://devopschat.co/

Like most comments here, I'd suggest you get started by deploying your side project to any of the cloud providers out there. This process alone will teach you a lot about deployment. Also, I find https://serversforhackers.com/ and https://sysadmincasts.com/ very useful.

Linux Academy. Not free but has a very nice DevOps learning track.


Love that site.

Shameless plug - I've got a 15 hour course on building a full infrastructure on AWS:


We go from zero to fully scalable/loadbalanced etc. infrastructure on AWS. :D

devops is more of a principle than a job. that's my opinion though.

i've been reading through this book and it does a decent job of covering the principles for CICD, builds, tests releases: https://www.amazon.com/Continuous-Delivery-Deployment-Automa...

you'll run across various "thought leaders" in devops and its important to remember that a) each employer treats devops and cicd differently and you'll want to learn their practices as you bring about your own ideas to the culture and b) form your own opinions, just b/c thought leaders and books are out there its important to learn what you like to do and improve how you like to do it.

Check out my "Training program to make a Novice System Administrator": http://verticalsysadmin.com/blog/training-program-to-make-a-...

I use pocket to collect (and tag) links related to the field. I hacked together a small UI around it that lets you filter them by tags. There's a lot of good content there IMHO.

You can check it out here: http://links.ozkatz.com/

Where do you live? If you'd like to meet up, I'd be happy to braindump as much as possible on you.

I'm trying to build up the effort to finish writing my book on monitoring, let me know if you'd like to read it and it might encourage me to push on.

Vendor specific but a lot of good tutorials http://learn.chef.io

get a digital ocean account and a domain name. start a lamp server (or equivalent), use the excellent (imo) documentation to get things running. add services as you go along. I learned a lot from just that, now I have my own webserver, source control (git server with gitea) and other things running. tons of fun and learning

Check out: - The DevOps Handbook - State of DevOps Report 2017 - Guide to Sysadmin Body of Knowledge www.sabok.org

Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact