Hacker News new | comments | show | ask | jobs | submit login
Software Developers Should Have Sysadmin Experience (professorbeekums.com)
577 points by beekums 134 days ago | hide | past | web | 406 comments | favorite



This might be controversial, but I don't think you get to be a half decent developer without being a reasonable sysadmin.

Maybe my experience is unusual, but I've never worked anywhere that the sysadmins knew more than the developers about how best to run their code in production. And when things go wrong with it how best to find the cause of the issue.

And I've never thrown code over a wall without having tested it in a representative environment.

The worst sysadmins get in the way of developers. Ones that scale down your CI server to the cheapest, throttled, one the hosting company has, leaving $800/day contract developers waiting for builds that run in 20 seconds on their laptops take nearly an hour. And then try and argue the toss about whether the CI server is cost effective and every few months keep switching it down despite the CTO saying it needs to be left alone.

When a sysadmin sees an issue in "their" environment that they understand there's a tendency for some of them to just see that issue as the only thing the developer has had to deal with that month. In all likelihood, in a productive company, it's the most trivial issue the developer has had to resolve that day.

Often this stuff goes more smoothly where the developers (I mean, it's not as though if you're going to drop one of the two groups of people it's going to be them going) manage production and there aren't people with separate job titles and the resulting friction between them.

Sorry. There must be great sysadmins out there struggling with terrible developers, I'm sure of it. I just haven't seen it.


I've done the dual sysadmin/developer thing for a small company, and the problem I experienced there was completely incompatible working modes.

Sysadmins must deal with interrupts (requests, crises, things driven by external schedules etc) and then in the rest of their time build systems to manage or reduce the interrupts. Developers are expected to produce work on a predictable schedule. This is disrupted by interrupts and obliterates the schedule for proactive work unless your management is very good at making it a priority.

The "prevention of information services" problem is certainly real though. Perhaps it could be addressed by embedding the sysadmins in the dev teams rather than having a department of their own, but then you have to fight org hierarchy.


I had exactly the same experience at a previous employer, almost word-for-word.

Having said that, the central assertion is still correct: the absolute best developers I've ever worked with were also top-tier sysadmins (or linux experts, depending on what you want to call it).


AMEN!!! In my current job, I am wearing both hats, and while I like that there is a certain variety in my work, users calling for help is highly disruptive when programming or doing some other stuff that requires deep focus.

The upside that in a three-person IT department there is very little bureaucracy to fight, just the odd "organically grown" legacy system.


As an aside, did you know that the word "Amen" actually is a acronym in the Jewish language that means "El melekh neʾeman" (or AMN) which translates to "God, trustworthy King". (source: https://en.wikipedia.org/wiki/Amen)

I figured the etymology of that word was rather interesting. But yeah, I get the whole SysAd/Dev dual job. They're tough to balance and do effectively. SysAds are firefighters. When the nag(ios) alarm rings, we come a-callin.


From that Wikipedia same article, it says

The Talmud teaches homiletically that the word amen is an acronym

The etymology section shows the word has much more prosaic roots. The Talmudic acronym seems to rather be an interesting backronym.


> As an aside, did you know that the word "Amen" actually is a acronym in the Jewish language that means "El melekh neʾeman" (or AMN) which translates to "God, trustworthy King". (source: https://en.wikipedia.org/wiki/Amen)

I did not know that. ;-) Thanks!


The fact is that nobody really knows. Talmud was written in ~200 BC and is an exegesis. Egypt is the elephant in the room of Hebraic history. It is possible it is an Egyptian loan word just like Moses -- "born" from water -- is an Egyptian name.


I've been in this situation for a long time; The worst part is every so often you get assigned a PM who wants you to accept responsibility for meeting artificial development deadlines.


Developers IMO benefit greatly from having the general engineering experience. This helps understand how the part they build fits in a full product, where the narrow spots are, what is likely to break first, where formal documentation is insufficient / contradictory / wrong, etc.

Sysadmins, who often manage crises, acquire this experience way or another (e.g, by researching options to fit a square peg in a round hole without leaks), so developers with sysadmin experience tend to all have it. I think though that the key part is the "engineer" part and it can be acquired and used without sysadmin-imposed hassles (interrupts, crises, being underappreciated).


The two hats don't fit well at the same time; 27 years of both has not been easy, but the experience of wearing each regularly is invaluable.


that's a great point about the types of schedules for each discipline.


> I mean, it's not as though if you're going to drop one of the two groups of people it's going to be them going

I'm sorry you've had such an awful experience with sysadmin colleagues that you've developed such a corrosive attitude towards them. I've worked in lots of good environments, where dev/ops was being effectively practiced, and sysadmins there were the most effective force multipliers imaginable.


> awful experience with sysadmin colleagues that you've developed such a corrosive attitude

This is going to be a sensitive topic, but can we talk about ""BOFH"" culture somewhere on this thread? (Maybe I'm old and it's now dead, but I think some of it persists)

I kind of understand it as a product of working in an environment where everything is urgent and nothing is appreciated, but when sysadmins come to resent the people they're supposed to be supporting then the force multiplier turns negative. Sysadmins develop strategies for reducing the number of requests at any cost, usually by making the experience as opaque and unhelpful as possible.


When I was reading BOfH a few years ago I got a different gist.

The BOfH is the archetype of someone who is excellent both with technology and politics. When you are in a service role, you have two competing priorities. You must deliver people the things they want, but also keep things nice and stable for yourself so you don't go crazy.

An important dynamic in organizations is laziness vs. intimidation. Political savviness allows you to apply intimidation to get the lazy to do what you want. You can threaten to fire, or raise an issue that could possibly get them fired and even if it doesn't, won't make you look good. The BOfH is someone who can respond to political intimidation with adroit technical interventions to ensure that his second priority, ensuring a smoothly-running system, isn't threatened.

If you read the BOfH stories carefully, you see that the operator knows where his bread is buttered and is careful to remain on good terms with the people who really have the power in the company. The whole thing is a phenomenal read on organizational dynamics.


> Sysadmins develop strategies for reducing the number of requests at any cost, usually by making the experience as opaque and unhelpful as possible.

This is sometimes an organizational problem. I worked in a support role at VMware for about 5 years and this is what I observed:

- Support & IT departments typically have enough staff in the beginning

- The organization grows & the department grows to match the new work that exists

- At a certain point, the organizational view of Support & IT/Ops changes, and it's now viewed as a cost that you want to keep down.

- Leaders try to minimize the increases in budget, but the workload per sysadmin/engineer increases.

- The sysadmins/engineers have no control over the flow of new work, which effects the quality of work that gets done and can create a toxic environment.

It literally becomes impossible to handle all the incoming requests. Different people handle it differently. Good sysadmins would learn to prioritize properly, but due to the toxicity some people have trouble handling it so they end up developing strategies to make a certain number of requests "go away".

Anyway- just my two cents.


The SOLUTION to this problem is multifold:

1) Leadership: Stop viewing the IT/Ops/Support department as a "cost to keep down".

2) Leadership: Treat the department like they are manned by people.

3) Realize that not all requests are created equal. Some take minutes, some take months.

4) Determine a reasonable number of requests/tickets per sysadmin/engineer. Make sure to add padding for things like project work, sick time, vacation, professional development, and so on.

5) Hire proactively to prevent the determined threshold above from being surpassed.

6) From the IT/Ops departments perspective: realize that the incoming requests are coming from people that need your help and they are effectively your clients/customers. Treat them as if customer satisfaction is extremely important!

There are also other strategies where you give a subset of people the ability to work on projects and designate a different subset to be interupted with urgent requests, and rotate the role. There are all kinds of things you can do to improve the situation :)


> where everything is urgent and nothing is appreciated

in such an environment i wouldn't expect any other result. fix the environment, not the rational human response to it.


Sysadmins are, 99% of the time, responsible for production uptime. Sadly, management includes all of the software being written in-house in this expectation. Change means instability, means that Sysadmins' feet are held to the fire - this makes them resistant to change.

Software developers, on the other hand, are responsible for making changes. Adding features, pushing fixes, and so forth.

These two points of view are inevitably going to cause friction. Developers are only recently starting to be held responsible for production uptime and the pages that come along with that - and it's a good thing for both sides.

That "most trivial issue" for a developer is something the sysadmin was woken up for 3x in the past, and doesn't want to be woken up for again, so he pushes back. How can he not?


In 1997 when my code first started running in Live I was given a pager and told welcome to Ops. Every piece of software that company had a single dev name easily read out from the binary and used to contact when trouble occurred.


>>> The worst sysadmins get in the way of developers. Ones that scale down your CI server to the cheapest, throttled, one the hosting company has, leaving $800/day contract developers waiting for builds that run in 20 seconds on their laptops take nearly an hour.

How likely is that the sysadmins were told to 'just make it run cheaper, I don't care' by someone higher in the foodchain?


> How likely is that the sysadmins were told to 'just make it run cheaper, I don't care' by someone higher in the foodchain?

Having worked in ops for > 10 years, this is how it usually goes.

The SA's job tends to involve a lot of scepticism and caution. You look for problems and try to solve them proactively. One (often easy) way to solve many classes of problems is to throw hardware at them.

Management always pushes back on this tactic. That's reasonable; they need to justify capital expenses (especially if you're self-hosted).

The core issue though is that capital expenses are easy to quantify, while "lost productivity" is much harder to fully account for. If I complain that some hardware upgrade which costs $x could improve productivity, I just don't have hard numbers on my end - it's all napkin math.

In many places reluctance to spend money on infrastructure is also, I think, a symptom of headcount-itis. Managers love to have more employees, and love to have more for them to do, because that makes managers seem more impressive to the org. My manager might have perverse incentives; keeping the SAs busy fighting scaling fires both makes his team look impressive because they're busier, and makes him look better because the capital expenditures are lower.

Obviously, head count is expensive, so this is usually a game of appearances rather than an effective strategy to improve the bottom line. Good insight into productivity is required to catch this kind of stuff, but in the real world I've found that a lot of places just don't have an org structure capable of weighing cost / benefit properly when it comes to infrastructure.


Thanks for such a detailed response :)


How likely is that the sysadmins were told to 'just make it run cheaper, I don't care' by someone higher in the foodchain?

If you're blindly following "orders" to reduce costs and doing things that push up costs elsewhere then you're not doing a good job. A good sysadmin (or the sysadmin's boss) should be able to pull up some numbers and say "Build tasks are being queued for an hour before they run. What impact is that having?", and call a wider meeting that brings together the higher-up-the-foodchain manager, the development team, and anyone else who might be affected. Ideally it'd be the higher up manager who calls that meeting of course, but they may not understand the technical issues.


I have long thought that one of the most effective workplace sabotages a sysadmin could put into place was to implement management plans without question.


>If you're blindly following "orders" to reduce costs and doing things that push up costs elsewhere then you're not doing a good job.

This is the responsibility of someone above to know whether or not the orders they give should be given. If they need to ask for information from people below them, fantastic, please help them along.

Please don't fall on the sword for incompetent managers.


This is the responsibility of someone above to know whether or not the orders they give should be given.

Yes, and part of that is the people in their team(s) helping them and understanding that they're fallible and may fail to ask a pertinent question. Equally, the manager needs to be open to updates volunteered by their team without a prompt. Ultimately everyone does better if the entire group works together.


Everyone was told to see if they can find cost savings. This just wasn't one though in the bigger picture and even after being told it wasn't one, even by their line manager and separately the CTO, they pursued it over and over. I don't believe anyone else was directly involved further up the food chain on their side.

All it needed was for the question to be asked on the company's internal board and listen to the answer. Even trying it once or twice and I'd probably have forgotten about it in short order. This went on for a couple of years though!


My favorite is being told I can't store stuff I need on some enterprise storage solution because it is running out of space, when I know in my head a couple more terabytes of storage costs way less than the amount of effort that went into discussing it by all parties involved, and there's no effort to help me find an alternative. So it goes into a different bucket, like S3, which is what we were trying to avoid (for various on-premise benefits) in the first place.


I agree. I also think it should work both ways. The worst jobs I've ever had is when the sysadmins have the mindset where they own the servers and are unwilling to deviate from what they've read at the behest of the developers.

"I will force AV on reads on the developer boxes." "I will install AV on the production DB servers without telling anyone in the development group, then make the developers prove AV was the cause of production slowness before removing it two weeks later." "I will force this crazy group policy on developers and when they complain, I will totally ignore them."

A bad, or uncompromising sysadmin (one in the same) make development work a complete nightmare.

I half believe the reason developers are embracing cloud architecture so much is to remove so many sysadmins out of the equation.

On a side note, a tip for developers. Always make friends with the sysadmins. Buy them lunch or something. Right or wrong, they can make your lives much better or much more miserable.


> I half believe the reason developers are embracing cloud architecture so much is to remove so many sysadmins out of the equation.

It was literally true at one of my previous jobs. We couldn't install anything on our own dev machines without approval from Net Ops, not even Notepad++ (I don't think I ever got that installed, never got approval).

We once asked for a new server which mirrored the software of an existing server with two months lead time and got complaints that two months is not enough time to get a new server. I think we ended up getting it in three months, after the new project was supposed to be deployed to it.

Meanwhile we were starting to get into Azure, and we had a new server in Azure up and spinning with everything we needed installed on it in about 15 minutes.

The Lead Developer said, "We need to get as much stuff on the cloud as we can so we can stop dealing with this mess." We dealt with a lot of PHI there, though, so there was only so much we could do.


I've seen this as well. "Timeline from internal IT for provisioning a box and deploying our app is 6 months, and subsequent changes go through a ticketing system with a 2 week average turnaround. Or, we can have it running ourselves on AWS in 30 minutes."


Shit. So, you're saying that my mess of a team is kinda awesome by implementing reliable production-ready deployments within 2-4 weeks, and implementing changes to environments within like 30 minutes ("set key foo to bar in configs please?") to a week ("we need persistence!").

I guess IT in this place really is getting up to speed.


Yeah, that sounds spookily similar to the process we had, including the ticketing system changes timeline.


what your probably not seeing, is the CIO/CSO screaming at the SA to get AV deployed on every machine in the company, to meet some audit requirement checkbox, or PCI compliance, by the end of the month.


Exactly this. Audits don't care if there isn't any practical malware or if nobody can access the system outside 3306 and 22. Audits say "all production systems implement antivirus software" as a binary checkbox.


So the problem is effective communication? Why can't the sysadmin in this imagined scenario explain their actions this way?


When I have seen this problem, it's because the sysadmins are instructed (or have learned via experience) not to explain their reasoning to developers or end-users. Because if they did, then it becomes a discussion or argument that becomes a time sink since there was very little chance they could change the mandate even if they agreed.

So they become intentionally opaque to move that discussion out of their laps and make it come via the development team managers confronting the operations managers and having the fight on that turf.

Such situations occurring is a sign that the organization is not set up effectively. This sort of confrontation shouldn't need to be happening.

Ideally the development team's lead and/or project managers are involved with, are informed ahead of time, or are even contributing to the policy decisions on the operational side.


Because telling someone that you did something because of compliance doesn't help. They still blame you personally even though the compliance standards are usually industry-wide or even defined by Congress as an act of law.


One of our web teams wanted to do a simple Wordpress deployment on LAMP. As sysadmins, that was no issue, even with clustered mariadb. But our DBA team doesn't have any experience outside of Oracle/MSSQL, and squawked about mariadb. After an hour of this BS, the manager of the web team spun up a few EC2 instances and got to work. Of course we don't have anyone familiar with EC2, so supporting that will be a learning curve for someone, but the manager is happy, he has his own sandbox without hassles from the DBAs.


"This might be controversial, but I don't think you get to be a half decent developer without being a reasonable sysadmin."

Couldn't this argument be applied to any developer for any discipline/speciality?

Sure, more knowledge/context is always better if reasonable to attain, but my experience suggests that your above concerns could also be addressed via team organization rather than expecting all developers to know all things.


Not particularly controversial at all, from my POV.

I was a sysadmin with various ISPs in various countries for 15 years before I "turned to the dark side". I'd been using Ruby for a few years with Puppet and Chef, and after dealing with one too many "flaky coders", I picked it up.

I have to say, coding is far more enjoyable, though both come in handy in my day-to-day life.

It sounds like you've dealt with a few "BOFH" sysadmins. Don't worry, we're not all like that, and those that have been on both sides of the team will probably see your way.

Tell your boss I'm available (remotely), by the way ;)


I have seen this situation too many times (exaggerated a bit):

D: I have noticed that task Frobnicate has not been running in Production for a month, then checked and it is not even added to scheduler!

SA: There is no mention of Frobnicate in the pipeline for scheduled tasks.

D: What pipeline? FancyPancyScheduler is bundled with application and tasks are defined in DB, I have done it in Staging and everything worked, Frobnicate is all the fuss in the team, you must have heard about it, why don't you check for changes in Staging?

SA: We have well defined pipeline to manage scheduled tasks, currently the executing agent is Cron, not FancyPancyScheduler.

---------------------

Developers and Admins have more or less the same goals (stable, maintainable and extensible), but on different pieces of the system (code vs infrastructure). In my short career I have seen problems arise where one party makes plans and changes according to current or even past (it worked like this earlier) state of the other party. This applies to both developers and admins.

So I sort of agree with your sentiment, that developers need understanding of system administration. Though, depending on team size, I believe it is entirely sufficient to have someone in Developement who understands system administration and actual infrastructure, and someone in Operations who understands developement and actual stack. This is where I hope DevOps will end up at: arbitration between Developement and Operations to ensure smooth sailing forwards. Because the debate "I will do it in code" versus "this must be done on the edge" (e.g. static assets in a website. Served by application or web frontend?) will never be resolved.

Edit: formatting


> Developers and Admins have more or less the same goals (stable, maintainable and extensible), but on different pieces of the system (code vs infrastructure).

I disagree because this generalizes both developers and admins too much for my own comfort. I've seen sysadmins get really sloppy in the name of getting something into production quickly out of hubris without thinking about the full lifecycle of an application (common with developer-turned-sysadmin engineers - I am one and tend to be more reckless due to the reality that most of the errors I've observed would not have been caught going super slow - that adding more test code does not necessarily find the most critical of errors, just increases confidence) and especially in enterprise software most developers are sitting on features and are nearly allergic to new trends by their organizations valuing revenue loss far above losing growth opportunities.

Of course the stereotype is that operations wants things stable and manageable at the behest of business while developers want to deploy new stuff faster (because the idea of development in most places is to create something new). Modern infrastructure becomes increasingly code-driven and emergent as opposed to manually formed and restrictively managed sysadmins will have more room for errors that may change this into the future. Meanwhile, developers are increasingly under greater scrutiny by society when rolling out features such that nobody can ignore the concerns and they may be eventually forced into nearly waterfall-like development patterns. We can already observe this with the infection of Agile with enterprise bureaucracy / overmanagement back into the rest of the software industry as many of the former smaller, agile tech companies become big behemoths themselves.


DevOps approach definitely shifts some sysadmin-type roles towards the development teams. That said though there are things that belong in the realm of sysadmin responsibilities - both "old school" like let's say setting up DB replication, VPNs or a puppet master server(not that "old school" but still) and "new" - things like Kubernetes and Fleet/CoreOS for example have plenty of configuration/maintenance complexity that is better suited to be handled by a dedicated sysadmin.


> thrown code over a wall without having tested it

this is the weirdest part of the whole devops mantra. like, I know how to evaluate the complexity and memory requirement of code way before I write it and I guess most compsci should be able to do the same.

so either it's yet one more attempt in getting cheap labor into workable territory or plenty people where this myth originated are being cheated out of their money for a graduated curriculum that teaches nothing of value.


>the complexity and memory requirement of code

Those are only tiny slices of real production bugs. No amount of complexity analysis of your code ahead of time is going to protect you from all of the issues that arise with integrating any large system dealing with lots of requests. You run into all kinds of things like query optimization, kernel TCP tuning, load balancer problems, cache thrashing, high latency clients, out of spec clients, power failures, etc.

If you think knowing the theoretical behavior of your program in an ideal environment is enough, you are exactly the type that throws code over a wall without having tested it.


funny how most of the things you list are either stuff that can be audited in code alone (query optimization, cache thrashing) or totally out of control of the developer (load balancer issues, tcp tuning)

sure if you bounce them all up like that it might look like you have a point, except it falls apart when you attribute concerns properly.

or please explain, how would dealing with kernel tcp tuning part-time help Joe Random developer write better code?


Query optimization can't be audited in code alone. The indexes you need depend heavily on the database system that you are using in production. Do you know at what point your DBMS stops loading the whole table in memory? Do you know what datastructure and algorithm it's using when you do a LIKE query?

Cache thrashing also can't be audited in code alone without understanding the architecture that the the app is going to be deployed on. It's highly unlikely that the servers will have the same processor cache sizes, memory sizes, and numa architecture of the dev's laptop.

Load balancer is something a developer should know about as well. A developer has to consider the behavior required by the application (e.g. backend session persistence, headers injected, etc).

>please explain, how would dealing with kernel tcp tuning part-time help Joe Random developer write better code?

Joe might learn that connections aren't as cheap as he thinks and maybe it isn't a great idea for each client to require 50 connections for the app to function. He might also learn that TCP isn't very efficient on high bandwidth, high latency, lossy networks and decide to switch to UDP with error correction.

Long story short, a good developer should know everything about the environment in which the app is intended to run. "It performed ideally on my laptop" is throwing code over the wall.

Civil engineers don't design bridges without understanding where the bridge will go. The same applies to software.


> Joe might learn that connections aren't as cheap as he thinks and maybe it isn't a great idea for each client to require 50 connections

so we're back to point one, you need devs that go trough basic education and stop cheapening out hiring Joe / or Joe should ask a refund from his tuition fees.

> snip of stuff that one does not know off the bat

sure but it is knowable, it's not exactly hard. database are predictable, building indexes on the right places is not an esoteric practice that can only be done by trial error and rituals etc etc. literature is quite adbuntant, easy to process and complete with tradeoffs about different approaches and how they impact performance, maintainability etc.

99% programmers aren't breaking new ground.


>so we're back to point one, you need devs that go trough basic education

There is no basic education that covers the associated costs of a TCP connection in the kernel of a modern operating system or in the load balancers it passes through on the edge of the network.

>sure but it is knowable

So you're saying it is important for a developer to understand the infrastructure the code will run on. Thank you

The reason I brought up all of those points is because they are things not covered in CS educations and they hammer "hands off" devs all of the time.

I've worked with tons of junior devs from all kinds of good schools (Stanford, MIT, UC Berkeley, etc) and they almost always get bitten by this stuff because they throw their code over the wall and don't make an effort to understand the operational environment. It has nothing to do with a good education, it has to do with a mindset of not operating in a vacuum.


True for most devs that do any backend work.

Many devs out there who work with Windows or do mostly front end often have little experience in that domain.

Seeing alot of work get done at uni by students - who also actually some backend (friend did a blockchain project recently) did infact do very little backend discovery - the job was delegated to another student to get the env. Up and running.


> Maybe my experience is unusual, but I've never worked anywhere that the sysadmins knew more than the developers about how best to run their code in production. And when things go wrong with it how best to find the cause of the issue.

I've had the exact opposite experience. In most of the organizations I've worked in the "sysadmins" (mostly Systems Engineers/Operations Engineers actually) were stronger developers than the people who were actually developing the software. But that could just be a title shift, because what I've seen happen is that people who care about systems but have a development background gravitate towards operations roles and end up filling in as the "actually Senior" developer for the dev teams.

In the 15 years I've been doing this, I've only occasionally met someone who has stuck hard to the development side of the house but actually is competent when it comes to systems. Most developers have zero care about any of the lower level things like networks, hardware, and even backend software/databases which are required for their application to succeed. A common scenario is that the devs choose an inappropriate backend stack because they chose the easiest things to deploy rather than what is best suited for the use case. Then when things blow up, they beg for an ops team to be created, which usually starts by hiring people who are competent enough developers they can relatively painlessly replace the entire backend with something sane (e.g. Mongo to Postgres shifts are commonplace, because Mongo is a dog in the real world).

> The worst sysadmins get in the way of developers. Ones that scale down your CI server to the cheapest, throttled, one the hosting company has, leaving $800/day contract developers waiting for builds that run in 20 seconds on their laptops take nearly an hour. And then try and argue the toss about whether the CI server is cost effective and every few months keep switching it down despite the CTO saying it needs to be left alone.

Yeah, that does sound terrible. I agree. My top 5 jobs as a systems person is the following in priority order:

1. Make sure production stays up for our customers so we keep making money. (5 9s targets)

2. Ensure the security (and compliance) of our systems so we don't get hacked and we maintain customer expectations about compliance.

3. Ensure the performance of our product/systems is up to customer expectations.

4. Make sure deployment automation is solid and streamlined so that deployments are frictionless

5. Make sure new code is actually being deployed regularly and remove impediments to deployment so customers get features faster.

You'll notice a trend here I'm sure. The most important thing is the customer, then the developer. The biggest frictions I've seen between systems/development teams is when the development team believes that their desires/needs are the highest priority. The systems team is /not/ there to be at the beck and call of the development team, it's to be at the beck and call of the customer who is paying the company money. As much as possible I try to ensure the development team is having a frictionless experience, but if something will negatively impact the customer it is 100% my job to throw a roadblock in the way of the development team to prevent that. The customer of the company is my priority, and everything else is secondary.


I believe in small cross functional teams but that you need to be a sysadmin to develop I don't agree with. Perhaps it's more your opinion of what a good developer means, most teams benefit from variety in my experience. It sounds like your biased to certain types of organisations where there's a big gap between departments.


> This might be controversial, but I don't think you get to be a half decent developer without being a reasonable sysadmin.

It is an interesting exercise to generalize this statement in context of general engineering.

It seems either your conclusion is held to be incorrect, or, we reach the conclusion that software development is not engineering.


There's no value in arguing over the semantics of "engineering". There are huge differences between software development and e.g. civil engineering, to the point that I would be dubious about any analogy that treated them as the same thing.


Chemical engineering, Hydroelectrical engineering, Power engineering come immediately to mind as engineering disciplines that deal with active systems that require operational management and control.


Sure, but those are still all very different from software development.


Of course. (In my opinion, /high software/ has more in common with mathematics, music, theatre-film-dance, and architecture than it has with engineering, and /low software/ is begining to resemble boiler room operations.)

But here, as example is my BSEE a.m.: http://eng.rpi.edu/academics

And to this day, we hear about "software engineers" and "software engineering".

Per my OP: "It seems either your conclusion is held to be incorrect, or, we reach the conclusion that software development is not engineering."

Possibly, one reason for the prevalent problems in the pedagogical & human resource fulfillment aspects of the field is due to a miscategorization of the field.


Process engineering (~manufacturing) and logistics (~supply-chain) are not dissimilar to modern software workflow. The basic tools (modular management of complexity, discrete processing, statistics, monitoring, redundancy in processes/providers, feedback) are equivalent. In fact, I feel like a huge part of a successful software career is learning to see the similarities in disparate fields and draw from them positive architectural benefits, while keeping other-profession-spire-dwellers properly onside/placated.


Well that is certainly correct but it should be pointed out that one can say that about most organized (psuedo-)industrial production endeavors. But it seems incorrect to posit that that is the 'defining' characteristic of software development.

> In fact, I feel like a huge part of a successful software career is learning to see the similarities in disparate fields and draw from them positive architectural benefits, while keeping other-profession-spire-dwellers properly onside/placated.

Fully agreed. In fact that has been my guiding light in my own approach to software development. To clarify my view, I think software, very much like architecture, is a polyglot yet distinct discipline. It is not engineering. It not mathematical logic. It is not process engineering. It is not logistics (provisioning). Etc. (Just like architecture is not civil engineering. It is not philosophy. It is not art. It is not environmental systems engineering. Etc. It is architecture.)

-- p.s. edit --

I would like to bolster my earlier statement that software development has more in common with architecture, theatre, film, etc., than with engineering:

I would like to propose and roughly define a notion of 'semantic gap'. A sort of soft measure of the degree to which the formally expressible definition of a 'production' falls short of permitting the realization of the 'product' without the intervention of the 'designer'.

With that definition in hand, I propose that "engineering" discipines are those creative productions that have minimized the semantic gap to a degree that permits strict divisions of labor in the production.

Where as the "arts" are those creative endeavors that are faced with an intrinsic constraint on the degree to which the semantic gap can be minimized, and, that this maximally reducible semantic gap requires subjective and/or contextual 'interpretation' of the formally expressed design.


I like your semantic gap notion, however I am less convinced that overall mutual comprehension is the issue but rather the different issues of clarity of expression of vision (at the earlier/design stage), or clarity of interface (beginning at the implementation stage).

By way of example, there are many successful artistic projects that utilized the talents of multiple artists in parallel (lots of murals and mosaics, for instance).

In larger scale computing projects, frequently the (mechanics of the) interfaces provide bigger problems than the vision statement or overall goal, whereas in artistic projects indefinable aesthetics may be the shopstopper, despite perfect comprehension and collaboration.


>Often this stuff goes more smoothly where the developers (I mean, it's not as though if you're going to drop one of the two groups of people it's going to be them going) manage production and there aren't people with separate job titles and the resulting friction between them.

This is not personal criticism but you know how I know you're not working in a highly regulated environment? Check out the Carnegie Mellon Capability and Maturity Model (CMM) as a counterexample of where some companies go. Development is not at one remove but two from production support. There's an "operate" team between them and production environments and in a regulated environment operate doesn't have privileged access either. That'll be a third team due to separation of duties requirements.

Now imagine you're paged out to a call where your code is slow or failing and you're not even allowed to login to where the issue's happening. Fun, right?

This is why I'm absolutely loving the devops changes we're seeing now - because developers can control the environment without retaining control of it. My ideal is to apply some sensible defaults (no, you can't have all my crashdump space for your app logging; ask for more disk instead, no you can't run ghost/glibc/pooodle vulnerable versions of libraries) and otherwise let the developers spec the OS as a template or dependency for their app. It's much better for me since if I'm required to troubleshoot I know my requirements are met and otherwise the developer may do as they wish. Everyone wins and my control requirements are satisfied because remember developers are never allowed production access in regulated environments.

>Maybe my experience is unusual, but I've never worked anywhere that the sysadmins knew more than the developers about how best to run their code in production. And when things go wrong with it how best to find the cause of the issue.

I guess it depends on what you mean? The developer is in the best position to know what logging there is and how to enable or it increase verbosity. But they may be completely ignorant of how the operating system's tcp stack, memory management or other mechanisms work. Have you ever had to explain to someone that a java out of memory error had nothing do with the fact that linux is using otherwise idle memory to buffer i/o and that they're misreading top output? That the actual issue is their object management and just increasing the JVM's heap size is at best a bandaid?

If you have a developer who insists every issue is the operating system, sometimes the SA has to know how to dig in and run stack traces, probe tools (systrace, dtrace, whatever), jmx queries, etc until they can pinpoint the offending code.

As another example if you have an application that isn't draining queues quickly enough and therefore sending back tcp zero window frames upstream, what's the solution? A hypothetical lazy developer will say "it's the OS not queuing enough data, increase the OS buffers." A hypothetical lazy SA may say "it's the app not consuming packets quickly enough, rewrite the app."

In reality if we've all been paged to a priority one bridge the solution will probably be the combination of the two - tactical fix of increasing buffers to create some time for development to understand why the code isn't doing what it should and fix it.


It's funny that you bring up CMMI in the discussion. CMM(I) is nearly the antithesis of Deming's approaches toward quality management. What's really interesting is that Deming's approaches were adopted by Toyota decades ago to historically great effect (sadly with few other large examples of named successes in the business world) while Taylorist approaches (including CMMI) a handful of other Japanese daibatsu and especially US companies back to the 19th century. These companies have had vastly different growth trajectories over time, but when it comes to quality most consumers in surveys will associate Toyota with it over Hitachi, Mitsubishi, and Fujitsu (I believe all of these companies are full-blown CMMI 2.0+ adherents and champions). Similarly in the US, what has Six Sigma really done for companies that have adopted it? GE is hardly known for anything in the public eye resembling technical chops, for example, and most studies show that companies that adopt Six Sigma more than 70%+ of the time lag the S&P 500 upon adoption with no long-term recovery afterward either (perhaps Six Sigma adoption is not a cause but a simple correlation with poor performance similar to private equity oftentimes getting a bad rap in the public eye).

When even the US military - one of the world's foremost investors in management and leadership research - has largely abandoned command and control (the military equivalent of Taylorism) we really need to ask whether structures that enforce a management/worker caste vs. one that empowers those closest to a problem are effective beyond any meaningful scale.


No doubt valid points about CMMI; I was exposed to it as part of a program to improve quality and in that specific instance it was constructive. However the program office oversight was shut down as all portions of the business were certified as "level 2" and without that level of structure and control most of the process died within a year.

Even so my original point remains - in some kinds of highly regulated shops there's enough external pressure for controls and separation of duties that the developer simply cannot have access to production. I'm not defending either practice (CMMI or seperation of duties), I'm just saying in some places it's reality, regardless of perceived drawbacks or overhead.


> Have you ever had to explain to someone that a java out of memory error had nothing do with the fact that linux is using otherwise idle memory to buffer i/o and that they're misreading top output? That the actual issue is their object management and just increasing the JVM's heap size is at best a bandaid?

I've not seen a professional developer be confused about Linux using otherwise idle memory to buffer, no. I have seen that with sysadmins who mostly look after Windows boxes and were somewhat unfairly dropped in at the deep end.

I've seen JVM heap OOM errors be caused by both object management issues and applications that would legitimately benefit from larger heap sizes. Many, many times.

I think I'd fall off my seat if I saw a sysadmin use JMX to find an issue. I did see a security guy (so not really a sysadmin, but he was doing a related job) use strace once. He was remarkable enough to have his own Wikipedia page.


> Have you ever had to explain to someone that a java out of memory error had nothing do with the fact that linux is using otherwise idle memory to buffer i/o and that they're misreading top output? That the actual issue is their object management and just increasing the JVM's heap size is at best a bandaid?

Is there somewhere to read about this, I this might have come up with one of our projects. (At the time, from googling, I suggested they try mark and sweep - I didn't really have any idea but was of the opinion they had lots of small objects.) I don't have much experience in Java but was trying to be helpful!


> but I don't think you get to be a half decent developer without being a reasonable sysadmin

I think you've just described me. And I don't see the argument against sysadmin.

> leaving $800/day contract developers waiting for builds that run in 20 seconds on their laptops take nearly an hour.

Sorry I just can't take you seriously.


Honestly, I've heard worse out of sysadmins in some places.


Question: What must go wrong for a build to be 180x slower on a server than on a laptop?


Maybes: Packaging latency in archive formats (compress before upload, decompress after). Network latency on the upload/download. Block IO performance on the server. Virtualization overhead. Memory or processor constraints. Assumption of equivalence is spurious (eg. server is doing multi-architecture builds and full suites of tests including eg. regression tests). Yep, something like that.


Using an EC2 t2.nano instance for a build server.

When they're out of CPU credits it's game over.


Overcommit of resources, causing near constant thrashing to disk.


Hi, I'm a Sysadmin, and I've been a grumpy one through a larger part of my 15 years experience. My main issue was that Developers were acting like Users: they don't care about what you have to deal with, they want things to 'just work'. In return, I've treated them like children, in some instances yelled at them when they did dumb stuff. I've tried to educate them when possible, and was angry when the education didn't stick. At the time i was the 'King of the Hill' type of sysadmin - natural leader of a very small and tight team, kind of irreplacable, and with enough years in the company behind me to consider myself as a demi-god.

When I switched companies, I came across better developers. Some had decent sysadmin skills, but the main difference was that they actually took interest in how things worked past the 'git push', and when I asked / required them to make some changes that would make my life easier, they listened, discussed and adopted when appropriate. With those same guys, I took interest in what they were doing, what their actual job was and came up with ideas that would make things easyier and run smoothly on both ends. After a while I figured out that they weren't actually better developers - they were better people. (Also, I figured out that being grumpy was not the best approach and that patience, kindness and gratitude could get people to do more than snark, humiliation and flame-throwers.)

I guess my point is: you don't really NEED to have sysadmin skills to be a decent developer; what you really need is to care about what sysadmins do - be curious, talk with them and trust them when they say that your brilliant idea won't work in production.


I think there are definitely developers out there that give little to no respect to systems administrators.

I've seen this ignorance even in college professors. In my first programming class in college I took a CS class that had both CS and IT students in it since it was required for both kinds of students. The (CS) professor kept trying to convince students how much better CS was and gave some good arguments (ie: salary) but the most arrogant thing he said is that IT is a subset of CS and that by doing a CS degree you would understand everything it takes to be in IT. He also mentioned how in IT you would be constantly fixing other people's computer problems but as a software engineer you wouldn't need IT's help since you can fix it yourself. The funny part is part-way through my degree I realized that college didn't even offer a real CS degree it was called "CIT with Computer Science Emphasis" which none of my advisers nor professors mentioned would cause issues getting jobs outside of Utah, the best thing I did was leave that school and finish my CS degree elsewhere which caused me to lose a lot of unnecessary credits and almost felt like I was starting over. I feel like I got scammed but that's beside the point I am yet to work for a company where a software engineer gets to manage his own computer without following IT guidelines like my CS prof had described.


I couldn't agree more.

I didn't realize the importance of all the "admin stuff", before the our newly hired sysadmin came to me and asked if I could help him figure out how to deploy the project I was working on. This ended up being a looong chat about monitoring, redundancy, architecture, security... you name it. What I've always thought of as installing and configuring software turned out to also touch designing the software so that it works reliably and is easy to maintain.

I don't think I'll ever have plenty of sysadmin skills, but knowing even the general idea of what's important to sysadmins helps a lot. Also, being able to become another interruption in their day and consult ideas is priceless. :)


This, so much this. You don't need to be the sheep with 5 legs every manager wants, what you need is to be accessible to collaborate with people on problems.

That's an individual skill as well as a systemic one, though.


I've worked as both a dev and sys admin, and I think this is the most reasonable response so far.


I think it's more than that -- it likely stems from how the IT department is run at a particular company, too. I've dealt with many sysadmins that offer absolutely no transparency into their processes, and in many cases actively obfuscate it. So any attempt to take an interest seems to be interpreted as some sort of threat to their position.

Either way, this is a two-way street. And often times the culture of one group or the other gets in the way. Which is really unfortunate.


Mostly this.

Having been an admin myself, gradually moving more and more towards development, I understand what things in software are annoying for an admin to have to deal with. Most developers simply don't care. Grant full permissions or don't expect anything to work. Any objections and you're a troublemaker. A better attitude would have gone a long way, but firsthand experience works best.

In addition, it helps me greatly when there is no (decent) admin around. I know whether to suspect the software or the system it's running on, how to keep things running on a less than ideally configured/maintained system without completely compromising security, can help users when the problem they're having is not a problem with the software, but a problem to them anyway – they love the extra mile – et cetera.

It must be said that some admins are just as shortsighted. Knowing what kind of measures actually work for stability, security, and so on, I've come to strongly dislike those who only complicate the situation to no benefit, as well as those who point their finger at the software when it really is their system that's causing problems.


As a sysadmin, I love the devs who think "devops" and I make a point of saying nice things about them to the CTO.


As a developer with some very basic sysadmin knowledge, I'd say you have to have enough sysadmin skill to set up and administer your own system, but not enough to keep it running and deal with attacks and security in the OS. (Obviously, attacks and security in the software you're writing are still your responsibility.)

I say this because if I had to wait for a sysadmin every time I wanted to see if something worked, I'd spend a lot of time doing nothing. And it's likely that I couldn't even solve a lot of problems.

So I think you not only have to know what they do, but some of how to do it.


As a grumpy evil sysadmin, I think the good Professor misses where the real disconnect is, at least nowadays: stack management.

Why do things like Docker exist? Because developers got tired of sysadmins saying "sorry, you can't upgrade Ruby in the middle of this project". Why does virtualenv exist? A similar reason.

Containerized ecosystems (which is to say basically all of them now) are really a sign of those of us on the sysadmin side of the aisle capitulating and saying that developers can't be stopped from having the newest version of things, and I think that's a bad idea.

15 years ago, when a project would kick-off, as a sysadmin I'd be invited in and the developers and I would hash out what versions of each language and library involved the project would use. This worked well with Perl; once the stacks started gravitating to Ruby and Python it was a dismal failure.

Why? Because those two ecosystems release like hummingbirds off of their ritalin. Take the release history for pip[1] (and I'm not calling pip out as particularly bad; I'm calling pip out as particularly average, which is the problem): in the year 2015, pip went from version 1.5.6 to 8.1.1 (!) through 24 version bumps, introducing thirteen (documented) backwards incompatibilities. Furthermore, there were more regression fixes from previous bumps than feature additions. You'll also notice that none of these releases are tagged "-rc1", etc., though the fact that regressions were fixed in a new bump the next day means they were release candidates rather than releases. Ruby is just as bad; the famous (and I've experienced this) example is that an in-depth tutorial can be obsoleted in the two weeks it takes you to work through it.

Devs are chasing a moving target, and devs who haven't been sysadmins may have trouble seeing why that's a bad idea.

[1]: https://pip.pypa.io/en/stable/news/


As a Sys Admin turned Automation/Tools Engineer, I think you're missing part of the point. You've got the beginning of it right in saying that Sys Admins used to be involved in pinning down versions, and even in why that was necessary, but I believe you're incorrect in saying that the containerization technologies are bad for removing that.

Those technologies don't exist so Developers can get around Sys Admins and ignore your helpful advice. They exist to solve that problem that makes the Sys Admins role there necessary. It removes the underlying need for a Sys Admin to worry about the versions. Admins should see this as a good thing, but in my experience many dislike it because it takes them out of their Gatekeeper role. We shouldn't WANT to stop the Developers from from having the newest version of things. They aren't kids playing with toys that we need to nanny over, they're doing work that creates values and the fewer things we do to get in the way of that, the better.

If something breaks due to version changes, their testing should catch it. If things are breaking in production, we ought to get involved because there's some other problem, but before that we, as a profession, need to learn to get out of the way and let people work by letting technology handle the problems. The "Gatekeeper" mentality needs to die as quickly as it possibly can.


> They aren't kids playing with toys that we need to nanny over

In my experience (university), yes they are, and they should do that at home.

Why do you need the latest bleeding versions in the first place?

In my sysadmin experience, people believe software gets bad and deprecated as soon as the glory next breaking version appears. I don't think I need to argue why this is an illogical stance.

With my developers hat on, bumping to the next version mid-process reliably introduces more friction than is worth it. People think the next version solves that one weird issue but ignore that it introduces two new ones and that the software must be changed to fix five new incompatibilities.

But the solution reliably is to just not use that weird feature that caused the bug in the first place, and think what a clean solution would have been. And guess what, the result is a cleaner and more compatible code base. It's a tip that works for me again and again: If there is friction, think - before spending the next hours with an update that will soon lead to new problems.

It's great that you can for example compile Linux without too much friction. It's great that arcane shell scripts can run on any system. Stability (in a compatibility sense) is not a nice-to-have, it's basic sanity.


My comment to you two is -- why not both?

Stability, sanity, all that is amazing, and a must have.

But also bug-fixes, security improvements, and performance improvements are wonderful too, which tends to come with using up-to-date dependencies.

The problem with the latter, as you mentioned, is when it introduces breaking API changes and is wholly not backwards compatible. This is not a "kids playing with toys wanting to experiment problem" this is a bad software problem, which is why I like Go, and why I liked Java when I was doing it full time. If the language you use has backwards compatibility as a first-class citizen, most likely the package authors will act that way too, and then the maintainers, and eventually the developers. Limit your software choices to those who care about not breaking everyone's shit every 2 weeks. Heck even when I write my own API's now that I know only my company is going to use internally I am thinking about this.


Backwards compatibility is seriously under appreciated. When I tell developers to ensure that their changes are backwards compatible, they tend to look at me like I'm green.

I do not understand the disconnect that developers have with understanding all of the benefits that it brings. Yes, you have some extra code in your code base so it's less clean. You also have a stable environment as a result. The first affects only your personal preference. The latter affects all of your developers and users.

Unless you have a situation where it's impossible to maintain, not insisting on it is pure self interest.


I do not understand the disconnect that developers have with understanding all of the benefits that it brings.

Because they've never worked on an old codebase, because front-end technologies change so often and everything just gets re-written anyway. It's a waste of time worrying about this when the code won't make it to is first birthday.

If you were speaking to seasoned C and DB developers about stability in the tools and the platform, you'd be preaching to the choir.


This gets to the complaint that so much of the open source ecosystem gets to version 0.8.6 (whether it's named that or not) and then completely rewritten "this time the right way". That's not actually a good thing.


As jwz put it,

> It hardly seems worth even having a bug system if the frequency of from-scratch rewrites always outstrips the pace of bug fixing. Why not be honest and resign yourself to the fact that version 0.8 is followed by version 0.8, which is then followed by version 0.8?


> Yes, you have some extra code in your code base so it's less clean. You also have a stable environment as a result. The first affects only your personal preference. The latter affects all of your developers and users.

Backward compatibility has real costs. You cannot restructure your code base as easy, you cannot deprecate bad ideas, cannot extend it as easy more and so on and so on. Sure, it also has real benefits (as you've stated), but missing the disadvantages while only highlighting the advantages is not a useful approaching, it only shows "your personal preference".


> Yes, you have some extra code in your code base so it's less clean. You also have a stable environment as a result. The first affects only your personal preference. The latter affects all of your developers and users.

No, the former results in a bloated code base full of old legacy crap that no one understands and is afraid to touch because it might break. You have to insert weird workarounds because that bug is now a feature to some idiot and you provide backwards compatibility so it lives forever now.


Has software quality increased now that everyone is refactoring and rewriting everything for every release?

I don't think so.

Its just plain arrogance to believe your are a better developer than the guy who came before. Having some fear of breaking the code base is healthy, the same way that having a little fear that the chainsaw is going to cut off your leg makes you safer.


I like your answer. It's civilized, balanced, and I agree with every word of it :-)

I'll add as an anecdote that I do follow your practices (limiting dependencies). It works wonderfully on Debian stable (most of the software there is now >2 years old, the next version has just been soft frozen). I have the occasional package pulled from testing: I recently toyed with Perl6. And currently I use a newer version of python3-sphinx for a nicer doc syntax but I could do without. It causes no headaches at all.


For one thing, if there is any place to treat software like a toy and to play around with the latest version, it would be a college or university.

I don't particularly care about having my software on the latest version. I personally prefer using the old version for six months while the newest version gets the bugs worked out of it.

I know sysadmins value reliability and security, but it's really frustrating when every upgrade takes dozens of hours of work to approve. Questions like "What features do you need in the new version" miss the point. It isn't about the features of the software, it is about maintaining a modern code base.

Upgrades always have the potential to break things, but when you keep up with the upgrades it is easier to achieve the stability and security goals the sysadmin wants. When you upgrade often, it is easier to read the documentation and find where changes might break something, and when things do break it is easier to fix them. Upgrades that jump over several versions at a time are a nightmare to debug, and it creates a lot of technological debt that you have to work out later.

Ultimately, sticking with a version of software because it works is trading a little stability now for an absolute mess down the line.


I said it elsewhere, but personally I'm a Debian stable evangelist. There is one major upgrade every 2 years or so. It often goes without friction. The rest is mostly security updates. Breakage between major upgrades is very rare.

I don't think this thread is about "maintaining a modern code base" at all. Whatever that should mean -- My impression is you've fallen victim to the hype train.

In my perception the thread is about always catching up with the latest and greatest. Would you say in all earnest that my code is not modern because I make a point of developping against solid standards and not constantly longing for things that are not in my distribution (the software there is usually 0.5-2.5 years old)?

You can check some of my code at https://github.com/jstimpfle. Is it "not modern"? I'm a reasonable but not outstanding developer, and not saying that everything will work on your computer (since I'm usually the only tester) -- but I'm pretty sure I can get everything there to run on your computer with minor effort.

> When you upgrade often, it is easier to read the documentation and find where changes might break something, and when things do break it is easier to fix them.

No. Breakages are less frequent because the software is not brand new, and they are better known because all people using the stable release are on the same version. Documentation comes with the distribution, but I don't have any problems googling it by giving the version string either.

> Upgrades always have the potential to break things, but when you keep up with the upgrades it is easier to achieve the stability and security goals the sysadmin wants.

This thread was never about security and I don't approve. I don't think you are familiar with the concept of a stable release.

> Upgrades that jump over several versions at a time are a nightmare to debug, and it creates a lot of technological debt that you have to work out later.

No. If you develop against solid standards you have less breakage. It's not about incompatibility with the most recent versions. That would be a stupid idea. It's about compatibility with releases other than the latest and greatest. This means not depending on the hot new features that are only in these versions, simple as that.


> I said it elsewhere, but personally I'm a Debian stable evangelist. There is one major upgrade every 2 years or so. It often goes without friction. The rest is mostly security updates. Breakage between major upgrades is very rare.

That's fine for an OS, but what do you think business customers would say if you said "sorry, that feature won't be added until the next release in 2 years time"?. That's were tools like pip come in, they let the software move faster, which it often needs too.


what do you think business customers would say if you said "sorry, that feature won't be added until the next release in 2 years time"?

We say that all the time; we have a two-year release cycle. And in our field (aviation) that's considered breakneck.


So what exactly are the latest-and-greatest libraries that you absolutely need to implement your own business critical features?

Please list more than only one. It's simple to make exceptions for exceptional requirements.


Well for me, all of the libraries I use because none of them exist as packages for any OS.

With most of them security fixes will only go into the latest version though, so once you get behind your system is insecure.

Applications aren't something you build and forget. An unmaintained project is a dead one.


> Well for me, all of the libraries I use because none of them exist as packages for any OS.

In conclusion you don't use any libraries that are packaged for any OS.

What libraries?

Also assuming that some libraries you use don't exist for your OS, that doesn't mean that you absolutely need the latest and greatest in a business critical way. So, not approved.

All in all, not too fond of the reasoning and the evidence you provide.


Your logic would prevent any app from just about any non-c ecosystem running. Java, ruby, python, dotnet, rust, go, they all have their own library management and very few of those libraries will be available in an apt repository (let alone a compatible one).

Your policy may work in a university, but you'd be fired from any real business.


You are still not providing evidence.

You're also making bold claims ("any app from just about any...") that my personal experience can just not validate. It's very easy to write applications without fancy dependencies. Recently I did algorithms, systems and applications in C, C++, python, sdl, alsa, X11, Unix shell scripting, lp_solve, and some web programming in python, javascript, sqlite3. All rock solid and stable - all will probably run on any Linux box from the last 5-10 years (python3 came only in about 2008; forget about the lp_solve bindings and just use the command-line tool).

There are 2050 python3-* packages on my system. Not that I think it's a good idea to use most of them. What's "compatible"?

So what are the libraries you absolutely need? What is this week's secret sauce?


.net MVC, nUnit 3, jquery, knockoutjs, and nHibernate to name a few. I could name a dozen similar tools on the java stack. Pretty soon I'll be experimenting with rust with libui.

>There are 2050 python3-* packages on my system. Not that I think it's a good idea to use most of them. What's "compatible"?

2000 of them are random versions someone made a package for that are unknown to the core team and probably not receiving updates.

Your list of projects sounds like typical academic ones, not tools used by businesses that employ most software developers.

You're also missing the other benefit of these tools, that we develop against the deployed version. There are no compatibility issues because we develop on ubuntu and host production on redhat.


If you consider making websites, and tools and web applications for internal processes, academic... I also did some contract work where I created a server and a client for displaying advertisement media to commodity screens, in a tiled fashion. The tools were all there, C, python, bash, X11, some media libraries, the versions there were all fine (I worked around a bug in mplayer though).

jquery, knockout. I don't know a first thing about bundling dependencies for client-side javascript code (not going super fancy there, don't need jquery, knockout - tried writing single page apps by hand but they are hugely complex) but anyway, isn't that independent from server installations? Don't you bundle these libraries in-tree? If so, it doesn't relate to the discussion.

.NET... It's MS, do you run on Mono? How does the question of requiring the latest version apply?


> There are no compatibility issues because we develop on ubuntu and host production on redhat.

Sorry, but that is a hilarious argument, almost straight from Gentoo is Rice: https://fun.irq.dk/funroll-loops.org/


> Your policy may work in a university, but you'd be fired from any real business.

Nah. If we define "real business" as something with a decent turnover, employing over 250 people, and being in business for over 8 to 10 years; a business that isn't actually in the business of writing software (the majority of what makes up global stockmarkets, or "real business" in most peoples' eyes) then you will find that OP's attitude and policy-making philosophy is right on the money. (Source: Was CTO in exactly the above type businesses for many years)


> Why do you need the latest bleeding versions in the first place?

Because the newest version has several features that we would like to take advantage of immediately?

Look at PHP 7.0 which introduced return types, and 7.1 which introduced nullable return types. These are features I really want in my application, so we upgrade.


No offense intended, but your university probably isn't competing for top engineers. Grad students and postdocs aren't professionals yet either.

The sysadmin role has traditionally been a focus in that environment (e.g. controlling access to cluster resources).


Define "professional". And let me claim that "top engineers" are actually the prudent ones - which you didn't refute.


I think we agree on the prudence of professional engineers.

The definition of 'professional' is up for debate, but I'd encourage people to weigh in on the following (to IEEE, not me):

- an appropriate engineering education background (ABET/EAC)[1]

- at least four years of engineering experience in your field and under the supervision of qualified engineers

- passed two exams (the Fundamentals of Engineering [FE] exam, which is now a computer-based test available essentially year round, and the eight-hour PE exam)

- kept current by as a minimum meeting your state's continuing education requirements.

-- http://insight.ieeeusa.org/insight/content/careers/97473

[1] I think it would be worthwhile to consider apprenticeships, equivalent to the 'law office study' path to attorneys' bar certification.


I think the pain of keeping any project fully up-to-date is (sometimes far) less than the pain of updating an out-of-date project.


If it is, your project has a problem.


Those technologies don't exist so Developers can get around Sys Admins and ignore your helpful advice. They exist to solve that problem that makes the Sys Admins role there necessary. It removes the underlying need for a Sys Admin to worry about the versions. Admins should see this as a good thing, but in my experience many dislike it because it takes them out of their Gatekeeper role. We shouldn't WANT to stop the Developers from from having the newest version of things. They aren't kids playing with toys that we need to nanny over, they're doing work that creates values and the fewer things we do to get in the way of that, the better. If something breaks due to version changes, their testing should catch it. If things are breaking in production, we ought to get involved because there's some other problem, but before that we, as a profession, need to learn to get out of the way and let people work by letting technology handle the problems. The "Gatekeeper" mentality needs to die as quickly as it possibly can

Well - that's all well and good, when put such that "Gatekeepers" are viewed as blockers.

However, we "Gatekeepers" are the ones that get paged and / or yelled at by a CTO when an application keels over. Not the developers. The developers get to sit in their sandbox (otherwise known as "production" in 2016/17) of ever-changing library versions that were only rapidly tested in QA. Then they play a game of Starcraft II, scan HN and go to bed. When something runs out of memory or crashes in the middle of the night, we get paged. So, hell yes we should be involved in the process.

Sincerely,

Gatekeeper


Sounds like an organizational failure to me.

When I started at my current company the traditional silo between dev and systems was there (although we were allowed to deploy our own stuff) - they managed everything we ran our apps on and we just deployed them to servers they had already configured. Over the past ~3 years we've made a lot of changes, the department manager for our IS team is present in our daily standup calls to relay information between our two teams and we now have a couple separate VMWare clusters dedicated to our applications and VM running on them is our responsibility for the most part. We are the first to get called for issues with our applications, and where necessary we work collaboratively with our systems team to resolve them - we don't throw blame around, it does no good.

I should add most of this is only possible because we have real DevOps people on our team (well, really, it's just me right now - we lost our other and need to hire a replacement still) - not developers who know enough to copy a blob of crap to a server to run, but people who have real skills in both aspects. We are trusted to maintain things because we can do it right, and while it took a lot of work (and some unfortunate infighting) to get to this point both of our departments are working great with this arrangement.

There's still kinks that need ironing, we've not done an adequate job at writing documentation so our systems team can help with some failures (primarily on our Linux VM's, our whole systems team is Windows admins) if we aren't available - but it's on the radar as well as getting PagerDuty set up to escalate alerts to them if we don't respond in time (like having our PostgreSQL data volume fill up over the weekend, not a call I wanted to get at 10AM on Sunday).

So yeah, fix your culture issue, get people communicating daily between your teams, share responsibility for issues instead of placing blame.


> However, we "Gatekeepers" are the ones that get paged and / or yelled at by a CTO when an application keels over. Not the developers.

And that's why people are moving away from that model. It's part of the reason DevOps is being embraced as a model. Developers should be on call to support the applications they build. You get benefits all around.


should is the key-word here. As a sysadmin, I'd love to work closer with devs, especially during outages. Unfortunately every time we bring up on-call, the room goes silent. This is very anecdotal, but IME, devops has just become a way for devs to bypass sysadmins.

I wonder how many companies are doing it right vs doing it wrong? Any anecdotes from a proper devops group?


and we get yelled at when we can't deliver a feature because some gatekeeper is sitting atop his little throne in the kingdom of servers saying no. ;)

This is institutional failure.


Then place the blame on the gatekeeper. As a sysadmin, I'd be more than happy with you pointing the finger at me as the reason why you can't deliver a feature. Assuming, of course, that you've run the proper tests and gotten QA's approval.


Honestly. Developers should stop thinking that their jobs is to release features all the time at all costs. That's simply not true and that's counter productive for the business.


If only the business might learn that it's not their jobs to request new features all the time at all costs...

If only customers weren't fickle and might learn not to demand new features all the time whatever the costs...

It's turtles all the way down.


And in the end, that's the developer who always gets the decision. Does he ships half assed half finished every single time, or does he take time to do some testing and not break production.


Developers do what the business wants them to.


Example?


The problem with your characterization of things is companies like the one I'm in where us "Developers" have replaced the gatekeepers entirely. We have no dedicated SysOps, and yet our production environment stays up just fine.


> It removes the underlying need for a Sys Admin to worry about the versions.

No, they really don't: they remove the ability of system administrators to administer versions of software across the total system.

This is bad, e.g. when a new OpenSSL vulnerability comes out (it being a day ending in -y) and every piece of software has to be updated.

> We shouldn't WANT to stop the Developers from from having the newest version of things. They aren't kids playing with toys that we need to nanny over, they're doing work that creates values and the fewer things we do to get in the way of that, the better.

I am a developer, and I disagree. We are, by and large, kids playing rather than adults making carefully considered decisions. We'd rather use v3.0.rc-1-awesome rather than 2.17.12, because the former is the version that adds an API that saves us from writing twenty lines of code, never mind that it also is untested, unstable and very likely insecure.

We need adult supervision. We need oversight. That's why I argue for using stable, LTS-style distributions, and running against the distro packages unless there is a very good business reason not to (and yes, 'we can't implement necessary functionality in a cost-effective timeframe' is a valid business reason). I'm not opposed to using the bleeding edge when it makes business sense; I'm opposed to developers using the bleeding edge because they like it, and keeping the business in the dark.


I agree with OP and would amend his statement to "15 years ago, when a project would kick-off, [sysadmins & architects would] be invited ..."

From an architectural point of view, microservices take the reductionist approach to system design to an absurd limit, and per my professional experience (fwiw & ymmv) are due to the general architectural illiteracy of the rank and file practitioners in this field.


Right, microservices isn't "architecture". It is - whether they know it or not - an admission that "we can't do architecture so we'll chuck it over the fence and let the ops people worry about how it all hangs together".


> microservices isn't "architecture".

Yes the 'no-architecture architecture'. It's very Zen. /s

> how it all hangs together

In case you are interested in rescuing :) young but promising talent in the field, next time you find yourself involved in a discussion about microservices "architecture", point out the realization of a single-node application per this approach, where every function is a process, and the call stack requires IPC, and the 'linker' is considered obsolete and outdated technology.


I once worked on an application that - no joke - comprised 5000 VMs each of which was running one "service" in a dedicated JBoss. It was laughably bad.


At what point do you determine who is responsible for securing the environment? The "Gatekeeper" mentality is stemming from this. There is no clear line in any organization and I see the blame-game all the time.


This is exactly the example I give when I try to explain to someone that managing through personal responsibility as opposed to team and organizational responsibility will grind your company productivity to a crawl.


That's kind of like asking which employees at a bank are responsible for keeping cash in the vault. Hopefully it’s a group effort.


There are actually generally rules for which employees have responsibility over the vaults in banks. The employees aren't equally responsible. Roles and responsibility are defined strictly. Generally, the lowly tellers aren't allowed the same access to the money as the general manager, and they aren't held responsible to the same degree if money in the vault goes missing.


This has actually been a very insightful response to me.

After reading your comment it now occurs to me that Docker and other container systems are actually a huge organizational tool. One issue I have encountered at companies is keeping the IT and development departments on the same high level organizational incentives to keep political barriers from coming up between them (and conflicts arising).

Containers can help keep everyone's incentives aligned because System admins can focus on the actual administration aspects of the systems and infrastructure (that devs do not need to be concerned about, like vnet layouts and whatnot) while devs can focus on the actual development and deployment without having to have everything confirmed and approved by the IT departments.


At almost every shop I've ever seen, "sysadmins" are also the ones whose responsibility it is to at least attempt some sort of security practice and business continuation. Which opaque, "just run it" containers actively fight against. Did the developers actually audit what they've pulled in as dependencies? Did they make sure that they can be rebuilt if whatever package source goes away? Where is everything documented? Containers, as usually implemented, rather than "keep everyone's incentives aligned", instead damages the ability of the adults in the room to keep everything from falling apart.

(I have been both the sysadmin saying "no" and the developer mad at sysadmins saying "no". But going slower and doing our homework has never, ever hurt me or my employers.)


I've never seen sys-admins verify security of dependencies outside of the major dependencies (like the language VM version). Security wise they have been much more concerned with the security of data storage systems (databases, elastic search, etc...), the operating system, and the network in general.

Security of the application is very much so responsibility of developers, not system admins, as the developers have the best point of view to understand the implications of the software they are developing/integrating with.

If there are routine violations of security at the application level that aren't being caught by developers working with those systems then the company as a whole needs to sit down and make sure the development teams have the proper security procedures in place, because putting a department in charge of security that has all accountability but no power to remedy the situation is a recipe for political fights between departments and a disaster. Proper code reviews and team leads with experience should be able to catch more security issues than sysadmins will.

If your sysadmins are in charge of security review of the application then they have to be in charge of security review of every low level dependency at the individual package level. Otherwise your developers won't think about it because it's not their problem (IT will review and let me know if anythings bad) and it encourages them to lack accountability of the security of their own software.


Doesn't that constant updating and lack of version oversight create security risks?

Developers may not be as aware of those topics as sysadmins.


Isn't it just as bad to keep versions locked for months and years so we don't upset the delicate balance of versions on the production env?

I've worked at a place where they were running PHP 4.4.9 until about 8 months ago. And they were upgrading to 5.4! I get that it was work to convert a lot of the older code base to 5.4, but it was already passed EOL when they were switching to it. And 5.5 wasn't much behind it.

So now in the near future, they'll need to upgrade again (though they probably wont), and they'll probably jump to 5.6, which EOLs in two years (probably two years after it EOLs).


The lack of updating is also a massive security risk.


That's terrible. If I'm a Python dev, why should a sysadmin who doesn't even know Python tell me what version of a library I will use?

I have regression tests to catch if an upgrade breaks anyyhing. What does a sysadmin have to approve or deny an upgrade? A little beard stroking and changelog reading?

I think the movement towards containers is like you said, to keep sysadmins off the code. Sysadmins add value in setting up the infastructure and keeping it running. They subtract value when they want to tell developers what version of a library to use.


why should a sysadmin who doesn't even know Python tell me what version of a library I will use

Because he maintains that installation and you don't? But, yeah, that's why virtualenv, Docker, etc. were invented, because devs kept getting sick of installations having consequences.

What does a sysadmin have to approve or deny an upgrade?

Check for conflicts of this version of this library with other software currently in use (by other developers maybe, or even by the same developer). Add it to the watchlist on the dozen or so security mailing lists and newsfeeds he checks daily. Read the changelog and look for implementation problems. Read fora and look for performance problems people are reporting. Yes, beards get stroked during this process, but time and again we see that developers refuse to do this, and wind up coming to us when they break something because of that...


Developers tend to fall into the trap of believing we are better than sysadmins but there is immense value in having talented admins around. I have recently experienced this first hand watching someone breeze through server archaeology and vitlrtualization tasks that I struggled through and may never have been able to accomplish in a reasonable amount of time. When admins and programmers recognize each other's strengths and play to them it is a rewarding experience. We just have to realize that we're on the same team.

Also, docker (and containerization in general) is a wonderful thing for both of us. It decouples the fickle apps from systems (also moving targets) and the other apps which are constantly seeking out new and creative version incompatibilities. It makes migration and maintenance a much less frustrating endeavor with fewer surprises along the way.


>Because he maintains that installation and you don't?

So why is that an acceptable mentality for "in-house" developed software but if you buy something proprietary from a third party where you have zero say over what lib/langs are used, it's A-OK?

>Check for conflicts of this version of this library with other software currently in use (by other developers maybe, or even by the same developer).

That's not the case when using containers properly. Every service gets it's own environment so whatever version of lib-xyz is needed, even if incompatible with other parts of the project, are walled off for only the service that needs it.

>Add it to the watchlist on the dozen or so security mailing lists and newsfeeds he checks daily.

Ok this is where I completely agree with you as we have been working on this at our company. My personal solution seems pretty logical though so hear me out.

1) Build a docker file that fully documents the install of your service as well as any OS level dependencies. Ensure that any config files are external to the container to allow sysadmin access.

2) Document in a central location (say an internal wiki) what the external services, servers, repositories, developers, and admins are responsible for the service.

3) Automate builds of containers from repo and add automated testing post containerization.

4) Sysadmins monitor repositories for changes to docker files or wiki articles for new services, databases and libraries as well as taking note of library versions. If an issue with a particular lib or service is discovered, the config files can be edited to point to a new service. Or a new container build can be triggered with zero changes to the source code, but a forced update to the OS packages for the container.

In a tight situation where a developer might not be available on-call, the sysadmins have more control over a similar proprietary product but don't have a workflow for messing with source code (which they are likely not familiar with regardless).

There are solutions to the issues you raise (often trivial ones at that), they just require an adjustment to workflow and an increase in communication between developers and their sysadmins.


I see your point, and like I said I agree this is why Docker was invented and it's the best-in-breed at what it does (namely, being a tourniquet for a self-inflicted wound). My biggest concern really isn't "my problem" since I'm in ops: it's the leftpad worry. I still have teams starting Dockerfiles with "FROM centos:latest" because that's just the mindset they have: "Latest will fix any problems" rather than "Latest will introduce new problems".

And, ultimately, Docker lets that not be my problem, because they have to deal with it when the next leftpad happens. So, yeah: they should have at it. I guess I still think there's something to be said for the cathedral pace, though.


>I still have teams starting Dockerfiles with "FROM centos:latest" because that's just the mindset they have: "Latest will fix any problems" rather than "Latest will introduce new problems".

Well there are two ways to approach this IMO.

One be proactive. Create them a vetted centos or whatever OS environment for them to base off.

The problem is, if you don't keep on top of it as a sysadmin, the developers will just figure out another way to wall you off.

Alternatively, accept that it doesn't matter what underlying OS they use, because a patched OS >> than unpatched and that when done correctly there is minimal exposure even when the service has an exploitable lib due to the jailed nature of containers.

Assuming "latest introduces new problems" too often builds an aversion to patching which can lead to worse issues down the road.

>And, ultimately, Docker lets that not be my problem, because they have to deal with it when the next leftpad happens. So, yeah: they should have at it.

Exactly! The only one responsible for libs are the parties directly leveraging them. Not that developers shouldn't make that info known. It has to be documented to remove the bus-factor of 1, and if it isn't the sysadmins should work with the devs to get it documented.

> I guess I still think there's something to be said for the cathedral pace, though.

I think it depends a lot on your resources as a department/company. You should always execute as quickly as feasible given your team size and work load. Otherwise technical debt has a way of piling up faster than you can offload it.


At my organization, leftpad is not a reason for SRE to tell developers they can't use dependencies. Instead, leftpad is a reason for SRE to run internal package mirrors for all our supported packaging systems (debian, pip, glide, Maven, etc) and ship forks of the build tools so that when you reference a 3rd party dependency, the URL is rewritten to one at our internal mirror. The internal mirror, in turn, goes out and downloads anything it doesn't already have.

They also maintain the base docker images that we're expected to use, as well as the docker build infrastructure.

Facilitation with guardrails, not blockers.


I have some perspective as someone who has done a bit of all of these jobs over the last 10 years as well as working in a hosting company that handled release management for large Fortune 500s.

> So why is that an acceptable mentality for "in-house" developed software but if you buy something proprietary from a third party where you have zero say over what lib/langs are used, it's A-OK?

Proprietary software generally has a support agreement and SLA for fixing things instead of getting the response "it works in dev!"

> That's not the case when using containers properly. Every service gets it's own environment so whatever version of lib-xyz is needed, even if incompatible with other parts of the project, are walled off for only the service that needs it.

That's why containers are great, but you have to remember most of the world isn't as fast as this community to adopt things, a lot of things are still being managed the hard way on shared servers with literally thousands of dependencies. Migrating to containers in these instances can't happen fast enough.

> There are solutions to the issues you raise (often trivial ones at that), they just require an adjustment to workflow and an increase in communication between developers and their sysadmins.

Implementing even trivial changes to processes that impact hundreds of people across multiple continents is often not trivial. Devs in India, devs in the US, hosting teams, release management, etc. A lot of those people are doing just enough to get by and not up-to-date tech wise, so not only are you implementing new tools and processes, but you're building out training programs around using them, etc.

These processes are old and will be modernized in time but that's the reality for a lot of "sysadmins."


>> Check for conflicts of this version of this library with other software currently in use (by other developers maybe, or even by the same developer).

> That's not the case when using containers properly. Every service gets it's own environment so whatever version of lib-xyz is needed, even if incompatible with other parts of the project, are walled off for only the service that needs it.

This illustrates why developers should have some experience with administering systems: do not deploy unrelated services on the same machine.

And you know what happens as a byproduct of this rule of hygiene? Suddenly the version conflicts disappear, at least for things that aren't broken anyway.


That's the tail wagging the dog (and I manage the sys admin department at my company).


> Because he maintains that installation and you don't?

Or not. You're devs are on-call, aren't they? They are maintaining their own software, right?


> why should a sysadmin who doesn't even know Python

Because a Python sysadmin has been through all the transitions of packaging systems, all the nasty corners of "backwards compatible" changes, and knows how underlying changes to the operating system will affect your code, what the storage behaves like under load, and why one tech is not "better" than another. If you really hired an admin (cough, "reliability engineer") for a Python codebase that doesn't know Python, well, that's a different question altogether.

> I have regression tests to catch if an upgrade breaks anyyhing

You don't know what you don't know.

When you can reason about the multiple ways the above statement can fail, congratulations! You are now a seasoned sysadmin, the scorn of junior developers who just want to get things done (who incidentally read a great blog post the other day about a new packaging system that we should immediately transition to and by the way it's all backwards compatible).


>You don't know what you don't know.

That's the point of regression tests. The sysadmin also doesn't know. Unless he's the one writing the tests (and IME he's not) or he's painstakingly regression testing everything by hand (trust me, he's not doing that either), making him a gatekeeper for all library upgrades achieves very little except adding bureaucratic friction.


Look, do you want a gatekeeper or not? For your small little web project you don't need it, and a little downtime probably isn't catastrophic. But as soon as you are under audit rules you need it, and we call this specialized role the admin. When you grow bigger this will likely branch out to a dedicated change manager, at which point you hopefully have other specialized roles for security as other things as well.

I understand this does not make sense when you are not more people than can fit around a table, but as you grow you will feel the need for more and more specialized roles to fit the changing requirements. The first specialized role is probably the sysadmin (devops, reliability engineer, whatever you call it) and he or she should preferrably be the one on the team with the most knowledge of how things work "under the hood" because that person is the one that can save you when things go haywire. Unless you trust this person to be more knowledgable than you are in those areas, as they rightfully should be, you're going to have a problem.


>Look, do you want a gatekeeper or not?

No, ideally not - that's the idea behind https://en.wikipedia.org/wiki/Continuous_delivery

Where gatekeepers are required (because regression testing is not yet fully trusted enough for continuous delivery), QA should be the gatekeeper, not sysadmins.

>For your small little web project

My comments are based upon working on projects with a turnover of > ~1-1.5 million USD / day.

>But as soon as you are under audit rules you need it, and we call this specialized role the admin. When you grow bigger this will likely branch out to a dedicated change manager

Every time I've worked with somebody whose role was "change manager" this role was introduced:

* As a response to repeated downtime in the past caused by some kind of idiocy.

* They were required to "sign off" on releases purely as an added bureaucratic step to cover some manager's ass.

* They never once prevented or caught a production issue.

* They always slowed down releases.

>The first specialized role is probably the sysadmin (devops, reliability engineer, whatever you call it) and he or she should preferrably be the one on the team with the most knowledge of how things work "under the hood" because that person is the one that can save you when things go haywire. Unless you trust this person to be more knowledgable than you are in those areas

Ironically the whole idea behind devops (which I fully agree with) is that it should not be a specialized role - developers and ops teams should be blended.

This is precisely because if the two teams are separate and one throws code over the wall to the other then things will go wrong. Then a manager will insist on a gatekeeper.


I think a lot of this debate argues for the sysadmin role being part of the dev team. The only real way to get both constraints (production stability and update to date fixes/features) is to have fast feedback between the interest holders of two sides.

In the python-specific case -- the requirements.in / .txt files for the virtualenv should be part of the software VCS, but the sysadmin should be able to edit & pin things just like the devs, so that they can bring their expertise to the container, rather than having to fight it.

---

Mind you, my opinion might not scale - I'm part of a small enough team that I'm holding both those roles, but I try make sure to spent time wearing both "hats", so that one role doesn't get more man-hours clocked.


IMO if a sysadmin wants to have visibility into the requirements.txt that's fine.

If they want to enforce a policy of pinning versions, that's very welcome (though I would do that anyway).

If they have specific, relevant comments about upgrades of specific packages - again, fine (though in practice they never do).

If they want to be a gatekeeper for changes to that file they can fuck off.


Another perspective: Why don't you want your software to be compatible with the system that your sysadmin provides? (Assuming that system is not completely obsolete).

Minimize your dependencies. It's incidentally also what leads to clean code bases.


>Minimize your dependencies. It's incidentally also what leads to clean code bases.

Oh hell no. I have wasted far too much of my life maintaining buggy, technical debt ridden reinvented wheels where there was a well maintained package that could just have been used instead.


> Minimize your dependencies. It's incidentally also what leads to clean code bases.

You also crush velocity. Smart use of libraries lets you ship code 10x faster. Two identical businesses.. one writes all their own code, one is smart about using libraries. Which one makes it to IPO first, and which goes bankrupt?


Are you arguing that you can keep velocity while building up technical debt?

The company "smartly" using libraries might stuck maintaining a monster of dependencies that only ever was meant to be an MVP. It will require 10x more engineers and while they might move fast at the beginning, they will only slow down over time.

The company minimizing their dependencies and paying attention to their stack will be able to add complexity over time without breaking a sweat. Their costs will be 10th the cost, and they will be able to run profitably.

I am not anti-library or anything, but e.g. adding sci-py to your python project just because you need a gaussian function in one place in your code is just lazy.

Managing dependencies wisely is one of the hardest thing in software development. It's right after cache invalidation and naming things ;)


In my experience, the vast number of problems we've had with our mobile apps, especially on Android, has been a developer deciding to use some random SDK to solve a simple problem because he didn't want to take the time to write it himself.

Then the app is broken or has memory crashes, or the final binary is 10x the size it needs to be. 9 out of 10 times it's a third party library. This is why I ban the use of them unless absolutely needed.


That only holds true (as much as it does) if you assume the goal is always an IPO.


If the world has already written 90% of the code I need, how likely is it that the remaining 10% is valuable enough to make a viable business?


Really likely. Code doesn't mean much, it's mostly about how you present it


To call out something that nobody else mentioned: because when you to use that newer version of pycurl or pyASN1 it's probably going to break the system level tools for running patches, handling license enforcement, and being able to keep the database online for other teams.

I've had this problem over and over again (biggest one last being with the US Census). Folks insisted on upgrading a python library and auth to the hosts stopped working.


> That's terrible. If I'm a Python dev, why should a sysadmin who doesn't even know Python tell me what version of a library I will use?

What if the sysadmin can code in the same language as you, faster and with less bugs?


Better yet: what if the sysadmin can code better and faster in the same language and also three others?


That would be me!


  > 15 years ago, when a project would kick-off, as a sysadmin 
  > I'd be invited in and the developers and I would hash out
  > what versions of each language and library involved the
  > project would use
To wit: the new ideologies in infrastructure management are actually designed to solve the underlying problem that necessitated that kind of working setup. Why should the version of a lib in one part of the software somehow pose existential threat to the infrastructure? Engrain the dependencies into contained, independently deployable pieces, and make it so that app-level code can evolve without bringing down the world with it. Make it easy to revert back, and/or utilize phased rollouts, and you've got the ability to iterate quickly, keep pace with external dependencies, and it no longer has to be some scary thing that requires big back-and-forth meetings over mundane details.

(As for software that releases often, maybe it's an over-correction, but there's a reason things don't work as they did in the glory days, and that's because they were never really that glorious.)

This doesn't necessarily rule out the expertise of systems administration, because the platforms for all of this need to be built & maintained, and there's still a lot of work to be done on network boarder security, etc. It's a movement that focuses systems administration to systems administration, instead of having to be this big org arbiter of microdecisions, and all the baggage that goes along with trying to be the gatekeeper of all.


> Why should the version of a lib in one part of the software somehow pose existential threat to the infrastructure?

Because that's how software developers wrote every dominant packaging system :P

There are tradeoffs to self-contained units. Disk space isn't so much of a practical concern these days, but security is very real: with a dozen apps, you could be at the mercy of a dozen different entities to update their embedded OpenSSL libraries.


Or the statically compiled application that "just works" and is "so easy to build and maintain". Lookin' at you, Golang.


Counter-wit: Why can't people get their software to work with the existing libs? Hint: It's very rarely that the existing libs disqualify.


this is the core of the problem between Devs and sysadmins. Sysadmins come from a mindset of a polished working system which never needs to change. They deliver stability and reliability to the business.

Devs come from a mindset to actively create change. This is to add new features and deliver new value and product to the business. As a Dev I do have to say that many Devs don't have enough experience in operations to understand properly how to help sysadmins, many don't understand the complexities of that job.

These two perspectives are at odds, and they should be. The new tools, like docker, start giving everyone what they want... Devs pick their dependencies, and in theory, can't stomp on the sysadmins pristine environment.

To respond directly to your question: because there are new things available in new libraries that allow us to develop new features!


> To respond directly to your question: because there are new things available in new libraries that allow us to develop new features!

If it were only that, we would have an easy time. The new things you need to develop new features are far and far between.


The new things you need to develop new features are far and far between

99% of web software written these days could fulfil identical use cases on an IBM 3270 from 40 years ago. You enter something into a form and it gets stored in a database. You enter something into a field and it generates a report. That's all Amazon, Facebook, Google, any e-commerce site are.

Sure it might be nice to use a new version of that new JS framework that all the twitterati are going crazy about, but does it deliver value to the business that justifies the risk and investment?


And yet none of those things did arise 40 years ago. All of the nuances of all the code written since then make a difference, despite duplicating "identical use cases".


Amazon was founded in 1994, so over 20 years ago and somehow they managed to succeed without AngularJS 3.7 or whatever the fashion of the month is.


You didn't come up with that idea, but it's not about "40 years ago".


People could do great things just with punch cards, yet somehow technology kept marching on.

If developers want to use newer stuff usually they have a good reason. The ability to hack around the deficiencies of old dependencies does not mean that one couldn't get a better, cheaper solution with newer technology.


> People could do great things just with punch cards, yet somehow technology kept marching on.

That's not the situation I've described - punch cards disqualify.

The situation I mean is where developers insist on writing software on version X, which doesn't compile on X-1 and is buggy on version X (and might not compile again on X+1). For a concrete example, new C++ features that aren't correctly implemented and lead to harder to read code and worse error messages when applied to day-to-day problems (which these features were never meant for).


If it is as you say, then why upgrade ever? How would we even discover bugs in software until it is used?

To have progress we need to change things. When we change things, we may break things, regardless of tests.

To quote deijkstra: "testing can be a very effective way of showing the presence of bugs, but it is hopelessly inadequate to show their absence". From 'the humble programmer'.

Production is the only way to eventually discover the stability of any software, even with 100% test coverage. It's a necessary evil in the support of progress.


All I disagree with is testing bleeding edge third party software by heavily depending on it in your production systems.

Software needs to be tested. But your view that the whole world needs to jump on it at once is very black-and-white.


That is not my view. If you're relying on prerelease software, you're definitely playing with fire.


Pick your poison.

If you run into a bug or problem with a 3rd party component (open source library, commercial tool, whatever), one of the first things they are going to ask you to do is upgrade. The fact you're on an old version of some library is an easy (and sometimes correct) scapegoat for problems.

Put yourself in the 3rd party's shoes: if you spend a bunch of time trying to fix a problem that turns out to be a bug in a separate library that's already been fixed, that's entirely wasted time.

The same goes for direct usage: you're likely to spend time fixing problems that have already been fixed.


Upgrading the version of the library wouldn't be a problem if the concept of stable ABI's were as prevalent as it was 15 years ago. Back then, the major.minor version number system was used as a signal that it was safe to upgrade to a newer version of a library without worrying that the entire application stack was going to come falling down around itself because the developer of said library decided to rework some part of the library without providing any backwards compatibility.

Put another way, a sysadmin could feel confident that moving from 1.52->1.53 would be a painless and transparent operation and that the provider of said library would continue to release 1.x branches with little ABI changes for some length of time. The expectation was that at some point the library provider would release a 2.0, which would require a more careful testing/deployment schedule likely with other upgrades to the system.

Today, that is all out the window, very few open source projects (and its infecting the commercial software too) provide "stable" branches. The agile, throw out the latest untested version mentality is less work than the careful plan/code/test/release, followed by fix/test/release, cycles.

This is a major rant of mine, as upgrading the vast majority of open source libraries usually just replaces one set of problems with another. Having been on the hook for providing a rock solid stable environment for critical infrastructure (think emergency services, banks, power plants, etc) I came to the conclusion that for many libraries/tools you had better be prepared to fix and backport bug fixes yourself unless you were solely relying on only libraries shipped in something like RHEL/SLES (and even then if you wanted it fixed fast, you had better be prepared to duplicate/debug the problem yourself).


> Put another way, a sysadmin could feel confident that moving from 1.52->1.53 would be a painless and transparent operation

This is what Semantic Versioning [1] aims to achieve, but as you highlighted, it still requires the maintainer(s) of the project to actually deliver stable software, regardless of what the version is. I think some people took "move fast and break things" a bit too literally.

A project following SemVer and that has good automated test coverage is definitely on the right track though, and in generally should be a pretty safe upgrade (of course it's important to know their track record).

"Move fast and break things ... in a separate branch with continuous integration running an extensive test suite" isn't quite as catchy but is what should be happening.

[1] http://semver.org/


Was there no automated testing that allowed you to go from 1.52 -> 1.53 with some degree of confidence?


> The same goes for direct usage: you're likely to spend time fixing problems that have already been fixed.

That depends on whether it's a feature or a fix release. Feature releases might or might not include bug fixes, but they typically include new bugs. I welcome localized fixes, however they are not as common because of constrained resources. (Fix releases is the idea behind Debian stable. Of course it only works to an extent).

A different perspective, I prefer to have the bugs that I already know, and know not to trigger.


Because those libraries have bugs, sometimes catastrophic ones. Sometimes they must update, due to API changes or other factors outside of their control. If your organization relies on keeping things static as a means to stability, one day that rule will have to break, and you may be pretty underprepared for it.


Because many of these old library versions go unmaintained.


> Why does virtualenv exist? A similar reason.

The reason why virtualenv exists is because different apps may have conflicting requirements, and you have apps that need to be deployed in different environments with different versions of different libraries. I know that even if I were developing against versions of libraries in system packages, I'd still end up having to use virtualenv in development (EDIT: I wrote 'production' here by accident) because my stuff gets deployed on different versions of Debian and RHEL, necessitating virtual environments if only so that I can make my development environment as close to production as possible.

> In the year 2015, pip went from version 1.5.6 to 8.1.1 (!) through 24 version bumps, introducing thirteen (documented) backwards incompatibilities.

Much of that has been down to efforts in recent years to finally fix the major issues with Python packaging. It has settled down quite a bit. Also, the 1.* to 8.* change is because the initial '1' was dropped: 8.* is essentially 1.8.* in the old versioning scheme.

I'm not saying that this couldn't have been handled better, but it's not just a 'hummingbirds off of their ritalin' situation: Python spent many years with packaging stagnated, and what you're seeing is rapid development to fix the mess that years of PJE-related neglect caused.


> Because developers got tired of sysadmins saying "sorry, you can't upgrade Ruby in the middle of this project".

As a Ruby developer, I can only laugh at this particular example. No Ruby project I've ever worked on ever upgraded their gems midway through a project, much less the version of Ruby. Developing procedures for this kind of ongoing maintenance is just way too much to ask.

This stuff tends to get done years after the original devs have all moved on. Maybe they tried that kind of thing back in the early days, before I started working with Ruby, definitely not today.


The time I'm thinking of was a team that wanted to switch from Rails 1.2 to 2.0 along with whichever interpreter bump was required to make that happen (IIRC 1.8.5 to 1.8.6, but this was a decade ago; I'm pretty sure 1.9 hadn't come out yet). Halfway through a project.


Unreal.

Yep, that sounds like that 'long time ago' I was talking about. Nowadays you can do that, no sysadmin to tell you not to, but nobody bothers.


DevOps here - yes, they do bother. Pinned set of dependencies, but one of them updates? Upgrade all of the dependencies. But don't worry, it's all in a Docker container (I have a completely separate rant about Docker's compatibility ignorant hummingbird).

Ironically enough, I think the current DevOps culture emerged partially because sysadmins got tired of saying no (if only so they could sleep through the night), so now they let developers tie their own nooses so they can be woken up at night.

It's wonderful to give up all of those software pages back to developers. And the developers do seem motivated to fix the bugs which wake them up at 3am, so it turns into a win all around. It's still hard to watch a new team come up to speed though, knowing how little sleep they will be getting over the next month because they made their new docker program stateful...


I upgrade Python deps all the time. Java too. How often depends on the scope of the project.


Ironically all-aboard-the-update-train was the actual reason I jumped off IIS back in the 90s when I got badly burned by updating Windows NT. Automatic database connection pooling for IIS was dropped and I started getting annoyed phone calls from clients who's websites were dying after updates.

One had to read MSDN every day to keep up with what might break on sites you had no control over.


The problem is that sysadmins don't know everything.

> in the year 2015, pip went from version 1.5.6 to 8.1.1

The only releases in 2015 were 6.x and 7.x.

There were 8 documented backwards incompatibilities, 4 deprecated the previous year, and 3 documenting a couple bugs that were fixed several days after the 7.0.0 release.

These are the sorts of thing an aware Python developer will know.


You're right; it was the period from December 22nd, 2014 to March 17th, 2016, so about 15 months centered around 2015.

We may be counting regressions differently; I'm including both adding and removing the spinner as a regression, for instance (since both the addition and removal added unexpected behavior).

Note that the undeniable regressions that occurred in releases during those 15 months included:

1. Exceptions raised in any command on Windows

2. Switching from not installing standard libraries to installing them back to not installing them

3. Blocking if the particular server pypi.python.org was down

4. An infinite loop on filesystems that do not allow hard links

Note that in that time they also added yet another internal package management system (incompatible with the existing two), changed the versioning semantics twice, and dropped support for versions of python that were 3 years old at that point.

And, again, there's nothing particularly wrong with or bad about pip; this is just what a younger generation of developers are used to.


> You'll also notice that none of these releases are tagged "-rc1", etc., though the fact that regressions were fixed in a new bump the next day means they were release candidates rather than releases.

Releasing an RC often results in nobody using it and hence not finding the bug even in several weeks, but it gets caught almost instantly in a release… At least, that's my experience in shipping various RCs that have led to next-day regression-fixes once it does ship.

While yes, better testing would solve such issues, but at some point the line has to be drawn as "good enough", because there's ultimately a limit to what is reasonable.


Oh, definitely, and I want to be clear I'm not wagging my finger at pip here; it's a good project. It's just that for somebody like me on the far side of 40 (and sysadmins in general are a grayer cohort than devs), that's absolutely not a release tempo that I grew up with. I still shudder when I see a Dockerfile that begins "FROM whatever/latest..." because I have no idea if whatever/latest is going to be the next leftpad.


That might be a bit naive, but isn't that more of a package management Problem than everything else? I can tell you that while I know this issue I also know that on systems like Alpine Linux or FreeBSD this is less of a problem than it used to be and containers (so virtualization - even when it is OS-level) is potentially overkill and certainly not a solution. virtualenv and others seem way saner in this case.

For development as a whole it is really great though in my opinion.

Or you simply use something like this: https://bazel.build/


Interesting. Is this because the Perl ecosystem is more mature or because of the philosophy of backwards compatibility?

The "backwards compatibility" philosophy isn't so explicit for the ecosystem, mostly the language? Is the test-on-install-by-default making a big difference there?


That's a good question. I think it's not a coincidence that CPAN predated the widespread use of distributed source control systems whereas pypi and gems blew up just as mercurial and git were unseating svn and cvs. It's a different release philosophy (remember, in the 1990s you often didn't even get to see pre-release CVS commits of open source projects; that was an innovation of OpenBSD).

I also think the widespread use of VPSs rather than accounts on shared servers (again, containerization) was a factor. In the 90s and early 2000s, you usually (even in a corporate setting) had an unprivileged account on a server with a given version of apache and perl, your own cgi-bin directory, and possibly some latitude on a personal CPAN install directory. The lack of containerization meant you had to compromise between using newer software and breaking existing use cases.

So I guess I think it's not so much about Python vs. Perl per se but about the technologies available when those languages became popular among developers.


That would seem to be ironic. As a longtime Perl developer who switched (for pragmatic reasons -- a job) two and a half years ago, my impression is that Python [as a language] is much better suited to a business environment. What makes Perl such a wonderful language and why I enjoy it so much is exactly why it blows as a business language. Python's rigidity is very useful if you ever want someone else to read and understand code written by someone else. So the idea that Perl ends up with the more mature package management is exactly the opposite of what I would have predicted.

That said, I haven't had any more problems with PyPi packages than I did in the past with CPAN. Yes, pip always wants to upgrade itself, but that sort of every-damn-day software upgrade cycle seems to have become quite prevalent, not just in the Python world.


You talk about that Python have just one possible layout standard and so on, I guess? That is a different subject. (And sure, no coding standard is bad for a project, Python removes that discussion to a degree.)

I think the real problem with Python/Ruby/etc is the surprising lack of an analogue to CPAN Testers.

It isn't just all of CPAN that is tested on different OS/Perl version combinations, it also stress test the Perl versions.


I'd say there is too much effort in reasoning on the wrong problem. What worries me the most is the 'why': why do (too) many software developers don't know about sysadmin?

I have been involved as a consultant in large software projects in the last two years and a vast majority of money lost in delays and bugs was caused by devs not understanding: 1) the difference between virtual memory and physical memory 2) the difference between costs of data storage per storage medium 3) the concepts of network round-trips 4) and hardware bandwidths 5) how to install and configure a web server on a workstation 6) how DNS works 7) how AD authentication works 8) what ORM frameworks do 9) how to write a raw database query (not necessarily sql) 10) the difference between navigating through database records on a database server vs. an application server vs. a client, 11) HOW TO INSTALL THEIR OWN WORKSTATION AND TROUBLESHOOT IT!!! N) etc. and those are just the topics that I can immediately remember.

As I see it, it's not about "they should". For me it's about understanding how many devs deal with such a level of ignorance on the systems they interact with, on a daily basis. This situation hurt my feelings everytime it happened and I struggled to accept it. I am not a sysadmin nor a developer but my daily work is insanely improved by my (even basic) understanding of how my workstation works and how to manage it.


I've worked with all kinds - from windows devs who can't figure out how to install visual studio - to people who understand Windows, Linux, and macOS - as well as basic system administration for each platform. The people who are most successful at rapidly developing good high quality software are more in the later group.

Would you trust a RF engineer who couldn't troubleshoot his own radio designs? why would you troubleshoot a software engineer who can't troubleshoot his own software as deployed in a real world environment?


I feel the same. Someone who isn't willing to investigate issues with their own work machine or spending some time configuring it is more likely than not to be less of a continuous learner (imo).

Looking at it short term a well paid developer troubleshooting all issues on their work laptop could be seen as a waste of resources but for me software development is a beautiful craft. I also wouldn't trust a carpenter who can't obsess over wood or tools. If I overhear two developers comparing notes on their tmux setup somewhere I mentally upgrade them into the interesting category right away.


I recently questioned someone about this very subject. They wanted to hire a "CSS expert" because the "developer" didn't have a grasp of css after having developed the project in JS/HTML. I was so confused as to how that's possible.


There's a large gap between basic understanding of CSS and actually creating good CSS. Personally I avoid touching CSS as much as possible.


I suck at CSS and do a lot of js/html, but most of my work is on the back end. I can do ok with CSS but it will definitely take me longer than someone who knows what they're doing. Usually we contract out the design/CSS for a few pages and I adopt that for the rest of the website.


CSS is not consistent and complete like a programming language. As someone else mentioned, doing things the correct way in CSS (responsive, cross-browser) is actually very hard, and often requires memorizing weird hacks.


In fairness to the folk who can't install Visual Studio - it's a genuine pain in the butt.

Last time I tried, half of the download links were absolutely non-functional. Their documentation didn't help much, either, since they pointed to the non-functioning links. I got a lot of shit for that one, but felt completely vindicated when it happened to someone else a year later.


> In fairness to the folk who can't install Visual Studio - it's a genuine pain in the butt.

It takes a while but I've never found it to be a pain in the butt and I've used it, off and on, since version 6 in the late 90s.

If the install errors out it gives you an error message, you google for it, figure out what the problem is (most common messages are easy to find solutions for), fix the problem, reinstall, done.


Then you were lucky.

I remember on the first day of a new job being given a folder containing all the MSDN subscription disks in the mid-2000s and being told to install visual studio.

I'd only ever used notepad as an editor before.

This is a massive stack of DVDs with multiple disks, but worse there are a bunch of cds listing different versions of a thing called "visual studio".

After 15 minutes of struggling and surreptitious googling because I didn't want to look stupid on my new job, a colleague walked by, went "oh", picked out the right disk and said "that's the one you need". And I had to do the same for multiple new starters.

Even today when you have to install something from MSDN, you search for "Office" and get a bunch of irrelevant language packs listed at the top which is definitely not what you want, then also have to know what x86 and x64 means, something a novice will not know, and know what "SP" means and that "SP2" is better than "SP1".


> Then you were lucky.

I worked in 3 Microsoft dev shops and 1 that had a mixture. As far as I know I hadn't heard of anyone having issues getting it installed except for the rare, occasional error that could be Googled and fixed. I'm not sure I'd call that luck, sounds like you just had a bad experience.

But yeah back in the day it was a stack of discs (I think the last disc version I used had 2 discs for visual studio and 4 for the msdn) but they were always clearly labeled. One for Visual Studio, one for additional add ons and stuff and the rest for MSDN documentation.

¯\_(ツ)_/¯


There were 40 DVDs in a MSDN subscription in 2004, so not sure what you're talking about, you might have had the cheapo one?


I've worked on VS since VS 2003.

The first versions (200X) had some challenge at times. Starting from locating the installer for the right edition and the license. Then minimal setup to have a working environment was split in 5 different installer/projects to be executed in orders (one VS pack per language + the Windows SDK + the debugger kit + the ATL/MFC package + the driver kit [if you dev drivers] + the DirectX SDK [if you need it]). Then configure some PATH and libraries to link together all of that.

Last I checked, in 201X editions. A lot have been regrouped in a single setup. That's enough for most developments. And the optional packages have auto detection (and it ain't fucked it you run it twice).


OHHH

So the MSDN subscription is different. The MSDN subscription is the full Microsoft catalog of software. Every version of Visual Studio, every Windows, Office, MSDN documentation; it's literally everything.

The parents above were talking about just installing Visual Studio. When you purchase Visual Studio it was usually 2-6 discs in my experience (most containing the MSDN documentation). But the MSDN subscription is a very different beast. Granted there should have still been a Visual Studio disc for a specific architecture that you were using and your group should have known if they're using Professional, Team, etc as you'd likely need the same.

That was a fun misunderstanding though :)


MSDN subscription has (almost) all Microsoft software - its tangential to Visual Studio. MSDN Documentation is seperate and was only a few CDs.


`choco install VisualStudio2015Professional`. ;)

After all--we're all developers. Automate!


I'm a quite good programmer who is a pretty terrible sys admin.

Of your 11 points, I understand 1-10 quite well, but I'm not great at 11.

I think the skill sets are quite different, despite the fact that a lot of people have both. I was never really that "into computers", but I have a burning passion for building large software systems fast and well.

Metaphor: I love to travel to exotic locations across the planet. That doesn't mean I'm also interested in building airplane engines.

To be clear, I'm not bragging. It would be great if I was good at this stuff.


I don't think the OP is talking about low-level OS stuff like perhaps a flaky hard drive, but about installing the things that turn a vanilla machine into your working environment, plus being able to sort out things when something goes wrong with that environment. If you're in IDE Foo all day every day, you should know your tool enough to fix it when it goes wrong.

I've not had them myself, but I have talked to other sysadmins who've had devs that couldn't install their own IDE. These people weren't seniors, admittedly, but they were still drawing pay...


> the difference between virtual memory and physical memory

For my defense (I'm a dev), OSes don't make it clear. Mac OS becomes extremely slow when I load a big virtual machine and yet displays "Swap 450Kb, 500Mb RAM free". Or with a sole text editor open after a long session it may say "Swap 750Mb". In both cases my logic tells me the swap and free memory should display the opposite, so I can't match my knowledge with the OS behaviour. Then comes Java which adds another layer of memory limitations.

> how many devs deal with such a level of ignorance

I can talk because I was 4 years ignorant, then met the right teams. It's impossible to learn and gain trust in your learnings if you start ignorant, and ignorant devs know it. We constantly need help and don't understand how ticking a weird checkbox in Eclipse makes the compilation different: Without directly executing the original command line, you can't learn anything, and architects in those kinds of companies give you too many proxy tools ("SDK") that you can't improve. You're on an old 14" screen anyway. Also, Windows is so inconsistent and weird that you just assume sysadmin is for people from another planet. My skills only took off 4 years later when I installed Linux, then Mac, and was thrown in open-source libs. It was so easy, in retrospect, and I'm so happy having been in the right context.


It's more than just swap. Second answer has more detail: http://stackoverflow.com/questions/4970421/whats-the-differe...

Udacity has a pretty good OS summary course. I had personally forgotten what a TLB was until I watched watched it.


The list of things to learn is endless. The only way we can get anything done on a daily basis is to gloss over the magic being done somewhere else in the system.


But most of the outstanding ones are renaissance people.


> The list of things to learn is endless.

yes, however there are items that are more important (ACLs, AD, Security, OID ... etc) that are a little more important than the new shiny JS framework.

I'd rather burn brain cells learning Haskell than trying to understand why so much effort is being spent on JS.


I've been a consultant on large projects and the vast majority of money that I've seen flushed down the toilet was on

1) Building the wrong piece of software/end-users not having enough influence on what get's built.

2) Lack of delegation, having people make technological/feature decisions about a product they only spend 5% of their time thinking about.

3) Organizational incentives not aligned correctly.

4) Not following software engineering principles that were discovered in the 70's(they also don't follow any software engineering principles discovered since then, but I'll give them a pass)


Should sysadmins have software development experience (e.g. DevOps)? For what values should 'X have Y experience?' Should we go as far as...

"A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects." — Robert Heinlein, Time Enough for Love


I think about frontend designers (User experience people). They should know a little bit about TCP, SSL, HTTP, networking, monitoring, statistics, and everything related to frontend markup bloat, bandwidth, compression, etc.

I've work with many of them, very good about Adobe products and Apple last thing, but clueless about, why their final code is of poor quality.

Fortunately, many of them did get their Ego down, when the web did start to move from markup (html+css) to code (js). Until then, they did take any feedback from a sysadmin, as an attack to their work.

About developers and architects... they should simply be the final support (24/7 calls included) for their work and changes.

As a side note... a sysadmin IS A developer (a systems developer). An application developer, usually should be a systems developer if he/she is working in a project that includes a platform to run (unless he/she is just uploading an app to any appstore). An operator with limited privileges, is just that. The concept of "pure developer" or "pure sysadmin" is so 199x...


> They should know a little bit about TCP, SSL, HTTP, networking, monitoring, statistics, and everything related to frontend markup bloat, bandwidth, compression, etc.

This is why Ilya Grigorik's book, High Performance Browser Networking, exists. It's a great reference to that end as well.


This book is also available to read for free on http://chimera.labs.oreilly.com/books/1230000000545/index.ht...


Agree with most of the list but why is TCP important for frontend developers? I'm struggling to find the arc for that one that isn't abstracted away much like CPU pipelines are for back end devs.


A front end developer should understand that a TCP request (i.e what an HTTP request is) consists of a handshake which means to receive a file from a server, it takes roughly 5/6 calls to and from the server. So if your UI consists of 1000 100b files, that's 6000 calls. So you would probably be better concatenating them to a single 10Kb file.

But then you need to understand the differences between HTTP 1.1 and 2.0 and how it handles multiple requests from the same page. 1.1, you would probably be better off with a single concatenated file, 2.0, perhaps several. And then what about compression? Any good UI developer should be considering compression of that file, cache control, etc.

I see too many front end developers shy away from this stuff and create bloated monstrosities. They feel it's someone else's job... but if not the UI developer, then who??


I guess my point was more that even just if you understand HTTP then you'll get all those insights already. A bit pedantic I suppose.

This stuff does get pretty complicated. Mainly because the tech is evolving fast (browsers do a lot of tricks now). Even in HTTP 1.1, there are times where multiple files might be more interesting due to how large files are handled


But can you really understand HTTP without understanding TCP?


I prefer to work with a frontend engineer that knows a little bit about packet loss, routing, load balancing, white/black listing, retransmission, packet fragmentation, latency, mobile connections, socket limits, common error messages on network diagnostics, etc... that with an engineer that just knows "the page is slow".

Well, indeed, the higher level protocols, like HTTP, DNS, etc, are more useful on a daily basis.


I'm with the Heinlein sentiment. Practically everyone doing one will benefit from a stint as the other. I've had a foot in both graves my entire career and it's an amazingly useful skill group.

How you get there is up to you. My own path was freakishly meandering. Don't be afraid to get out of your depth, but try to have a mentor around to stop you drowning.

And remember that DevOps is a mindset, not a job. If someone tries to setup a "devops team", run away.


> How you get there is up to you. My own path was freakishly meandering.

Same here, though I was aided by being able to be really confident about stuff when people came calling with the checkbook and a quick study once I'd landed a gig. I learned to do what we would now call "sysadmin stuff" (manual administration of hardware) as a kid because I broke my Linux machines a lot. Then I went into the web development gristmill for a while. Ended up leading a multi-platform mobile team with zero mobile experience because "you're a good developer, you'll pick it up" (I did); I literally went into a devops role knowing no Ruby (to say nothing of Chef) under pretty much the same rationale.

"Fake it till you make it" is real, but then you gotta make it. ;)


To complement Heinlein:

Therein lies the best career advice I could possibly dispense: just DO things. Chase after the things that interest you and make you happy. Stop acting like you have a set path, because you don't. No one does. You shouldn't be trying to check off the boxes of life; they aren't real and they were created by other people, not you. There is no explicit path I'm following, and I'm not walking in anyone else's footsteps. I'm making it up as I go. - Charlie Hoehn


I'm in much the same boat, my path to get where I am, was meandering and long.

I've worked with developers who fully understand the systems they are deploying on, and developers who develop for an abstraction of services rather than a real world environment - I tend to prefer the former to the later because the deployment process is much less taxing, even though the later tend to have better luck moving their software to another platform in the future.

But I'll echo you, every time I've learned something hard, it's because I got way out of my depth and had to learn how to swim all over again.


> Should sysadmins have software development experience (e.g. DevOps)?

I think this is kind of a given, today. I don't know many healthy, growing organizations for whom their "sysadmins" are not either originally developers or proficient in writing at least domain-specific code (I have always said that I am a software developer whose output is systems rather than web apps, because it's true; right now, with my current projects, I just wear both hats and get on with it!). Even the term "sysadmin" has largely disappeared in my neck of the woods; it has been largely replaced with "SRE" or similar, but that sort of position invariably seems to have development connotations.


I've been thinking a bit about my own lack of specialization lately...and, its negative impact on my value to a larger organization than my own company.

I suspect I'd have a hard time finding employment doing the kind of work I'd want to do at the kind of salary I'd expect to earn. I'm a decent programmer with broad but rarely deep experience, a better than average sysadmin with ridiculously broad experience, a passable designer (better than half of the "real", but merely average, designers I've worked with over the years, but so far behind the good ones that I'm hesitant to use the same term to describe what I do when I build websites), a passable sales person, a pretty good writer and copy editor, and the list goes on and on, because I've run my own companies for the past 17 years. I've touched everything that a business has to do, and I've somehow muddled through and kept the bills paid and the customers coming back.

But, I'm not a "rock star" at any particular task. I couldn't wow anyone with my algorithms knowledge, though I've always figured out how to solve the problems I needed to solve. That's not a very compelling sales pitch when talking about a $100k+/year job for a company that has a specialist in all of the above-mentioned roles.

So, I think it really depends on what you want out of life. If you want to maximize security and income, focus on a high value skill. Become the best in your market, or as close to it as you can manage. Eschew all distractions from that skill; don't fuck around with weird Linux distros, figuring out how DNSSEC works, building your own mail server, setting up CI, self-hosting all of your own services and web apps, or otherwise becoming a "jack of all trades, master of none". If, on the other hand, all of those distractions sound like the best reason to be in tech (and, that's the way it's always added up for me, even when it's cost me time and money), and you're willing to take on a lot more risk building your own business (whether consulting or building products), I guess being a jack of all trades isn't so bad.

But, and this is a big but: There's only so many hours in the day, and so many productive days in your life (and you also have to take time away from productivity to have a life outside of work/tech). As I get older I realize more and more that I have probably valued my time less than I should and valued my ability to effectively DIY my way to success too highly. I've spent many hours fucking around with stuff that I could have paid someone a few (or a few hundred, or a few thousand) bucks to make the problem go away, and it would have been worth it in a lot of those cases.


+100.. if I could. I find myself in the same boat. I've refused to move to management, and along with that my ability to earn has taken a hit(for someone in mid-thirties). Now realizing, I don't want to take the risk of building a business, and wondering if I should just focus on expertise in one specialization.


Development is sadly the job that is becoming the kitchen sink of "You ought to have skill X" where X is anything from sysadmin, computer science, business, security, product, bare-metal, networking, infrastructure, management, statistics, operating systems, math, social, and industry knowledge.

If you aggregate all of the "Every developer should know X" posts and blogs, the list would probably be very long. It only promotes shallow signaling instead of actual competence (I only need to know enough about X to make people think I know about X).

Meanwhile, your salary will still only compensate you for one skill set: software development.


Writing software is less useful than writing working, relevant software. The latter needs far more breadth than simply writing code.


Product managers, especially technical ones, can bridge that gap without being too deep in either skillset. A PM, on the other hand, isn't expected to know all of the nuances and gotchas or frameworks of whatever language the devs are working in.

Devs will slowly learn the relevant knowledge anyway, just at a slower pace than the immediate needs. And yes, after a certain number of years, that dev could be good at both. But you can't only hire devs good in both places for every position at every company.


I haven't really looked, but I can't think of posts of the form 'Every manager should know X'. From my lack of data, I pose that it's interesting how management is seen as a transferable skill, but software development isn't.


I think yes.

I see no harm in one having basic experience in multiple fields. Especially when those fields are related. Actually, this concept I kind of "market" it among my circles. A mobile developer should know the basics of web development and vice versa. That way they communicate better.

I claim that already happens naturally. It's our drive to quickly build a niche (e.g. "I'm a professional X developer"), in addition to insecurities that lead to saying "X is not my responsibility", is what gets in the way of expanding our fields of expertise.


It's pretty much impossible to be a sysadmin without writing some software, though I acknowledge we sysadmins have a tendency to use duct tape (there's a reason Perl is the sysadmin language) rather than a more solid adhesive. But whereas I've met tons of developers who have never maintained an installed system, I've never met a sysadmin who has not written a good deal of code.


> there's a reason Perl is the sysadmin language

That would be mostly Python these days. If someone here would touch Perl on my systems, they're gonna have a very bad day. Sure I wrote stuff in Perl back in the days, but those days are over.

I see this as Perl's problem: to be good enough at Perl, you'd have to frequently use it, but Perl is in my eyes only suitable for quick hacky run-once scripts - which should not be written frequently, so you shouldn't be good at Perl. My Perl is pretty damn rusty these days.

If someone is still writing large scripts/apps in Perl these days, I question their judgement in technologies and ability to keep up with the times. Sure you can write larger scripts in Perl - but what's the point? You have to take care not to make things unreadable, while when using something like Python, it's much harder to make a script that's unreadable.


IDK. My blog is in catalyst; I really like it for that.


> Should sysadmins have software development experience (e.g. DevOps)?

Yes. Where I work, they're all expected to be able to cut code. They might pair with devs at various stages, but a sysadmin who can't write code isn't going to be able to work with some of the more modern frameworks for orchestrations because they are in essence DSLs: they _are_ code.


I strenuously disagree with some of these.

> Plan an invasion

This is actually a massive undertaking. An undergraduate at MIT taking a semester-long course on this will barely scratch the surface of it. Furthermore, you're never going to suddenly and unexpectedly need to know this. Any situation where you plan an invasion is going to be preceded by spending a long time getting into the position where people trust you with their lives and the fates of their nation.

> die gallantly

You're only ever going to be in this situation once, and probably not even that. Why does it matter how gallant your heart attack is?

These skills make a bit more sense in a world where most of us need to march off to war. Happily, we don't live in that world.

You should prepare for the situations you are only mildly unlikely to be in and where your skill matters.


It's literature, and SF, so it would be a mistake to take the quotation too literally. But since we're here; the thing to remember about Heinlein is that he was a romantic and a futurist at the same time. Hence his militarism wasn't so much about the enemy as about providing an opportunity for romantic heroism. Similarly with the other items in the list. It's not grounded in practical necessity but in a "renaissance man" / hero of a novel approach to being able to not just handle situations but show off in them.


>These skills make a bit more sense in a world where most of us need to march off to war. Happily, we don't live in that world.

Depends how old you are.. some historians are keen to point out that the global climate is very similar to the conditions just before WW1. Will you be of conscription age if WW3 does break out in the next decade?


> > die gallantly

> You're only ever going to be in this situation once, and probably not even that. Why does it matter how gallant your heart attack is?

I think doesn't exactly refer to the act of dying, but is more along the lines of "The object of war is not to die for your country but to make the other bastard die for his." So don't exactly want to die; rather, you want to avoid it, but should it happen, you just want to make it very expensive.

But of course, it could also have nothing to do with war; it could be about how you face death, and how much of a burden you leave to your loved ones. (E.g. don't commit suicide leaving a note that tries to shift the blame to your family, or whatever).


As others have pointed out, Heinlein was writing literature, not management or career advice. That said, planning a modern, distributed web service probably isn't far off in complexity from the invasions he would have been picturing. As for dying gallantly... you're right that we'll all face it only once. Some go bitterly, some go cravenly, some 'rage against the dying of the light'. I'd say part of a life well lived is to face its end with brave, 'gallant', acceptance, whatever circumstance precipitates that end.


> Should sysadmins have software development experience (e.g. DevOps)?

May be. Not only you understand better the needs of software developers. But also, you are able to automate a large chunk of your work. Paraphrasing Larry Wall: Sysadmins should be Lazy. "Laziness: The quality that makes you go to great effort to reduce overall energy expenditure. ..."


I used to work on a product where the three major teams were Client, Server and Ops. Tens of millions of customers used our stuff.

Client and Server folks were on separate floors of the same building. We didn't interact much at all, except by trading bugs back and forth. The management chains met at a VP. Three or four times a year we tried to coordinate a release, and it always took at least a month. Getting a feature out the door might take a year, with all the paperwork and pipelining of release schedules.

Ops was in another building. The only things that both Client and Server teams were sure of was (a) they did a bunch of customization of our stuff so that it would work, and (b) they hated us.

Support was in another state. We were not allowed to talk to customers. Maybe once a year Support would fly in to talk to the teams about pain points. I think we did an okay job addressing these, but it took a long time, and customers suffered a lot.

I won't talk about the disaster that ensued when Scrum was thrust upon us, or the splinter projects that spun off to try to fix things (but wound up being lots worse).

You have to be close to the customer or you won't know if you're succeeding, or even on the same page. You have to know what your software is doing in production or you're just sitting in an ivory tower pontificating about angels-on-the-head-of-a-pin nonsense. You have to spend time in the trenches measuring and fixing stuff or you're hatching an unmanageable disaster. The good news is that most of this is actually kind of fun. The bad news is that, when managed badly, this can turn into a horrible and soulless grind of pager duty and making legacy code even more legacy to fix wee-hours downtime.

I still get a rush when I get feedback from a real, live customer, and I think that isolating your teams from customers is one of the worst things you can do to a team and to a product. Getting teams to work on ops and support aren't bad ways to improve this.


Sysadmin knowledge definitely helps, but so does an MBA, knowledge of writing, public speaking, design, user experience, networking. Oh, and the domain of the problem itself.

The skills I require of my developers depend on the rest of the team and the project.


The difference with this is that server-side software has direct operational consequences.

A server-side developer who never deals with ops is like a chef who never tastes the food. It's in theory possible to get right, but in practice the results tend to be poor.


That's true for backend, but frontend developers don't necessarily need a lot of ops knowledge if the team has a good separation of concerns. If frontend devs have to worry much about sys admin issues, I'd say that likely points to a flaw in the way things are being done.

Of course, the more someone knows, the better. Knowledge and experience in any area of technology can improve understanding of all the others. But nobody has time to learn everything, and there's way too much to learn. Time given to learning more about sys admin issues is time lost to other potentially valuable knowledge.

It's a good area to learn about, and it is essential to being a strong backend developer, but through good architecture and management practices, there are still plenty of ways to make very high value contributions for a developer who hasn't spent much time focusing on sys admin.


I think your definition of "sysadmin issues" is probably a little more limited than mine or 'wpietri's. I've had frontend developers insist that they could bake configuration settings into their webpacked artifact...which needed to be deployed into multiple environments because of course we weren't rebuilding something that had already been okayed in QA when we wanted to send it to prod.

You might say that's not a "sysadmin issue," but I have seen it happen three or four times now and in each case it was the "sysadmin" (read: devops engineer) who caught the problem and explained it to the offenders in question. (Maybe it's a "build engineer issue"...but at most companies I've seen, he or she is probably the "sysadmin", too.)


I would consider that sort of thing to be part of the required basic knowledge to be a competent frontend dev, so if you want to call that sys admin, then sure, there's a bit of it involved. If it's on your side of the fence then yes, it's your responsibility. I'm not sure how many actual sys admins would consider configuring webpack to be 'sys admin knowledge', but there is a spectrum and it's true that there are some ABCs that everyone needs to know, particularly when there are security implications. Still, it's a pretty far cry from unix, web servers, and databases.


You might consider it basic. I would have, insofar as I considered it pretty obvious even when I hadn't written a line of frontend code in five years, and webpack etc. weren't even on my radar then. ;) But my experience has led me to believe that it's not.

I think that's kind of the point of devops, though, is that something like a build system is a cross-cutting concern, that architectural decisions for an application need to involve people across the stack. Classifying somebody as "a sysadmin" is the problem in the first place, which is why I caveated my post as heavily as I did. My experience is that your "devops" or "sysadmin" people functionally become the "junk drawer programmer" who are relied upon for all sorts of weird stuff. I've been at jobs (and at clients) where I had to teach senior backend devs how to use VisualVM or what the implications of using Kafka and CQRS are. I've been at gigs like the aforementioned where I had to explain the ramifications of webpack to people knowledgeable and capable enough to make React dance. And so my definition is probably necessarily more broad...but it's also stuff I've had to do in practice, so, enh.

And, 'cause fair is fair, I think a "sysadmin" who couldn't step in and write production-quality code (allowing for a little ramp-up) is probably an endangered species over the next ten years, too.


> but through good architecture and management practices, there are still plenty of ways to make very high value contributions for a developer who hasn't spent much time focusing on sys admin.

Huh. Hands up everybody who is working at a place with good architecture and management practices?

Knowledge of and experience with the direct downstream and upstream of one's work is different than general knowledge. It can in theory be made up for by high-ceremony processes and strongly controlled interactions. But in practice that mostly doesn't work in software development.

Front-end developers don't need much server ops knowledge because that's not generally the direct upstream or downstream for them. But the principle still applies; they will be better if they have experience with their upstream and downstream.


> [...] frontend developers don't necessarily need a lot of ops knowledge if the team has a good separation of concerns.

If there is a sysadmin in your development team and this sysadmin has something to say about system's architecture and development workflow, then OK, you don't need any of your programmers to be a decent sysadmin. Otherwise, you need to get the sysadmin.

> But nobody has time to learn everything, and there's way too much to learn. Time given to learning more about sys admin issues is time lost to other potentially valuable knowledge.

And then we have Homebrew, which I constantly hear is atrocious and breaks software on updates (incorrectly managed dependencies), and its developers don't understand the necessity of using cryptography for transporting packages. We also have this whole idiocy of deploying software in production by recompiling it with `pip install', `rbenv install', and `npm install'. And then we also have incorrectly built RPM and DEB packages published by software developers (InfluxDB and ElasticSearch are common offenders, Riemann being another one when I checked long time ago).

And mind you, I don't use MacOS, I don't develop web applications, and I package and host additional software myself, so how big are these issues when they got my attention? What was it again with your valuable developer's learning time?


We took it one step further on my team. The best way to ensure that the pager(duty) doesn't go off at 2am in the morning is to put developers in the first on-call group. Not only did it work, but the developers got a much more nuanced view of how systems operate. When we first started, I'd see developers using ping to determine whether a nameservice entry was correct. After a few months of handling almost all of our own ops, developers knew how each piece of the puzzle worked and how it all fit together rather than the hazy, largely abstracted view they had before.

But we learned that we needed it to go the opposite direction too. We had ops people who, when given a corporate-wide mandate to apply a security patch or some such task, would log into every machine and apply the patch, despite the fact that we'd been practicing immutable infrastructure with zero-downtime deployments. We hadn't given them the necessary exposure to the dev side to understand that you had to apply those fixes to a base image and trigger a redeploy. There was a lot of finger pointing a couple of weeks later after it was discovered that the fixes were overwritten by an application deploy.


I hope you increased their salaries together with the new 2am pager duty.


That's what I did at my tiny engineering team (3 folks) and results have been great.

If you have a small organization (say <10 engineers), it's crucial that every developer writing server side applications can also do at least some sysadmin work. As the article says, it leads to deeper understanding and it really helps when thinking about scalability, fixing certain kinds of issues, etc. It also often shortens the feedback cycle and requires less throwing over the wall. As a bonus, you increase the bus factor.

Automation is the key though. If everyone connects to the boxes and does random things manually over ssh, nothing good will come out of it.

Still, you need to have a person or two who are responsible for the vision of the architecture/systems and who make sure that things don't go off the rails.


Somewhat related, I feel that mechanical engineers who design cars should have some experience servicing vehicles.

Sometimes engineers will not leave enough space, use weird fasteners, etc. that make a simple job much more complex.


It could be a deliberate form of obfuscation/discouraging of repair, or just the common trend of making things more complex than they really need to be.


That's possible, but lack of insight into what's easy or hard in manufacturing or construction is a pretty common problem. I saw it a number of times when working for a few different manufacturing companies, though none of them built cars.

A failure to understand construction concerns also played a role in the Hyatt Regency walkway collapse. The original design was poor, but redesign to address construction difficulties accidentally weakened the walkways further.

> Havens Steel Company, the contractor responsible for manufacturing the rods, objected to the original plan, since it required the whole of the rod below the fourth floor to be screw threaded in order to screw on the nuts to hold the fourth floor walkway in place. These threads would probably have been damaged and rendered unusable as the structure for the fourth floor was hoisted into position with the rods in place. Havens therefore proposed an alternate plan in which two separate sets of tie rods would be used: one connecting the fourth floor walkway to the ceiling, and the other connecting the second floor walkway to the fourth floor walkway.

> This design change proved fatal. In the original design, the beams of the fourth floor walkway had to support only the weight of the fourth floor walkway, with the weight of the second floor walkway supported completely by the rods. In the revised design, however, the fourth floor beams were required to support both the fourth floor walkway and the second floor walkway hanging from it.

https://en.wikipedia.org/wiki/Hyatt_Regency_walkway_collapse...


There was another problem -- the beams were spec'd as box section, but what was on the shop drawings was 2 channel sections with the flanges welded together. (like this: []). The attachment point that supported twice the design load was also compromised by the weld and less competent section. Whoever was checking the shop drawings didn't pick up on the importance of the change.

IIRC, either change on it's own would have been marginally ok, the two together weren't. (by marginally, I mean probably wouldn't have killed people but wouldn't be to code)

When I was going through civil engineering, there was a big push to use a statistical basis for loads and resistances, rather than using a blanket factor of safety. Loads vary, strengths vary, potentially normally, probably not. But they're described by statistics at any rate. Blunders of this sort aren't, at least on a per project basis.


The BMW i8 is a good example https://www.youtube.com/watch?v=fxe_b2GRwok


That is a function of design requirements, not lack of foresight. If your manager tells you the cab needs to be 20% roomier without changing the outside dimensions, or they change the engine going into the thing without changing the structural integrity of the car, there goes your maintenance space.


I feel that managers of mechanical engineers who design cars should have some experience servicing vehicles.


And if the customers are imposing these ridiculous demands, then they should have some experience servicing vehicles.


I spent a couple summers working for the sustaining engineering department of a manufacturing company. There was quite an oil boom and they were having trouble hiring enough people for the shop floor, so they had the engineering staff come down to help for a few days. The mechanical engineers would bolt things together, the electrical engineers would do the wiring, etc.

Afterwards, one of the supervisors showed me the difference between wiring done by an electrician and wiring done by an electrical engineer. He pulled on each wire of the relay panel. The wires done by the electricians didn't move, but most of the wires done by the electrical engineers popped right out.


In a similar vein: Some houses are just so unpractical to live in, that architects should be punished by ordering them to do household chores, especially cleaning!


I don't think "should have" is the right statement; more like having at least one person on the team with sysadmin experience is extremely helpful. I know it has been for me.

Of course in HS/college I ran a website that was a frequent target of hate in the late 90s/early 2000s and it taught me about XSS and CSRF before people invented fancy terms to describe them. It also taught me about HTML/JS escaping, DoS/DDoS, SQL injection, and how all your defenses are useless if someone social-engineers their way into root and nukes everything. I have the assumption that everything is compromised and user input is toxic waste burned into my subconscious.


I worked for an ISP in the late 90s/early 2000s for about a year before my final year at Uni that when I started had 2,000 customers and by the time I left we'd got to 750k customers.

It was the best training I could have for writing software for the Internet. Just a couple of months ago we noticed one of our core apps was not scaling no matter how many docker containers we had spun up in Mesos. Spent a week breaking it apart using that experience and being able to make changes directly to the code base and being able to talk to devops in a language they understood, and in under 5 days we managed to identify and fix 7-8 different issues.

I would value a dev with sysadmin experience _far_ higher than one without in a tech business with a headcount under 500: it's going to lead to fewer problems and issues in the short and medium term.


Anecdotal as it is, I started my career working as a sysadmin while studying CS. When I was younger I was quite interested in security (e.g. I followed defcon and CVE lists and read a lot of manuals and source code). I built server software, broke applications, and did a lot of reverse engineering. In that time I was forced effectively to learn about how the OS worked and how to utilize it, both Linux and Windows, from C APIs to scripting, package management, and everything else needed to effectively work in those environments, primarily around reverse engineering and vulnerability research.

That experience naturally lead to me becoming a sysadmin during my study at university. It was a fairly straightforward application of what I already had learned with a much larger scale of management. The primary thing I gained out of it was a drive to automate everything. When I started that job most of the sysadmin work was manual, but a few of us spent a huge amount of time focusing heavily on automation and when I left most of the work was automated and we were just doing meaningful firefighting and supporting development.

As an engineer, the main benefits have been understanding how my software is going to run in an actual software/hardware stack, easily jumping into a production environment and debugging complicated issues, being able to quickly have my OS do what I want, and that drive to automate everything. A lot of that informs how I build software and in general it feels like it makes me a lot more productive.


I have a very similar story as a background. To add to what you said: while maintaining and debugging software installs, I learned a lot about how and when things break. Especially, how important it is to keep things simple. This turned out to be invaluable when I began working as a software developer after graduating.


I would change it from "should have sysadmin experience" to "should have operating system knowledge, or ask the right questions".

The example in the article isn't what I tend to see in real life. What I do see are things like:

- Not knowing about various limits (number of open sockets, or listen queue depth for example), how to know you've hit one, how to deal with it.

- Not handling various error situations correctly (can't open file due to permissions for example)

- Security issues. ( socket listening on 0.0.0.0 when localhost would work, for example)

- Making assumptions about things like "current working directory" or "certain environment variables will be already set for me"

For many of these, including a sysadmin or system knowledgeable architect in the right discussions would suffice.


This is a short little article that just barely touches on a much deeper, often hidden issue; The state of system administration in business is abysmal.

It's not the developers fault though, at least not as much as devops types would like you to believe. I think the author has a good point, in that its good to get Devs thinking about real world environments on deploy, but the real world is much more complex than concurrency of servers.

All that being said though, very rarely have I as a sysadmin of 10+ years seen problems so easily attributable to devs. Of course I haven't lived in hn/sv startup land either, so take that into account, but failures in systems I have seen have almost always been a failure of management, up to and beyond C level.

I could go into detail, but I'll save it for another time. Suffice it to say, what businesses need to be doing is getting better CTOs and CIOs who can bridge the gaping chasm between sysadmins and managment.

Devs, you keep being Devs, and let the sysadmins be sysadmins. Cross train and communicate when you can, but don't fool yourselves, it is management that bears the responsibility and burden of you both. Management just doesn't like to admit that to themselves or anyone else, so don't play into this Devs vs sysadmins dialectic too much, lest ye find yourself the scapegoat next go round.


I agree that the dialectic shouldn't be played...but not that it's "cross-training." Rather, it's the same skillsets being applied in slightly different ways. There was a day in which "sysadmin" very often meant "shit-hot Perl slinger." It was before my day, but I know some of the graybeards who can still lay claim to it.

The sysadmin who can't write good, maintainable code is going to rapidly see their positions reduced to sinecures in large and slow companies. That may be enough to finish out a career, but I wouldn't bet on it if I was under 50 (and I don't bet on it; I have always, as said elsewhere in this thread, framed myself as "a developer whose output is configured systems" rather than "a sysadmin" for this reason). Similarly, in an age where infrastructure-as-code is becoming the norm, you had better be able to work with it or, as a developer, you are very limited in what you can do without being blessed by somebody else--and the set of environments where that's gonna fly is shrinking, too.

This is emphatically not to take any weight off of the shoulders of management, to be sure. But rather that I feel very strongly that what divide existed between these "disciplines" no longer exists, and both developers and traditional sysadmins need to move to catch up.


> The sysadmin who can't write good, maintainable code is going to rapidly see their positions reduced to sinecures in large and slow companies.

I completely disagree, but respectfully. What I think we are lacking in this discussion is a differentiation between what we mean by sysadmin as a product and as a job description. Systems Administration is a aggregate management of the technical infrastructure. (for $reasons).

In super small technical infrastructure, such as webdev startups, there is very little active sysadmining to do if done properly, but as complexities grow, you need multiple people to perform all the various duties needed to maintain a system. In the performance of these duties, there are various descriptions with varying levels of requirements.

What I postulate is that traditional businesses in seeing the agility of the dotcom and startup culture have been attempting to cut costs in the technical infrastructure, but for the most part because the managers of the technical infrastructure (the sysadmins) haven't had anyone to advocate on their behalf, this has resulted in poor business performance bottlenecks.

I would be curious to hear what other have to say, but in the "shit-hot Perl slinger." days, usually the senior sysadmin was the slinger, and he reported directly to the CFO and CEO, if not ownership.

What I argue is that systems have gained in complexity from a computing infrastructure standpoint such that this old structure no longer worked well, hence the creation of the CTO/CIO class(very similar). The problem is, they aren't doing a good job in my opinion. Therefore, in this current business climate, my argument is that the main problem is we already have good code slinging sysadmins on hand (or can train them), but what businesses need/the industry is lacking, is business/political game aware senior sysadmins to make up for that failing of the C's. Hence I disagree the days of non-code slinging sysadmins are going anywhere. Indeed, some of the best senior sysadmins I know live in meetings, but if they are performing well in fulfilling that role, and not slinging code, but instead making big picture, high-level overview decisions and then monitoring progress, I don't think there is anything wrong with that. Now, if we got CIO/CTOs back on the right track, the political/business game demand could fall and those sysadmins could get back into hard work, (which just happens to sometimes involve code-slinging to solve problems.) Devs are an entirely seperate entity, but still part of the process, in anything other than a pure web startup type businesses with little to no real infrastructure (eg, workfrom home contractors).

Of course I want to qualify this in that this is ancedotal and I admit I haven't seen every environment, but I have seen many environments (as a contractor, seeing more insides than the guy going for retirement), from fortune 500s to 2 man lawyer shops.

Any business that can see this issue coming and address it head on will be far ahead of the game. Those that don't, will one day have very rude wakeup calls as the complexity exponentially increases and they don't have the structures in place the handle the demand, mostly due to lack of foresight/vision.


To me, you're just describing another layer of management. Which is fine--management is important!--but it's not the same thing as the actual implementor class that "sysadmin" usually refers to. And those people are inexorably going to be doing their administration via code. There isn't enough time in the day to waste with "pet" servers. They're cattle. Infrastructure as code is here, and it's just going to grow more extensive. If you can't write code effectively--both in terms of stuff like "hey, this code can be read by other people" and "this code can be automatically tested like any other code"--you're in trouble. One of my clients is a top-10 commercial bank in the United States that has not only moved entirely into the cloud (save for some legacy mainframe stuff that they're working on) and every team is not only required to provision and operate strictly with automated tools (both instance and cloud-level provisioners), but their non-database servers are killed automatically every few months to ensure that they're rotating successfully and without human involvement.

I'm pretty confident that I've seen where we're going with this. The management piece of the puzzle is totally important...but the practice is going to continue converging with every other bit of software development.


Software developers should also have customer-facing support experience (supporting other people's code) so that they realise, the hard way, the importance of having extensive logging and debugging capabilities accessible to those unfamiliar with the codebase.


Having someone who is cross trained is always better than not, but they will be harder to find and more expensive.

A frontend UI person who can write their own backends is great. A backend developer who knows javascript and can build their own frontend is great.

A coder who manages their own cluster is great too, as is a sysadmin who who can write code.

And a developer who can do customer service, and therefore can fix a problem for every customer at once through modification of the application, is better than just someone who can answer the phone and make the customer happy.

All this is to say that someone cross trained will always add more value and will also cost a commensurate amount.


> All this is to say that someone cross trained will always add more value and will also cost a commensurate amount.

Man...I wish the latter was the case. I ended up going into consulting precisely because of the way that salaried employees are valued in tech right now. I literally am all of those things, I am comfortable and have delivered at a high level mobile, web, backend, and infra projects--but the "market" loves valuing those roles individually, and not the synergistic capability of being able to do all of them.

Consulting is fun, but finding a place where those talents actually are valued would be rad. (Anybody out there: have an opening for somebody who can deliver value at literally every level of your engineering organization while always being game to help bring your other developers up? Email's in my profile. ;) )


> but the "market" loves valuing those roles individually, and not the synergistic capability of being able to do all of them.

Find a company that values "founder's mentality" or "ownership", and that organizes projects in a way that can take advantage of it. Where I currently work, there is a lot of opportunity to "start up" new ideas, and driving them successfully benefits substantially from the the founder-like ability and desire to wear many hats. Part of this focus on ownership is the principle that every product team owns their vision, roadmap, technology choices, system stack, etc. from top to bottom. The larger platform is organized as services layered on top of services, that one team provides to another with a high degree of isolation and autonomy. There is no central "operations team" to punt to, and while there might be central customer service teams, they primarily handle routine requests.

I bet you will also find this in places that care a lot about customers (are "customer obsessed"), and spend a lot of time thinking about and examining what their experience is. A relentless focus on improving the customer experience is one principle that can lead you to look anywhere and work on any problem.

P.S. A team I work with is starting up a new operations excellence group to drive the next level of improvement to our operations/availability/SLAs/efficiency, etc. in our division. They are looking for talented engineers comfortable wearing multiple hats, who can work both at the architecture level one moment and switch gears to get deep into code the next. If anyone's interested in driving engineering excellence cross-functionally across many teams in that sort of way, see my profile and hit me up!


If you claim to be an expert at more than two, then chances are you're either 1) not really that good at any of them or 2) would make a great founder!

If you're good in multiple areas the only way to really get paid what you're "worth" is to start a company and use your skills to create an amazing product.

Because you're right: the market doesn't value a true polygot.


I don't claim to be an expert at anything so wide as a full category of software. Hell, I don't even claim to be an expert at a language. (I learned yesterday about Ruby flip-flops, ferchrissake.) Merely (well, "merely") that I can deliver a solid product at any of those levels. =)

As it happens that's the other side benefit of consulting--being able to build a product with downtime. And I am! But it's still a lottery, and risk mitigation is a thing. So I like hailing for interesting offers to come my way.


>Anybody out there: have an opening for somebody who can deliver value at literally every level of your engineering organization while always being game to help bring your other developers up?

Sure just do our HackerRank coding challenge and implement some algorithm that will be completely irrelevant to your job in a web browser under time pressure.


While I agree with the general sentiment, the examples feel fairly contrived.

What I think is more practical is understanding when certain problems fall outside of your own scope or capabilities and when to engage someone else for their advice.


Sysadmin or devops is responsible for uptime directly, being ultimately responsible for the services to remain performant. It's a mindset and temperament that may help developers see solutions or foresee pitfalls while designing and implementing code. Zero-downtime attitude. Making and testing deployments, upgrades and migrations to be least impactful as a core function is probably the biggest effect of putting a developer in a sysadmins shoes.


What he's talking about is systems architecture not sysadmin experiece.

Of course developers have to write code for a logical model of a machine that isn't necessarily the same as the laptop they're working on. But I don't think anyone would ever dispute that.

Storing pictures on a local disk when the software is supposed to run on a cluster has absolutely nothing to do with sysadmin experience.


I came here to say this.

If a developer is building an app for 100K users the same way they'd build an app for 100 people, the systems architect dropped the ball- not the developer.


The post has a valid point even if the example is more about system architecture than system administration.

In my experience a developer doing one year in an Operations department gains some insight in what is going to be important in the longest phase of the software life cycle, that is when it's exposed to customers and the company has to react quickly to problems. Proper logging, anything that can help pinpointing the cause of problems and even hot patch them at 3 AM.

The usual lean startup might have little thoughts for that (is it even going to have customers?) but established businesses do.

If it's 6 months to develop and it runs for 10 years (with maintenance and new features), which phase is more important to a company, development or production? I'm a developer which did a couple of years of operations and I've little doubts about it. It's where the company makes the money from.


With the rise of devops (docker, puppet, ansible, etc...) , and cloud VMs, how often are sysadmins different from software devs?


With the rise of devops system engineering knowledge is being lost.

I often see developers trying to replace it with bloated, complex piles of hipster software. While a devops team is attempting to build a toy Google-like infrastructure on 50 nodes for weeks, another company is deploying code just fine using some 15 years old setup with netboot + disk imaging system + OS packages.


I don't think this is true at all. Which is not to defend or even endorse the Kubernetes wank, "devops" is also "hey, we have codified what our systems look like in code (with Chef/CloudFormation/etc.)." Which is a hell of a step up from what existed prior. That I occasionally have to then go debug why AWS's Xen network drivers are spitting the bit or why this application's disk access patterns are making EBS sad certainly keeps me very firmly in the details of the systems in question, too.

(As an aside: I mentally replace the words "hipster software" whenever I see it with "things I don't understand" and the sentences never seem to change their meaning.)


> Which is a hell of a step up from what existed prior.

Not accurate. A lot of new-stuff is just old-stuff we've had forever with a new name and branding. Worse, a lot of "new" stuff is too low quality to be reliable and/or gonna die/be obsoleted very soon.

Old shops who took development seriously have developed internal tooling, customized to their internal needs. It can be surprising how well it can hold the comparison against new hype software.


"A lot of" old stuff is low-quality--but it's what we've got and the creator works here and he likes it, so we use it. See? I can generalize too.

And, friendly advice: I assure you that you meant to write "I disagree", not "not accurate". Don't be That Guy. Never be That Guy.


If old and new both suck. There is no gain in migrating to newer ;)


But they don't both suck. Newer tools reflect the needs of newer problems. Like, I can point to exactly what I get from my Chef/Cfer stack--I can wrangle hundreds of concurrent machines that need to update to a policy while also getting, in a standardized and predictable format, a full itemization of all changes applied to all systems. I can leverage the community, too, to help me bootstrap new features and functionality because we're speaking a shared language.

They might be new tools to you, and being skeptical of new stuff is fine (even healthy), but there's a point at which skepticism turns into Ludditism and it's well before the point where one starts to bloviate about "hipster software".


It's funny you'd quote Chef (or puppet), because they've been replaced by Ansible/Salt.

I spend a lot of time evaluating new tools, possibly more so than anyone else in this tread. There are very few novelties which solve a problem notably better than what existed before AND are reliable enough for serious usage AND don't come with so many drawbacks that it nullify their interests.


Chef hasn't been replaced by Ansible or Salt. Each tool has a different value prop that makes sense in different contexts. I might use Ansible in some environments where I wouldn't use Chef--there aren't a ton of these environments, IME, but I've chosen to use Ansible in the past despite it not being my favorite for reasons like this. I would use Chef in other contexts where Ansible's ecosystem or tooling is lacking--I find that chef-zero and berkshelf are a better way to self-bootstrap on AWS when using CloudFormation, for example--or where I desire the greater flexibility and expressiveness of a Ruby-based DSL.

The black-and-white thing you're painting is, if I'm gonna be frank...a little messed up. I mean this in the best of ways: please enhance your chill. It's not that bad out here.


There are a lot of parts of ops that handle things that aren't easily scriptable. What do usage patterns look like so you can better manage when to bring servers up/down to handle normal usage flow and have servers warmed up when they're needed? Are you using the best build options for your VMs? Are you using the best VMs for your application? Are you at the best datacenter for your application? Even with certain build options having better automation tools than before, there are still a lot of places where it the world is far from a few scripts.


That question doesn't make any sense at all. That's exactly the same as saying: with the rise of do it yourself home repair, how often are plumbers different from carpenters?


The question is great. DevOps is a field that I really don't want to touch as a developer. I build frontend and backend as well, design databases and focus on UX and configure a LAMP stack, but that's where sysadmin ends for me.

What I don't want to mess with is the "DevOps" tools, like Docker and rest of cloud orchestration stuff. It's a very complex field and I couldn't care less. If I work on a project like that there's always somebody who can do that kind of setup.


DevOps, at its core, is just treating your configuration files and artifacts (IP/DNS, firewall rules, mail config, service and web app artifacts, libs and other stock software, SQL/NoSQL deployment descriptors, queue configs, users/roles/certificates) as development artifacts, and have an automated, reproducible deployment procedure. Going to a box and manually configure something in an ad-hoc fashion using fancy GUIs is considered a no-go. Rather, you're supposed to craft/test your configuration artifacts and then check it in to eg. git. If anything, this should be a workflow familiar to developers, hence "DevOps".

Docker and other Docker-like containers like Rocket don't have anything to do with DevOps per se, but they facilitate easier automated deployment of said configuration artifacts. Technically, Docker & Co. are just remedies to classical DLL hell situations, and have been used mostly for development and software evaluation purposes. Though they technically don't provide much more isolation than simple chroot-jails, increasingly they're also used in place of VMs because of density reasons, eg. because you can run a whole bunch of services on a single machine with less footprint than VMs (on Linux, that is).

The frenzy with Docker and the other cloud orchestration stuff, as you say, is IMHO mostly because of Sillycon startups with insane amounts of venture capital buying their way into people's minds.

Edit: ok, the last remark was a bit snarky; sure there's a need of mass-digestible tool for collaboratively editing DevOps artifacts and deployment plans, but I don't know any I like; feel free to point me to one (preferably without JSON and/or yaml configuration files)


A plumber who is also a carpenter is usually known as a "handyman". There's a reason apartment complexes big enough to have maintenance staff tend to hire that sort of person over a dedicated specialist.

If you're a large business you should be automating your sysadmins with devops/SRE techniques. If you're a small business you should be outsourcing that work to companies that have automated it for you.

That's why cloud providers and associated services have seen enormous growth.


I dunno, I think the question makes perfect sense. I am a software developer. I very often write Chef and a metric buttload of glue code in Ruby and Bash and occasionally PowerShell. That I can also write a Rails app and a React-Native app and a Dropwizard app (all things that I've touched this week) in addition to being able to set up Kafka in a fault-tolerant and scalable way is because it's all code.

Chef etc. aren't "home repair." (Well, maybe Docker, and it shows.) They're a way of expressing subject matter knowledge. Still gotta be a developer to do that expressing in a way that isn't going to kill you, either directly or because you write code that your fellows burn you at the stake for.


There's reasons we have these classes of people with certain labels. Until the 18th century, buildings were designed and built by "artistan craftsmen", people who had no formal training or education in building things (or in any capacity) but said, sure, I can make you a house. Then it would fall down and kill a family of five. All around the world today, architects and engineers are required to be licensed and prove they know what they're talking about so this doesn't happen. As a result of this process, the whole profession developed and advanced the state of how we build things.

We're lucky that we work in an industry where our work is most likely not going to kill people. We don't have to be licensed to make a claim of who we are or what we can do. But it would be very disingenuous for me to call myself a professional software engineer just because i've written hundreds of thousands of lines of code, and for you to call yourself a sysadmin because you've set up some software. The mark of a professional is understanding at a deep level all of the hidden detail that goes into the work, and if we hold them up to a standard, the titles will reflect that.

So, how often are software developers different from sysadmins? They're always different. They're completely different practices. As a person, you can indeed have two separate successful careers and become both, but that doesn't have anything to do with devops, which is merely the practice of dev and ops collaborating together.


I've read your post two or three times now, and I straight-up don't get what you're driving at. Yes, there are levels of subject matter expertise that you need to know--or at least be conversant enough with to deep-dive when necessary--for system administration. That's also the case for database programming or bioinformatics or graphics programming or whatever. But it is still expressed through the writing of code, in 2016--and that code, largely today and with increasing focus going forward, needs to be well-structured, testable, so on and so forth. It is still the development of software. It's a specialization within the same field. It may not have been twenty years ago; if it isn't generally today, it will be before I retire, and probably a lot sooner than that. And to that end, the question that was originally asked makes a lot of sense, to which the answer is "a sysadmin is-or-had-better-be a software developer; a software developer may or may not be a sysadmin."

"Set up some software," though? The condescension is both unnecessary and built upon some pretty shaky projections on your part. Can we not dance that dance, please? (ETA: Not least because I think your posts, for the most part, are really good and I have a lot of respect for you, especially around security topics.)


I would be horrified if you had to be a software developer to be a sysadmin. It would be like requiring an auto mechanic to be an engineer, or a cook to be a chemist or something.

In actual fact, the complexity of software design is antithetical to the job of system administration. If at all possible you should use the least code possible to do a given job, and rely on the reuse of tools to serve various functions. This is not coding, just like auto repair is not engineering. If you're writing code you are not adminning, you are developing.

There is a time and a place for software development in Ops, but it should be based around projects led by teams to serve specific functions that existing tools won't solve, again similar to auto repair because some tools are needed simply to repair things more efficiently, but then you use those tools and don't keep engineering new ones. The distinction is small but highly important, because code is often a source of headaches in Ops. This is coming from a guy who is usually hired to write code for Ops departments.

A software developer can of course perform some of the functions of a sysadmin, but again the complexity of the whole becomes too much to understand just by doing single tasks. Either job is too complex to learn without a lot of study.

I just find comparing the two in the context of one becoming the other invites oversimplification of either role, and devops doesn't help that comparison.


> If at all possible you should use the least code possible to do a given job, and rely on the reuse of tools to serve various functions.

I agree, on both counts--and you should write the least code possible and reuse the most tools possible when you're building a web app, too! And when I am managing systems and infrastructure, I am reusing tools that are configured and operated through the expression of domain-specific code in Ruby. When I'm solving a problem, I'm writing code in Chef or in Cfer/CloudFormation (if I'm touching a machine by hand that's outside of my burn-it-down test environment, something's going really, really wrong). I'm consciously writing that code in such a way that I can put it into an automated test environment based on preconditions and postconditions that map to the business logic of the system to ensure correctness when that code is then changed. If I need to solve a related problem later, I'll generalize the code for least-effort reuse at that point.

This is literally software development, to me. That the output involves a configured system instead of a web app or whatever is orthogonal to being software development. It still feels like you are drawing a distinction without a difference.


Not the same at all. Let me rephrase the question.

Reading the article, the sysadmin is described as the person who constructs and maintains the environment the code runs in, including the physical machines, but also things like dependency management (I infer).

Given that just about everything except the physical machine can be handled through devops automation (from experience), and those things don't require "special" skills other than docker/ansible knowledge, why are sysadmins still considered separate from software devs?

On a more opinionated note, I think that using people, and not code to setup the operating environment of their application is pretty inefficient nowadays.


I started as a dev many years ago, then moved into sysadmin / devops stuff for over a decade, and then moved back to more hands on dev stuff again a few years ago. I think your statement that "those things don't require "special" skills other than docker/ansible knowledge" is over-simplifying things a bit.

Writing a production quality Dockerfile or Ansible Playbook requires more than just a bunch of "apt-get install" statements. You need a bunch of knowledge about how each bit of software should be configured and tuned for each individual use-case.

Once it's been written, then the devs can happily automate all the things as much as they like. Writing it in the first place is where a lot of what was traditionally sysadmin knowledge comes into play.

Anyone with a bit of patience and practice can follow a recipe in a cook-book. Not everyone can write one.


I'll agree that I definitely oversimplified it. I think what I was getting at was that it was a similar set of skills to other types of software development.


Thinking like this is what leads to tens of thousands of Mongo databases being exposed online.

Anyone can follow a "how to install X" guide and install a few packages and get a working config to get going and then automate it with their tool of choice but that doesn't come anywhere close to a production ready system.

When your job focus is shipping product that's all you care about, get it out the door as fast as possible so why learn anything more than the bare minimum?


Getting it out the door will make you learn faster. The choice to stop learning is a personal one IMO.


Software is complicated and error prone and it takes a good amount of skill and time to write it well. Personally, I'm not impressed at all with most software made in an Ops environment. I think it usually sucks and that sucky quality, along with the complexity of debugging and maintaining it, makes it a net deficit to productivity. I have been writing software in Ops for a long time and it is just annoying how little work is done correctly in this space. I think most Ops automation is so error prone that you might as well hire an intern to stare at your network and you'll get as good if not better system stability.

In fact, this is how Ops used to work. You'd use humans to oil the machinery of the basic automation you put in place, and they were damn good at keeping uptime high and catching problems nobody thought of in their automation. Now people just automate and pray, it seems, and then play catch-up indefinitely, or waste time redesigning.

We totally automate the physical machines too. But wherever possible by using tools, not writing software. Honestly I wouldn't want to insult the practice of software engineering by comparing my work to theirs.


> Software is complicated and error prone and it takes a good amount of skill and time to write it well.

This we can agree on.

> Personally, I'm not impressed at all with most software made in an Ops environment. I think it usually sucks and that sucky quality,

Not sure what is an ops environment although I'll agree with the general statement that most software I see sucks and is of sucky quality, but I think that relates back to your first point.

> In fact, this is how Ops used to work. You'd use humans to oil the machinery of the basic automation you put in place, and they were damn good at keeping uptime high and catching problems nobody thought of in their automation. Now people just automate and pray, it seems, and then play catch-up indefinitely, or waste time redesigning.

I'm not proposing that the rise of devops precludes the need for people specialized in it's use and implications. Instead that sysadmin is essentially becoming a branch of software engineering with the rise of devops. When setting up a system IS writing code, why is the "software engineer" different from the "sysadmin"?


with the rise of cargo cult style management, how often are managers different then a script that purchases items from Amazon randomly and bulk emails randomly generated business plans?


They're not really and they shouldn't be (although systems knowledge here becomes increasingly important). This is the point of DevOps, that operations can be treated as a software problem and managed in an automated fashion. The role of "SysAdmin" as presented in this article is fading quickly.


UC Berkeley's approach to teaching computer science I believe is the correct approach.

1. In CS150 one has to build a hardware computer. 2. In another class one has to build an OS. 3. In another class one has to build a Database. 4. In another class one has to build a Network.

Education is key. Understanding the TLB inside the CPU gives one an appreciation for context switching between the only 2 users the CPU has hardwired into it: kernel and user.

Being a sysadmin is insufficient experience as education. Every developer should know how to build a hardware computer from the ground up.

Computer science at Berkeley doesn't have a "Java" class. The language one programs in changes depending on the Professor and topic. That way one doesn't get too attached to a language.

Most developers do not understand the hardware implementation of a thread, or now-a-days a docker. As a result both get used wildly incorrectly.

Threads are useful blocking on IO to keep the CPU from starving. However, switching threads simply to swap CPU bound jobs is inefficient. Just adding a thread doesn't guarantee equal service to users just like adding processes does not.

Developers are using dockers today who have no clue why they are doing so...it's just that everyone is doing it and they do not want to appear stupid.

Experience as a sysadmin is no substitution for education.


The unspoken elephant in this thread is that sysadmins and developers have broken relationships surrounded by organizational dysfunction.

Personally, I feel that system administration experience of some sort is essential to development skills, and development experience of some sort is essential to system administration.

Communication and leadership skills are essential to both. It _used_ to be OK if you were the jerk at the office if you were good enough at your job (sysadmin or developer). That's no longer the case because we've progressed well past the point where a single developer or sysadmin can be enough of an asset to look past their ability to work well with others.

Operations needs stability. Developers need to hit moving targets. Sysadmins need to keep things manageable, compatible, and secure. These aren't incompatible goals, and usually when you get the techies talking they are sympathetic towards each other's challenges and helpful to one another. With as many project managers, engineering managers, manager managers, directors, and executives in the mix I wonder if the problems in all these environments isn't more of a historic issue with leadership failings.


You do not know pain until you've had to deal with government sysadmins as a contracted developer. And the govt has a strong preference for bringing sysadmins into the civil service rather than contracting that work out (like they used to) these days.

Which means the ones with actual, marketable skills aren't working for the government.

Spending months waiting for a sysadmin to perform a task you know takes about 3 minutes is great fun.


I think the problem here is that decent admin jobs are getting harder to come by these days. Used to be any reasonable organization needed an admin just to keep systems straight, now they all rely on less-skilled help desk types and make it up with outsourcing companies, which in turn are shedding personnel and relying on big vendor contracts rather than individual skills.

I predict sysadmin culture and skills will be the next Fortran.


The underpinning logic seems false.

To extend the analogy of the house on a slope, if you don't tell the developer you need a house designed to be built on a slope, that is a problem with the spec provided to the developer.

Although it can at times be more efficient for a dev to have sysadmin experience, making blanket statements requiring all developers to come with such a background is just sensationalism.


Wholeheartedly agree.

I think the lesson the author is really trying to convey is the importance of learning about distributed computing [1] concepts and not necessarily system administration.

[1] https://en.wikipedia.org/wiki/Distributed_computing


I'm a front-end developer with very little knowledge of sysadmin stuff, but I can't agree more, the best developers I've ever seen had massive sysadmin experience.

Can you point some good resources about sysadmin I could use to improve my knowledge on the subject? Note I'm a noob when it comes to all this server-side magic.


When I was getting into ops 10 years ago the book "The Practice of System and Network Administration" by Thomas Limoncelli was a great overview (very high level and not specific) that we all read.

The Phoenix Project by Gene Kim is kind of a more modern uptake on it, but both can be read easily by people outside of the field and I doubt the first one will ever not be relevant.

Both are more about establishing proper ops methodologies. If you're looking for something more low level, like a Programmers guide to sysadmin I can't think of anything off the top of my head.


Limoncelli's newer book _The Practice of Cloud System Administration_ is awesome. It's really more about how to build distributed systems to performant and operational.


I started writing a book about cloud system administration before I read this one, FWIW. It's worth your time.


Oh wow, I need to check that out. Wasn't aware he made one cloud relative.


Yes, yes. We get it, we should know all things, keep up with all the latest rends, work 80+ hours every week, and do our company's IT support on the side for free while we code their software. Or companies could compensate us for our time and hire more god damn people if they need it. You know, just saying.


I've had to work at start ups where I was one of the initial developers (and the most experienced in the team) so I've been forced to learn a lot about Sysadmin and design (and a lot of surrounding dev tools for instance) but all in all - knowing more about sysadmin has made my dev experience change - I think a lot more about how it's deployed, how it works in prod, how to get data to improve my code and so on.

I do feel all devs at one point should have some sysadmin experience - if not professional, then informally.

That said, sysadmin is strongly divided into 2 types of tools - configuration and code.

I love tools that can be coded (as an example - Gulp), over configuration (like Grunt) - as they generally have a lot less undocumented "gotchas" and getting started is generally a lot more simple. (I'm looking at you Webpack!!!)


Wherever I went, I have seen the war between developers and sysadmins being fought. In enterprise the sysadmins have the upper hand, in frontend, mobile, SMB the developers.

Developers don't have the burden of maintaining and upgrading multiple applications on the same server, with conflicting dependencies. 24/7. Sysadmins don't feel the pressure by end-users to deliver new features yesterday.

At the moment developers are winning because Docker. But when they grab that responsibility they will be the ones called when their 50000 containers are being hacked because vulnerabilities. And they will be under fire for the system being down and that costing the company a lot of money, so everybody breathing down their necks.


A great reason to have cross-experience at the management level is to understand the different motivations of the two roles: * devs are incented (paid) to introduce changes to the system...mostly in the form of new features. * sysadmins are incented to stabilize the system and make it scale.

In organizations with uncoordinated or siloed management, this leads to infighting... sysadmins try to stop "troublesome" devs from making changes at all and actively slow down changes in the name of stability and developers at the other extreme try to get changes in without caring about stability and blame sysadmins for that.

Great management with experience in both will find ways to incent the teams to introduce regular change while maintaining stability.


I think it's more accurate to say that good developers need to have some level of systems expertise which probably means OS & networking fundamentals + practical Linux experience. This is because you need to have some understanding of how your program will execute on (or across) machine(s), and some know-how to get your program up and running on a real machine available to users on a real network. Whether you need to be an expert in the latest best practices for managing a fleet of servers compliant with whatever regulations, that's more for the sys admin (though it never hurts to learn more, there's just so many things we developers could spend our time learning).


I get the sentiment. Personally, I could say the same thing about control theory, systems theory, economics, operations research, and statistics. And I could also share dozens of anecdotes of bugs and suboptimal software that was produced as a result of this lack of experience.

But at some point, you have to acknowledge that developers can't know everything. Go deep or go broad, but accept that both paths have their limitations and benefits. Maybe in some cases you'll run into a bug you created due to your lack of sysadmin experience, but don't feel bad about it...your experience has led you to other types of expertise. Fix it, learn from it, and move on.


Actually, I would rather language and tool designers have a go at a true System Admin job. There is a reason PHP gets installed by default and Ruby on Rails does not.

Frankly, at the end of the day, most businesses don't want something that was "working" breaking. All the conflict comes from that one desire. There is a reason System Admins have to do the "Patch Tuesday" dance and despise any software that just updates by itself. Getting yelled at because Chrome updated to a version not supported by your ISV or your local developers is an amazingly fun experience.

Yep, System Admins have to be the company nanny. It sucks, but the apps need to keep working.


True for the most part, however PHP is a language while Ruby on Rails is a framework.


PHP comes with all the stuff to make a website. Ruby needs a framework, and Ruby on Rails is the most popular. I think you proved my point for me.


It's hard for me to "get into" developers who don't know how the system works. I don't mean they have to be full-fledged sysadmins, but it just seems natural to me that they should have a decent understanding of the electronics, how to assemble the hardware, and what's involved in tuning it and keeping it all running.

It may have something to do with my education being EE, but that's not the heart of it. I never practiced EE. I taught myself (more or less in order) building circuits, programming, building PCs from pieces, and elementary system administration. I know there are lots who never "get" all three, but the best I knew did.


In the case described by the author, it seems like most of the trouble would be resolved by the developer understanding the concept of "distributed file storage is hard" and coordinating with ops early - the dev doesn't necessarily need to know the details of setting that up, but they need enough understanding of the domain to realize there's work to be done on the issue.

For that matter, I suspect most sysadmin / ops teams would be happier if developers could only strip "just" out of their vocabulary (and vice versa) - nothing more annoying than "could you JUST do {thing asker doesn't understand and thinks is trivial}"


Just like how you typically distinguish front and back end developers, went wouldn't you distinguish dev from devops?

Sure, it helps if an individual is not entirely myopic, and would be great if that person can wear multiple hats with skillset spanning all those disciplines.

However those different labels exist primarily they align to unique mindsets and solutions to a problem in a mutually exclusive manner.

Do you want a jack of all, but master of non?

Seriously, how many individual out there do you think have the capacity to develop their career so they have deep capabilities for all of those disciplines?


I think that there's good advice in this article, but a couple counter-points:

- Plenty of software developers work for small- to mid-size companies where the Node or Rails server running on your machine is nearly identical to the one running in production.

- Plenty of software developers work exclusively on the front-end, which means that the computer your code runs on in production is very very similar to the one you develop on (although then you have the added concern of testing on lower-end mobile devices, other browsers, etc)


I think everyone would benefit if they learned just a small amount about what their team members have to deal with. For example, project managers should learn a little about what programming involves while managing programmers and graphics+UX designers should learn a little about coding to make their designs more practical (i.e. identifying cases where a slightly better design requires a massive amount more coding effort and probably isn't worth it).


I think many professions should have sysadmin knowledge. In my work (biology) there is a lot of unnecessary clicking, not only in Excel (where Python/R would be much more efficient) but also in web interfaces to transfer data daily. The latter is often completely automatable using cron and rsync over ssh. Many repetitive actions are a shell script or command line pipe away. But people don't know.


> automatable using cron and rsync over ssh ... shell script or command line pipe ...

Where I come from, that's called "knowing how to use a computer".

Or at least it was, until Windows et al arrived.

(Interesting how the user-interface technology advance called the GUI, decreases the efficiency of the human/computer interaction.)

Except ... for those who still regularly use a Unix-like OS, that knowledge still is quite basic and, thus, surprisingly difficult to trumpet on a resume.


I understand the point here; it can be useful to understand the abstractions on top of which you are working.

But given most engineering projects are crud apps without scaling problems, and PaaS companies like Heroku totally abstract away the system adminstration piece for those projects... I certainly wouldn't recommend a new engineer spend any time learning system administration.


I'm moving from dev to devops for a while, because of similar reasons. Excited to learn all these new things and peer beneath all the tools and abstractions I rely on! I have no doubt in my mind that it will make me a better developer - most abstractions have a tendency to leak, eventually.


This seems obvious to me. You'll never see your code again with the same attitude after being on call.


As a developer with practically no sysadmin skills, how would I go about improving these skills?


Small/micro AWS/Azure instances. Set up your own CI/CD pipeline. Read sysadmin books. Get comfortable with the environment shell (Powershell/bash/etc.) and write some scripts.

Bonus points if you make an actual product that solves a problem and also practice all of these skills.


I have been at this since the early nineties and can't comprehend that someone can be a developer and not have any experience with How to set up and administer the environment. I guess it's a normal thing now?


Just a small nitpick with the article: it probably meant "backend software developers". Because a graphics developer who mostly writes shader code would surely benefit more from artist experience.


It's incredible how apparently 99% of HN are web developers who don't even bother considering that other things exist. It's not at all representative of the larger industry.


Very much this. I don't pretend to be a dev, but I listen and work with them to accomplish business goals. Sysadmin work is very much more business management oriented in the real world, but to see so many Devs say "I know how to admin my software in prod therefore I am sysadmin too" just makes me realize just how little the Hn/sv web startup dev knows about real world business systems administration for established business.

Once again, like I said in my other comment, even in web startup land, this is a failure of management, in particular CIOs and CTOs, and the management that selects them.

The problem is that companies can usually manage to survive for quite a while ignoring these problems. There is little short-term feedback mechanism for duct taped systems... Until that day when shit starts dying and no one can fix it. Or a cryptovariant hits the file server. Etc.


On top of the 20 years of experience we already require of junior developers?


Probably preferable to calling someone a developer after a 12 week bootcamp.


Just make sure they're 22, 23 tops. Can't have any of those 30+ year olds with their early onset dementia.


Careful with that should, or you might take someone's eye out.


As a developer the way we resolve this is by including the sysops department as a stakeholder that produces requirements for us to negotiate and implement.


The general trend in software development appears to be toward generalization over specialization. I'm not sure this is a good thing.


Funny, I always see this from the other side. I always find myself thinking "I wish sysadmins had more programming background."


Software developers should have experience in the things they need to get their job done. This is not a revelation.


Yes, sysadmin experience will give me vital insight into my work writing embedded code for dishwashers.

Snark aside, if the post was headed "Software developers need to understand the environment where their code will be running or they may not build it properly", which is the point being badly made by conflating "environment" with "some kind of commodity x86 with a standard OS", it would be a lot more applicable.


sysadmin really means "technical debt manager" most of the time these days.

I really spend the majority of my time trying to slot something into place in a way that no one will notice that doesn't disturb existing stuff...often that means duplicating bad behavior because its become a dependency.


Maybe it's only because I've only ever worked with myself, but it's hard to imagine a software developer who isn't also a sysadmin by necessity. How do you debug things otherwise?

In what city or company is the barrier to entry for software developers so low?

I just can't imagine paying above median income for a software developer who can't install, for an example given in this thread, Visual Studio.


Managers should have management experience and less nepotism experience. One step at a time, guys.


One thing I like about DevOps is never having to hear "it works on my machine!"


I always take the "it works on my machine" as a synonym for "you need to explain your problem better". In my experience most problems that "work for me" are caused by somebody installing an unstable build and patching it with random crap they have thought of.


I think the reverse is much more true, but maybe nobody argues about that anyway.


This needs to be mandatory for developers working on printer software.


Ironically I can't see the images at the bottom of that page.


Shameless plug perhaps, but this is free and will help here: https://www.cncf.io/event/webinar-cloudnativenetworking


I worked in the home office of WalMart Stores, the retailer, from the mid 90s until 2009 as a hybrid network engineer/developer. That is, my team worked in Network Engineering, but we wrote tons of code, because when there are millions of different IP addressable nodes on a centrally managed network, there will be code to manage it. (:

In that era, I think WalMart handled this apparent developer/sysadmin confliact quite well. Believe it or not, even though we (the whole company) had exceptional uptime, we were also extremely agile. LOTS of code was written and rolled out across thousands of sites, multiple times a day.

There were a number of keys to this, and I'll enumerate a few.

1. Most new hires (those that had minimal previous experience) to development positions had to spend their first six months working at one of the four or so major help desks. Note: they were paid their target salary, but in an hourly fashion. 2. The various operations teams had complete and final say so over what went out, and how incidents were handled. A couple of the big operational areas were Unix Operations, Network Operations, Windows Operations and Mainframe Operations. I called them 'teams' but each had a number of teams, and their own help desk. 3. Any time there was an impacting problem associated with a program developed by team X, a war room was called by the relevant operational area. A member of team X (a developer) had to stay in that war room (switching off between team members if it ran very long) until the production problem was not only fixed, but also deemed unlikely to recur, except in cases where that dept of a fix would take a long time. 4. Every development team had a pager rotation, with rigorous expectations about responding to such pages. This was primarily to support the previous point. 5. Because of the enormous operational scale, all of the major operational areas had dedicated teams focused on automation. My team was that team for the networking area. Furthermore, most of the rest of the operational folk read/wrote code to some extent.

In short, incentives were aligned. Teams that wrote externally facing code felt pain if the stuff they wrote and released caused problems. Operational folks wrote/managed/interacted with tons of their own code in order to manage the enormous infrastructure. Also, ops folk were far more willing to let things move with velocity knowing that the people who actually wrote the code would be required to support it, globally, any time of the day or night.

Another, perhaps even more important reason we were so successful during those years (and the years before) was a strong and vibrant esprit de corps. The entirety of Information Systems was, at the time, around 2000 people, and we were facilitating double digit year over year growth over a 150 billion dollar company. We had over 5000 remote sites in 15+ countries, with a diversity of software and infrastructure that was honestly pretty astounding. Each of those sites had quite a surprising amount of infrastructure.

We worked hard, and we produced huge velocity with fantastic uptime. For example, the network achieved six 9s of availability a couple of quarters.

In end, while things were sometimes contentious, we trusted each other, and only minority of teams were forced to work bad hours for any length of time.


Agreed.


As software developers, this should notice and kept in mind that all these experiences are very much important. Thanks for sharing an important article.


That’s interesting.


Sysadmin is getting automated. There is not reason to know it.


I have ten years of experience in systems and have written tools software in Python and Javascript for Netflix, Stanford University and Google (current). I have never been a software engineer officially, but I want to be. I am a PC gamer, anime lover, and a proud SJW. Any takers?


What are you on about?


I am interested in finding positions. Sorry if my post was ambiguous, I know it is a bit out of place here.


No. Developers do not necessarily need any experience as a sysadmin and I would advise against moving productive developers into this role for the purpose of misguided training.

The argument about understanding how code scales beyond one computer is fallacious. In fact it's quite easy to learn and practice all kinds of distributed high scale architectures without getting up from a desk.

I will agree it's important to understand the people and work connected to your job. It's probably a good idea to eat lunch with the sysadmin, or QA, or UX person.

Would also agree its important to have respect, empathy, and collaboration others, but it doesn't mean you need to switch jobs.


I strongly disagree with this, from experience. There is no difference between a "developer" and a "sysadmin" in a healthy shop in 2016. It's all the same job. Until you actually put something into practice in a hostile environment you have no idea what rakes are going to be in that yard. I would be a bad developer if I did not understand how the systems my code runs on worked. I would be a bad sysadmin if I couldn't marshal my systems through code and fully and completely understand the software that would be run on them. (As it happens, I'm not too shabby at either, and it's additive.)

The mindset that you are describing is why that "devops engineer" that gets hired so very often ends up being burnt at both ends being the savior of the team because, unlike the developers, he or she does understand both sides of the equation and is able to solve the problems that the incurious developer cannot. I would strongly urge that you reconsider.

And, for your own sake, please excise the word "fallacious" from your vocabulary. It will help you, I promise.


>There is no difference between a "developer" and a "sysadmin"

Ridiculous. Many developers have specialties that take years to gain proficiency. If there were no difference we could just replace a math PhD doing computer graphics with a sysadmin. Sometimes it's possible, but the blanket statement does not hold up.

>I would be a bad developer if I did not understand how the systems my code runs on work

I agree, but so what? Becoming a sysadmin is not the only way to do that. You shouldn't confuse what you've seen work, with being the only way something can work.

>[your] mindset is why "devops engineers" get hired and very often end up being burnt at both ends

You’re reading way too much into this. I never said a sysadmin rotation couldn't be helpful for some. I said devs don't necessarily have to do it to be great at their jobs. I wouldn't pull a dev who was, in the zone so to speak, being highly productive into a sysadmin role misguidedly thinking it would have no impact on delivery.

>[sysadmins are] able to solve the problems that the incurious developer cannot

I never implied I wanted “incurious” devs who couldn't solve problems to the extent they needed a “savior” as you put it. Maybe you are projecting your past experiences onto others based on overzealous inference? I didn't say what you suggested, and I don't believe it.

> please excise the word "fallacious" from your vocabulary

What are you talking about? The definition is “based on a mistaken belief”. I claim the argument that devs cannot understand scale out architecture without being a sysadmin is based on a mistaken belief. Why are you so against that word?


> Why are you so against that word?

Because, to be frank: it makes you sound like a tool. That's what 'greglindahl was referring to in his post, 'cause I said exactly that before I edited it to assume good faith and be nice about it. But leading off your reply with "ridiculous" confirmed that you're okay with that. I'm not gonna play this game with you; I'm having way more interesting and way better conversations in this thread with people who don't wanna dance that dance. Haveaniceday.


That's unfortunate it came across that way. I did not intend to do anything other than make a good faith (and maybe passionate) argument.


Thank you for re-wording the 3rd paragraph before I could complain about it :-)


I maintain that the sentiment was not wrong.

...also, man, don't tell me you're on Hacker News while you're camping...


There is value in having breath of experience as it allows you to produce solutions that either have less impact in metrics you might ignore or to better fulfill requirements that are difficult to observe without having been in the field.

Should a //everyone// be in other roles full time? Depends.

Should a //everyone// have had exposure to expand their knowledge? Yes, absolutely. It should also be refreshed so they don't forget.

Most adults from time to time are exposed to this for 'common day to day tasks'. Either in the form of being an assistant to someone else doing a task or in directly doing it themselves.


If you just talk to a formula 1 race car driver, does it make you a good driver if you never drive yourself? I don't think so.


Nope. SD should not be using DB. They should use API to make REST | Microservices, ex FireBase. SD should not be using servers. They should go serverless, deploy static HTML to CDN>

etc. If you are still using docker, puppet or worried about scale/security... you are living in the past. (one ex: https://github.com/struts3/what-demos )


You're right.But like any big change it will take time.Say 5-10 years to most new stuff being serverless.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: