Hacker News new | comments | show | ask | jobs | submit login
“It's The Future” (circleci.com)
1323 points by knowbody on Aug 17, 2016 | hide | past | web | favorite | 521 comments

The article perfectly summarizes my frustration and sentiment. These days I hear these buzzwords all time. I work as a consultant for an enterprise product and most people whom I meet they somehow catch these buzzwords and blurt it out in front of everyone during meetings and discussions to either showoff that they know technology and things that are in the market these days(also latest iphone, apple news, tesla, space exploration and what not) or I feel they are somewhat trying to hide their insecurities.

Anyway long story short, most of these people do not really understand why they need all this rocket science to manage < 500 internal users. One of the new buzzwords I am hearing these days is mostly related to bigdata and machine learning. One of my managers came to me and asked me why dont we integrate our product with hadoop it will solve the performance problems as it can handle lot of data.

I am frustrated by the industry as a whole. I feel industry is simply following marketing trends. Imagine the no. of man-hours are put into investigating technologies and projects dropped mid-way realizing the technology stack is still immature or not suitable for at all.

This is how we manage this problem at the times when Visual Basic was the king and we use instead Visual FoxPro.

People want theirs apps to be made with Visual Studio (BTW, FoxPro was part of the package).

So they ask: "In what is the app made"?

"In Visual, Sir."

Done. End of story (like most of the time, obviously some times people are more dangerous and press it ;) ).


The point is not focus in the exact word but in what the people know the word will give to them.

So, for example "Big Data". The meaning for us matter zero. The meaning to some customer is that it have a largeish excel file that with his current methods and tools take too long to get results.

So. Do you use "Big Data Tools"?

"Yes Sir."

And what about use Hadoop?

"We use the parts of big data tech necessary for solve this, and if we need to use hadoop or other similar tools that fit better with your industry and that use the same principles will depend in our evaluation. Not worry, we know this"

Or something like that ;). Know that worry the people behind the words have help me a lot, even with people with WORSE tech skills (damm, I have build apps for almost iliterate people with big pockets but only witch cellphones as references of tech!)

And the anecdote about the largeish excel file that was too big and take too long? Yep, true. And was for one of the largest companies in my country ;)

You're exactly right. In the end, the customer is worried about solving their problem and they're asking if you're aware of <latest thing they've heard>. It's like when you go to the doctor and say "I've read of an experimental new treatment for X, can't we do that?". The doctor has probably already heard about it.

Yup. An experimental treatment is probably not even available to be prescribed, besides it's ethically questionable to use a treatment when the risks (or benefits) haven't been clearly delineated. Even if it could be used, a responsible prescriber would want to try all "standard" remedies before attempting an experimental method.

That's called practicing conservatively, minimizing chances of bad outcomes. It's a matter of astute clinical judgement to glean optimum risk/benefit ratio in a particular case. Since no two cases are ever exactly the same, good judgement is a constant necessity.

I see that the process of developing software has many parallels and not surprising that everyone experiences so much brokenness. When people complain to me about some mysterious program misbehavior (stuff I had nothing to do with) I empathize with them, and try to help them think logically about the problem they're having.

Only rarely can I offer any real insight, but given the insane proliferation of the alphabet soup of identifiers attached to all the "new things" out there, no one I know in the industry feels they have a handle on what's happening.

Seems like the pace of "innovations" will lead to even greater levels of incomplete and dysfunctional systems and can only lead, sooner or later, to truly catastrophic failures.

I think the buzzword abuse exists because of people who don't want to take the time to learn to real skill, and just want shortcuts to sound smart and relevant.

I am very skeptical of people who are "BizDev" or "Project Managers" or "Managers" or "Scrum Master" they generally don't know what they're talking about and rely on buzzwords.

Not necessarily. I suspect that people that have overlap in hist job positions but still need to use a "shared vocabulary" will diverge in their understanding of each word, not by laziness but because simply is another job.

For example, if a DBA and a JS Developer say "We need to use a scalable database", they probably don't have the same thing in mind about what "scalable" or "database" exactly is, however, both are concerned about provide data at performance.

So, if a naive web developer wanna "a scalable document stored!" you can just give to it postgres and presto! ::troll:: ;)

I'm wondering why you have managers in scare quotes. Are you suggestion that they aren't really managers or that managers as a position is just some sort of fraud?

Also Project Manager and Scrum Master are just positions that describe roles and responsibilities an organization / on a team. The people filling those roles needn't be clueless.

I feel people in those positions are just people who want a job in tech / startup and have gotten there not because they deserve those positions but because, they've given up on learning the skills and resorted to becoming a "manager". At least from my experience this is what I've seen. They generally don't really add any value because they don't have any skills, don't know how to do anything.

The positions I mentioned above are usually the position people who failed at picking up any valuable skill seem to resort to.

Elegant summary and recommendation. If NumPeopleWhoUnderstandBuzzwords < NumPeopleWhoUseBuzzwords, then don't assume your answer has to accurately capture the nuances to satisfy them.

If they don't accept your answer and ask a followup, then they're probably a person worth actually having a conversation about the pro's and con's with.

I totally agree with this and have done it myself. I've also been on the receiving end of this when I actually do know the market / tools better than the consultant and I actually do know exactly what I want, and the consultant has tried to brush me off. That does not end well for them. Just a word of caution to take extra care to know the background of the person you're talking to before starting down this path :)

> I am frustrated by the industry as a whole. I feel industry is simply following marketing trends.

My work lands me in a number of different conferences in non-software industries. This is true for all industries. Its just that ours has a faster revolving door. That, in addition to a low barrier to entry (anyone can claim they're a web developer), leads to a higher degree of this madness. Its just part of human behavior to seek out, and parrot, social signals that let others know you, too, are an insider.

Personally, I have to avoid a great number of those gatherings, since the lot of them are just a circlejerk of low-density information. If I pay too much attention to those events, I catch myself looking down my nose, and since that isn't productive/healthy behavior, I avoid landing myself in a place where guys with buddy-holly glasses and <obscure-craft-beer> argue which Wordpress plugin is the best.

Another way to signal others that you, too, are an insider is by calling a current trend a hype.

To make a clarification, I was not calling Docker hype, nor specifically remarking on any particular item. I use docker religiously. I even use dokku for about 30+ toy projects.

My remark was to highlight that buzzwords are often used for "me-too" ankle-deep conversations/articles. Whether someone calls it Devops, or Systems Engineering, makes no difference to me. However, I favor pragmatic conversations about the topic, rather than buzzword bingo.

Examples include: "MongoDB sucks.", "Everyone should use Docker", and "What? You mean you're not using Kubernetes for your CRUD app?"

Basically, blanket statements that accomplish nothing more than to send social signals.

From 20+ years of experience and having seen tons of trends just die, most are just that, hype.

But calling everything that's new "hype" and pointing to the past as the only things that are "real" or "solid" is also a form of groupthink.

Yeah, but I think it's the inverse kind of groupthink than what the industry suffers from.

So it's healthy to embrace it as counter-balance to the constant hype.

(Besides, whether something is "real" or "solid" I think can mostly be answered in hindsight -- when it's mature enough and tested enough. In which case calling only things in the past solid is prudent).

I agree, and am guilty as charged.

"I am frustrated by the industry as a whole"

Unfortunately I have to agree as a developer. My job is to make a fast, reliable, stable product but at the same time I'm questioned the tools I use by people who don't have any knowledge but heard the latest trend.

But sometimes it's also very easy to please people. Big data: just insert 10M records in a database and suddenly everyone is happy because they now have big data :|

These discussions are perfect examples of why building good social skills can be more important than learning the next greatest programming language 5.0.

I love you for saying this because it needs to be said.

Thank you.

> But sometimes it's also very easy to please people. Big data: just insert 10M records in a database and suddenly everyone is happy because they now have big data :|

Since when is 10M records is considered big data?

My goto gauge for big data is that it can't fit in memory on a single machine. And since that means multiple TB[1] these days, most people don't really have big data.

[1]: *Heck you can even rent ~2TB for $14/hour! https://aws.amazon.com/ec2/instance-types/x1/

I get your point, but 10M records is big data depending what you're doing with it. Not big on disk, but extremely unwieldy depending how it's structured and how you need to query/manipulate it. I let internal product engineering at a large multinational for a long time, and we accrued so much technical debt as a result of having to handle the stupidest of edge cases, where queries against just a few million (or even thousands) of records took multiple seconds -- in the worst cases, we had to schedule job execution because they took minutes -- because of ludicrous joins spanning hundreds of tables, and imposition of convoluted business logic.

Most all of that is overall poor architecture, and most companies don't hire particularly good developers or DBAs (and most web developers aren't actually very good at manipulating data, relational or not), but it's the state of the union. That's "enterprise IT". That's why consultancies makes billions fighting fires and fixing things that shouldn't be problems in the first place.

I think that is why he had the :| face at the end.

Oh haha. I thought that was a typo!

> big data is that it can't fit in memory on a single machine

A Lucene index can be much larger than your current RAM. It can be 100x that. The data will still queryable. Lucene reads into memory the data it needs in order to produce a sane result. Lucene is pretty close to being industry standard for information retrieval.

My definition is instead "when your data is not queryable using standard measures".

I literally heard that "Big Data is something that is too large to fit into an Excel spreadsheet". The speaker was serious.

I unsubscribed from that (non-tech) podcast.

I would have said when a single file is bigger than the maximum size of disk so say 4TB - its whey we used Map Reduce back in the 80's at British telecom for billing systems - the combined logs would have been to big to fit on a single disk

I'd say if it can't fit in RAM, but still can fit on a single SSD it doesn't count as big data either

Not sure how accurate that is, since you can buy 60TB SSDs these days.

ergo, not big data.

Yeah that's everybody's gauge if they actually work with it, which was the point.

On a positive note, it sounds like proprosals to use newer technology are welcome. I keep seeing the opposite, "No this is too different, could break stuff."

IME that comes with: Sure, the new tool looks cool, but is it battle-tested? How many tools end up being relied on heavily while they're still in beta? And does it solve any of our current problems? or is it just neat?

As a grumpy SA, I see way too many people try to push for new tools because they "seem cool", instead of "Do they solve a problem we have?"

Personally I prefer to wait until technology is battle tested before adopting. New technologies are for side projects imo. If I had to categorize myself I would say early-late majority on this graph (https://en.wikipedia.org/wiki/Diffusion_of_innovations#/medi...)

Things we consider industry standard though, why should you need to fight for it? An example I can think of, dependency injection. Ideally you can test your software better and realease more reliable builds. Believe it or not I do come across companies that still are not aware of these concepts. Introducing it would be possible without breaking anything because you can continue instantiating services the old fashioned way.

With newish stuff that's still changing, if it won't impact production (i.e., tooling) I'm up for adopting it earlier than usual.

One example I can think of is javascript bundling and packaging. This would not impact production, but will have a pretty big impact on feature integration between team members and rate of completion. In MVC you need to hand type up the path of all your JS files and stick them into bundles. Not bad, not great either. Instead you could take your flavor of package management and have that bundle and minify your js files for you automatically.

I've been around government contracting and when you see problems that come up a lot, that we have industry standard solutions too, it's hard not to feel frustrated. I get where you're coming from though, just sharing my experience :)

It took me years to realize the reason programmers do this is because the tools that "seem cool" make their lives easier at the expense of everything else. This is where the popular traits of "laziness and hubris" become a liability instead of an asset.

More programmers need to embrace the suck.

> tools that "seem cool" make their lives easier at the expense of everything else.

I'd argue the opposite. Instead of spending time reflecting on how cool and useful their code is, or hardening it up, devs spend too much time reinventing the wheel. All this work to learn the next new fad is killing productivity.

Easier might not be the right word. 'tools that allow them to be lazier' might be more accurate. Gluing together pieces somebody else wrote and trying to get them all to work with as little effort as possible and are surprised when it doesn't work well.

> devs spend too much time reinventing the wheel

I'd argue the opposite. They spend too much time not reinventing the wheel. They strap factory made bicycle wheels onto a car and are surprised when the wheels break. They could benefit from spending more time trying to make a better wheel.

Or learn about better wheels designed by smart people back in the 60s and 70s, when no one had the capability to just keep sticking wheels onto cars to see what works - so they had to rely on thinking and solid engineering practices instead.

Precisely why I've started buying technical books from ages past. I'm working my way through Algorithms + Data Structures = Programs by Niklaus Wirth, Constructing user interfaces with statecharts by Ian Horrocks and Practical UML Statecharts in C/C++, Event-Driven Programming for Embedded Systems. The last one has been especially enlightening.

Do you have any suggestions for which 'better wheels' people should be looking at?

SICP is a classic I can highly recommend. It made me aware of just how much the "new, smart" approaches to organizing code that people like to attribute to their favourite programming model (like "OOP is best because classes and inheritance means modularity") are actually rehashing of obvious and general ideas known very well in the past.

I generally like reading on anything Lisp-related, as this family of languages is still pretty much direct heritage of the golden ages.

The stuff done by Alan Kay, et al. over at PARC is also quite insightful.

If it ain't broken, why fix it?

In my case, usually something is broken or breaking in production frequently enough to warrent some changes. Plus, there are other reasons you can make a change even though it's not broken.

Sometimes it can make you more productive. Or though your site is still responding to current customer demands in a timely fashion, you know that the mobile experience could be significantly improved now that browsing via cell phone is on the rise.

Another thing to consider is employability both from a company and individual perspective. If you can keep up with moderately current (not the latest and greatest) trends, you'll attract people who want to grow in their careers. I wouldn't want to work on C# 2.0 using Visual Source Safe. It's hard to convince a company that you can learn git on the job.

In general I like to move without introducing breaking changes. I'm not a cowboy coder, it's really exhausting working with one. I do think there's merit in realizing when it's time to change though.

As long as the database isn't relational, I guess.

10 million rows in a relational database doesn't need to be bad nor is it big data.

Rows is a bad measure of "big" when it comes to data. A measurement of bytes and probably more specifically bytes per field and how many fields the records have, as this gives a better indication into the way this will be written and potentially searched.

10 million rows of 5 integer values is pittance for any relational database worth using in production. 10 million rows of 250 text columns would be horrendous for a relational database.

Someone once suggested to me that 'big data' begins when it doesn't fit in RAM in a single rack any more.

Yup, that's essentially what looking that byte-size means. However, just because it doesn't fit in memory might not make it big data if it's just poorly engineered.

But many times this happens because of wasted or bloated indexes that aren't useful. Or it happens when data types are picked incorrectly.

For example, I once worked on a database where the original developer used Decimal(23, 0) as a primary key. This was on MySql and that ended up taking up 11 bytes per row, versus a Long which would have just been 8. In one table, maybe not so bad but when you start putting those primary keys into foreign key relationships... we ended up with a 1 billion row table in MySql that had 4 of these columns in it. That might make it "big data" by that definition but it's also just bad design.

Another example in that same database was using text fields in mysql for storing JSON. Since text fields in mysql are stored as separate files, this meant that every table that had one (and we had several tables that housed multiple) ran in large IO and disk access issues.

"big" data is probably a bad term to use these days because of easy it is to accidentally create a large volume of data but not need a big data solution outside of the fact that it's not the business that needs it, it's the poorly implemented system that does.

But the real reason we talk about fitting in memory comes from the core of the issue: IO. Even a super large memory set could end up being slow if it's postgres and single threaded reader that's scanning a 500 GB index. AWS offers up to 60 GB/s memory bandwidth and we'd need it for this index, since that would still take almost 10 seconds to warm up the indexes in the first place.

>Since text fields in mysql are stored as separate files

Bwuh? Over in MS SQL you just go for an NVARCHAR and forget about it. What is the right way to store this data (if you really do need to store the JSON rather than just serializing it again when you get it out of the DB)

varchar is different than a text field in mysql: http://dev.mysql.com/doc/refman/5.7/en/blob.html

It stores text fields as blobs.

I suppose now the right way would be the json data type. It didn't exist when I was working with these servers though (or they were on a much older version of MySql) https://dev.mysql.com/doc/refman/5.7/en/json.html

That's soon going to be on the order of 100 terabytes, so there will be only a handful of companies doing big data ;-)

I'm only aware of servers up to 12TB. Care to elaborate?

He/she said a whole rack of servers. I actually took 30 servers of 2TB each and rounded up to 100. With 12TB per server it will already be over that.

10M rows in a relational database is a very low number (depending on the size of the row of course).

I know, it was a sarcastic follow up to the "now they have big data" part of the original comment.

"SQL doesn't scale". It needs to be in Mongo or whatever NoSQl database is in right now. I have heard all sorts of nonsense regarding "big data" in the last few years.

ahaha, i didn't read the sarcasm that time, sorry for replying with tmi

I never thought I would appreciate Java...but the industry has really made me.

Take your .war file, drop it onto JBoss. It deploys across the cluster in a zero downtime manner, isolates configuration, provides consistent log structure, cert management, deployment. You can deploy dozens of small war's to the same server and they can talk to each other. Load balance across the cluster automatically based on actual load. Run scheduled jobs and load balance the scheduled jobs themselves. Allow them to be isolated and unique within the cluster.

I may not like Java as a language, but from an infrastructure standpoint Java was basically Heroku long before Heroku was Heroku. The infrastructure is just...solid. The downside was that the XML config stuff was just messy.

Can you do similar stuff with clojure or Scala? Maybe there's a way to avoid the bad parts

Actually for many of us it isn't bad.

I have come to the point where I only look at other languages once in a while and it serves me well.

A few years ago when I was still in farming we had the ostrich craze: ostriches were crazy profitable (or so the ostrich sellers said) and every farm needed to consider it.

Eggs where $300 a piece etc etc.

Of course the first to get one made great money by selling eggs, chicken and consulting hours to all the rest.

The rest where not so lucky and today I don't know a single ostrich farm.

Same goes for latest tech: if you want to you can try to be first and make a good living on the hype stream.

These days there are workarounds to avoiding a lot of xml configuration in the java ecosystem (although you end up with a decent amount of annotations). Spring boot is a great example of this.

> These days there are workarounds to avoiding a lot of xml configuration in the java ecosystem

As is, believe it or not JavaEE.

That's the nice thing about JVM languages. They get all that infrastructure without having to use Java straight up.

Yeah. And Kotlin.

Oh, I feel your pain! There's too much fashion going on in this industry, and the high salaries and the youth rediscovering the same concepts from 40 years ago doesn't help..

I mean, it's great to have this new tech and all, but when you're trying to build something to last some years, sometimes it's hard to filter the crap between all the buzzwords. It just reinforces the thought that smart people should just leave this field entirely or search for other fields of knowledge (or business) where our knowledge of programming can be made of use.

I'm 35 now, but I'm starting to realize that I will not have the patience to keep with all the crap just to be employable.. There are some areas where being old and experient is valuable. Philosophy, science, psychology, teaching, etc., are maybe some of them, but this industry is definitely not one of those areas. It makes me think that what I'm building now will some day be completely wiped out of existence..


"All my work will be obsolete by 2005" -Steve Jobs

If you aren't willing to accept that obselence is part of life, then you are either building something you aren't passionate about or confused about the cruelty of time.

As inspirational as Steve Jobs was, he wasn't correct about everything, nor was he a brilliant engineer.

Nor did anyone (in this thread) say he was.

The quote has bounded context. And in that context, seems generally valid and applicable.

I'd say it goes beyond than building physical or virtual things. I'm talking about something that resembles compound interests, where you build knowledge or experience over previous knowledge and experience such that in some years you are way ahead of where you are now.

I'm basically saying that the high churning we have now does not give you enough time to build significant experiences that you can use later on in life, and, as such, it's the opposite of a good investment to the future. It is almost as we are only living for the present status, forgetting that on the future we will have less patience and energy to have to "re-learn" almost the same things.

I think a lot of this churn can be sidestepped if you avoid startups and instead work for an "Enterprise" organisation. everything is much less cutting edge. In fact technology progression is pretty much glacial.

If you look at what technology was popular 10-15 years ago then that's what will be in use in Enterprises now. Java web services is currently the big thing at my company.

All the late 90's business apps which were in Visual Basic, Oracle Forms and Access are being rewritten as Java web services at the moment by an army of contractors. In another 10-15 years they will be rewritten again in the language Du Jour of today probably Go. It's an endless cycle.

My manager and CEO are going wild with "blockchains", and how we should use it for everything.

"We could store gigabytes of data on the clients without having to pay for servers"

Well, maybe... if you had any clients left, or the infrastructure to support such a thing.

This is hysterically funny! Can you tell us who this CEO is?

I too consult for an enterprise product and hear the same things. People don't know what the buzzwords mean but they know they must have it. That and then the assumption that ML based products are all just magic and require no effort. "You mean it doesn't just figure that out for me". Thanks marketing department.

> ...the assumption that ML based products are all just magic and require no effort.

Yep, same experience here with both "Big Data" and the ML space. The decision makers need to see the sheer amount of Java, Scala and/or Python code you need to actually implement to do anything useful.

Nope...not magic.

It seems like it's been going on in our industry for a long time. I'm reminded of it being referenced in comics from the past two decades.


Micro-services are the current-year deity of the cargo-cult that is Silicon Valley.

Unlike the natives, however, who simply wasted some time building extraneous fake runways, in the Valley people are royally screwing up their own core architecture.

I'm old enough to find this more humorous than frustrating.

It should give anyone with a better grip on core technologies a competitive edge.

The Valley is ripe for disruption. ;)

I think micro services only came into existence because SOA was such a disaster, everyone confused SOA architectures with web services. These ended up being n-tier apps with a web service RPC (or several) in the middle just to add some unnecessary serialization and network transfer bottlenecks.

So far I've seen micro services repeat this trend almost exactly.

Microservices take all the SOA problems and turn them up to eleven, as far as I can tell. It intertwines decomposition of your system (sometimes good) with network communication (rarely good) and additional ops management (never good).

Yanking out the major chucks of independent functionality into separate deployable services makes sense at a large enough scale and for large enough, independent enough components. But you would only do so out of necessity, not as an initial architecture.

And yet here we are.

I remember Larry Ellison saying once that the only industry more driven by fashion than fashion itself was IT. That was when the «cloud» thing was starting to take off. He was refusing to have Oracle use those new buzzwords, but at the end he was forced to.

Perhaps because in both fashion and IT, the surface appearance conceals a tremendous amount of underlying complexity.

Fashion signals, well, virtually everything about social interactions. A tremendously complex world. Including, for that matter, whether or not you care about fashion trends, and quite possibly, why you might or might not (you're not in the game, you've quit the game, you're so fabulously successful you don't need to play the game, you couldn't play the game if you wanted to, ...)

In IT, TLAs, ETLAs, buzzwords, slogans, brands, companies, tool names, etc., all speak to what you know, or very often, don't know. It's not possible to transmit deep understanding instantaneously, so we're left with other means of trying to impart significance.

Crucially, the fact that clothing and IT fashion are so superficial (of necessity) means they can be game, and that those who are good at following just the surface messages can dive in. Some quite effectively. But they're not communicating the originally intended meaning.

Totally agreed. As a freelance, it's really scary to invest time into a full stack of technologies. I should start a discipline to pick tools and not look back before n years went by. Maybe n = 2 or 3 ? (right now, I'm Objective-C - not even Swift - for native iOS, Ember for client, Rails for API/back-office and Heroku for deployment)

Honest question: How's that tech stack working out for you? What if you want to dev an Android mobile app?

Have you looked at React Native at all?


React native was not mature enough the last time I had to solve this problem. After writing and maintaining native clients in both Android and iOS for years, I decided to try something different. SPA app + Cordova + writing custom, native plugins for performance has worked out pretty well. Some things in the UI are not as fast as I would like, but develop/test/release cycle is so much faster (web, ios, android released nearly at the same time). It does help that I can write native Android or native iOS (obj-c/swift) to handle the plugins where needed. Cordova can also be a bit of a mess to deal with sometimes, but it is improving.

I also gave up on frameworks like Cordova because (1) who knows if they'll still be maintained in a couple of years, (2) how reactive/efficient can they be to offer access to new features from the native iOS and SDK apps. I feel like pretty much anything Cordova is really good at, you can do it with a webapp.

Another question... how do you like Ember for your web apps? :)

I still have a love/hate relationship with Ember (and other JS frameworks). They are simultaneously very powerful and quite restrictive. My most recent example: there's still no common and fast&easy way to integrate Google Analytics in a project. It takes some (reasonable) effort, like 1 hour for research and implementation, when it takes 10 minutes on good old server generated HTML/JS.

I sort of gave up on Android development by now. My mindset is iOS first if I need an app. And then consider a good webapp (with Ember then) if I want to extend to all smartphone users. If I needed a native Android development, I would try to find a partner able to code it, I wouldn't do it myself.

Swift should be mature enough by now for development. Or start with version 3 when it comes out.

I have seen excel, SQL, even MS office used as a buzzword. Of course there's also agile, scrum, etc.

Bigdata and machine learning are also hot word. But they are clearly modern engineering. Consultants exist to explain the best way to achieve modern best practices to people without the appropriate background. If someone asks about "Why no Hadoopz plx?", either explain the other technology used instead (maybe spark, storm?) or explain that the scale is small enough for Access to handle. That's a consultant's job.

> I feel industry is simply following marketing trends.

'twas ever thus.


This is not new. This industry has been about trends since the first dot com boom - remember Java Beans?, J2EE, 3 tier architectures, blade servers, virtualization, Rails, full stack, Nosql and big data, ad infinitum. It's an industry like any other with its own signal to noise ratio. You can get frustrated about it or accept it and accept that's also what makes it an interesting industry to work in.

Some things never change. It's been this way for as long as I can remember. The sad irony is, satisfaction remains the same (i.e., low) as well. We keep chasing our tails and products remain sub-par and users still frustrated. It's a shame SOS isn't a loved buzzword.

Our industry is a pop culture. A fad.

Computer science is not a real field.

Computer science is a real field. It's just not "developing site/app for BigCo". But - then again - it never has been. :-)

This comment makes me believe that you've never been exposed to any computer science.

Of course not. I am in undergrad.

But I think Alan Kay has been "exposed" to computer science, and I follow his logic, based on my limited scope of knowledge.


Yea, but does it scale?

What other parts of getting old suck? :)

Having to work with people who see caution and wisdom as "getting old".

Virtues are generally disregarded by those who seek instant gratification.

Most of the times that I bring up the concept of virtue to my peers in age they seem either confused with the concept or contemptuous of it. They behave like virtue is a purely religious thing, yet caution in the face of possible danger is a very basic survival skill.

That's probably because you're talking about 'virtue' as an abstract quality, rather than good decision making as a practical framework.

How does the concept of morality have anything to do with enthusiasm for new technology?

Honest advice: Stop working for/with stupid companies/people and start working for smart ones.

Not really practical advice without some hint as to how one is supposed to spot those smart companies.

The thing is that for a lot of people, work is a balancing act - between how much you like doing something, and how much you like money. If a "bad" company pays you a tonne of money, you might still work for them because you like having shiny things. However, for each person the line is someplace else - I know some people who will take shit pay just to do what they enjoy, and I know people who don't have a problem working in the most frustrating environments 80 hours a week because they like being paid big $$$. It's all relative to what's important for you.

The smart companies already have more qualified applicants than slots, and often arbitrary hiring processes to boot. Not everyone has the luxury of working for one.

"-No, look into microservices. It’s the future. It’s how we do everything now. You take your monolithic app and you split it into like 12 services. One for each job you do.

That seems excessive"

A 100 times yes. We tried to split our monolithic Rails app into micro-services built in Go. 2 years and many fires later, we decided to abandon the project. It was mostly because the monitoring and alerting were now split into many different pieces. Also, the team spent too much time debating standards etc. I think micro-services can be valuable, but we definitely didn't do it right, and I think a lot of companies get it wrong. Any positive experiences with micro-services here?

I think that splitting into micro services is valuable if and only if you reach a scale where it makes sense to split into micro services. By scale, I mean the number of people on the team (if you have a lot of people, it can make sense to split into micro-services to limit communication bottlenecks between developers) or in term of traffic, in which case microservices can be very useful to better optimize the system piece by piece.

A small team starting a new project should not waste a single second considering microservices unless there's something that is so completely obviously decoupled in a way that not splitting it into a microservice will lead to extra work. It's also way easier to split into microservices after the fact than when you're developing a new app and you don't have a clue how it will look like or what the overall structure of the app will be in a year (most common case for startups).

In practice micro services mean that you turn a function or method call into a network request. This doesn't really limit communication bottlenecks. It is often more difficult to argee on a network interface than on a simple function or object interface. It's also more difficult to change. You introduce a whole new set of failure modes due to going over the network. Debugging is more difficult since you now can no longer step through your program in a debugger but rather have an opaque network request that you can't step into. You can no longer use editor/IDE features like go to definition. It becomes harder to do integration tests. Version control becomes harder if the different services are in different repositories. A network request is much slower than a function call. You no longer have the advantage of a garbage collector for logical values that now cross network boundaries, and rather need to manually free them. Deployment is more difficult. The list is much longer than this, but I'd be interested in the counter-list: what are the advantages of micro-services?

> You introduce a whole new set of failure modes due to going over the network.

A thousand times yes. Distributed systems are hard.

> Debugging is more difficult since you now can no longer step through your program in a debugger but rather have an opaque network request that you can't step into.

Yes. Folks underestimate how difficult this can be.

In theory it should be possible to have tooling to fix this, but I've not seen it in practice.

> You can no longer use editor/IDE features like go to definition.

Not a problem with a good editor.

> Version control becomes harder if the different services are in different repositories.

No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

I would modify this slightly. Larger organizations with independent teams may want to run on per-team repos. Conway's law is an observation about code structure but it sometimes also makes good practice for code organization. And of course, sometimes the smell is "this company is organized pathologically".

Another problem is that large monolithic repositories can be difficult to manage with currently available software. Git is no panacea and Perforce isn't either.

> No organisation should have more than one regular-use repo

Flat out wrong for any organization with multiple products. Which, let's be honest, is most of them.

I guess Facebook, Twitter, and Google are doing things "flat out wrong", then. Yes, that's a weak argument (argument from authority) but it is true that monolithic repositories have major advantages even for organizations with multiple products. Common libraries and infrastructure are much easier to work with in monolithic repositories.

My personal take on it, at this point, is that much of our knowledge of how to manage projects (things like individual project repos, semantic versioning, et cetera) is centered on the open-source world of a million mostly-independent programmers. Things change when you work in larger organizations with multiple projects. You even start to revisit basic ideas like semantic versioning in favor of other techniques like using CI across your entire codebase.

Those are huge organizations with commensurately large developer resources, and they simply work at a different scale than most people on HN. "It works for Google" is not an argument for anything.

Monorepos come with their own challenges. For example, if any of your code is open source (which means it must be hosted separately, e.g. on Github), you have to sync the open-source version with your private monorepo version.

Monorepo are large. Having to pull and rebase against unrelated changes on every sync puts an onerous burden on devs. When you're remote and on the road, bandwidth can block your ability to even pull.

And if you're going to do it like Google, you'll vendor everything -- absolutely everything (Go packages, Java libraries, NPM modules, C++ libraries) -- which requires a whole tool chain to be built to handle syncing with upstream, as well as a rigid workflow to prevent your private, vendored fork from drifting away from upstream.

There are benefits to both approaches. There is no "one right way".

It seems we agree, we are both claiming that "there is no one right way".

I love Git, and I used submodules for years in personal projects. It started with a few support libraries shared between projects, or common scripts for deployment, but it quickly ballooned into a mess. I'm in the process of moving related personal projects to a monolithic repository, and in the process I'm giving up the ability to tag versions of individual projects or provide simple GitHub links to share my code.

Based on these experiences, I honestly think that the only major problem with monolithic repositories is that the software isn't good at handling it, and this problem could be solved with better software. If the problem is solved at some point in the future, I don't think the answer will look much like any of the existing VCSs.

Based on experiences in industry, my observation is that the choice of monolithic repository versus separate repository is highly specific to the organization.

> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

Mind elaborating on this?

> > You can no longer use editor/IDE features like go to definition. > Not a problem with a good editor.

What editor are you thinking of that can jump from HTTP client API calls to the corresponding handler on the server?

> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

Totally agree with everything else, but gotta completely disagree on this last point. Monorepos are a huge smell. If there's multiple parts of a repo that are deployed independently, they should be isolated from each other.

Why? Because you're fighting human nature, otherwise. It's totally reasonable to think that once you excise some code from a repo that it's no longer there, but when you have multiple projects all in one repo, different services will be on different versions of that repo, and your change may have changed semantics enough that interaction bugs across systems may occur.

You may think that you caught all of the services using the code you refactored in that shared library, but perhaps an intermediate dependency switched from using that shared library to not using it, and the service using that intermediate library hasn't been upgraded, yet?

When separately-deployable components are in separate repositories, and libraries are actual versioned libraries in separate repositories these relationships are explicit instead of implicit. Explicit can be `grep`ed, implicit cannot, so with the multi-repo approach you can write tools to verify that all services currently in production are no longer using an older, insecure shared library, or find out exactly which services are talking to which services by the IDLs they list as dependencies.

While with the monorepo approach you can get "fun" things like service A inspecting the source code of service B to determine if cache should be rebuilt (because who would forget to deploy service A and service B at the same time, anyways...), as an example I have personally experienced.

My personal belief is that the monorepo approach was a solution back when DVCSs were all terrible and most people were still on centralized VCSs like Subversion that couldn't deal with branches and cross-repo dependencies well, and that's just what you had to do, while Git and Mercurial, along with the nice language-level package managers, make this a non-issue.

Finally, there's an institutional bias to not rock the boat (which I totally agree with) and change things that are already working fine, along with a "nobody got fired buying IBM" kind of thing with Google and Facebook being two prominent companies using monorepos (which they can get away with by having over a thousand engineers each to manage the infrastructure and build/rebuild their own VCSs to deal with the problems inherent to monorepos that most companies don't have the resources and/or skills to replicate).

EDIT: Oh, I forgot, I'm not advocating a service-oriented architecture as the only way to do things, I'm just advocating that whatever your architecture, you should isolate the deployables from each other and make all dependencies between them explicit, so you can more easily write tooling to automatically catch bad deploy states, and more easily train new hires on what talks to/uses what, since it's explicitly (and required to be) documented.

If that still means a monorepo for your company's single service and a couple of tiny repos for small libraries you open source, that's fine. If it means 1000 repos for each microservice you deploy multiple times a day, that's also fine (good luck!).

Most likely it means something like 3-10 repos for most companies, which seems like the right range for Miller's Law) ( https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus... ) and therefore good for organizing code for human consumption.

> It's totally reasonable to think that once you excise some code from a repo that it's no longer there, but when you have multiple projects all in one repo, different services will be on different versions of that repo, and your change may have changed semantics enough that interaction bugs across systems may occur.

But having multiple repos doesn't prevent the equivalent situation from happening (and, I think, actually makes it much likelier): no matter what, you have to have the right processes in place to catch that sort of issue.

> You may think that you caught all of the services using the code you refactored in that shared library, but perhaps an intermediate dependency switched from using that shared library to not using it, and the service using that intermediate library hasn't been upgraded, yet?

That's the sort of problem which happens with multiple repos, but not (as often) with a single repo.

> Explicit can be `grep`ed, implicit cannot, so with the multi-repo approach you can write tools to verify that all services currently in production are no longer using an older, insecure shared library, or find out exactly which services are talking to which services by the IDLs they list as dependencies.

A monorepo is explicit, too, even more explicit than multiple repos: WYSIWYG. And you can always see if your services are using the same API by compiling them (with a statically-typed language, anyway).

The beautiful thing about a monorepo is it forces one to confront incompatibilities when they happen, not at some unknown point down the road, when no-one know what changed and why.

I think that your comment is actually a pretty good test for when not to spin out a micro service.

If you expect to need to step into a function call when debugging, then it's too tightly coupled to spin out. You should be able to look at the arguments to the call and the response and determine if it's correct (and if not, now you have isolated a test case to take to the other service and continue debugging there).

If the interface will change so often that you expect it will be a problem that it's in a separate repository, if you expect that you will always need to deploy in tandem, then it's too tightly coupled to spin out.

The advantage of micro services is the separation in fact of things that are separate in logic. The complexity of systems grows super-linearly, so it's easier to reason about and test several smaller systems with clear (narrow) interfaces between them than one big. It's easier to isolate faults. It's harder to accidentally introduce bugs in a different part of the system when the system doesn't have a different part. If done right, scaling can be made easier. But these are hard architectural questions, there's no clear-cut rule for when you should spin off a new service and when you should keep things together.

Someone else mentioned separating the shopping app from the payment system for an ecommerce business, which even has security benefits. I think that's an excellent example.

Edit: Another clear benefit is that you can choose different languages, libraries, frameworks and paradigms for different parts of the code. You can write your boring CRUD backend admin app in Ruby on Rails, your high-performance calculation engine in Rust and your user-facing app in Node.js (so the front- and backend an share Javascript validation code).

I just want to add one disadvantage before I give some advantages. There's a lot of operational complexity involved in routing, monitoring, and keeping every instance of every microservice running. That complexity also makes debugging in production much more difficult, as one must track a relay of network requests through many separate layers to find the point where it actually got stuck.

As for advantages, microservices tend to keep code relatively simple and free from complex inheritance schemes. There's rarely a massive tangled-up engine full of special cases in the mix, as there often is in monolithic apps. This substantially decreases technical debt and learning curve, and can make it simple to understand the function an isolated microservice performs.

There is the obvious advantage that if you have disparate applications executing nearly-identical logic to read or write data to the same location, and the application platforms can't execute the same library code, you can centralize that logic into an HTTP API, which reduces maintenance burden and prevents potentially major bugs.

My opinion is that adopting microservices as a paradigm leads to a slow, difficult-to-debug application, primarily because people take the "micro" in microservices too seriously. One shouldn't be afraid to split functionality out into an ordinary service after it's been shown to be reasonable to do so.

Yes, but there's another dimension here. If another team breaks your build in a monolithic repo, you may or may not be able to resolve this quickly. You're in a contract with them about the state of the repo and thus your service.

With microservices, the production version of their service would conceivably be stable. It moves the contract from the repo to the state of production services.

> If another team breaks your build in a monolithic repo, you may or may not be able to resolve this quickly.

With a monolithic repo done right, the other teams broke their build of their branch, and it's up to them to resolve it. You, meanwhile, are perfectly happy working on your branch. When their changes are mergeable into trunk, then they may merge them, not before — and likewise for you.

With multiple repos, they break your build, but don't know it. You don't know it either, until you update your copies of their repos — and now you have to figure out what they did, and why, and how to update your logic to handle their new control flow, and then you update again and get to do it again, until finally you ragequit and go live in a log cabin with neither electricity nor running water.

> With multiple repos, they break your build, but don't know it. You don't know it either, until you update your copies of their repos — and now you have to figure out what they did, and why, and how to update your logic to handle their new control flow, and then you update again and get to do it again, until finally you ragequit and go live in a log cabin with neither electricity nor running water.

I don't see how this is a problem if you are pushing frequently and have a CI system. You know within minutes if the build is broken. If it broke, don't pull the project with the breaking changes.

My point is, I don't think one approach is inherently better than the other. Both require effort on the part of the teams to manage changes (or a CM team), and both require defined processes.

> If it broke, don't pull the project with the breaking changes.

I agree with the overall sentiment of your comment, but the quoted part is where I've seen trouble brew. The tendency is to be conservative about pulling updates to dependencies, which can easily get you into a very awkward state when a critical update eventually sits on top of a bunch of updates you didn't take because they broke you. It is usually better to be forced to handle the breakage immediately, one way or another.

> With a monolithic repo done right.

Yes, that's the contract that you need to have with other teams. And it's the contract that is automatically enforced with microservices.

That's true, but...

You don't debug distributed systems by tracing into remote calls and jumping into remote code. You debug it by comparing requests and responses (you use discrete operations, right) with the specified requests and responses, and then opening the code that has a problem¹.

It calls for completely different tooling, not for a "better debugger".

1 - Or the specs, because yes, now that your system is distributed you also have to debug the specs. Why somebody would decide on doing that for no reason at all? Yet lots of people do.

I think it's more to do with a need rather than going straight just because you have enough people on a team. For instance, if you find that some of your processing/specific request handling can outperform better by using a different framework, programming language than the ones it's currently developed on, then you should definitely consider a microservice approach by decoupling that specific service/functionality from your current stack.

Careful with this one too. Usually adding new features to adapt to a changing marketplace can have new requirements across many of your services that need to be finished quickly. If those services are each in a different language, it can slow everything down by weeks or months.

Multiple platforms is not a problem and generally a good thing as long as it's not excessive. You don't want to be in a case where you have the same number of different platforms as developers or anything like that. I'm guessing there is a rule of thumb here, but I'm not sure what it would be. Max 1 different platform per 5 developers? Something like that.

>decoupling that specific service/functionality from your current stack.

I do wish people would stop conflating "running in a different service" and "loose coupling". They are completely orthogonal.

I've worked on some horrendously tightly coupled microservices.

OSGi makes it easy to end up with a cornucopia of tightly coupled nanoservices all running in the same JVM.

Unless you can coax dOSGi into working (which is tons of fun), then you can have services tightly coupled to other services running on entirely different machines causing frequent (and hilarious) cascades of bundle failures whenever the network hiccups.

OSGi is a trigger word for me now. I've worked on two large OSGi projects (previous job and current job) and it's always the same. Sh*t is always broken (and my lead still insists that OSGi is the one true way to modular bliss). And the OSGi fanboys always say "Your team is using it wrong!" Which very well might be true, but I no longer care. Apparently it's just too damn hard to get a team of code monkeys to respect service boundaries when OSGi makes it so damn easy to ignore them.

If I'm ever in a position of getting to design a new software architecture (hasn't happened in 10 years, but hey I can dream), I'll punch anyone who suggests "OSGi" to me right in the face.

Wholeheartedly agree on OSGi. Disastrous implementations. I worked with servicemix and still have nightmares around class path issues and crazy bundle scoping rules. A plain old maven built jar with shading works much better in practice, but shading itself is shady :)

Well, I consider that by definition tightly coupled microservices should never be done. If it's not possible to decouple that function then it should not be in a micro service.

> A small team starting a new project should not waste a single second considering microservices unless there's something that is so completely obviously decoupled in a way that not splitting it into a microservice will lead to extra work.

That's a good point. I think this thought extrapolates to other parts of software engineering as well. Sometimes writing very modular and decoupled software from the beginning is very hard for a small team, and we can't see well if this is the best approach since it's also hard to grasp the big picture.

I'm currently facing this issue. I'm trying to write very modular and reusable applications, but now I'm paralyzed trying to picture the best patterns to use, where should I use a facade, a decorator, etc. I think I'll adopt this strategy for myself--only focus on modularizing from the beginning if it'd lead to extra work otherwise.

I'd also add that microservices have increased value if you begin with such an architecture in the first place. It's much more difficult to "gracefully" rip an existing monolith into modular pieces than to build modularly from the start.

I don't like correcting with "well, actually", however, I have to say that the author of the book "Building Microservices" in his first few chapters (in particular: Chapter 3: Premature Decomposition) warns against using microservices with new apps, especially if you are new to the domain. He claims that they are actually easier to use when you have to refactor a large monolith, and that normally you shouldn't start with microservices unless you know what you are doing - therefore my criticism towards the article which starts with a pre optimization (split one service in 12), which seems to be a common, yet arguable practice.

This has not been my experience. I've seen a few projects where microservices had been added from the start because it's the thing to do and, in all cases, it didn't work well. It's extremely difficult to split in microservices if you do not have a clear big picture of your projects functions and coupling. And, in most cases, in new projects, you don't have that big picture.

Microservices also make it much harder to refactor the code which you often need to do in the early stage of a project.

Yeah, we do micro services, the "real" kind. Not the "SOA with a new name kind", but the "some services are literally a few douzan lines of code and we have 100x the amount of services as we do devs" kind.

The thing is, you need a massive investment in infrastructure to make it happen. But once you do, its great. You can create and deploy a new service in a few seconds. You can rewrite any individual service to be latest and greatest in an afternoon. Different teams don't have to agree on coding standards (so you don't argue about it).

But, the infrastructure cost is really high, a big chunk of what you save in development you pay in devops, and its harder to be "eventually consistant" (eg: an upgrade of your stack across the board can take 10x longer, because there's no big push that HAS to happen for a tiny piece to get the benefits).

Monolithic apps have their advantages too, and many forget it: less devops cost, easier to refactor (especially in statically typed languages: a right click -> rename will propagate through the entire app) and while its harder to upgrade the stack, once its done, your entire stack is up to date, not just parts of it being all over. Code reuse is significantly easier, too.

Yeah, add in things like MORE THAN ONE PRODUCTION ENVIRONMENT and LETTING YOUR CUSTOMER HOST AN INSTANCE OF YOUR MICROSERVICES and you have guaranteed your own suffering.

It is a matter of tooling. One data center or ten, it does not matter much with proper tooling. We deploy to seven data centers with a click of a button, with rollback, staggered deployment etc. Centralized logging using ELK gives us great visibility in to each DC, without worrying about individual microservice instances.

Easy until you realize you need to somehow manage + configure hundreds of services to run your dev environment...

beyond just the docker environment, you only need to be able to run the service you're working on locally. Anything you don't run local should hit some shared dev/QA infrastructure (which share a db with local). Whatever you use to develop should be able to detect what you have running locally and prefer those when available.

Anything you're not running locally just hits the shared infra.

Dockerized apps make it simple to run on dev environment, which is what we do. Of course , one cannot run everything on a laptop, we have a dev cluster

"Different teams don't have to agree on coding standards (so you don't argue about it)."

Unsure if sarcastic.

Well, you use the standard you like, and the other team can use the standard they like. Then you write a micro service to convert from one standard to the other.

Not at all sarcastic. I've seen endless wheel warring over spaces/tabs, level of indents. Mostly I ignore it.

Seems like a terrible idea if "different teams" actually means "random assortment of developers for this specific project." If it actually means "different teams," e.g. you rarely if ever would move from one team to another, I don't see the issue if one team uses tabs and one uses spaces, or you have different naming conventions or whatever.

I wonder what language I would pick for that. I usually use Scala, but it seems a bit silly when the footprint of the platform would massively outweigh the actual service. I don't like Go. I like Python, but I prefer static typing. Rust seems a bit too low level (although I'd like to try it for embedded). I don't see any point in learning Ruby when i already know Python well.

Maybe Swift? Scala Native in a year or two? I've done a little Erlang before, so maybe Elixir?

So that sounds pretty much like a function call in a monolithic app. How do you store state? I assume you need to between all those microservices.

>The thing is, you need a massive investment in infrastructure to make it happen.

I thought that one of the selling points of microservice architectures was the minimal infrastructure. I am really struggling to see an advantage in this way of doing things. You are just pushing the complexity to a dev ops layer rather than the application layer - even further form the data.

Building Microservices needs discipline, an eye to find reusable components and extracting them and as you said, investment in infrastructure.

Monoliths invariably tend to become spaghetti over time, and completely impossible to any non trivial refactoring. With microservices, interfaces between modules are stable and spaghetti is localized.

Can you expand on how you do logging/debugging/monitoring?

what are the big infrastructure costs?

Deployment has to be easy. Create a new service from scratch, including monitoring, logging, instrumentation, authentication/security, etc, and deploying it to QA/Production with tests has to take minutes from the moment you decide "Hey, I need a service to do this" until it's in prod.

Because individuals may be jumping through dozens of services a day, moving, refactoring, deploying, reverting (when something goes wrong), etc. It has to be friction-free, else you're just wasting your time.

eg: a CLI to create the initial boilerplate, a system that automatically builds a deployable on commit, and something to deploy said deployable nearly instantly (if tests passed). The services are small, so build/tests should be very quick (if you push above 1-5 minutes for an average service, it's too slow to be productive).

Anyone should be able to run your service locally by just cloning the repo and running a command standard across all services. Else having to learn something every time you need to change something will slow you down.

That infrastructure is expensive to build and have it all working together.

I do have a positive micro-service experience, and although we are still in that process of breaking down our monolith SOA based app, we have seen the benefits already.

The more dramatic effect was on a particular set of endpoints that have a relative high traffic (it peaks at 1000 req/s) that was killing the app, making upset our relational database (with frequent deadlocks) and driving our Elasticsearch cluster crazy.

We did more than just split the endpoints into microservices. We also designed the new system to be more resilient. We changed our persistence strategy to make it more sensible to our traffic using a distributed key-value database and designed documents accordingly.

The result was very dramatic, like entering into a loud club and suddenly everything goes silent. No more outages, very consistent response times, the instances scaled with traffic increase very smoothly and in overall a more robust system.

The moral of this experience (at least for me) is that breaking a monolith app into pieces has to have a purpose and implies more than just move the code to several services keeping the same strategy (that's actually slower, time consuming and harder to monitor)

Do you think the result could also be a dramatic improvement if you kept old system and do those other things except splitting into microservices?

I can't get my head around how people introduce changes to their system if they have to update 12 different microservices at once? It must be horrible.

Often you hear stories how people are converting monolithic app to microservices - but this is easy. Rewriting code is easy and it's fair to say it always yields better code (with or without splitting into microservices - it doesn't matter).

What I'd like to hear is something about companies doing active development in microservice world. How do they handle things like schema changes in postgres where 7 microservices are backed by the same db? What are the benefits compared to monolithic app in those cases?

It seems to me that microservices can easily violate DRY because they "materialise" communication interfaces and changes need to be propagated at every api "barrier", no?

Multiple microservices are supposed to have different data backends, so that they are completely independent. Splitting your data up this way isn't all roses, but ideally the services are isolated so an update to one doesn't affect the others.

>Do you think the result could also be a dramatic improvement if you kept old system and do those other things except splitting into microservices?

As I said in another thread, the separation in different components was key for resiliency. That allowed independence between the higher volume update and the business critical user facing component.

>I can't get my head around how people introduce changes to their system if they have to update 12 different microservices at once? It must be horrible.

The thing is, if you design the microservices properly it is very rare to introduce a change in so many deployments at once. Most of the time is just 1 or 2 services at a time.

>What I'd like to hear is something about companies doing active development in microservice world. How do they handle things like schema changes in postgres where 7 microservices are backed by the same db? What are the benefits compared to monolithic app in those cases?

We don't introduce new features in our monolith service anymore. So, from that perspective we do all active development in microservices.

>"How do they handle things like schema changes in postgres where 7 microservices are backed by the same db?

The trick is, you want to avoid sharing relational data between microservices. I don't know if it is just us, but we have been able to split our data model so far and in most cases we don't even need a relational database anymore, so having a schemaless key/value store makes seems easy too.

> What are the benefits compared to monolithic app in those cases?"

There are several advantages, but the critical one for me is being able to have a resilient platform that can still operates even if a subsystem is down. With our monolithic app is an all or nothing thing. Another advantage is splitting the risk of new releases.

>It seems to me that microservices can easily violate DRY because they "materialise" communication interfaces and changes need to be propagated at every api "barrier", no?

Not necessarily. YMMV but you can have separation of concerns and avoid sharing data models. When you do have shared dependencies (like logging strategy or data connections) you can always have modules/libraries.

Which one of the four major improvements do you attribute the success to though? Could you have done the work on making it more resilient, persistence, sensible, redesign the docs without breaking into micro-services and still have seen the positive results?

I don't think the level success comes from one dimension, but I don't think either that we could have achieved the resiliency without breaking it in micro-services (or just services that happened to be small if you will).

One key factor was decoupling the high volume updates from the users requests so one didn't affect the other one.

> Any positive experiences with micro-services here?

In my experience, any monolith that can be broken up into a queue based system will benefit enormously. This cleans up the pipelines, and adds monitoring and scaling points (the queues). Queues removes run-time dependencies to the other services. It requires that these services are _actually_ independent, of course.

I do, however, avoid RPC based micro-services like the plague. RPC adds run-time dependencies to services. If possible, I limit RPC to other (micro) services to launch/startup/initialization/bootstrap, not run-time. In many cases, though, the RPC can be avoided entirely.

> Any positive experiences with micro-services here?

Yep. We already had a feature flag system, a minimal monitoring system, and a robust alerting system in place. Microservices make our deployments much more granular. No longer do we have to roll back perfectly good changes because of bugs in unrelated parts of the codebase. Before, we had to have involved conversations about deployments, and there were many things we just didn't do because the change was too big.

We can now incrementally upgrade library versions, upgrade language versions, and even change languages now, which is a huge win from the cleaning up technical debt perspective.

How granular are your services? I've heard a lot of talk about microservices without much talk about how micro they are. As someone who's happy with the approach, would you mind giving a little context in terms of what sort of degree you've split the system down?

They are of varying sizes. We have maybe 3-4 per developer. And they vary in size. The smallest are maybe 5-6 python classes.

To be honest, we still have a monolithic application at the heart of our system that we've been slow to decompose, though we're working on it. We deploy it on a regular cadence and use feature flags heavily to make it play nice with everything else.

How many instances do you deploy of services that are essential but low usage?

That sounds more like your team doesn't know how to use git beyond nothing more than an SVN replacement.

"We rolled out this update with 220 changes. There's a breaking bug. Where is it? We need to find out in the next 5 minutes, revert, and deploy. Otherwise we have to revert the whole thing- we're losing money."

Git doesn't really help with that. More granular deployments do, and if microservices help with more granular deployments, go for it.

> We rolled out this update with 220 changes.

That's your problem right here

Git-bisect does, doesn't it?

Only if you have a test that catches the bug and it still needs time to run. You'll also need time to write a fix, validate it and deploy. Plus any extra time your organization needs between code and deployment

It helps with the find part. The revert and deploy not so much, especially say if it's the middle commit if 200 and you'd still like to deploy all the commits before and after.

If you had a test for the issue, you probably wouldn't have deployed the software in the first place.

My experience is that most devs don't even know how to use SVN correctly. I just had a conversation with someone waiting for me to finish something before they could branch. The idea that I could merge my change into their branch afterwards didn't occur to them.

>Any positive experiences with micro-services here?

It makes sense for some thing. We run a webshop, but have a separate service that handles everything regarding payments. It has worked out really well, because it allows us to fiddle around with pretty much everything else and not worry about breaking the payment part.

It helps that it's system where we can have just one test deployment and everyone just uses that during testing of other systems.

I've also work at a company where we had to run 12 different systems in their own VMs to have a full development environment. That sucked beyond belief.

The idea of micro-service are is enticing, but if you need to spin up and configure more than a couple to do your work, it starts hurting productivity.

> have a separate service that handles everything regarding payments. It has worked out really well, because it allows us to fiddle around with pretty much everything else and not worry about breaking the payment part.

Is the payments service a single service that manages the whole transaction, or have you go for multiple services handling each part and, if so, how did you manage failure with a distributed transaction?

It's a single service. It just sites between us and our PSPs. That way no other system needs to worry about integrating directly with the PSPs.

Not sure if it's the case here. But what works really well for us is queues with at-least-once guarantee. (For payment services you might need an additional check to guarantee exactly one execution.) I think you can find this queue offered by most providers.

Totally agree.

We had almost the same story with payments. Except for we've jumped to a payment-processing SaaS but got dissatisfied (all those SaaSes I saw don't work with PayPal EC without so-called "reference transactions" enabled) and decided that wasn't a good idea and we have to jump back to in-house implementation.

I didn't want to re-integrate the payments code back to the monolith - thought it would take me more time and make code messier. So I wrote a service (it's small but to heck with "micro" prefix) that resembled that SaaS' API (the parts we've used). It had surely evolved and isn't compatible anymore, but it doesn't matter as we're not going back anyway.

Works nicely and now I'm feel more relaxed - touching the monolith won't break payments.

On the other hand, I see how too many services may easily lead to fatigue. Automated management tooling (stuff like docker-compose) may remedy this, but also may bring their own headaches.

I don't think having a handful of services handling a specific, atomic section of the app really classifies as this 'micro-services' claim, it's just smart separation of concerns.

We have specific services that process different types of documents, or communicate and package data from different third parties, or process certain types of business rules, that multiple apps hook into, but it's literally like 20 services total for our department, some that are used in some apps and not others.

When I hear 'micro-services' I'm picturing something more akin to like node modules, where everything is broken up to the point where they do only one tiny thing and that's it. Like your payment service would be broken into 20 or 30 services.

But maybe I'm mistaken in my terms. I haven't done too much with containers professionally, so I'm not too hip with "the future".

I'm building a podcast discovery app and I find myself being de-facto pulled towards modularity. It's because my feed checker is in Elixir, my site is WordPress-based, and I communicate between them using the WP API, and I'm using Google Cloud SQL, and Elasticsearch on its own virtual machine...

The thing is though, the Elixir feed checker has its own database table that tracks whether it's seen an episode in a feed. And when there's a new episode it sends an API call to WP to insert the new post. The problem is that sometimes the API calls fail! Now what? I'll need to build logging, re-try etc. So I'm thinking of making the feed checker 'stateless' and only using WP with a lot of query caching as the holder of 'state' information about whether an episode has been seen before.

To sum up my experience so far, there's something nice about being able to use the right tech for each task, and separating resources for each service, but the complexity--keeping track of whether a task completed properly--definitely increases.

I think your problem might be not expanding wordpress. PHP will gladly do the above through wordpress plugins and cron jobs. I built something similar and ended up going that way. The system is still running to this day. Sure, its not hip but gets the job done with miniml fuss and makes money.

You are right that PHP can do the feed checking part but I wanted to use something with easy async/concurrency out of the box (and wanted to learn Elixir instead of using Node.)

One hard tech limit is that with 50k podcasts, 4million+ episodes, search definitely doesn't work well. Not just WP, but SQL itself. Hence Elasticsearch. I also plan to work on recommendations, etc. so will need probably to be exporting SQL data into other systems anyway for making the "people who liked this also liked this" kinda things.

Also I kinda lied about using the WP API--that's how I built the system initially (and will switch to it moving forward), but to import the first few million posts from the content of the feeds, I just used wp_insert_post against the DB of new entries that Elixir fetched (I posted the code I used here: http://wordpress.stackexchange.com/a/233786/30906).

I also plan to write the whole front-end in React (including server side rendering) so will have to figure out how to get that done. Would probably use the WP-API with a Node.js app in front of it, will look into hypernova from AirBNB. So probably more usage of WP API accessed by another service...

I hope you are not doing all of this alone. I'd try and keep things as simple as possible within a monolith and then improve as needs increase. Good luck :)

I generally write what I consider monolithic Django apps. I would add in Haystack (a search module for Django) and configure it to use Elasticsearch to overcome the problems you describe.

It doesn't sound like microservices are needed, just adding in the appropriate tech for the job.

cron jobs

Once these are doing anything other than rotating log files, can the system really be considered monolithic?

How do you define a monolith? Please establish that before we discuss further. :)

That's an incisive question. My impression, which may be mistaken, is that a cronjob would be used to move data (pages compiled from templates, chart images, etc.) into the PHP host on a "batch" basis. To me, that implied the existence of other systems that handle the data in their own way, but I guess in this thread the salient difference between micro and mono is that the former connects components via a web stack. Are there more agile interfaces available for cronjobs? If instead we're only considering transformations of data already resident on the host (as what, flat files?), I don't imagine that cronjobs are the best solution available.

Care sharing your progress with the podcast discovery app? I'm a cofounder of Podigee which is a podcast hosting service. Maybe we can exchange know-how, find some synergies or even join forces on certain topics. Feel free to drop me a line at mati@podigee.com

Sure, emailed. Also added a screenshot/twitter to my profile in case anyone else is interested.

The problem with microservices is that your state is spread over multiple systems. You completely lose the concept of transactional integrity, so you will have to work around that from the start.

The advantage though is that APIs (system boundaries) are usually better defined.

Perhaps one should use the best of both worlds, and run microservices on a common database, and somehow allow to pass transactions between services (so multiple services can act within the same transaction).

One of the big advantages of microservices is scaling/migrating the databases behind each service independently. If you need transactions across multiple services then one could argue that either your API endpoint is doing too much, or your services are doing too little. It's not perfect, and certainly not always convenient, but it's a balance. Microservices with a common DB is asking for trouble. The monolith is a better option in that case IMO.

Idempotency, event sourcing and sagas with compensations are ways to solve your problem.

A shared database is an anti-pattern in distributed systems.

Similarly, distributed transactions (ala. DTC) is an anti-pattern.

Distributed systems aren't hard. They're just different.

Say you sell a widget. You want to update both your cash account and your inventory, and never one without the other. Which is easier to understand and more reliable: doing them atomically, or making sure you have designed in 2^n intermediate states and all the code required to complete work that should happen but hasn't yet?

The problem with microservices is that your state is spread over multiple systems.

Then again, sometimes it's advantageous to identify parts of your system where aspects of state can be safely decoupled. And in which having them reside in disparate systems (and yes, sometimes be inconsistent or differently available) might actually be a better overall fit.

You completely lose the concept of transactional integrity, so you will have to work around that from the start.

Then again, sometimes your state changes not only don't need to be transactional; it can be disadvantageous to think of them that way.

Depends, depends, depends.

> Then again, sometimes your state changes not only don't need to be transactional; it can be disadvantageous to think of them that way.

I'm curious; in what kinds of situation would this apply?

> Depends, depends, depends.

Flexibility is usually an important requirement. Often you cannot freeze your architecture and be done with it. I think a transactional approach could better fit with this.

I'm curious; in what kinds of situation would this apply?

Any situation where the business value of having your state be 100% consistent does not outweigh the performance or implementation cost of making it so.

> You take your monolithic app and you split it into like 12 services.

The non-web world has been doing this with message queueing for about 15 years. Maybe more.

Probably more. I'd say, like, at least 30-40 years.

I mean, the infamous "UNIX way" of "do one thing, do it well" (something we nearly lost with popularity of "do everything in a manner incompatible with how others do it" approach in too many modern systems), when complex behavior was frequently achieved through the modularity of smaller programs communicating through well-defined interfaces.

Heck, microkernels are all about this, and their ideas haven't grew out of nowhere. And HURD (even though it was never finished) is quarter a century old already.

You do know the author is a taking the piss out of the practice, right?

You know nothing in the comment you're replying to indicates I wouldn't, right?

Don't be so quick to assume micros services involve message queuing. For most it seems to just be an elaborate RPC mechanism (unfortunately).

Oh yeah, there's certainly other ways of doing it. My own experience is just that message queuing seems to be the default loosely-coupled RPC mechanism for larger orgs (from before the term 'micro services' was popular).

Yes. A lot of success. And with only one person on the backend full time.

That said, in places where it doesn't make sense we didn't try to force it. Our main game API is somewhat monolithic, but behind it we have almost 10 other services. Here's a quick breakdown:

  - Turn based API service (largest, "monolithic")
  - Real-time API service (about 50% the size of turn-based)
  - config service (serves configuration settings to clients for game balancing)
  - ad waterfall service (dynamic waterfall, no actual ads)
  - push notification service 
  - analytics collection service (mostly a fast collector that dumps into Big Query)
  - Open graph service (for rich sharing)
  - push maintenance service (executes token management based on GCM/APNS feedback)
  - help desk form service (simple front-end to help desk)
  - service update service (monitors CI for new binaries, updates services on the fly - made easy by Go binary deployment from CI to S3)
  - service ping service (monitors all service health, responds to ELB pings)
  - Facebook web front-end service (just serves WebGL version of our game binary for play on Facebook)
  - NATS.io for all IPC between services
...and a few more in the works. Some of these might push the line of "micro" in that they almost all do more than a single function's worth of work, but that level of granularity isn't practical.

But don't get too caught up on the "micro" part. Split services where domain lines naturally form, and don't constrain service size by arbitrary definitions. You know, right tool for the job and whatnot.

I only use a microservice if its something that can operate by itself. Things like a file data store, reports generation, etc. But all business logic goes in the monolith.

Oh yes. We're splitting up a large monolith into a bunch of different services. Completely amazing, though there's a ton of tools (like Netflix's Hysterix, etc) that make it much, much easier to do.

I wouldn't, however, just "do microservices" from day one on a young app. But usually that young app has no idea what the true business value is, i.e., you have no idea what down time of certain parts of your services really means to the business. That's the #1 pain point we're solving: having mission critical things up 100%, and then rapidly iterating on new, less stable feature designs in separate services.

You should, however, keep an eye on how "splittable" everything is, i.e., does everything need to be in the same DB schema? Most languages have package concepts, which typically align (somehow) with "service" concepts. Do you know their dependencies? That sort of thing. Then, the later process of "refactor -> split out service" is pretty straightforward and easy to plan.

saw done properly once, with good separation of services etc, but it still required a truckload of commitment to make it work, because one thing is interface changes when the code just don't work when you merge, another is figuring out the publish and restart order of each service when you have to add an operation so you don't have to knock out the whole system at every upgrade.

I don't really like that model applied to everything, but eh now you are kind of forced in a hybrid approach - say, your macro vertical plus whatever payment gateway service, intercom or equivalent customer interaction services, metrics services, retargeting services, there are a lot of heterogeneous pieces going into your average startup.

but back on topic, what Docker really needs now is a whack on the head of whoever thought swarms/overlays and a proper, sane way to handle discovery and fail-over - instead we got a key-value service deployment to handle, which cannot be in docker and highly available unless you like infinite recursion.

I don't know if the community will consider this example to be microservices as currently defined, but many years ago I wrote client-server systems using a PC-based fat client application, and transactions running on a CICS server. I think this was pretty similar to what people currently think of as a microservices architecture, although we didn't have to worry about running/monitoring multiple servers (all the transactions/services ran on a single mainframe server), and the transaction monitor managed things like start-up and shutdown pretty simply. This approach worked really well for us, and we built several robust, scalable applications using this approach. To be clear, we numbered our users in hundreds, not thousands or millions. I can well understand how scaling this approach across many servers could be very challenging.

> Any positive experiences with micro-services here?

I'm currently working on a large refactoring effort along these lines. The end goal is to create a modular, potentially distributed system that can be deployed in a variety of configurations, updated piecemeal for different customers, and integrated by our customers with the third-party or in-house code of their choice using defined APIs. We aren't typical of the other examples, though, in that we do literally ship our software to our customers and they run it on their own clusters.

positive experience with microservices: identify discrete functions that can considered "stateless" (i.e. no side effects, deterministic output for given input) and factor those out into stand-alone microservices.

a good example of this that I've used in production at my current $dayjob: dynamic PDF generation. user makes request from our website, request data is used to fill out a pdf template context which is then sent over to our PDFgen microservice which does its thing and streams a response back to the user.

Does it connect to the database to fill in some values in the template? Does it keep connection pool of let's say 5 connection always open (as libraries like to do)? Does it have authentication? Is it public or private API? Who is managing security? Is it running behind it's own nginx or other proxy? Does it have DoS protection (PDF generation can be CPU intense)? What about the schema for request? How do you manage changes to the schema? They need to be deployed together with changes in other services, right? What about changes to database schema - you need to remember to update that service as well and redeploy it at the right time as well - just after successful db migrations - which live in another project.

All of that and much more needs to be replicated for each microservice, right?

Why not just have a module in your monolithic app that does it. The logic will still be separate. In most languages/frameworks you can spawn pdf generation task. Any changes are easier to introduce as well. There's no artificially materialised interface. Updates are naturally introduced. All auth logic is there already, you don't need to worry about deploying yet another service, same with logging etc.

> Does it connect to the database to fill in some values in the template?

the template has values that are related to database models. the main app (still mostly monolithic) fills out the template context. the context itself is what's passed to the microservice. the microservice does not connect to a database at all.

> Does it keep connection pool of let's say 5 connection always open (as libraries like to do)?

no. the service probably handles a few hundred requests per day, it is not in constant use. communication is over HTTPS. it opens a new connection on each request. this does impact throughput, but its a low throughput use case, and pdf rendering itself is much slower and that time totally dominates the overhead of opening and closing connections anyway.

> Does it have authentication?

yes, it auths with a bearer token that is borne only by our own internal server. this is backend technology so we don't have to auth an arbitrary user. we know in advance which users are authorized.

> Is it public or private API?


> Who is managing security?

we are, with a lot of assistance from the built-in security model of AWS.

> Is it running behind it's own nginx or other proxy?

the main app is behind nginx. the microservice is running in a docker container that exposes itself over a dedicated port. there's no proxy for the microservice, again, because of the low throughput/low load on the service. no need to have a load balancer for this so the most obvious benefit of a proxy wasn't applicable.

> Does it have DoS protection (PDF generation can be CPU intense)?

yes, it's an internal service and our entire infrastructure is deployed behind a gatekeeper server and firewall. the service is inaccessible by outside requests. the internal requests are queue'd up and processed 1 at a time.

> What about the schema for request?

request payload validation handled on both ends. the user input is validated by the main app to form a valid template context. the pdf generator validates the template context before attempting to generate one also. its possible to have a valid schema that has data that can't be handled correctly though. errors are just returned as a 500 response though. happens infrequently.

> They need to be deployed together with changes in other services, right?

nope. the microservice is fully stand alone.

> What about changes to database schema - you need to remember to update that service as well and redeploy it at the right time as well - just after successful db migrations - which live in another project.

the microservice doesn't interact with a database at all. schema changes in the main app database could potentially influence the pdf template context generation, but there are unit tests for that, so if it does happen we'll get visibility in a test failure and update the template context generation code as needed. none of this impacts the microservice itself though. it is fully stand alone. that's the point.

> All of that and much more needs to be replicated for each microservice, right?

in principle yes, and these are good guidelines for determining what is or is not suitable to be a microservice. if it would need to auth an arbitrary user, or have direct database access, or be exposed to public requests, it might not be a good candidate for a microservice. things that can stand alone and have limited functional dependencies are much better candidates.

> Why not just have a module in your monolithic app that does it.

because the monolithic app is Python/django and the PDF generation tool is Java. one of the main advantages of microservices architecture is much greater flexibility in technology selection. A previous solution used Python subprocesses to call out to PDF generation software. It's actually easier and cleaner for us to use a microservice instead.

<sarcastic mode on> Maybe so. But didn't the move create a cool Software Architect job position out of nowhere -wink-wink- ? <sarcastic mode off>

> Also, the team spent too much time debating standards etc.

Ah yes, the 'let's have decentralised microservices with centralised standards!' anti-pattern. It results in lots of full-fledged, heavyweight, slow-to-update services, which also have all the problems of a distributed system. It's the worst of both worlds.

was it a complete rewrite? I don't think thats the right way to transition. Why didn't you try to separate the features one-by-one from the monolith? That would give more immediate feedback and real problems to work on instead of the possibility to get stuck on the holy architecture debate.

No it wasn't a complete rewrite. We started by separating out the most mission critical components. Maybe that's where we went wrong; the most mission critical components were quite large and unwieldy to split out all at once. There was also the overhead of keeping the newly separated out component and the monolithic app in sync.

Microservices is a technology. IMO, like any other tech, it should be used when there are clear benefits expected in the near future, not as a blanket "always microservices" policies.

Although I personally had to deal with some monolithic monsters that I wished were split into smaller services.

What was the benefit you envisioned?

We were looking to break up our monolithic rails app into micro-services so devs could iterate and develop faster. We also thought that the application as a whole would become more failure resistant. Unfortunately, inter-dependancies among the services themselves meant that the failure-resistance didn't pan out as we thought.

You split off specific parts of your app into microservices because you want to scale those parts independently from the rest. It's not just a blind decomposition of a monolith for the sake of decomposition.

Wait a minute, this sounds familiar. looks at username Oh.

'mrhektor is a green account.

I should've been more clear. The story sounded familiar, because I worked with them and recognized them based on the username.

Hi Stepan!

> Also, the team spent too much time debating standards etc.

IMHO. You need a lead with a clear vision that drives the effort. Too many leads will create chaos.

We did have a lead with a vision, and part of the vision was standards for each service (for example, file structure in Go). I can see the rationale behind it; a new dev can onboard very quickly on to a new service. But in hindsight, maybe it wasn't thought out enough.

Heh, since the day I heard of "microservices" the only thing I could think was "have fun maintaining that".

> It was mostly because the monitoring and alerting were now split into many different pieces

Well, there's your problem - you need a monitoring microservice and an alerting microservice! Well, those may be too coarse by themselves, but once you break them down into 5 or 6 microservices each, you'll be ready for production.

Author here.

To answer some questions: yes this is obviously poking fun at Docker, but I also do really believe in Docker. See the follow-up for more on that: https://circleci.com/blog/it-really-is-the-future/

In a self-indulgent moment I made a "making of" podcast about this blog post, which is kinda interesting (more about business than tech): http://www.heavybit.com/library/podcasts/to-be-continuous/ep...

And if you like this post you'll probably like the rest of the podcast: http://www.heavybit.com/library/podcasts/to-be-continuous/

It's a good post. It capture both the innate complexity of the problems some of us are working on, and the incredible WTF moments involved. Both sides of your theoretical conversation have their own pitfalls. Everyone in the comments here seems stuck on the "I don't really need microservices train", (and there is such a thing as overdoing it … but I've never seen it) but I can't help but think that Italics nailed it here:

> -It means they’re shit. Like Mongo.

> I thought Mongo was web scale?

> -No one else did.

It's so incredibly true, and I laugh (and cry, b/c we use Mongo) at this section each time I read it. Also, this gets me every time:

> And he wrote that Katy Perry song?

I have to admit, I was startled to get to the "we sell these services" blurb at the end, because that was such a well-done unsell.

I like the humility on display here. And I'm not trolling or being sarcastic.

There was a time when Heroku seemed just as foreign to me as Docker does in this article.

- So shared webhosting is dead, apparently Heroku is the future?

- Why Ruby, why not just PHP?

- Wait, what's Rails? Is that different from Ruby?

- What's MVC, why do I need that for my simple website?

- Ok, so I need to install RubyGems? What's a Gemfile.lock? None of these commands work on Windows.

- I don't like this new text editor. Why can't I just use Dreamweaver?

- You keep talking about Git. Do I need that even if I'm working alone?

- I have to use command line to update my site? Why can't I just use FTP?

- So Github is separate from Git? And my code is stored on Github, not Heroku?

- Wait, I need to install both PGSql and SQLite? Why is this better than MySQL?

- Migrations? Huh?

When you heard about Rails it sounds like it was mature enough for you to start using it. Because it was a serious improvement over PHP, as I took a similar route. Docker/virtualization is still early and people are figuring out how the pieces fit together, and what the best pieces are. So it's best to wait until then (IMO).

To be fair, a lot of these questions are valid. You arguably /don't/ need MVC for yours simple website, if Dreamweaver floats your boat you might as well use it, it's unclear why a small website needs an elaborate deployment infrastructure, using a VCS for personal code can be overkill, PGSql and SQLite aren't better than MySQL for every use case -- and so on.

Frameworks, orchestrations, even just new technologies -- these are great if they actually make your job easier or if they make your product better. Unfortunately, they often do exactly the opposite.

Agree until

> using a VCS for personal code can be overkill

I've been burned before, have you? If you're using something like Google Drive, you should use DropBox instead, since it seems less likely to lose your work.

Obligatory link to "The S stands for simple", a SOAP-bashing classic: http://harmful.cat-v.org/software/xml/soap/simple

"Let me tell you about UDDI"

Nooooooooooooooooo. Everytime someone says "service discovery" a kitten dies (Except for consul, that's the biz).

Everything must be in XML. Except the SoapAction header. Which has no defined standard. Yeah I remember all that madness.

remember? Thomson Reuters on demand APIs are still largely SOAP based.

I'm working with a very well-known American company with over $4b annual revenue that shall remain nameless and is currently developing a new SOAP API to replace the existing "dump a CSV on an FTP server" integration.

never touch a running system...

fair enough, but I posit X users started consuming this API when SOAP was prevalent and Y users started when ReST was prevalent and Y >> X. Furthermore, SOAP is hard to maintain these days because it's so ancient. i.e. the libraries are not new and/or actively maintained.

As such, I maintain SOAP should be gone for the good of the running system.

In Python you simply don't have good SOAP libraries. They were all started at the tail end of its popularity and then all died quiet deaths when attention shifted to ReST before they were actually production ready, and if you now want to talk to a SOAP service… well, better don't do it in Python. 2, that is. Forget about 3.

Have you seen Zeep?

It's literally billed as "A fast and modern Python SOAP client". Python 2 and 3 compatible. Last commit was two weeks ago.


Nope. We needed one last September, zeep didn't yet exist back then.

And going by the bugtracker, it's running into quite a few problems with almost-but-not-quite compliant servers/WSDL files, which is a real issue when you're trying to interface ass-old legacy APIs (we're talking "not upgraded since 2006"-old) made by $BigEnterprise. Maybe this time the project won't die before they work out all the little kinks.

If that was ever true it certainly doesn't seem to be true now. All the tools support WSDL-first. All the tools are compatible with each other. Fill in the URL, let it autogenerate the interface, write your code and it all just works.

Because it was the latest trend 10+ years ago, and now people have made it just work because their applications are all built on it and people need actually good tools to use these architectures. It's always about tools, it's not like TCP is the best protocol or anything, it just has the best tooling, ditto for C, POSIX, etc. anything can be a good standard after 15+ years of work on it. Containers will be like that in a couple of years. It's all just cycles man.

Man, I dont think this is the future at all. OK, Docker is good and has its propose, and is very good on what its do: "Run only one process in one brand new kernel", but beyond than that, its just a daemon that uses and abuses of linux containers, you can easily scale, but is a pain in the ass to upgrade apps, also you need to run only one process on that. Does not looks like the future for me to have 30 different linux containers running just only one process in each of them, dude, you have a kernel in your hand, why the hell you will run only one process on it? (what the heck, you can protect yourself and scale without be the bitch of a daemon, you just need to know your best friend kernel), you dont need to make micro services for everything, its good ok, but its not the solution for everything like the people are saying...

I really dont have any idea why the people are are so excited about "docker" all the things.

It's all about simplifying deployment. That's it, that's what's so good about using containers.

I don't know if you understand what Docker really is when you say something like this: "Run only one process in one brand new kernel", the kernel is shared between containers, that's the whole idea, you package the things your application need and be done with it.

The current problem with containerization is that there are no really good or understood best practices, people are still experimenting and that's why it's a big moving target and, consequently, a pain in the ass if you need to support a more enterprise-y environment. You will need to be able to change and re-architecture things if the state-of-the-art changes tomorrow.

I agree with your sentiment about going overboard on "docker all the things", that's dumb and some people do it more because of the hype than by understanding their needs and using a good solution for it but I think you are criticising something you don't really grasp, these two statements:

> "Run only one process in one brand new kernel"

> you have a kernel in your hand, why the hell you will run only one process on it?

I'm not trying to be snarky, I really recommend you doing a bit more of research on Docker to understand how it works. Also, Docker doesn't make it a pain in the ass to upgrade apps, quite the contrary if you do it in some proper ways.

Doesn't statically compiling programs solve the deployment issue better? I mean, as far as I can tell Docker only exists because it's impossible to link to glibc statically, so it's virtually impossible to make Linux binaries that are even vaguely portable.

Except now Go and Rust make it very easy to compile static Linux binaries that don't depend on glibc, and even cross-compile them easily.

Hell I think it's actually not even that hard to do with C/C++: https://www.musl-libc.org/how.html

If I have a binary built by Go, what problems does Docker solve that just copying that binary to a normal machine doesn't?

Binaries are one thing, but there are the other abstractions that containers bring in regard to networking and storage.

You expose what are the network APIs of your apps (e.g open ports), filesystem mounts, variables (12 factors), etc.

Your application becomes a block that you can assemble for a particular deployment; add some environment variables, connect a volume with a particular driver to a different storage backend, connect with an overlay to be able to talk to other containers privately across different servers or even DCs, etc.

It's really all about layers of abstraction for operating an application and deploying it to different environments.

With the latest container orchestration tools, you can have a catalog of application templates defined simply in Yaml and it's very easy to make it run anywhere. Add some autoscaling and rolling upgrades and it becomes magic for ops (not perfect yet, but checkout latest Kubernetes to see new advancements in this space).

With the proper tools and processes, this removes a lot of complexity.

> add some environment variables, connect a volume with a particular driver to a different storage backend, connect with an overlay to be able to talk to other containers privately across different servers or even DCs, etc.

But environment variables already exists without docker. Volumes already exists, aka partitions. "Overlay network" already exists, aka unix sockets or plain TCP/UDP/etc over the loopback interface.

I'm not trying to be a dick here, it's just that the points you brought up doesn't really bring anything new to the table. How is this different from just having a couple bare-metal or virtual machines behind a proxy?

There are some aspects to containerization that are very feasible, but only at certain scales and the points you brought up makes me question whether you perhaps might be over-engineering things a bit.

Those things exist, but you need the "setup" bit to achieve the level of isolation that you want.

For example, volumes: With Kubernetes (on Docker), the lifetime of the volume mount is handled for you. No other containers have access to the mount. Container dies, mount dies. Whereas on plain Linux, mounts stay. You need cleanup, or you need to statically bind apps to their machines, which will seriously limit your ability to launch new machines -- there will be a lot of state associated with the bootstrapping of each node. Statefulness is the enemy of deployment, so really what you want is some networked block storage (EBS on AWS, for example) plus an automatic mount/unmount controller, thereby decoupling the app from the machine and allowing the app to run anywhere.

Environment vars are inherited and follow the process tree, so those are solved by Linux itself.

Process trees also handle "nesting": Parent dies, children die. But you will end up in a situation where a child process might spawn a child process that detaches. This is particularly hard to fix when a parent terminates, because the child doesn't want to be killed. Now you have orphaned process trees. The Linux solution is called cgroups, which allows you to associate process trees with groups, which children cannot escape from. So you use cgroups, and write state management code to clean up an app's processes.

I could go on, but in short: You want the things that containerization gives you. It might not be Docker, although any attempt to fulfill the principles of containerization will eventually resemble Docker.

It's about the automation of these things.

You now have generic interfaces (Dockerfile, docker-compose, Kubernetes/Rancher templates, etc.) to define your app and how to tie it together with the infrastructure.

Having these declarative definitions make it easy to link your app with different SDN or SDS solutions.

For example, RexRay for the storage backend abstraction of your container:


You can have the same app connected to either ScaleIO in your enterprise or EBS as storage.

We are closer than ever to true hybrid cloud apps and it's now much more easier to streamline the development process from your workstation to production.

I think it's pretty exciting :)

We are closer than ever to true hybrid cloud apps and it's now much more easier to streamline the development process from your workstation to production.

This sounds exactly like the "It's the future!" guy in the original post...

Have to admit, as a fellow Go dev, with single binary static compiles, I don't really GET why I need docker... all it seems to offer is an increased workload and complicated build proc

You don't really, but tools like kubernetes, which are really useful if you're deploying a number of heterogeneous apps, expect a container format as they aim at a market wider than just golang. The overhead of putting the service inside docker and following 12 factor is minimal and largely worth it, but if you're only running a single go binary, you could legitimatly go other ways.

Something like kubernetes also lets you abstract away the lock-in of your cloud infrastructure, so whilst it adds another layer and a bit of complexity, it again is arguably worth the effort if you're worried about needing to migrate away from your current target for some reason in the future.

As a framework it abstracts apps from infrastructure quite well. It's super easy for me to replace my log shipping container in kubernetes and have most things continue to work, as all the apps have a uniform interface.

Nobodies saying you can't build these things without kubernetes, but it definitely gives me more of the things than configuation managment systems currently do. Personally, I'd rather aim at the framework than handles more of what I need it to do.

Finally, bootstrapping a kubernetes cluster is actually quite trivial and you can get one off the shelf in GKE, so I'm not really sure why I'd personally want to go another route.

In my humble case, Docker solves the problems I have to manage the systems on which my application runs (and that's mainly it). A single dockerfile of 20-30 lines describes a whole system (operating system, versions, packages, libraries, etc), and cherry on the cake, I can version it in my git repository.

This is not revolutionary in itself, but having the creation and deployment of a server being 100% replicable (+ fast and easy!) on dev, preproduction, and production environments, plus it's managed with my usual versionning tool, that is something I appreciate very much.

Sure, there are other tools to do the same, but docker does the job just fine.

> having the creation and deployment of a server being 100% replicable

The problem of ensuring that upstream dependencies can be reproducibly installed and/or built is, of course, left as an exercise for the reader.

Isolation is a strong argument. You don't want one process to starve another. You can get isolation via one-host-per-service or you can get it using cgroups. Docker sort of gives you both, without the waste of one-per-host and with a manageable set of tooling around cgroups.

systemd runs services in their own cgroup by default and gives you control over the resources alloted to those cgroups.

Yes yes yes. We're nearly 100% Go on the backend and deployment is a breeze. We don't use Docker because it wouldn't give us anything beyond more things to configure. Our CI deploys binaries to S3 and the rest is just as easy.

Namespaced filesystem and networking, just for one. you seem very eager to dismiss a technology you only barely understand.

Namespaced filesystem shouldn't even be a special requirement - your program should use relative or at least configurable paths. I mean, directories are namespaced filesystems.

What networking problems does Docker solve?

Namespaced FS as in chroot.

Your program don't see what else is running on the system. Also means that it removes possible conflicts for shared libraries and other system-wide dependencies.

This kind of isolation is not only good for app bundling as a developer, but even more important as an operator in a multi-tenant scenario. You throw in containers and they don't step on each other toes. Plus, system stay clean and it's easy to move things around.

Network namespace as in linux network namespace (http://man7.org/linux/man-pages/man8/ip-netns.8.html).

Each container has it's own IP stack.

Containers provide proper abstractions so you can then assemble all of this, pretty much like you use pipes on a unix shell.

It seems to me that you're confusing configuration management and containers...

Deployments, installations etc. are pretty easy, it's not something containers are actually good at solving. At best you containerize the configuration management itself, which simply makes it harder to work with.

I've been working with configuration management for some years now, apart from also working as a developer so I don't believe I'm confusing them as much as I'm admitting that containers make configuration management and deployment easier. I might not have been so eloquent on that point but it's my feeling from using Docker for the past 2 years.

Nowadays all that I do is setup a barebones CoreOS instance and fire away containers at it, be it with kubernetes (and then my config management is a bit more robust so to setup k8s in CoreOS) or just use CoreOS's own fleet if it suffices.

Then I get the goodies of containerization such as process isolation, resource-quotas, etc.

Like I said: it isn't painless, sometimes much the opposite, but it's worked much better for the lifecycle of most of the products and services I've been working on the past couple years.

Even before with automated deployments it wasn't so easy when configuration begins to get hairy. And yes, you can argue that this might be a smell of something else but that's what I've seen happening over and over.

Docker containers don't contain a kernel. A container isn't anything special -- it's "just" a namespaced set of processes that are isolated from the host system. If you run "ps" on the host, you will see all the containers' processes.

One process per container is perfectly fine. In fact, that's the common use case. There is absolutely nothing wrong with it, and there is practically zero overhead in doing it.

What you gain is isolation. I can bring up a container and know that when it dies, it leaves no cruft behind. I can start a temporary Ubuntu container, install stuff in it, compile code in it, export the compilation outputs, terminate the container and know that everything is gone. We do this with Drone, a CI/build system that launches temporary containers to build code. This way, we avoid putting compilers in the final container images; only the compiled program ends up there.

Similarly, Drone allows us to start temporary "sidecar" containers while running tests. For example, if the app's test suite needs PostgreSQL and Memcached and Elasticsearch, our Drone config starts those three for the duration of the test run. When the test completes, they're gone.

This encapsulation concept changes how you think about deployment and about hardware. Apps become redundant, expendable, ephemeral things. Hardware, now, is just a substrate that an app lives on, temporarily. We shuffle things around, and apps are scheduled on the hardware that has enough space. No need to name your boxes (they're all interchangeable and differ only in specs and location), and there's no longer any fixed relationship between app and machine, or even between app and routing. For example, I can start another copy of my app from an experimental branch, that runs concurrently with the current version. All the visitors are routed to the current version, and I can privately test my experimental version without impacting the production setup. I can even route some of the public traffic to the new version, to see that it holds up. When I am ready to put my new version into production, I deploy it properly, and the system will start routing traffic to it.

Yes, it very much is the future.

It comes down to not wanting different applications (not equivalent to processes) to share a single filesystem and all that implies like shared dependencies.

Docker contains the effects of sucky programming to a single container. If your programs follow best practices, systemd is just as good.

Please also read the followup (https://circleci.com/blog/it-really-is-the-future/).

I read both articles a year ago and it really helped me grasp the whole container movement.

"Why don’t I just use Google’s thing?

"-You think that’s going to be around in 6 months?"

Isn't reputation a thing of beauty?

So true. Why people often use "it's backed by Google" as an argument in favor of anything is beyond me.

"it's backed by Google" is the reason I avoided Go for years and is still the only reason I'm nervous using it.

If it's free software it won't just disappear. If it's proprietary and hosted by an org that isn't making real money from it, that's a different story....

Is the development open though (serious question, I don't know how they do it with go)?

Look at android, it's a closed project that the occasionally release some source code for. If google decided to drop android then it's a critical blow because there isn't much of a community around it.

It depends what you mean by "open." Anyone can subscribe to the go-codereview mailing list. Anyone can submit a patch, but you need to a) use gerrit and b) sign a (digital) contribitors license agreement with Google.

It's "open", but it would almost certainly collapse entirely if Google decided to drop support for it. (There's no sign that they'll do that, but it's not a risk you take with a language like C.)

Even by the most lax standard you couldn't call the app store open; and that's kind of what's being shuttered here. The technology wasn't ever particularly interesting (not a bad thing), but the distribution model was. And that's entirely closed - they're not about to give anyone else permission to run even parts of the app store to anyone else, and that's that.

Hasn't App Engine been around for more than 4 years by now?

Seriously though, this is why I'm afraid to use something like Firebase now.

Funnily enough, I don't think this is a real concern. It's the stereotype, which is why I said it, but Google believes it will make as much money off cloud as from ads, so I wouldn't expect anything to get shut down.

I am going to ramble. Just move on if you don't care to hear the ramblings of a 62 year old development manager.

I'm pretty docker ignorant. I think I get it in concept. I manage >150 web sites (~15,000 pages total) that are php based with eXist-db and oracle (overkill but forced to use it) for database backends. My team develops on mac os x and pushes code to RHEL. We have never had a compatability problem between os x and RHEL except for some mgmt scripts in bash that were easily coded around.

Big data to me is a 400 MB apache log file.

I go home grateful I don't have to be in the buzz word mix.

I do read a lot about technology and over time that informs some changes like using apache camel for middleware, splunk for log file analysis yada dada...

I have had bosses that brought me buzz word solutions that don't ever match the problems we have. I hate that but right now I am not in that position. My boss leaves technology decisions to us.

Least you think we are not modern at all we do use a CDN, git and more.

Some days I get anxiety from reading HN, feeling stupid. Some days I get a lift from HN from reading articles like this one and the comments.

I am so glad I'm not in the business of chasing technology.

Sometimes it seems the webdev world is unaware of the complexity its creating simply to execute instructions....

What do you mean? I'm just running bytecode on a virtual machine ontop of a virtualized container on top of virtualized hardware on a CPU where the instruction set is virtualized in microcode...


I think every developer goes through that phase at some point.

I think the problem is that it's an entire industry. And they don't even see to see the problem.

There are 2 major issues with this:

1) Small teams (~1-5 people) trying to seem "big" by working at Google's scale.

2) Heroku's prices. We are currently (successfully so far) migrating a small Django project from bare Amazon EC2 instances to ECS with Docker. Even using 3 EC2 micro instances (1 vCPU, 1 GB RAM) for the Docker cluster we would spend ~8 USD/month/instance. With Heroku the minimum would be 25 USD/month/dyno. That's a 3x increase in expenses.

It's very possible to take advantage of technologies like containers without getting too caught in the hype.

wait. you're comparing $25 with $75. it is 3x but it's still accounting noise by any standard imaginable unless you're running a charity server for an open source project.

What about the standard of "I'm young and this is a side project I'm doing in a couple of hours at the weekends"? Of course once you have a real company with more than two customers $75 is nothing. But version 0.1 is often a tool that's only useful to you.

Heroku has free dynos for side projects, and hobby dynos ($7/dyno/month) for slightly-less-side projects. So that original $75/m quote isn't quite right for that situation.

Yes it is, our EC2 instances have 1 GB of ram, the 7$ dyno has 512 MB.

Young people doing side projects on the cheap... are they Heroku's bread and butter?

> What about the standard of "I'm young and this is a side project I'm doing in a couple of hours at the weekends"?

Well, even someone who's young (for values of 'young' older than high-school age) is probably spending more than that every month on beer, food & entertainment each …

i'd roll that into 'charity/hobby' part.

or you just host it on an old laptop hidden in your closet.

To answer the concerns raised in the comments: we are a real company and it took 2 weeks (while working on other features and bugfixes) to migrate to Docker. The plus is that now we have experience with the platform and we can streamline the process. Again: we are not using microservices or anything like that, simply Docker containers instead of EC2 instances, which makes life pretty damn easier (and cheaper).

And 25 and 75 are bogus numbers, what if we start running 10 instances?

And what is the cost of the working hours spent to migrate to Docker compared to just doing a git push to Heroku?

A few hours a week dedicated to re-building our deployment process (which was a pain since everything had to be provisioned manually for each new project). Not saying it was the best approach, it sure was an improvement and worth the (relatively little) time.

Once again keep in mind that for new projects the process is so streamlined it will take a fraction of the time to set them up.

Indeed, the goal is to solve your business problem with technology, not use Docker for everything that you can find in your infra. Many people are mixing up the two. Docker can be replaced with anything that is hyped at this level.

Exactly. Docker provides a set of features that are nice to standardized development environments and deployments across projects. Anything else that accomplishes that works as well.

Your micro instance has cpu credits while your docter cluster does not? So price increase is expected...

Previous discussion from 434 days ago: https://news.ycombinator.com/item?id=9688383

I think I will be fine, thanks. I'll stick to my shell scripts, so far they've outlived any other devops fad.

I worry when I can't tell if a comment like this is based on fact or just trying to be funny.

Because I've seen my share of nasty "legacy" automation but, surprisingly, I still think a good set of well thought-out shell scripts written by someone that understands what's being automated still beat modern tools, even when the person doing the automation is the same.

I don't quite know why this is, but there's something timeless about shell scripts. I've also seen shell script automation survive for a long time unattended and with zero issues. Not so with some of the modern tools that are supposed to be all unicorns and rainbows.

My take is that the shell script does not have any unspoken assumption or magic that is performed by the tool.

It all has to be in the script building up strictly from well-understood and long stable basic bricks (and the few places where you don't it's even worse with devops tools.)

Any issue, any question can be answered by reading the damn shell script and you're never dependent on a cookbook/recipe/playbook/component that you got off of some github repo that you need 5% of to do X.

An un-researched opinion from someone who replaced chef/ansible/puppet with his own shell-script config-management system:

I don't have to rewrite my shell-scripts every 6 months when a new version comes out. New updates usually only happen when security issues arise.

Shell-scripts tend to be simple. There's not a lot of magic hand-holding going on, which means not a lot of complexity to break things.

It keeps you from getting too abstract. Your writing pretty close and specific to what you want it to do, not "how it should be".

They are typically standalone. It's really easy to have 1 script that solves one problem, and another script that solves another. You don't need a giant code-infrastructure to keep things going.

I think config-mgmt tools can be extremely useful if your running a widely-ranged environment. But, you probably shouldn't be running a widely-ranged environment. If you keep things simple, and run as homogeneous as possible, you probably don't need all the added complexity.

> I think config-mgmt tools can be extremely useful if your running a widely-ranged environment. But, you probably shouldn't be running a widely-ranged environment. If you keep things simple, and run as homogeneous as possible, you probably don't need all the added complexity.

This applies to small environments. If the environment is large the situation almost reverses.

Deploying automation throughout a large homogenous environment is where config-management tools really shine. They make it easy to ensure homogeneity is maintained (even if that just means ensuring all machines have the same set of shell scripts) and allow grouping for staggered updates.

If the environment is widely-ranged and large, the utopia starts to break down. Their configuration explodes in complexity and (if you're not careful) you end up with mostly the same amount of work as if they were managed as small independent environments. With the added risk that there is now a single place from where you can break everything at once.

And this happens... Usually from wrong assumptions of what's common between all machines in the environment. In homogeneous environments almost everything is common, but in widely-ranged environments you sometimes add some configuration that wasn't there before and you think applies to the whole set and all hell breaks loose. If you're lucky this will happen suddenly, if you're not, breakage will spread slowly and you'll spend quite a lot of time scratching your head on why.

Well, it depends ;)

I don't think large/small is a good deciding factor. You can be large and homogeneous, or small and diverse. I think similar/dissimilar is a better decider for config-mgmt vs shell-scripts.

I'd argue that config-mgmt usually does a better job if your setup is large and complex. No need to write a script that checks if it needs to install a .deb, .rpm, or whatever, if your config-mgmt tools have already done that work.

Also, if you build your shell-scripts right, they can ensure that your system is kept the same.

Why not use Python instead? Shellscript is so... chaotic.

I pine for the days of yore when the Unix Philosophy was strong and pure, and every program did one thing well, and only one.

Like the way the shell would fork off an "expr" sub-process to parse a mathematical expression to add two numbers, then write the result to a pipe via stdout, then terminate the process, clean up all its resources, and switch context back to the shell, which then read the serialized sum back in from the other end of the pipe, and went about its business, regardless of the fact that the CPU running the shell already had its own built-in "add" instruction in hardware.

> I pine for the days of yore when the Unix Philosophy was strong and pure, and every program did one thing well, and only one.

Unless you are over fifty years old, you never experienced this.

Rob Pike said it best: "Those days are dead and gone and the eulogy was delivered by Perl."

Perl being a thing in 1995...

Now "expr" has finally been rewritten as a jQuery plug-in.

[1] https://github.com/baphomet-berlin/jQuery-basic-arithmetic-p...

This is the first coherent refutation of the "do one thing well" ethos I have ever read. Thanks for putting into words what I haven't been able to express myself.

Shell/batch scripting can often be useful in the devops world, where you have no guarantee that any additional tools (python, ruby, perl, powershell, whatever) wil be available.

Shell scripts are guaranteed to be runnable on all machines.

Unfortunately the shell "language" sucks, but still...

So you're running 'sh' scripts, right? none of that new-fangled bash stuff...

Oh, and be very careful of the commands your script invokes!

shell scripts have no guarantee of portability (often less than Python, which has a rich standard library available on all platforms).

Shell languages are great at doing interactive programming. Few non-shell languages can match the convenience, flexibility, and expressiveness in that domain. Of course, in any other domain (scripts) they are awful.

> Unfortunately the shell "language" sucks, but still...

Why do you say that? genuinely curious

Python + Fabric is a lovely devops/sysadmin toolbox.

Still no Python 3 support though

How do you write "foo | bar" in Python?

    from subprocess import Popen, PIPE
    p1 = Popen(["foo"], stdout=PIPE)
    p2 = Popen(["bar"], stdin=p1.stdout, stdout=PIPE)
    p1.stdout.close()  # Allow p1 to receive a SIGPIPE if p2 exits.
    output, _ = p2.communicate()

Thanks. Maybe I'll make a public Gist that's a kind of "foo | bar" cookbook for different languages...

What you actually do in Python is use https://github.com/kennethreitz/envoy

That seems to recommend just calling the shell to do pipelines? Makes sense... Or maybe it parses that syntax itself?

I believe it parses it itself.

Looks like its not maintained.


When you're not writing shell, just use the tools the language gives you.

For the matter, I think a shell script is cleaner than a python script for devops; but I don't think the composability of unix tools is that much of an advantage compared to the amount of python libraries out there.

When I use shell it's often exactly because I want to construct pipelines of processes and FIFOs and do all the other things that shell does very well and has done well for decades.

I'm likely to be using Python programs and other programs in those shell scripts. The beauty of shell is that it makes it so easy to compose programs written in different languages.

Shell does things well provided all the intermediate states are naturally expressible as streams of bytes. Otherwise not so much.

I think the advantages of using a single language for everything outweigh the disadvantages - see e.g. http://www.teamten.com/lawrence/writings/java-for-everything... (though actually my single language is Scala)

Everything in a computer is a stream of bytes... My shell scripts often use tools like jq and jshon to deal with JSON structures, etc. File hierarchies can also be very pleasant data structures.

The kinds of scripts I write would be awkward to have as compiled JVM programs, I think. Shell is just way more ergonomic for me for many tasks.

> Everything in a computer is a stream of bytes

Data can be meaningfully separated from control and structure in many cases, and failure to do that is a major (perhaps the major) source of security bugs.

Everything in a computer can be interpreted as a stream of bytes. For most things an object is a better interpretation.

You could then also criticize for example HTTP or even TCP for making you turn everything into "bytes".

Shell doesn't enforce any particular interpretation of data. Pipelines simply connect one program's output to another's input. Interpretation is up to the programs.

>You could then also criticize for example HTTP or even TCP for making you turn everything into "bytes".

If these were the only standard protocols that existed, and people were trying to tell me this was great because it's easy to compose different network applications, that criticism would be completely valid.

>Interpretation is up to the programs.

But because there are no standards beyond "stream of bytes", the chance that two independently written programs working with non-stream-like data can communicate directly is extremely low.

Lots of programs can communicate with JSON, XML, standard formats like that. If some legacy program outputs a non-standardized kind of output, that's a problem to be solved, not an inherent failure of shell scripting. There is no overarching successful solution to the problem of different programs using different data representations, but I don't blame this on shell; I work happily with shell scripts as do many many others. The same problem shows up the minute you want to use a Ruby module from Python, and rewriting everything in every language is not an economically viable solution.

> Lots of programs can communicate with JSON, XML, standard formats like that. If some legacy program outputs a non-standardized kind of output, that's a problem to be solved, not an inherent failure of shell scripting.

But the shell language itself is one of these legacy non-standardized formats. Arcane escaping rules, multiple incompatible implementations, surprising ways things get interpreted as code (e.g. the recent bash CGI bug),...

How often do you actually need to do that. 99% of `foo | bar` command could easily be `foo > a && bar < a` which is pretty trivial to do in Python.

Well, how do you write that in Python?

I'm curious because these simple things that are delightfully easy in bash often turn out to be surprisingly tedious in other languages.

Of course, some things are tedious in bash too. But a basic principle of shell scripting is that you call other programs to do the stuff you don't want to do in shell.

Like this: http://stackoverflow.com/a/1996540

I agree it is tedious, but to be honest, reading and writing to stdin/out isn't something that would commonly need to be done in a robust system. If the world were perfect you would use library functions.

I definitely think there is scope for a language that works well as an interactive shell, and as a general purpose language. They have somewhat conflicting constraints but I'm sure we can do better than Bash. Have you seen how [ is implemented?

No stdout or stdin for robust systems? I disagree.

Yeah, a nicer shell-like language would be cool. I've been thinking about it for a while.

Bash is quirky but it gets a lot of stuff right and once you understand it it can be extremely ergonomic and productive.

And not depending on language run times other than shell can be really glorious in some situations, too...

Now you have to clean up 'a'. And decide whether /tmp or /var/tmp or a dir on some other filesystem has enough space to hold all of 'a' until 'bar' is finished. Is it a security problem that other processes could snoop the contents of 'a' or even tamper with it?

Try python "sh"


It's pretty elegant.

import sh


cos they run without needing python.

It's much easier to not need shell scripts, than to not need Python scripts.

Pffft, shell scripts. In Lisp I can emulate shell scripts and all of the technologies mentioned in the article with three to five macros.

Or just put your app(s) into containers and run them through docker compose on a single VPS. That bypasses about 99% of the things listed in this article.

You can still easily set things up so it's a git based deploy which is hands free after the initial push.

Now you have a single $5-10/month server that runs your app's stack without a big fuss. Of course it's not "web scale" with massive resiliency but when you're just starting out, 1 server instance is totally fine and exactly what you want.

I've ran many projects for years on 1 server that did "business mission critical" tasks like accepting payments, etc..

I wonder if the original HN "Heroku is Dead..." title will cost Heroku money / market share in indirect ways.

When I see titles like that (despite the fact that it was intended as sarcasm), I think to myself, e.g., "I bet at least hundreds of people who scrolled past it thought it was sincere, and now they will have this subconscious 'Heroku is Dead... Docker...' thought at times when deploying projects. Maybe they'll even check out Docker. Maybe these hundreds of people will represent a tipping point of sorts for Heroku->Docker migrations, because one of them will write a really great blog post about it, and it will receive thousands of views..." (alternate endings of the same thought continue to be brute-forced for a few moments).

Along the same vein of thinking, back in 2008 I had this "realization" that Google could control the world by simply showing results based on headline titles (e.g., a search for "Obama" during the election could have resulted in articles / results whose titles have words/phrases whose presences are positively correlated to lower or higher stress levels, assumptions, other emotions, etc., resulting in a net positive or negative sentiment, respectively, about the subject of the search query, all while simply scanning the results to determine which one to click).

> I wonder if the original HN "Heroku is Dead..." title will cost Heroku money / market share in indirect ways.

This would be true for an average BuzzFeed-consuming-crowd, which -to my knowledge- isn't the case here.

This pretty much sums it up. New stuff, stacked on existing stuff, with the theory that older stuff was hard, and new stuff is better, but you still needs the old stuff, so you basically end up doing exactly what you did, but now with more stuff in between, adding cost, complexity and additional failure modes while solving nothing.

Any of the proposed problems that containerization was supposed to fix are already fixed by using proper configuration management. In almost all cases so far, people yammering on about docker and containers (and CoreOS), it ended up being their idea of configuration management, because they didn't have any in the first place.

Say you want to fix your 'problems' with setting up servers, how about doing it the right way. You will need deployment services, regardless of containers, VMs or bare metal. You will also need configuration management services, and monitoring. Containers and special distributions solve none of it, knowledge to run systems is still required and not actually fixing your problems and layering stuff on top of it doesn't actually help.

Get something like SaltStack or Chef, and configure the out of everything. It doesn't care what you're running on, and actually solves the problems that need fixing.

I've spun up a lot of kubernetes clusters to test it out. A few months ago I also tested out Flynn, Deis, Deis Workflow, Openstack, and a lot of other options. I still haven't found a simple bootstrap script that gets everything set up on AWS and lets me simply deploy my application. And it's true that storage still seems to be an unsolved problem with kubernetes.

Heroku is great, and free for small services. On the other hand, a highly-available kubernetes cluster is going to set you back at least $100 per month, which is just too much for small startups and side projects before they take off.

I think I'm going to forget everything and head towards http://serverless.com/. No Heroku, no Docker, no micro-services, no servers. Just everything running on AWS Lambda and DynamoDB. And everything static in S3 behind Cloudfront.

Or maybe just Firebase. But I really am tired of managing servers.

So, why not ditch AWS and use GCP? Kubernetes cluster setup in one command, or use AppEngine.

Maybe the problem is AWS.

> So, why not ditch AWS and use GCP?

It's a Google product offering, which means it could be EOLed tomorrow. Or this afternoon. Or maybe it already has — better go check their blog.

Amazon and Google have almost the exact same language in their ToS for deprecation policies. Also, k8s is open source, so you can at least go somewhere else if you really must.

GCP isn't "Google Labs".

Take a look at Convox for setting up AWS then getting out of the way. It's open source and free aside from the base AWS costs.

Disclaimer: I work at Convox.

This looks REALLY good, thank you! I will definitely be trying this out.

Thumb rule: Divide number of backend engineers by 5 you get optimal number of micro-services.

I got 0.2

Should I round up or down?

Congratulations! It's serverless!


I got a genuine laugh out of this quote "Yeah, BDSM. It’s San Francisco. Everyone’s into distributed systems and BDSM.".

Thought it was a joke, then quickly googled Aphyr's blog...

The OP should have snuck a joke in there about The Armory :)

Making sort of funny articles about Docker doesn't change the fact that circleci's support for docker is horrible. I don't use it on prod (and I probably won't) but since circleci forces me to use ubuntu 12.04 which is not something I use on prod I want a docker container which looks like my production host. Having said that - circle really tries to support docker, but It doesn't.

I have to use ECS for caching (I am not happy about it)

Builds might fail due to the custom docker version/compilation

You can mock docker, but people are using it in one way or another and you should support it properly.

This needs a "(2015)" adding to the title.

I can always count on at least one comment per tech article with exactly this text.

Having read the article back then (and reread it now) it seems like it's still relevant. Maybe we'll have to add the year qualifier after a while when AWS lambda becomes "the way".

Because 2015 was, like, 100 years ago...

Containers are obsolete. AWS Lamba is the new hot thing.

On the timescale of the web technologies, that probably reads as "ancient".

We should really have the year of the original post in the title already! It's the current year!

it was so last year

Agreed, was reading through this and before I checked the date - all I could think was "I know I've read this before - did someone just republish this as their own?"

Most commenters don't seem to realise that this is satirical, and that the author actually thinks Docker is 'the future':


But there is, as the author notes, truth in the satire.

I assume this entire article is one piece of sarcasm. Because after reading it, how could any sane person not prefer Heroku?

From your comment I see you didn't read the post.

Read it, it's a lovely 5 minutes piece of writing.

It looks like I was not the only one to be confused: https://circleci.com/blog/it-really-is-the-future/

Yes. You are right.

At least you understand the author's intention. I would be worried if some non-technical people took the title literally...

Read it and passed it around to the guys at the office and I can tell you that it goes beyond the Heroku v Docker debate. For us, here at some kind of bank development shop it is about languages and frameworks. I would advice that you read the whole thing. Pretty funny I must add.

show me an easy way to push a rails application to aws (with docker) that uses RDS ?

is there ANY way i can spin up a server, add the ssh keys to some configuration file somewhere and just "docker-magic push" and have my rails application running ?

or do "docker-magic bundle exec db:migrate" and have that command run on the server.

Or push a Procfile with worker definitions and have the PAAS automatically pick it up, add it to supervisord/systemd and run it ?

Cloud Foundry is probably what you want. To the point that the Ruby buildpack code is a soft fork of Heroku's (source: I work on the Cloud Foundry buildpacks team). It'll run on AWS, vSphere, OpenStack, Azure, GCP is coming and others to follow.

There is, however, still a hump to get over in installation -- you need to learn what BOSH is, install BOSH, then install Cloud Foundry with BOSH. In the long run, for a production deployment, this is what you want. But it certainly doesn't feel that way when you just want to kick some damn tires.

If you just want to tinker, you can try PCFDev[0]. It's a fully-functional Cloud Foundry installation in a single VM.

Disclosure: I work for Pivotal, we donate the majority of engineering on Cloud Foundry.

[0] https://pivotal.io/platform/pcf-tutorials/getting-started-wi...

Edit: yes, I know we ask you to signup during the PCFDev install. I hate it too. We have to for export compliance, it can't be avoided.

thanks for this ! how do you look at deis.io vs Cloud Foundry. Do you see yourself subscribing to one of the popular camps out there... or will you stick to Cloud Foundry in the long term ?

I'm also with Pivotal... CF is a long term deal. Deis is a young startup that began on Dokku before pivoting to K8S, CF is in its third generation runtime and its use of containers predates Docker to 2012.

CF is already a relatively successful business with hundreds of million dollars in annual revenues across a pile of companies. Kubernetes and Docker are small in comparison as "businesses" but of course the momentum there is surging in terms of both pure open source adoption and contribution. It's likely going to be a big market with a lot of choice like Deis, or plain k8s, or RedHat OpenShift, or IBM bluemix, Pivotal , Docker Datacenter, or Mesos/Marathon, etc. It's a bit of a market war brewing and that competition will make for better solutions.

What is great about open source vs. past tech "gold rushes" is that these experiments and feedback loops exist across communities that are otherwise competing and overlapping. Mesos adopted the Docker image format independently of the Docker runtime ; Kubernetes introduced pods independently but also reused parts of Docker. Docker container networking and volumes are being used compatibly in the latest incubated CF releases. RedHat submitted a way to get CF style buildpacks working on K8S. Someone found a way to make CF run on Mesos; I could see a similar attempt on Kubernetes some day. It's a confusing and busy time but also an explosion of activity. And even if there is competition for dollars in the end among all these players that will lead to tension , the work is out in the open mainly.

I'm obviously biased.

Most of Cloud Foundry is built the way I like software to be built. Pair programming, TDD, small balanced teams, prioritising for user valu.

That style of development is actually baked into the Cloud Foundry Foundation rules. Companies who join the Foundation are expected to send engineers to ramp up on developing in this style. And voting rights are based on the number of full-time engineers you have assigned to the effort.

The reason I mention all this is that I trust the way we build Cloud Foundry. We still get production bugs and oversights and mistakes. It's around 4 million lines of code that turns into a distributed system of ~50 different interacting processes. We built a fully-featured, robust PaaS, using containers, in about 3 years, starting from scratch.

Nobody outside Google had built a container platform of this level before. Nobody but Heroku had built a fully-featured PaaS of this level before. We are, to my knowledge, the first system to do both of these things. Certainly the first opensource one.

The reason you never hear about Cloud Foundry is because we've already built all the components other folks are trying to roll up into full PaaSes. "It just works, already" is a boring story.

But again, quite seriously: I am obviously very biased.

This is awesome! So you're telling me that not telling the world that you have a high quality, tested, opensource, fully functional, container based Heroku-clone is deliberate ?

Please excuse me while I take an hour to digest that.

You're choralling to the preacher, mate.

Right now the companies who provide the most engineering -- Pivotal and IBM -- are laser-focused on capturing enterprise dollars.

Which from a business perspective makes perfect sense. Pivotal's commercial distribution of Cloud Foundry (PivotalCF) holds the record for fastest-growing sales of any opensource product. Ever.

But there's not much effort on promoting to devs and startups (IBM are starting to do this more with BlueMix, which is their CF distribution). But it's early days.

So basically you'll usually find that I and a sprinkling of my colleagues show up in threads like these out of the goodness of our hearts and fondness for our work.

Obviously I have a financial interest. Pivotal makes money from PivotalCF, I work for Pivotal I'm on an options plan as well. So YMMV.

But I think Cloud Foundry is just ... way ahead, in terms of actually getting work done.

Edit: and since I'm musing aloud about business-y things, I should emphasise again that nothing I say is in an official capacity, consult your lawyer, financial planner and astrologer, etc etc.

Ive been looking for the same thing. Tried dokku and it was painfully slow. Tried rolling my own solution via git hooks but other devs had problems with execution. I would love to hear more about people running dokku on Digital Ocean (I was on a beefy azure server)

In a very specific case, Heroku is the best solution for my problem. Sounds like it is for you too.

i would say for most use cases unless ROI becomes an issue. There is a point of time where it becomes cost effective to run your own S3.

Once you finish taking my Scaling Docker on AWS course, you'll have access to a magic command that deploys your app and you're free to optionally run migrations.


It covers using RDS, ElastiCache and also handles load balancing your app + much more.

And for a limited time, you get a free set of Japanese cooking knives with every purchase.

You should mention it costs $20 for information that can be found for free anywhere on the web. While other users answer the question, you simply refer to your piggy bank, shameless.

> You should mention it costs $20 for information that can be found for free anywhere on the web. While other users answer the question, you simply refer to your piggy bank, shameless.

Yep, it costs $20. Basically the cost of chinese food for 2, to ensure guaranteed victory in learning the essentials of AWS' platform while having a guided tour on how to deploy a fault tolerant web app with Amazon ECS from start to finish.

You can definitely learn everything for free, but the value in a course is that you're getting a cohesive learning path that was carefully planned and tested. You get a system that you can apply to your own projects and plenty of source code to reference.

You're paying the $20 so you can avoid spending 6 months trying to figure out everything on your own while stringing together a bunch of half-assed blog posts and tutorials.

You pay the small fee for certainty and it's well worth it because your time (and sanity) is not infinite.

I love the comparison to chinese food for 2. As if everyone lives in the US with the same economic situation. Your posts are truly despicable.

I'm sorry you feel that way. Unfortunately time is something we all share, and it brings me a lot of joy to have students say that my courses have saved them a ton of time and helped them meet their goals.

I see that you recently left Amazon after making 200k/year there (a comment you made 5 days ago). I can see why you don't like people promoting Amazon products, I fully understand.

Condescension abounds! :-)

i have grappled with Amazon ECS and im in no mood to go through that again. Convox (YC S15) is a pretty good alternative - but it will take some time for them to tie the "magic" together.

Deis will do exactly that.


hey - not bad. The documentation looks very similar to heroku.. i cant find if they do stuff to handle workers in Procfiles (which need to be added to the job manager like supervisord/systemd). But interesting..

Glad to know there's the same kind of hipsterism going on in the backend world as much as in the frontend world.

You could basically substitute all these backend buzzwords with "Webpack", "Grunt", "Gulp", "Requirejs", "React", "Angular", "Ember", "Backbone", etc. and it would have same effect on the readers--they think you're an annoying hipster.

https://www.gitignore.io runs on Heroku and it gets 40k+ visitors a month on a free Dyno. I use Heroku because I don't want an IAAS solution, I want a PAAS solution. If I wasn't using Heroku, I would probably find another PAAS, before switching to Docker (even though I do love Docker).

That's 1 request per minute, you could slap that into a raspberry pi on a 3G connection, and do the same for your next 400 apps that has 1 request per minute.

People seem to underestimate just how powerful modern machines really are. And I don't get why people seem to think it's hard to deploy simple web applications. Just write a 4-line shell script that rsync's, runs whatever DB migrations you may have and restarts the thing.

40k visitors per month isn't one per minute. It's anywhere from one per minute to 40k per second. Division of requests by time is the worst possible mistake in calculating load.

Plus visitors != requests.

I have 200k visitors per month generating 8m page views and about 50m hits on the servers (with CDNs taking another few hundred million hits).

These all peak during the UK weekdays and wind down at nights and weekends.

Divisions over time aren't going to work, but neither is translating visitors into requests, especially as it's only the page views that have a beyond trivial computation cost.

Yeah, I just checked Cloudlfare and here are the requests

  Total Requests
  Last Month

While you overall point is correct, it's still likely not going to be a lot. It's highly unlikely that it's 40k per second.

Thank you for the correction, but that was not the point I was trying to make. I was referring to his statement:

> [...] before switching to Docker (even though I do love Docker).

This seems like a very drastic solution to problems he does not yet have. I've been responsible for similar thoughts before, but shortly after realized just how damn stupid I was.

Well technically it's:

  622482 / 43800
  = 14.2119178082 requests per minute
Here are my requests over the last month - http://jmp.sh/9EAUVrv

I definitely understand that modern machines are powerful, but I used to STIG RHEL 4 and RHEL 5 boxes for 4 years as my primary day job. I've done everything from create creating kickstart files to manually locking down a whole Linux instances to creating RPM files. At this point in my career I just don't care about the extra cycles I get by using <insert your infrastructure tech>.

If my product needed the extra performance, trust me I would switch.

Thanks for providing that - I just notices that your shell instructions don't include completion. I use the following in zsh: https://gist.github.com/lorenzhs/ad6c009f5748d333b73376e07ae... With that, I can do "gi <tab>" to get a list of all possibilities.

I'm literally in the process of switching to zshell over the next few weeks. Since it's more than a 1-liner, could I just add that to the Advanced CLI instructions[1] section and credit you?

[1] - https://github.com/joeblau/gitignore.io/wiki/Advanced-Comman...

I didn't come up with that, but I don't remember where I found it. So don't credit me for it :-)

Turns out I copied it from oh-my-zsh: https://github.com/robbyrussell/oh-my-zsh/blob/master/plugin... - I edited your wiki page accordingly.

One thing that I liked about Heroku was the ease of deployment for Java applications (or any language for that matter as long as you had the right build pack)

Since they upped the cost of their small tier, I moved to Digital Ocean and installed Dokku, which gives me that Heroku-like deployment experience so managing my (admittedly very small) website isn't that much of a hassle.

We at Boxfuse (https://boxfuse.com) provide the same ease of deployment for running Java apps (as well as Go and Node.js ones) on AWS. All you need to type is literally: boxfuse run myapp-1.0.jar -env=prod

And you automatically get things like auto-scaling, database auto-provisioning. easy debugging and more.

Disclaimer: I'm Boxfuse's founder and CEO

Hey, it looks like your "center-iframe" is pointing to an invalid link. It threw me off while I was looking around. Hope this helps

In an earlier age, all we had to hate were frameworks.


(Factory factory factory factory.)

Out of topic but, man, is coreos not ready for prime time. I've been using it following the stable channel and ended up having to turn off automated updates because it would break docker (docker would just hang with the last 4 updates)... Not a very convincing test of coreos :)

Kubernetes/Docker will become increasingly accessible to developers and it will loosen the reliance on lock-in PaaS like Heroku - This is the future; I'm betting everything on it.

With tools like Rancher http://rancher.com, you can already see things moving in that direction. Next step is rancher-as-a-service.

When it comes to developers, I think open systems will always prevail in the end (it's just more flexible).

I thought it's more important WHAT you run inside containers than containers itself.

I got really upset about this rancher tool because it doesn't design my database schema.

Shit, future was so close.

You still have to write code, define your database schema and declare config files to specify how containers should be orchestrated but soon open source developers will start creating frameworks/boilerplates which capture some of these requirements and which can automatically run and scale on Kubernetes/Swarm/Mesos and this will greatly speed up application development and deployment.

Right now, we think of frameworks as being components (part of) larger software systems - But in the future, frameworks will provide the foundations for entire software systems - They will be be responsible for declaring their own network topologies and resource requirements and they will be capable of scaling automatically to any number of machines.

Developers will extend and customize the framework with their own logic but the framework itself will handle all the difficult stuff related to its own operations.


How is rancher related?

Don't listen to hype. Look at the problems you need to solve. If you need a big, complex, distributed system, than maybe microservices are a good idea. If you're building a simple webapp... not so much. Things that work in the large don't necessarily work in the small.

I write stuff in Scheme. I'm a hobbyist, there's no reason for me not to, and I love the language. The apps I write are sometimes single-threaded (or coroutine-based) monoliths. But I only have one machine available for me, and the things I'm writing are fairly simple. It's good ENOUGH. And Worse really is Better[1].

1:and I truly mean that in the Gabriel sense. As in the New Jersey model. Not any other way.

For people who do not know what "Worse is Better" means, here's the link: https://www.jwz.org/doc/worse-is-better.html

WTF is at that link? It's NSFW...

This definitely reflects some of my experiences when trying out Docker and some of the other stuff associated with it.

Serious question though: I would absolutely love to have an introduction on how to use Docker to deploy one or two web applications that use a typical amount of backend services, say some sort of database and a redis server. All of this would probably run on a single VM (whether Amazon, DigitalOcean, Linode, ...) and you mainly use Docker to isolate the applications from each other in terms of the environment/dependencies that they need.

How do I do this with Docker in a way that gets me an easy deploy process? (Or maybe the question is actually, should I even do this with Docker?)

My personal blog is basically an example of deploying a simple app using docker. HTTPS://GitHub.com/pbecotte/devblog ... The makefile wraps up the actual commands.

> No, look into microservices. It’s the future. It’s how we do everything now. You take your monolithic app and you split it into like 12 services. One for each job you do.

reader implements and gets massive bill for personal blog hosting

"Am I doing this right?"

I guess this is meant to be hyperbolic but honestly it's true.

> So I just need to split my simple CRUD app into 12 microservices, each with their own APIs which call each others’ APIs but handle failure resiliently, put them into Docker containers, launch a fleet of 8 machines which are Docker hosts running CoreOS, “orchestrate” them using a small Kubernetes cluster running etcd, figure out the “open questions” of networking and storage, and then I continuously deliver multiple redundant copies of each microservice to my fleet. Is that it?

> -Yes! Isn’t it glorious?

> I’m going back to Heroku.

I actually learned a couple of things from this article even though it's satire. I struggled through the de-monolithing of an application into microservices and then into Docker containers. Do people actually do this without having pressing issues that haven't lead them to this path? Is there any reason to spread into microservices unless the monolith is not keeping up with requests?

I would have never considered Docker containers unless artifact preservation/isolation and deployment issues hadn't forced me to look toward a solution.

What a nice clickbait title. A is better than B, even though A is an apple while B is an orange, and we use them for entirely different purposes. Some functionality provided by Heroku can be replaced with Docker, and some missing features of Heroku are in the Docker infra, that can be added to Heroku using software (like service discovery: https://blog.heroku.com/managing_your_microservices_on_herok...)

It's a humorous lament on the fragmented state of devops, not to be taken too seriously.

As someone who recently drove down the path of learning all the new "hotness" in the DevOps world, I can safely say I came out of it with two learnings:

1. I have a much better understanding of what's happening behind-the-scenes

2. For most small startups, you should seriously consider the time (and therefore, cost) of investing in your own infrastructure.

For point #1, I think understanding your options and how they benefit your company is essential for you transition from a small -> medium -> large size company. The paradigms you learn by virtue of researching the new technologies might end up being applicable in other parts of your development process.

On point #2, I partially regret not deploying to Heroku, seeing where our system became stressed, and optimizing. Attempting to scale for things you don't know about yet is tough, and can lead you down a path of wasted time and money.

Look, if you're a one man shop doing a small project — do LAMP. Do perl. Do cgi. Do whatever you're comfortable with; if you try to switch to the latest silver bullet tech, you'll just get disappointed.

But if you're a CTO with a startup with 10+ server-side developers and plan to hire at least as much in near future, suddenly all these dockers and microservices actually make sense.

So, unless you'll start conversations with _who_ you are and _what problem_ are you trying to solve, of course the other side will seem stupid.

> if you're a CTO with a startup with 10+ server-side developers and plan to hire at least as much in near future, suddenly all these dockers and microservices actually make sense

As a consultant, I often get asked those kinds of questions: "Should we use X?"

Whether it's programming languages, databases, operating systems, whether it's Chef vs Puppet vs Ansible vs Docker vs Whatever, it's a question that comes up a lot.

I generally answer it with "What are your team good at? What have they used, what do they know well?"

There are always exceptions to the rule, but in general I encourage people to play to the strengths of their team, rather than recommending Technology X because it's shiny and bang on-trend.

Exactly my point. I used two broad categories just to paint the overall picture; the fact that they "make sense" doesn't mean that they're the silver bullet.

I'm not a web developer but I sometimes put some programs on the web for me and my friends. I use FreeBSD Jails for that.

Can someone explain to me the advantages of Docker compared to Jails?

In your use case: pretty much no advantages. Stick with jails and be happy.

Really spot on - recommended! "You are a bunch of Node.js hipsters that just HAVE to install everything you read on Hacker News!"

By the way, why all the downvotes to the parent?

Since we are seeing so many "ads" here in this discussion. Does the average HN reader prefer this type of advertising to regular adsense ads? There is no violation of privacy, no autoplaying video, they are actually relevant to the discussion and it's obvious that they are trying to sell you something. Oh and probably the most important thing si you can downvote them.

I dislike it because they just seem to have read the title and not the actual post.

He slowly caressed his docker image while tenderly inserting it in the continuous delivery pipeline. The slow throbbing of the Jenkins service increased in intensity as the automated tests started firing off in a crescendo of etcd writes leaving a quivering micro-service that lay panting in its pod. That was a memorable deployment. The first of many that night...

See .... microservices are where you take a process using IPC and force it to go over a network and use TCP ...then you pretend it will run better and scale more but avoid the whole 10x infrastructure growth and the fact the devops team has tripled in size to manage the frankenstein ... dont forget to add in the whole SDN layer the network guys absolutely love

I'm not sure I get the message.

heroku is dead, docker is awesome, gluten free food does not have gluten ;-)

I'm sure I don't get that message.

It's all sarcastic. It's poking fun at how trendy advances might not necessarily be better than what existed before, as well as how much complexity is involved in them. It makes the most sense if you're already familiar with the concepts in the article (Docker, Heroku, CoreOS, etcd, and so on), and happenings in the technology industry related to them.

As usual, if a joke needs to be explained (to a person) then it's not funny (to them). I found this amusing since it aligned with my experiences:

> -Since no-one understands Paxos, this guy Diego…

> Oh, you know him?

> -No, he works at CoreOS. Anyway, Diego built Raft for his PhD thesis cause Paxos was too hard. Wicked smart dude. And then he wrote etcd as an implementation, and Aphyr said it wasn’t shit.

> What’s Aphyr?

> -Aphyr is that guy who wrote, ‘Call Me Maybe.’ You know, the distributed systems and BDSM guy?

> What? Did you say BDSM?

There are several jokes here based on cultural references related to the aforementioned topics.

I learned that <x> will laugh at a joke three times:

* First when you tell it,

* then when you explain it,

* then finally when they get it.

(x used to be English people when I was a kid.)

A gentle reminder that it's against the Hacker News guidelines to leave unclosed xml tags in your comments.

Surely that's BNF rather than XML?

At least you level of certainty has increased.

I call the process it's satirizing 'fashion driven development'. It doesn't just happen in the devops world.

Hype is as bad as seeing a promising technology just as hype without trying understand the problems it might solve.

This is quite frustrating for both people who are aware of those issues and trying to fix them as well as the people missing out on the real advantages of such technologies.

This reminds me of similar sentiment around virtualization and cloud computing later in my peer group:

Some sold VMs as security feature and people focused their criticism on that, without understanding other advantages like quick/self-service provisioning of systems. Later one, cloud computing was trivialized as "it now just somebody elses computer" which completely ignored advantages like no ramp up costs and the ability to problematically manage your systems life cycle.

PS: Considering every new thing a fad probably also makes you consider 'hadoop' the latest shit in big data processing and assume today's tech companies hipster are fighting over wordpress plugins. (Like, really?)

Well I think that people and developers in particular needs to start realising that companies like docker has gotten really good at marketing them selfs towards them. Most of the stuff you hear about a technology is just marketing stuff that makes you sound smart when telling your peer about a new technology.

> So I just need to split my simple CRUD app into 12 microservices, each with their own APIs which call each others’ APIs but handle failure resiliently, put them into Docker containers, launch a fleet of 8 machines which are Docker hosts running CoreOS, “orchestrate” them using a small Kubernetes cluster running etcd, figure out the “open questions” of networking and storage, and then I continuously deliver multiple redundant copies of each microservice to my fleet. Is that it?

exactly. I mean look, if you have a lifestyle business that's only going to support 5-10 people, it's totally a waste of time. if you have some hope of scaling this is the way to go. I get it, just use Heroku. It's easy and convenient. If you're planning on a billion dollar exit, this way is way better.

> I need to decide if i believe my own hype?

yeah. sorry.

Self admittedly, I am one of the P2P psychotic pundits. This is a good reminder that we need to tone our language down.

That said, if you can get your system to work with a single Heroku box, you really truly can simplify your life. That is what we're trying to do with http://gun.js.org/ , be able to start with a single machine and no configuration/setup/complexity. Then grow out.

We just had a discussion on the WebPlatform Podcast about all of this P2P stuff (https://www.youtube.com/watch?v=NYiArgkAklE) although, like I said, I probably got too jargony.

But props to circleci for calling out the elephant in the room. Great marketing actually.

You know what's better than microservices? Nodes. Nodes running open source software that can power a distributed network.

Microservices often hit the same database. You want to be able to split up the database. Not just into shards, but into distributed nodes.

And by doing this, you split up the whole stack.

I believe microservice is the wrong term used to describe SOA. The word 'micro' make it looks simple & applicable for tiny apps 10-50K SLOC. I believe you should start chopping off your monolithic app only when it reach > 100K SLOC. Still you can split it up by well defined modules with clear interface, without necessarily using SOA if it is running on same box.

Having monolithic app does not make it bad. What makes it bad is not having proper modules with proper interfaces.

SOA comes handy when you want to distribute your workload, so now we have proper modules but those modules needs more computing power, so split them up into boxes and pay the pain for managing that, because you have no option.

As the author of that "Docker is the Heroku Killer" post that was popular a couple of years ago I have to say that I agree with this.

When I wrote that article it was largely focused on the potential for Docker to create a bunch of Heroku competitors as well as a simplified development experience across multiple languages.

The businesses aren't there yet although a ton are trying. The local dev experience has not materialized yet either outside of native Linux due to performance issues with volumes that only a 3rd party rsync plugin have come close to fixing.

I still use and advocate for Heroku pretty heavily for just about any non-enterprise environment.

I'm working on one of those Docker powered Heroku "competitors". This post and your post both rung true for me.

It's a constant balancing act. Too flexible, it becomes overwhelming. Too constrained and you sacrifice a bunch of the perks of using Docker.

The conclusion I've come to is the only way to do it is to be unashamedly opinionated about keeping things simple for the average user. Otherwise you end up having that exact conversation

Yea. The most realistic way to get things working IMO is Dockerfiles with 80% use case defaults for parts of the stack. From there if people want/need to dive in and tweak them they'd have the ability to do so but ideally the average user doesn't need to know they are touching Docker much beyond knowing that it's there if they need it.

> The local dev experience has not materialized yet either

Have a look at PCFDev.

Disclosure: I sit next to the PCFDev team and use it in my dayjob.

When it comes to micro services and containers, we are are missing fairly detailed descriptions of real world architectures from successful projects. And actually that applies to many other technologies as well.

When it comes to micro service, it would be interesting to know simple things like what kind of services were created, how large are, how communiction is handled, how large team(s) behind the service etc.

For some companies these are of course trade secrets, but sometimes opening things up might be good marketing. An example is Backblaze with their very detailed descriptions of their storage pods.

Would you find this presentation useful?


Is there a computer-voice teddy-bear version of this?

I find them much better than walls of text.

Not exact but somewhat similar feelings expressed here: Let Me Just Code: http://www.drdobbs.com/tools/just-let-me-code/240168735

Getting Back To Coding http://www.drdobbs.com/architecture-and-design/getting-back-...

Clickbait titles like that don't make me want to do business with circleci , it feels like a business full of immature kids yelling at each others. Not even going to click on the link.

I think words are very powerful, in particular "microservices" vs "monolith". By accepting those words, we imply the conclusion: microservices sound sexy and lean and elegant - who can argue with separation of concerns? And a monolith is a big unchangeable rock.

I think we need a better word for apps that are single tight self-contained systems than "monolith". You can design elegant interfaces, and avoid creating a sloppy mess, with function calls or objects too.

What containers do for you is two and only two coarse-grained things. 1. Make you more productive rolling out validated infra 2. Better utilize your hardware. Both of these things imply that you were using VM/AMIs before and you were hand-crafting your entire stack using things like Chef and Ansible. If you weren't doing that before (like, if you were using Google App Engine or Elastic Beanstalk or Lambda), Docker will make you less productive than you are today.

It's amazing that an article from a year ago can still create a buzz on HN like this. Goes to show there are still a lot of people talking about this.

Where do Beowulf Clusters fit into all of this?

I'm actually looking into using Docker as an easier way to deploy my software to a Beowulf cluster..

"Hi, my name is dokku, and I have no idea what you're talking about" :)

This rant sounds just like any rant from old dev mocking a new tech. "This is less efficient, this is too complicated, this can't be taken seriously, this won't last".

Creating a character obsessed with "this is dead" hardly dissimulate the obsession with "this won't work". Do whatever you please, we don't care. But don't mock others about what they please.

Passing through that, let's address the critics.

Microservices and docker are not necessarily tied. I write only monolithic apps, and use them with docker through dokku.

Etcd is a microservice problem, not a docker one.

You don't need coreos or kubernetes to use docker in production. You need them if you want massively scaled applications, just like you would have many servers running the same app with replication without docker. Most of us don't need that (and those who need it probably won't find it more complicated than what is needed to do that without docker).

If you don't want to manage servers, well, don't manage them. That's what cloud services are made for. But please tolerate some people love devops and not spending much direct money into infrastructure.

Parody is just a light-hearted way of criticizing. You shouldn't take it personally (unless specifically targeted).

In any case, the author of the post actually agrees with you: https://circleci.com/blog/it-really-is-the-future/

Yeah, I tend more and more to be annoyed by some devs negativity every time someone is trying something new. I may take that personally, yes, not about docker and microservices, but because I'm always told to stop creating and use what already exists each time I'm trying to develop a new concept on sideprojects. I have to let that go, I guess.

Thanks for the article, a lot of interesting things in it.

It's funny how it's focused on the scaling problem. It may depend on which circles we're in, but it seems to me that what people were most interesting about in heroku was more the ease of deployment than the scalability. It probably depends on the size of your usual projects.

Yet, what is interesting in docker is not just scalability. I find it way easier to code system dependencies with docker than with chef, that's already a big win. Also, I've stayed out of heroku for my own projects, mainly because of the cost. Docker, with dokku, allows me to have the same comfort one have on heroku, but with a 80€/mo server (handling about 15 small apps, and still having one third of memory available). And having several applications using several versions of ruby or postgres on the same server is not a problem anymore.

In that regard, docker is not only interesting for people who have massive infrastructure to manage, but also to people who are used to self hosting and want an easier way to deliver.

>This rant sounds just like any rant from old dev mocking a new tech.

Probably because we have seen it all before, and there isn't much "new" most of the time.

I've been a professional developer for ten years myself, I've seen my part of hype. But I've seen way more people not even trying new things (and trying does not mean sticking with it on your main project), probably because they could not stand the idea that the experience they are so proud of could be invalidated in any way.

As an old dev that mostly (not completely) likes new tech, the novelty is often subtle, and almost always has a point to it.

What is mock worthy are the attempts to make everyone's pet technology a floor wax AND dessert topping. But that's mostly VC funding at work

You mean B.S. written to impress investors and dominate a market segment for 2 or 3 years until the "founders" have a good exit.

I must've missed a tech cycle (or two) - I had heard of Heroku but didn't know what it did. I've used lxc and Docker and read bits about CoreOS/rkt/appc/kubernetes/etcd.

I know it's tongue-in-cheek but few if any of these new fangled things are critically dependent on one another.

This comment is over simplistic and reads like it was written by someone who doesn't know what they're talking about. All of these technologies have their place, but they should be adopted incrementally and where it makes sense. Posting a frenetic conversation benefits no one.

There's a theory in economics about the optimal size of a firm. How big should a company be? The optimal size could be infinite, where all of society acts as one firm. Think Communist-style command economy. Or it could be one, where everyone acts as networks of individual contractors or single-owner businesses. Think anarcho-capitalism. But in reality it's neither extreme and falls somewhere in the middle; why?

It turns out that the optimal size depends on the balance between the overhead costs associated with allocating resources within one firm and the transaction costs associated with two firms doing business with each other. The overhead costs are higher with large firms because there's more internal resources, including people, to allocate. On the other hand, transaction costs are higher with small firms because each firm does less themselves so they need to transact more with others to accomplish their goals.

As the relative costs vary over time, the optimal size varies too, and firms in an industry will grow and shrink. If it increases, then you'll see mergers and acquisitions produce larger firms. If it decreases then you'll see firms start splitting or small startups disrupting their lumbering competition.

I suspect a similar thing happens in software, where there's an optimal service size. It could be infinite, where it makes sense to build large monoliths to reduce the cost of two systems communicating. Or it could be one, where it's optimal to break the system at as fine a granularity as possible (function level?).

The optimal size depends on the balance of costs. All else being equal, by drawing a service boundary between two bits of functionality you shrink the services on either side but you increase the number of services and add communication costs for them to exchange data and commands.

How these costs balance out depends on the technology, and there are competing forces at work. As languages, libraries and frameworks improve, we can manage larger systems at lower costs. That tends to increase the optimal service size. As platforms, protocols and infrastructure tools improve, the costs to run large numbers of services decreases. That tends to decrease the optimal service size.

The microservices movement, and to an extent the serverless movement, assume that in the medium- and long-term the technological improvements are going to tip the scales sharply in favour of small services. I agree that's likely the case. But we're not there yet, except in some specialized cases such as large distributed organizations (Conway's law). But it's going to be at least a few years before it's worthwhile to build most software systems in a microservice architecture.

Well, repeating buzzwords and pushing for early technology adoption for the sake of early technology adoption seems to be dumb, as the article implies.

But new technology is necessary and early adopters are necessary. Iteration is necessary. Don't punish it.

This is structured like one of those Xtranormal do-it-yourself cartoons from a few years back.

I'm sure I could go back 5 years and see an almost identical complaint about Heroku.

I love everything about this.

The post got one critical detail wrong. The microservices won't have their own API. APIs are dead. They'll use Kafka for messaging. That's the future.

This reminds me of React state. Sagas, selectors, constants, components, containers, routes, reducers, actions, connectors, shoot me.

Check back next year for Docker is dead article.

Yet another comparison of hot and soft.


Is there really something called "Ew" ? Its hard to Google something thats just two chars.

The search term is "Jimmy Fallon ew".


Is there an advantage to using docker when it takes 3 hours to rebuild our relatively small database?

Did you read it?

Sorry, I was lazy. I did not read it.

this old gem is back trending now!

Really Hilarious and Reality. Lot of noise about docker, kubernates, microservices etc.

doesn't anyone think about performance anymore? I mean just add more vms or increase their config, no one talks about efficient resource utilization asfaik. And there might be times where you containerizing everything is an over overkill.

Funnily enough, people do. In fact, containerization came from Google caring a lot about performance! A good intro to how containers help, and to the probable future of computation: http://m.youtube.com/watch?v=7MwxA4Fj2l4

Wow, what a strawman. Seems like someone was blowing off some steam.

Depends on how you choose to read it. I love microservices and all the new tools we have to scale gracefully, but I read this as satire. It's more hyperbole on how some people always seem to be over-engineering and pretending that's the conservative way to write software without considering the time-sink required to be able to dynamically scale with all of those tools.

Calling this satire insults the skill and wit of competent satirists the world over.

It's not clever or funny. It's lazy. It's exaggerated to the point of questionable emotional stability on the part of the writer.

Actually it is funny, but only in an ironic way.

"-Well, Amazon has ECS, but you gotta write XML or some shit."

"I thought Mongo was web scale?"

This joke will never get old to me

Is it a blank page ? I don't see any text.

Your browser is dead. Use servo instead. Works on servo.

Seriously, the text colour is so light that it is very hard to read.

No, it's not.

Even renders on console browsers.

Did you read this on an Xbox?

Lovin' it!

Solution: .NET

Did you just tell me to go containerize myself?


That's why we started https://baasil.io/.

The idea is you can deploy any app to any infrastructure of your choice (inside Docker containers). This means that you are not locked into Heroku and it gives you much more flexibility.

It's basically a hosted Rancher http://rancher.com/ service with a focus on a specific stack.

I think in the future, there will be a lot of services like Baasil.io (specializing in various stacks/frameworks) and managed by various open source communities.

Docker and Kubernetes WILL become more accessible to developers - I would bet my life on it.

I'm currently building a CLI tool to allow deploying in a single command - So you can get the simplicity of Heroku while not losing any flexibility/control over your architecture.

But all you've done is shift the problem to, er, you. If someone is uncomfortable taking out a dependency on say Azure or AWS - the two leading Docker hosting platforms, then they sure as hell aren't going to take out a dependency on "baasil.io" are they?

Um no, Rancher is open source can run and manage ANY infrastructure (including Amazon EC2) - You can run it on your own datacenter or even on your own local machine.

Also Baasil.io is essentially just a control panel/dashboard (Rancher-as-a-service), you can quit Baasil.io at any time and switch to your own hosted Rancher instance and you don't have to change any of your application code or change infrastructure providers.

The main benefit of Baasil.io is that it was built by an open source community using open source software and so we can offer the best possible support for apps built on top of our own open source stacks.

I took one look at "baasil.io" and saw the typical landing page with a "Plans" link at the top and questioned why anyone would want to take a dependency on this. If there is some OSS project behind the plan-based charade then that's fine.

"Also Baasil.io is essentially just a control panel/dashboard (Rancher-as-a-service), you can quit Baasil.io at any time and switch to your own hosted Rancher instance and you don't have to change any of your application code or change infrastructure providers."

But so you can with AWS and Azure - especially with their Docker offerings. So I'm not sure what problem baasil.io actually solves? If anything it just adds to the list of dependencies and points of failure.

The main benefit of Baasil.io is that it offers developers a boilerplate/framework which they can extend with their own code (to build scalable realtime apps and services) - Any app/service built on top of this boilerplate can be automatically scaled to 1000 hosts/nodes using a single command.

Users don't have to use Baasil.io, but if they do, they will get the best possible support - For example, a customer can give us access to their Rancher control panel and this would allow us to SSH into their machines to help resolve any problems in a hands-on way.

It's probably more accurate to describe it as "DevOps as a service - With a focus on realtime apps/services". The value proposition is probably closest to Cloud 66 http://www.cloud66.com/ except more focused on realtime apps.

Another similar service is Zeit.co https://zeit.co/ - Except Zeit.co only runs Node.js. Baasil.io can be extended with components written in any language.

xkctext: we saw 15 competing standard, so we decided to invent our standard that encompass all the previous one!

Then there are 16 competing standards.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact