Hacker News new | comments | show | ask | jobs | submit login
“It's The Future” (circleci.com)
1323 points by knowbody on Aug 17, 2016 | hide | past | web | favorite | 521 comments



The article perfectly summarizes my frustration and sentiment. These days I hear these buzzwords all time. I work as a consultant for an enterprise product and most people whom I meet they somehow catch these buzzwords and blurt it out in front of everyone during meetings and discussions to either showoff that they know technology and things that are in the market these days(also latest iphone, apple news, tesla, space exploration and what not) or I feel they are somewhat trying to hide their insecurities.

Anyway long story short, most of these people do not really understand why they need all this rocket science to manage < 500 internal users. One of the new buzzwords I am hearing these days is mostly related to bigdata and machine learning. One of my managers came to me and asked me why dont we integrate our product with hadoop it will solve the performance problems as it can handle lot of data.

I am frustrated by the industry as a whole. I feel industry is simply following marketing trends. Imagine the no. of man-hours are put into investigating technologies and projects dropped mid-way realizing the technology stack is still immature or not suitable for at all.


This is how we manage this problem at the times when Visual Basic was the king and we use instead Visual FoxPro.

People want theirs apps to be made with Visual Studio (BTW, FoxPro was part of the package).

So they ask: "In what is the app made"?

"In Visual, Sir."

Done. End of story (like most of the time, obviously some times people are more dangerous and press it ;) ).

----

The point is not focus in the exact word but in what the people know the word will give to them.

So, for example "Big Data". The meaning for us matter zero. The meaning to some customer is that it have a largeish excel file that with his current methods and tools take too long to get results.

So. Do you use "Big Data Tools"?

"Yes Sir."

And what about use Hadoop?

"We use the parts of big data tech necessary for solve this, and if we need to use hadoop or other similar tools that fit better with your industry and that use the same principles will depend in our evaluation. Not worry, we know this"

Or something like that ;). Know that worry the people behind the words have help me a lot, even with people with WORSE tech skills (damm, I have build apps for almost iliterate people with big pockets but only witch cellphones as references of tech!)

And the anecdote about the largeish excel file that was too big and take too long? Yep, true. And was for one of the largest companies in my country ;)


You're exactly right. In the end, the customer is worried about solving their problem and they're asking if you're aware of <latest thing they've heard>. It's like when you go to the doctor and say "I've read of an experimental new treatment for X, can't we do that?". The doctor has probably already heard about it.


Yup. An experimental treatment is probably not even available to be prescribed, besides it's ethically questionable to use a treatment when the risks (or benefits) haven't been clearly delineated. Even if it could be used, a responsible prescriber would want to try all "standard" remedies before attempting an experimental method.

That's called practicing conservatively, minimizing chances of bad outcomes. It's a matter of astute clinical judgement to glean optimum risk/benefit ratio in a particular case. Since no two cases are ever exactly the same, good judgement is a constant necessity.

I see that the process of developing software has many parallels and not surprising that everyone experiences so much brokenness. When people complain to me about some mysterious program misbehavior (stuff I had nothing to do with) I empathize with them, and try to help them think logically about the problem they're having.

Only rarely can I offer any real insight, but given the insane proliferation of the alphabet soup of identifiers attached to all the "new things" out there, no one I know in the industry feels they have a handle on what's happening.

Seems like the pace of "innovations" will lead to even greater levels of incomplete and dysfunctional systems and can only lead, sooner or later, to truly catastrophic failures.


I think the buzzword abuse exists because of people who don't want to take the time to learn to real skill, and just want shortcuts to sound smart and relevant.

I am very skeptical of people who are "BizDev" or "Project Managers" or "Managers" or "Scrum Master" they generally don't know what they're talking about and rely on buzzwords.


Not necessarily. I suspect that people that have overlap in hist job positions but still need to use a "shared vocabulary" will diverge in their understanding of each word, not by laziness but because simply is another job.

For example, if a DBA and a JS Developer say "We need to use a scalable database", they probably don't have the same thing in mind about what "scalable" or "database" exactly is, however, both are concerned about provide data at performance.

So, if a naive web developer wanna "a scalable document stored!" you can just give to it postgres and presto! ::troll:: ;)


I'm wondering why you have managers in scare quotes. Are you suggestion that they aren't really managers or that managers as a position is just some sort of fraud?

Also Project Manager and Scrum Master are just positions that describe roles and responsibilities an organization / on a team. The people filling those roles needn't be clueless.


I feel people in those positions are just people who want a job in tech / startup and have gotten there not because they deserve those positions but because, they've given up on learning the skills and resorted to becoming a "manager". At least from my experience this is what I've seen. They generally don't really add any value because they don't have any skills, don't know how to do anything.

The positions I mentioned above are usually the position people who failed at picking up any valuable skill seem to resort to.


Elegant summary and recommendation. If NumPeopleWhoUnderstandBuzzwords < NumPeopleWhoUseBuzzwords, then don't assume your answer has to accurately capture the nuances to satisfy them.

If they don't accept your answer and ask a followup, then they're probably a person worth actually having a conversation about the pro's and con's with.


I totally agree with this and have done it myself. I've also been on the receiving end of this when I actually do know the market / tools better than the consultant and I actually do know exactly what I want, and the consultant has tried to brush me off. That does not end well for them. Just a word of caution to take extra care to know the background of the person you're talking to before starting down this path :)


> I am frustrated by the industry as a whole. I feel industry is simply following marketing trends.

My work lands me in a number of different conferences in non-software industries. This is true for all industries. Its just that ours has a faster revolving door. That, in addition to a low barrier to entry (anyone can claim they're a web developer), leads to a higher degree of this madness. Its just part of human behavior to seek out, and parrot, social signals that let others know you, too, are an insider.

Personally, I have to avoid a great number of those gatherings, since the lot of them are just a circlejerk of low-density information. If I pay too much attention to those events, I catch myself looking down my nose, and since that isn't productive/healthy behavior, I avoid landing myself in a place where guys with buddy-holly glasses and <obscure-craft-beer> argue which Wordpress plugin is the best.


Another way to signal others that you, too, are an insider is by calling a current trend a hype.


To make a clarification, I was not calling Docker hype, nor specifically remarking on any particular item. I use docker religiously. I even use dokku for about 30+ toy projects.

My remark was to highlight that buzzwords are often used for "me-too" ankle-deep conversations/articles. Whether someone calls it Devops, or Systems Engineering, makes no difference to me. However, I favor pragmatic conversations about the topic, rather than buzzword bingo.

Examples include: "MongoDB sucks.", "Everyone should use Docker", and "What? You mean you're not using Kubernetes for your CRUD app?"

Basically, blanket statements that accomplish nothing more than to send social signals.


From 20+ years of experience and having seen tons of trends just die, most are just that, hype.


But calling everything that's new "hype" and pointing to the past as the only things that are "real" or "solid" is also a form of groupthink.


Yeah, but I think it's the inverse kind of groupthink than what the industry suffers from.

So it's healthy to embrace it as counter-balance to the constant hype.

(Besides, whether something is "real" or "solid" I think can mostly be answered in hindsight -- when it's mature enough and tested enough. In which case calling only things in the past solid is prudent).


I agree, and am guilty as charged.


"I am frustrated by the industry as a whole"

Unfortunately I have to agree as a developer. My job is to make a fast, reliable, stable product but at the same time I'm questioned the tools I use by people who don't have any knowledge but heard the latest trend.

But sometimes it's also very easy to please people. Big data: just insert 10M records in a database and suddenly everyone is happy because they now have big data :|


These discussions are perfect examples of why building good social skills can be more important than learning the next greatest programming language 5.0.


I love you for saying this because it needs to be said.

Thank you.


> But sometimes it's also very easy to please people. Big data: just insert 10M records in a database and suddenly everyone is happy because they now have big data :|

Since when is 10M records is considered big data?

My goto gauge for big data is that it can't fit in memory on a single machine. And since that means multiple TB[1] these days, most people don't really have big data.

[1]: *Heck you can even rent ~2TB for $14/hour! https://aws.amazon.com/ec2/instance-types/x1/


I get your point, but 10M records is big data depending what you're doing with it. Not big on disk, but extremely unwieldy depending how it's structured and how you need to query/manipulate it. I let internal product engineering at a large multinational for a long time, and we accrued so much technical debt as a result of having to handle the stupidest of edge cases, where queries against just a few million (or even thousands) of records took multiple seconds -- in the worst cases, we had to schedule job execution because they took minutes -- because of ludicrous joins spanning hundreds of tables, and imposition of convoluted business logic.

Most all of that is overall poor architecture, and most companies don't hire particularly good developers or DBAs (and most web developers aren't actually very good at manipulating data, relational or not), but it's the state of the union. That's "enterprise IT". That's why consultancies makes billions fighting fires and fixing things that shouldn't be problems in the first place.


I think that is why he had the :| face at the end.


Oh haha. I thought that was a typo!


> big data is that it can't fit in memory on a single machine

A Lucene index can be much larger than your current RAM. It can be 100x that. The data will still queryable. Lucene reads into memory the data it needs in order to produce a sane result. Lucene is pretty close to being industry standard for information retrieval.

My definition is instead "when your data is not queryable using standard measures".


I literally heard that "Big Data is something that is too large to fit into an Excel spreadsheet". The speaker was serious.

I unsubscribed from that (non-tech) podcast.


I would have said when a single file is bigger than the maximum size of disk so say 4TB - its whey we used Map Reduce back in the 80's at British telecom for billing systems - the combined logs would have been to big to fit on a single disk


I'd say if it can't fit in RAM, but still can fit on a single SSD it doesn't count as big data either


Not sure how accurate that is, since you can buy 60TB SSDs these days.


ergo, not big data.


Yeah that's everybody's gauge if they actually work with it, which was the point.


On a positive note, it sounds like proprosals to use newer technology are welcome. I keep seeing the opposite, "No this is too different, could break stuff."


IME that comes with: Sure, the new tool looks cool, but is it battle-tested? How many tools end up being relied on heavily while they're still in beta? And does it solve any of our current problems? or is it just neat?

As a grumpy SA, I see way too many people try to push for new tools because they "seem cool", instead of "Do they solve a problem we have?"


Personally I prefer to wait until technology is battle tested before adopting. New technologies are for side projects imo. If I had to categorize myself I would say early-late majority on this graph (https://en.wikipedia.org/wiki/Diffusion_of_innovations#/medi...)

Things we consider industry standard though, why should you need to fight for it? An example I can think of, dependency injection. Ideally you can test your software better and realease more reliable builds. Believe it or not I do come across companies that still are not aware of these concepts. Introducing it would be possible without breaking anything because you can continue instantiating services the old fashioned way.

With newish stuff that's still changing, if it won't impact production (i.e., tooling) I'm up for adopting it earlier than usual.

One example I can think of is javascript bundling and packaging. This would not impact production, but will have a pretty big impact on feature integration between team members and rate of completion. In MVC you need to hand type up the path of all your JS files and stick them into bundles. Not bad, not great either. Instead you could take your flavor of package management and have that bundle and minify your js files for you automatically.

I've been around government contracting and when you see problems that come up a lot, that we have industry standard solutions too, it's hard not to feel frustrated. I get where you're coming from though, just sharing my experience :)


It took me years to realize the reason programmers do this is because the tools that "seem cool" make their lives easier at the expense of everything else. This is where the popular traits of "laziness and hubris" become a liability instead of an asset.

More programmers need to embrace the suck.


> tools that "seem cool" make their lives easier at the expense of everything else.

I'd argue the opposite. Instead of spending time reflecting on how cool and useful their code is, or hardening it up, devs spend too much time reinventing the wheel. All this work to learn the next new fad is killing productivity.


Easier might not be the right word. 'tools that allow them to be lazier' might be more accurate. Gluing together pieces somebody else wrote and trying to get them all to work with as little effort as possible and are surprised when it doesn't work well.

> devs spend too much time reinventing the wheel

I'd argue the opposite. They spend too much time not reinventing the wheel. They strap factory made bicycle wheels onto a car and are surprised when the wheels break. They could benefit from spending more time trying to make a better wheel.


Or learn about better wheels designed by smart people back in the 60s and 70s, when no one had the capability to just keep sticking wheels onto cars to see what works - so they had to rely on thinking and solid engineering practices instead.


Precisely why I've started buying technical books from ages past. I'm working my way through Algorithms + Data Structures = Programs by Niklaus Wirth, Constructing user interfaces with statecharts by Ian Horrocks and Practical UML Statecharts in C/C++, Event-Driven Programming for Embedded Systems. The last one has been especially enlightening.

Do you have any suggestions for which 'better wheels' people should be looking at?


SICP is a classic I can highly recommend. It made me aware of just how much the "new, smart" approaches to organizing code that people like to attribute to their favourite programming model (like "OOP is best because classes and inheritance means modularity") are actually rehashing of obvious and general ideas known very well in the past.

I generally like reading on anything Lisp-related, as this family of languages is still pretty much direct heritage of the golden ages.

The stuff done by Alan Kay, et al. over at PARC is also quite insightful.


If it ain't broken, why fix it?


In my case, usually something is broken or breaking in production frequently enough to warrent some changes. Plus, there are other reasons you can make a change even though it's not broken.

Sometimes it can make you more productive. Or though your site is still responding to current customer demands in a timely fashion, you know that the mobile experience could be significantly improved now that browsing via cell phone is on the rise.

Another thing to consider is employability both from a company and individual perspective. If you can keep up with moderately current (not the latest and greatest) trends, you'll attract people who want to grow in their careers. I wouldn't want to work on C# 2.0 using Visual Source Safe. It's hard to convince a company that you can learn git on the job.

In general I like to move without introducing breaking changes. I'm not a cowboy coder, it's really exhausting working with one. I do think there's merit in realizing when it's time to change though.


As long as the database isn't relational, I guess.


10 million rows in a relational database doesn't need to be bad nor is it big data.

Rows is a bad measure of "big" when it comes to data. A measurement of bytes and probably more specifically bytes per field and how many fields the records have, as this gives a better indication into the way this will be written and potentially searched.

10 million rows of 5 integer values is pittance for any relational database worth using in production. 10 million rows of 250 text columns would be horrendous for a relational database.


Someone once suggested to me that 'big data' begins when it doesn't fit in RAM in a single rack any more.


Yup, that's essentially what looking that byte-size means. However, just because it doesn't fit in memory might not make it big data if it's just poorly engineered.

But many times this happens because of wasted or bloated indexes that aren't useful. Or it happens when data types are picked incorrectly.

For example, I once worked on a database where the original developer used Decimal(23, 0) as a primary key. This was on MySql and that ended up taking up 11 bytes per row, versus a Long which would have just been 8. In one table, maybe not so bad but when you start putting those primary keys into foreign key relationships... we ended up with a 1 billion row table in MySql that had 4 of these columns in it. That might make it "big data" by that definition but it's also just bad design.

Another example in that same database was using text fields in mysql for storing JSON. Since text fields in mysql are stored as separate files, this meant that every table that had one (and we had several tables that housed multiple) ran in large IO and disk access issues.

"big" data is probably a bad term to use these days because of easy it is to accidentally create a large volume of data but not need a big data solution outside of the fact that it's not the business that needs it, it's the poorly implemented system that does.

But the real reason we talk about fitting in memory comes from the core of the issue: IO. Even a super large memory set could end up being slow if it's postgres and single threaded reader that's scanning a 500 GB index. AWS offers up to 60 GB/s memory bandwidth and we'd need it for this index, since that would still take almost 10 seconds to warm up the indexes in the first place.


>Since text fields in mysql are stored as separate files

Bwuh? Over in MS SQL you just go for an NVARCHAR and forget about it. What is the right way to store this data (if you really do need to store the JSON rather than just serializing it again when you get it out of the DB)


varchar is different than a text field in mysql: http://dev.mysql.com/doc/refman/5.7/en/blob.html

It stores text fields as blobs.

I suppose now the right way would be the json data type. It didn't exist when I was working with these servers though (or they were on a much older version of MySql) https://dev.mysql.com/doc/refman/5.7/en/json.html


That's soon going to be on the order of 100 terabytes, so there will be only a handful of companies doing big data ;-)


I'm only aware of servers up to 12TB. Care to elaborate?


He/she said a whole rack of servers. I actually took 30 servers of 2TB each and rounded up to 100. With 12TB per server it will already be over that.


10M rows in a relational database is a very low number (depending on the size of the row of course).


I know, it was a sarcastic follow up to the "now they have big data" part of the original comment.

"SQL doesn't scale". It needs to be in Mongo or whatever NoSQl database is in right now. I have heard all sorts of nonsense regarding "big data" in the last few years.


ahaha, i didn't read the sarcasm that time, sorry for replying with tmi


I never thought I would appreciate Java...but the industry has really made me.

Take your .war file, drop it onto JBoss. It deploys across the cluster in a zero downtime manner, isolates configuration, provides consistent log structure, cert management, deployment. You can deploy dozens of small war's to the same server and they can talk to each other. Load balance across the cluster automatically based on actual load. Run scheduled jobs and load balance the scheduled jobs themselves. Allow them to be isolated and unique within the cluster.

I may not like Java as a language, but from an infrastructure standpoint Java was basically Heroku long before Heroku was Heroku. The infrastructure is just...solid. The downside was that the XML config stuff was just messy.


Can you do similar stuff with clojure or Scala? Maybe there's a way to avoid the bad parts


Actually for many of us it isn't bad.

I have come to the point where I only look at other languages once in a while and it serves me well.

A few years ago when I was still in farming we had the ostrich craze: ostriches were crazy profitable (or so the ostrich sellers said) and every farm needed to consider it.

Eggs where $300 a piece etc etc.

Of course the first to get one made great money by selling eggs, chicken and consulting hours to all the rest.

The rest where not so lucky and today I don't know a single ostrich farm.

Same goes for latest tech: if you want to you can try to be first and make a good living on the hype stream.


These days there are workarounds to avoiding a lot of xml configuration in the java ecosystem (although you end up with a decent amount of annotations). Spring boot is a great example of this.


> These days there are workarounds to avoiding a lot of xml configuration in the java ecosystem

As is, believe it or not JavaEE.


That's the nice thing about JVM languages. They get all that infrastructure without having to use Java straight up.


Yeah. And Kotlin.


Oh, I feel your pain! There's too much fashion going on in this industry, and the high salaries and the youth rediscovering the same concepts from 40 years ago doesn't help..

I mean, it's great to have this new tech and all, but when you're trying to build something to last some years, sometimes it's hard to filter the crap between all the buzzwords. It just reinforces the thought that smart people should just leave this field entirely or search for other fields of knowledge (or business) where our knowledge of programming can be made of use.

I'm 35 now, but I'm starting to realize that I will not have the patience to keep with all the crap just to be employable.. There are some areas where being old and experient is valuable. Philosophy, science, psychology, teaching, etc., are maybe some of them, but this industry is definitely not one of those areas. It makes me think that what I'm building now will some day be completely wiped out of existence..


https://youtu.be/zut2NLMVL_k

"All my work will be obsolete by 2005" -Steve Jobs

If you aren't willing to accept that obselence is part of life, then you are either building something you aren't passionate about or confused about the cruelty of time.


As inspirational as Steve Jobs was, he wasn't correct about everything, nor was he a brilliant engineer.


Nor did anyone (in this thread) say he was.

The quote has bounded context. And in that context, seems generally valid and applicable.


I'd say it goes beyond than building physical or virtual things. I'm talking about something that resembles compound interests, where you build knowledge or experience over previous knowledge and experience such that in some years you are way ahead of where you are now.

I'm basically saying that the high churning we have now does not give you enough time to build significant experiences that you can use later on in life, and, as such, it's the opposite of a good investment to the future. It is almost as we are only living for the present status, forgetting that on the future we will have less patience and energy to have to "re-learn" almost the same things.


I think a lot of this churn can be sidestepped if you avoid startups and instead work for an "Enterprise" organisation. everything is much less cutting edge. In fact technology progression is pretty much glacial.

If you look at what technology was popular 10-15 years ago then that's what will be in use in Enterprises now. Java web services is currently the big thing at my company.

All the late 90's business apps which were in Visual Basic, Oracle Forms and Access are being rewritten as Java web services at the moment by an army of contractors. In another 10-15 years they will be rewritten again in the language Du Jour of today probably Go. It's an endless cycle.


My manager and CEO are going wild with "blockchains", and how we should use it for everything.

"We could store gigabytes of data on the clients without having to pay for servers"


Well, maybe... if you had any clients left, or the infrastructure to support such a thing.


This is hysterically funny! Can you tell us who this CEO is?


I too consult for an enterprise product and hear the same things. People don't know what the buzzwords mean but they know they must have it. That and then the assumption that ML based products are all just magic and require no effort. "You mean it doesn't just figure that out for me". Thanks marketing department.


> ...the assumption that ML based products are all just magic and require no effort.

Yep, same experience here with both "Big Data" and the ML space. The decision makers need to see the sheer amount of Java, Scala and/or Python code you need to actually implement to do anything useful.

Nope...not magic.


It seems like it's been going on in our industry for a long time. I'm reminded of it being referenced in comics from the past two decades.

http://www.hackles.org/cgi-bin/archives.pl?request=52


Micro-services are the current-year deity of the cargo-cult that is Silicon Valley.

Unlike the natives, however, who simply wasted some time building extraneous fake runways, in the Valley people are royally screwing up their own core architecture.

I'm old enough to find this more humorous than frustrating.


It should give anyone with a better grip on core technologies a competitive edge.

The Valley is ripe for disruption. ;)


I think micro services only came into existence because SOA was such a disaster, everyone confused SOA architectures with web services. These ended up being n-tier apps with a web service RPC (or several) in the middle just to add some unnecessary serialization and network transfer bottlenecks.

So far I've seen micro services repeat this trend almost exactly.


Microservices take all the SOA problems and turn them up to eleven, as far as I can tell. It intertwines decomposition of your system (sometimes good) with network communication (rarely good) and additional ops management (never good).

Yanking out the major chucks of independent functionality into separate deployable services makes sense at a large enough scale and for large enough, independent enough components. But you would only do so out of necessity, not as an initial architecture.

And yet here we are.


I remember Larry Ellison saying once that the only industry more driven by fashion than fashion itself was IT. That was when the «cloud» thing was starting to take off. He was refusing to have Oracle use those new buzzwords, but at the end he was forced to.


Perhaps because in both fashion and IT, the surface appearance conceals a tremendous amount of underlying complexity.

Fashion signals, well, virtually everything about social interactions. A tremendously complex world. Including, for that matter, whether or not you care about fashion trends, and quite possibly, why you might or might not (you're not in the game, you've quit the game, you're so fabulously successful you don't need to play the game, you couldn't play the game if you wanted to, ...)

In IT, TLAs, ETLAs, buzzwords, slogans, brands, companies, tool names, etc., all speak to what you know, or very often, don't know. It's not possible to transmit deep understanding instantaneously, so we're left with other means of trying to impart significance.

Crucially, the fact that clothing and IT fashion are so superficial (of necessity) means they can be game, and that those who are good at following just the surface messages can dive in. Some quite effectively. But they're not communicating the originally intended meaning.


Totally agreed. As a freelance, it's really scary to invest time into a full stack of technologies. I should start a discipline to pick tools and not look back before n years went by. Maybe n = 2 or 3 ? (right now, I'm Objective-C - not even Swift - for native iOS, Ember for client, Rails for API/back-office and Heroku for deployment)


Honest question: How's that tech stack working out for you? What if you want to dev an Android mobile app?

Have you looked at React Native at all?

Thanks.


React native was not mature enough the last time I had to solve this problem. After writing and maintaining native clients in both Android and iOS for years, I decided to try something different. SPA app + Cordova + writing custom, native plugins for performance has worked out pretty well. Some things in the UI are not as fast as I would like, but develop/test/release cycle is so much faster (web, ios, android released nearly at the same time). It does help that I can write native Android or native iOS (obj-c/swift) to handle the plugins where needed. Cordova can also be a bit of a mess to deal with sometimes, but it is improving.


I also gave up on frameworks like Cordova because (1) who knows if they'll still be maintained in a couple of years, (2) how reactive/efficient can they be to offer access to new features from the native iOS and SDK apps. I feel like pretty much anything Cordova is really good at, you can do it with a webapp.


Another question... how do you like Ember for your web apps? :)


I still have a love/hate relationship with Ember (and other JS frameworks). They are simultaneously very powerful and quite restrictive. My most recent example: there's still no common and fast&easy way to integrate Google Analytics in a project. It takes some (reasonable) effort, like 1 hour for research and implementation, when it takes 10 minutes on good old server generated HTML/JS.


I sort of gave up on Android development by now. My mindset is iOS first if I need an app. And then consider a good webapp (with Ember then) if I want to extend to all smartphone users. If I needed a native Android development, I would try to find a partner able to code it, I wouldn't do it myself.


Swift should be mature enough by now for development. Or start with version 3 when it comes out.


I have seen excel, SQL, even MS office used as a buzzword. Of course there's also agile, scrum, etc.

Bigdata and machine learning are also hot word. But they are clearly modern engineering. Consultants exist to explain the best way to achieve modern best practices to people without the appropriate background. If someone asks about "Why no Hadoopz plx?", either explain the other technology used instead (maybe spark, storm?) or explain that the scale is small enough for Access to handle. That's a consultant's job.


> I feel industry is simply following marketing trends.

'twas ever thus.

http://dilbert.com/strip/1995-11-17


This is not new. This industry has been about trends since the first dot com boom - remember Java Beans?, J2EE, 3 tier architectures, blade servers, virtualization, Rails, full stack, Nosql and big data, ad infinitum. It's an industry like any other with its own signal to noise ratio. You can get frustrated about it or accept it and accept that's also what makes it an interesting industry to work in.


Some things never change. It's been this way for as long as I can remember. The sad irony is, satisfaction remains the same (i.e., low) as well. We keep chasing our tails and products remain sub-par and users still frustrated. It's a shame SOS isn't a loved buzzword.


Our industry is a pop culture. A fad.

Computer science is not a real field.


Computer science is a real field. It's just not "developing site/app for BigCo". But - then again - it never has been. :-)


This comment makes me believe that you've never been exposed to any computer science.


Of course not. I am in undergrad.

But I think Alan Kay has been "exposed" to computer science, and I follow his logic, based on my limited scope of knowledge.

https://www.youtube.com/watch?v=FvmTSpJU-Xc


Yea, but does it scale?


What other parts of getting old suck? :)


Having to work with people who see caution and wisdom as "getting old".


Virtues are generally disregarded by those who seek instant gratification.

Most of the times that I bring up the concept of virtue to my peers in age they seem either confused with the concept or contemptuous of it. They behave like virtue is a purely religious thing, yet caution in the face of possible danger is a very basic survival skill.


That's probably because you're talking about 'virtue' as an abstract quality, rather than good decision making as a practical framework.


How does the concept of morality have anything to do with enthusiasm for new technology?


Honest advice: Stop working for/with stupid companies/people and start working for smart ones.


Not really practical advice without some hint as to how one is supposed to spot those smart companies.


The thing is that for a lot of people, work is a balancing act - between how much you like doing something, and how much you like money. If a "bad" company pays you a tonne of money, you might still work for them because you like having shiny things. However, for each person the line is someplace else - I know some people who will take shit pay just to do what they enjoy, and I know people who don't have a problem working in the most frustrating environments 80 hours a week because they like being paid big $$$. It's all relative to what's important for you.


The smart companies already have more qualified applicants than slots, and often arbitrary hiring processes to boot. Not everyone has the luxury of working for one.


"-No, look into microservices. It’s the future. It’s how we do everything now. You take your monolithic app and you split it into like 12 services. One for each job you do.

That seems excessive"

A 100 times yes. We tried to split our monolithic Rails app into micro-services built in Go. 2 years and many fires later, we decided to abandon the project. It was mostly because the monitoring and alerting were now split into many different pieces. Also, the team spent too much time debating standards etc. I think micro-services can be valuable, but we definitely didn't do it right, and I think a lot of companies get it wrong. Any positive experiences with micro-services here?


I think that splitting into micro services is valuable if and only if you reach a scale where it makes sense to split into micro services. By scale, I mean the number of people on the team (if you have a lot of people, it can make sense to split into micro-services to limit communication bottlenecks between developers) or in term of traffic, in which case microservices can be very useful to better optimize the system piece by piece.

A small team starting a new project should not waste a single second considering microservices unless there's something that is so completely obviously decoupled in a way that not splitting it into a microservice will lead to extra work. It's also way easier to split into microservices after the fact than when you're developing a new app and you don't have a clue how it will look like or what the overall structure of the app will be in a year (most common case for startups).


In practice micro services mean that you turn a function or method call into a network request. This doesn't really limit communication bottlenecks. It is often more difficult to argee on a network interface than on a simple function or object interface. It's also more difficult to change. You introduce a whole new set of failure modes due to going over the network. Debugging is more difficult since you now can no longer step through your program in a debugger but rather have an opaque network request that you can't step into. You can no longer use editor/IDE features like go to definition. It becomes harder to do integration tests. Version control becomes harder if the different services are in different repositories. A network request is much slower than a function call. You no longer have the advantage of a garbage collector for logical values that now cross network boundaries, and rather need to manually free them. Deployment is more difficult. The list is much longer than this, but I'd be interested in the counter-list: what are the advantages of micro-services?


> You introduce a whole new set of failure modes due to going over the network.

A thousand times yes. Distributed systems are hard.

> Debugging is more difficult since you now can no longer step through your program in a debugger but rather have an opaque network request that you can't step into.

Yes. Folks underestimate how difficult this can be.

In theory it should be possible to have tooling to fix this, but I've not seen it in practice.

> You can no longer use editor/IDE features like go to definition.

Not a problem with a good editor.

> Version control becomes harder if the different services are in different repositories.

No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.


> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

I would modify this slightly. Larger organizations with independent teams may want to run on per-team repos. Conway's law is an observation about code structure but it sometimes also makes good practice for code organization. And of course, sometimes the smell is "this company is organized pathologically".

Another problem is that large monolithic repositories can be difficult to manage with currently available software. Git is no panacea and Perforce isn't either.


> No organisation should have more than one regular-use repo

Flat out wrong for any organization with multiple products. Which, let's be honest, is most of them.


I guess Facebook, Twitter, and Google are doing things "flat out wrong", then. Yes, that's a weak argument (argument from authority) but it is true that monolithic repositories have major advantages even for organizations with multiple products. Common libraries and infrastructure are much easier to work with in monolithic repositories.

My personal take on it, at this point, is that much of our knowledge of how to manage projects (things like individual project repos, semantic versioning, et cetera) is centered on the open-source world of a million mostly-independent programmers. Things change when you work in larger organizations with multiple projects. You even start to revisit basic ideas like semantic versioning in favor of other techniques like using CI across your entire codebase.


Those are huge organizations with commensurately large developer resources, and they simply work at a different scale than most people on HN. "It works for Google" is not an argument for anything.

Monorepos come with their own challenges. For example, if any of your code is open source (which means it must be hosted separately, e.g. on Github), you have to sync the open-source version with your private monorepo version.

Monorepo are large. Having to pull and rebase against unrelated changes on every sync puts an onerous burden on devs. When you're remote and on the road, bandwidth can block your ability to even pull.

And if you're going to do it like Google, you'll vendor everything -- absolutely everything (Go packages, Java libraries, NPM modules, C++ libraries) -- which requires a whole tool chain to be built to handle syncing with upstream, as well as a rigid workflow to prevent your private, vendored fork from drifting away from upstream.

There are benefits to both approaches. There is no "one right way".


It seems we agree, we are both claiming that "there is no one right way".

I love Git, and I used submodules for years in personal projects. It started with a few support libraries shared between projects, or common scripts for deployment, but it quickly ballooned into a mess. I'm in the process of moving related personal projects to a monolithic repository, and in the process I'm giving up the ability to tag versions of individual projects or provide simple GitHub links to share my code.

Based on these experiences, I honestly think that the only major problem with monolithic repositories is that the software isn't good at handling it, and this problem could be solved with better software. If the problem is solved at some point in the future, I don't think the answer will look much like any of the existing VCSs.

Based on experiences in industry, my observation is that the choice of monolithic repository versus separate repository is highly specific to the organization.


> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

Mind elaborating on this?


> > You can no longer use editor/IDE features like go to definition. > Not a problem with a good editor.

What editor are you thinking of that can jump from HTTP client API calls to the corresponding handler on the server?


> No organisation should have more than one regular-use repo (special-use repos, of course, are special). Multiple repos are a smell.

Totally agree with everything else, but gotta completely disagree on this last point. Monorepos are a huge smell. If there's multiple parts of a repo that are deployed independently, they should be isolated from each other.

Why? Because you're fighting human nature, otherwise. It's totally reasonable to think that once you excise some code from a repo that it's no longer there, but when you have multiple projects all in one repo, different services will be on different versions of that repo, and your change may have changed semantics enough that interaction bugs across systems may occur.

You may think that you caught all of the services using the code you refactored in that shared library, but perhaps an intermediate dependency switched from using that shared library to not using it, and the service using that intermediate library hasn't been upgraded, yet?

When separately-deployable components are in separate repositories, and libraries are actual versioned libraries in separate repositories these relationships are explicit instead of implicit. Explicit can be `grep`ed, implicit cannot, so with the multi-repo approach you can write tools to verify that all services currently in production are no longer using an older, insecure shared library, or find out exactly which services are talking to which services by the IDLs they list as dependencies.

While with the monorepo approach you can get "fun" things like service A inspecting the source code of service B to determine if cache should be rebuilt (because who would forget to deploy service A and service B at the same time, anyways...), as an example I have personally experienced.

My personal belief is that the monorepo approach was a solution back when DVCSs were all terrible and most people were still on centralized VCSs like Subversion that couldn't deal with branches and cross-repo dependencies well, and that's just what you had to do, while Git and Mercurial, along with the nice language-level package managers, make this a non-issue.

Finally, there's an institutional bias to not rock the boat (which I totally agree with) and change things that are already working fine, along with a "nobody got fired buying IBM" kind of thing with Google and Facebook being two prominent companies using monorepos (which they can get away with by having over a thousand engineers each to manage the infrastructure and build/rebuild their own VCSs to deal with the problems inherent to monorepos that most companies don't have the resources and/or skills to replicate).

EDIT: Oh, I forgot, I'm not advocating a service-oriented architecture as the only way to do things, I'm just advocating that whatever your architecture, you should isolate the deployables from each other and make all dependencies between them explicit, so you can more easily write tooling to automatically catch bad deploy states, and more easily train new hires on what talks to/uses what, since it's explicitly (and required to be) documented.

If that still means a monorepo for your company's single service and a couple of tiny repos for small libraries you open source, that's fine. If it means 1000 repos for each microservice you deploy multiple times a day, that's also fine (good luck!).

Most likely it means something like 3-10 repos for most companies, which seems like the right range for Miller's Law) ( https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus... ) and therefore good for organizing code for human consumption.


> It's totally reasonable to think that once you excise some code from a repo that it's no longer there, but when you have multiple projects all in one repo, different services will be on different versions of that repo, and your change may have changed semantics enough that interaction bugs across systems may occur.

But having multiple repos doesn't prevent the equivalent situation from happening (and, I think, actually makes it much likelier): no matter what, you have to have the right processes in place to catch that sort of issue.

> You may think that you caught all of the services using the code you refactored in that shared library, but perhaps an intermediate dependency switched from using that shared library to not using it, and the service using that intermediate library hasn't been upgraded, yet?

That's the sort of problem which happens with multiple repos, but not (as often) with a single repo.

> Explicit can be `grep`ed, implicit cannot, so with the multi-repo approach you can write tools to verify that all services currently in production are no longer using an older, insecure shared library, or find out exactly which services are talking to which services by the IDLs they list as dependencies.

A monorepo is explicit, too, even more explicit than multiple repos: WYSIWYG. And you can always see if your services are using the same API by compiling them (with a statically-typed language, anyway).

The beautiful thing about a monorepo is it forces one to confront incompatibilities when they happen, not at some unknown point down the road, when no-one know what changed and why.


I think that your comment is actually a pretty good test for when not to spin out a micro service.

If you expect to need to step into a function call when debugging, then it's too tightly coupled to spin out. You should be able to look at the arguments to the call and the response and determine if it's correct (and if not, now you have isolated a test case to take to the other service and continue debugging there).

If the interface will change so often that you expect it will be a problem that it's in a separate repository, if you expect that you will always need to deploy in tandem, then it's too tightly coupled to spin out.

The advantage of micro services is the separation in fact of things that are separate in logic. The complexity of systems grows super-linearly, so it's easier to reason about and test several smaller systems with clear (narrow) interfaces between them than one big. It's easier to isolate faults. It's harder to accidentally introduce bugs in a different part of the system when the system doesn't have a different part. If done right, scaling can be made easier. But these are hard architectural questions, there's no clear-cut rule for when you should spin off a new service and when you should keep things together.

Someone else mentioned separating the shopping app from the payment system for an ecommerce business, which even has security benefits. I think that's an excellent example.

Edit: Another clear benefit is that you can choose different languages, libraries, frameworks and paradigms for different parts of the code. You can write your boring CRUD backend admin app in Ruby on Rails, your high-performance calculation engine in Rust and your user-facing app in Node.js (so the front- and backend an share Javascript validation code).


I just want to add one disadvantage before I give some advantages. There's a lot of operational complexity involved in routing, monitoring, and keeping every instance of every microservice running. That complexity also makes debugging in production much more difficult, as one must track a relay of network requests through many separate layers to find the point where it actually got stuck.

As for advantages, microservices tend to keep code relatively simple and free from complex inheritance schemes. There's rarely a massive tangled-up engine full of special cases in the mix, as there often is in monolithic apps. This substantially decreases technical debt and learning curve, and can make it simple to understand the function an isolated microservice performs.

There is the obvious advantage that if you have disparate applications executing nearly-identical logic to read or write data to the same location, and the application platforms can't execute the same library code, you can centralize that logic into an HTTP API, which reduces maintenance burden and prevents potentially major bugs.

My opinion is that adopting microservices as a paradigm leads to a slow, difficult-to-debug application, primarily because people take the "micro" in microservices too seriously. One shouldn't be afraid to split functionality out into an ordinary service after it's been shown to be reasonable to do so.


Yes, but there's another dimension here. If another team breaks your build in a monolithic repo, you may or may not be able to resolve this quickly. You're in a contract with them about the state of the repo and thus your service.

With microservices, the production version of their service would conceivably be stable. It moves the contract from the repo to the state of production services.


> If another team breaks your build in a monolithic repo, you may or may not be able to resolve this quickly.

With a monolithic repo done right, the other teams broke their build of their branch, and it's up to them to resolve it. You, meanwhile, are perfectly happy working on your branch. When their changes are mergeable into trunk, then they may merge them, not before — and likewise for you.

With multiple repos, they break your build, but don't know it. You don't know it either, until you update your copies of their repos — and now you have to figure out what they did, and why, and how to update your logic to handle their new control flow, and then you update again and get to do it again, until finally you ragequit and go live in a log cabin with neither electricity nor running water.


> With multiple repos, they break your build, but don't know it. You don't know it either, until you update your copies of their repos — and now you have to figure out what they did, and why, and how to update your logic to handle their new control flow, and then you update again and get to do it again, until finally you ragequit and go live in a log cabin with neither electricity nor running water.

I don't see how this is a problem if you are pushing frequently and have a CI system. You know within minutes if the build is broken. If it broke, don't pull the project with the breaking changes.

My point is, I don't think one approach is inherently better than the other. Both require effort on the part of the teams to manage changes (or a CM team), and both require defined processes.


> If it broke, don't pull the project with the breaking changes.

I agree with the overall sentiment of your comment, but the quoted part is where I've seen trouble brew. The tendency is to be conservative about pulling updates to dependencies, which can easily get you into a very awkward state when a critical update eventually sits on top of a bunch of updates you didn't take because they broke you. It is usually better to be forced to handle the breakage immediately, one way or another.


> With a monolithic repo done right.

Yes, that's the contract that you need to have with other teams. And it's the contract that is automatically enforced with microservices.


That's true, but...

You don't debug distributed systems by tracing into remote calls and jumping into remote code. You debug it by comparing requests and responses (you use discrete operations, right) with the specified requests and responses, and then opening the code that has a problem¹.

It calls for completely different tooling, not for a "better debugger".

1 - Or the specs, because yes, now that your system is distributed you also have to debug the specs. Why somebody would decide on doing that for no reason at all? Yet lots of people do.


I think it's more to do with a need rather than going straight just because you have enough people on a team. For instance, if you find that some of your processing/specific request handling can outperform better by using a different framework, programming language than the ones it's currently developed on, then you should definitely consider a microservice approach by decoupling that specific service/functionality from your current stack.


Careful with this one too. Usually adding new features to adapt to a changing marketplace can have new requirements across many of your services that need to be finished quickly. If those services are each in a different language, it can slow everything down by weeks or months.

Multiple platforms is not a problem and generally a good thing as long as it's not excessive. You don't want to be in a case where you have the same number of different platforms as developers or anything like that. I'm guessing there is a rule of thumb here, but I'm not sure what it would be. Max 1 different platform per 5 developers? Something like that.


>decoupling that specific service/functionality from your current stack.

I do wish people would stop conflating "running in a different service" and "loose coupling". They are completely orthogonal.

I've worked on some horrendously tightly coupled microservices.


OSGi makes it easy to end up with a cornucopia of tightly coupled nanoservices all running in the same JVM.

Unless you can coax dOSGi into working (which is tons of fun), then you can have services tightly coupled to other services running on entirely different machines causing frequent (and hilarious) cascades of bundle failures whenever the network hiccups.

OSGi is a trigger word for me now. I've worked on two large OSGi projects (previous job and current job) and it's always the same. Sh*t is always broken (and my lead still insists that OSGi is the one true way to modular bliss). And the OSGi fanboys always say "Your team is using it wrong!" Which very well might be true, but I no longer care. Apparently it's just too damn hard to get a team of code monkeys to respect service boundaries when OSGi makes it so damn easy to ignore them.

If I'm ever in a position of getting to design a new software architecture (hasn't happened in 10 years, but hey I can dream), I'll punch anyone who suggests "OSGi" to me right in the face.


Wholeheartedly agree on OSGi. Disastrous implementations. I worked with servicemix and still have nightmares around class path issues and crazy bundle scoping rules. A plain old maven built jar with shading works much better in practice, but shading itself is shady :)


Well, I consider that by definition tightly coupled microservices should never be done. If it's not possible to decouple that function then it should not be in a micro service.


> A small team starting a new project should not waste a single second considering microservices unless there's something that is so completely obviously decoupled in a way that not splitting it into a microservice will lead to extra work.

That's a good point. I think this thought extrapolates to other parts of software engineering as well. Sometimes writing very modular and decoupled software from the beginning is very hard for a small team, and we can't see well if this is the best approach since it's also hard to grasp the big picture.

I'm currently facing this issue. I'm trying to write very modular and reusable applications, but now I'm paralyzed trying to picture the best patterns to use, where should I use a facade, a decorator, etc. I think I'll adopt this strategy for myself--only focus on modularizing from the beginning if it'd lead to extra work otherwise.


I'd also add that microservices have increased value if you begin with such an architecture in the first place. It's much more difficult to "gracefully" rip an existing monolith into modular pieces than to build modularly from the start.


I don't like correcting with "well, actually", however, I have to say that the author of the book "Building Microservices" in his first few chapters (in particular: Chapter 3: Premature Decomposition) warns against using microservices with new apps, especially if you are new to the domain. He claims that they are actually easier to use when you have to refactor a large monolith, and that normally you shouldn't start with microservices unless you know what you are doing - therefore my criticism towards the article which starts with a pre optimization (split one service in 12), which seems to be a common, yet arguable practice.


This has not been my experience. I've seen a few projects where microservices had been added from the start because it's the thing to do and, in all cases, it didn't work well. It's extremely difficult to split in microservices if you do not have a clear big picture of your projects functions and coupling. And, in most cases, in new projects, you don't have that big picture.

Microservices also make it much harder to refactor the code which you often need to do in the early stage of a project.


Yeah, we do micro services, the "real" kind. Not the "SOA with a new name kind", but the "some services are literally a few douzan lines of code and we have 100x the amount of services as we do devs" kind.

The thing is, you need a massive investment in infrastructure to make it happen. But once you do, its great. You can create and deploy a new service in a few seconds. You can rewrite any individual service to be latest and greatest in an afternoon. Different teams don't have to agree on coding standards (so you don't argue about it).

But, the infrastructure cost is really high, a big chunk of what you save in development you pay in devops, and its harder to be "eventually consistant" (eg: an upgrade of your stack across the board can take 10x longer, because there's no big push that HAS to happen for a tiny piece to get the benefits).

Monolithic apps have their advantages too, and many forget it: less devops cost, easier to refactor (especially in statically typed languages: a right click -> rename will propagate through the entire app) and while its harder to upgrade the stack, once its done, your entire stack is up to date, not just parts of it being all over. Code reuse is significantly easier, too.


Yeah, add in things like MORE THAN ONE PRODUCTION ENVIRONMENT and LETTING YOUR CUSTOMER HOST AN INSTANCE OF YOUR MICROSERVICES and you have guaranteed your own suffering.


It is a matter of tooling. One data center or ten, it does not matter much with proper tooling. We deploy to seven data centers with a click of a button, with rollback, staggered deployment etc. Centralized logging using ELK gives us great visibility in to each DC, without worrying about individual microservice instances.


Easy until you realize you need to somehow manage + configure hundreds of services to run your dev environment...


beyond just the docker environment, you only need to be able to run the service you're working on locally. Anything you don't run local should hit some shared dev/QA infrastructure (which share a db with local). Whatever you use to develop should be able to detect what you have running locally and prefer those when available.

Anything you're not running locally just hits the shared infra.


Dockerized apps make it simple to run on dev environment, which is what we do. Of course , one cannot run everything on a laptop, we have a dev cluster


"Different teams don't have to agree on coding standards (so you don't argue about it)."

Unsure if sarcastic.


Not at all sarcastic. I've seen endless wheel warring over spaces/tabs, level of indents. Mostly I ignore it.


Seems like a terrible idea if "different teams" actually means "random assortment of developers for this specific project." If it actually means "different teams," e.g. you rarely if ever would move from one team to another, I don't see the issue if one team uses tabs and one uses spaces, or you have different naming conventions or whatever.


Well, you use the standard you like, and the other team can use the standard they like. Then you write a micro service to convert from one standard to the other.


I wonder what language I would pick for that. I usually use Scala, but it seems a bit silly when the footprint of the platform would massively outweigh the actual service. I don't like Go. I like Python, but I prefer static typing. Rust seems a bit too low level (although I'd like to try it for embedded). I don't see any point in learning Ruby when i already know Python well.

Maybe Swift? Scala Native in a year or two? I've done a little Erlang before, so maybe Elixir?


So that sounds pretty much like a function call in a monolithic app. How do you store state? I assume you need to between all those microservices.

>The thing is, you need a massive investment in infrastructure to make it happen.

I thought that one of the selling points of microservice architectures was the minimal infrastructure. I am really struggling to see an advantage in this way of doing things. You are just pushing the complexity to a dev ops layer rather than the application layer - even further form the data.


Building Microservices needs discipline, an eye to find reusable components and extracting them and as you said, investment in infrastructure.

Monoliths invariably tend to become spaghetti over time, and completely impossible to any non trivial refactoring. With microservices, interfaces between modules are stable and spaghetti is localized.


Can you expand on how you do logging/debugging/monitoring?


what are the big infrastructure costs?


Deployment has to be easy. Create a new service from scratch, including monitoring, logging, instrumentation, authentication/security, etc, and deploying it to QA/Production with tests has to take minutes from the moment you decide "Hey, I need a service to do this" until it's in prod.

Because individuals may be jumping through dozens of services a day, moving, refactoring, deploying, reverting (when something goes wrong), etc. It has to be friction-free, else you're just wasting your time.

eg: a CLI to create the initial boilerplate, a system that automatically builds a deployable on commit, and something to deploy said deployable nearly instantly (if tests passed). The services are small, so build/tests should be very quick (if you push above 1-5 minutes for an average service, it's too slow to be productive).

Anyone should be able to run your service locally by just cloning the repo and running a command standard across all services. Else having to learn something every time you need to change something will slow you down.

That infrastructure is expensive to build and have it all working together.


I do have a positive micro-service experience, and although we are still in that process of breaking down our monolith SOA based app, we have seen the benefits already.

The more dramatic effect was on a particular set of endpoints that have a relative high traffic (it peaks at 1000 req/s) that was killing the app, making upset our relational database (with frequent deadlocks) and driving our Elasticsearch cluster crazy.

We did more than just split the endpoints into microservices. We also designed the new system to be more resilient. We changed our persistence strategy to make it more sensible to our traffic using a distributed key-value database and designed documents accordingly.

The result was very dramatic, like entering into a loud club and suddenly everything goes silent. No more outages, very consistent response times, the instances scaled with traffic increase very smoothly and in overall a more robust system.

The moral of this experience (at least for me) is that breaking a monolith app into pieces has to have a purpose and implies more than just move the code to several services keeping the same strategy (that's actually slower, time consuming and harder to monitor)


Do you think the result could also be a dramatic improvement if you kept old system and do those other things except splitting into microservices?

I can't get my head around how people introduce changes to their system if they have to update 12 different microservices at once? It must be horrible.

Often you hear stories how people are converting monolithic app to microservices - but this is easy. Rewriting code is easy and it's fair to say it always yields better code (with or without splitting into microservices - it doesn't matter).

What I'd like to hear is something about companies doing active development in microservice world. How do they handle things like schema changes in postgres where 7 microservices are backed by the same db? What are the benefits compared to monolithic app in those cases?

It seems to me that microservices can easily violate DRY because they "materialise" communication interfaces and changes need to be propagated at every api "barrier", no?


Multiple microservices are supposed to have different data backends, so that they are completely independent. Splitting your data up this way isn't all roses, but ideally the services are isolated so an update to one doesn't affect the others.


>Do you think the result could also be a dramatic improvement if you kept old system and do those other things except splitting into microservices?

As I said in another thread, the separation in different components was key for resiliency. That allowed independence between the higher volume update and the business critical user facing component.

>I can't get my head around how people introduce changes to their system if they have to update 12 different microservices at once? It must be horrible.

The thing is, if you design the microservices properly it is very rare to introduce a change in so many deployments at once. Most of the time is just 1 or 2 services at a time.

>What I'd like to hear is something about companies doing active development in microservice world. How do they handle things like schema changes in postgres where 7 microservices are backed by the same db? What are the benefits compared to monolithic app in those cases?

We don't introduce new features in our monolith service anymore. So, from that perspective we do all active development in microservices.

>"How do they handle things like schema changes in postgres where 7 microservices are backed by the same db?

The trick is, you want to avoid sharing relational data between microservices. I don't know if it is just us, but we have been able to split our data model so far and in most cases we don't even need a relational database anymore, so having a schemaless key/value store makes seems easy too.

> What are the benefits compared to monolithic app in those cases?"

There are several advantages, but the critical one for me is being able to have a resilient platform that can still operates even if a subsystem is down. With our monolithic app is an all or nothing thing. Another advantage is splitting the risk of new releases.

>It seems to me that microservices can easily violate DRY because they "materialise" communication interfaces and changes need to be propagated at every api "barrier", no?

Not necessarily. YMMV but you can have separation of concerns and avoid sharing data models. When you do have shared dependencies (like logging strategy or data connections) you can always have modules/libraries.


Which one of the four major improvements do you attribute the success to though? Could you have done the work on making it more resilient, persistence, sensible, redesign the docs without breaking into micro-services and still have seen the positive results?


I don't think the level success comes from one dimension, but I don't think either that we could have achieved the resiliency without breaking it in micro-services (or just services that happened to be small if you will).

One key factor was decoupling the high volume updates from the users requests so one didn't affect the other one.


> Any positive experiences with micro-services here?

In my experience, any monolith that can be broken up into a queue based system will benefit enormously. This cleans up the pipelines, and adds monitoring and scaling points (the queues). Queues removes run-time dependencies to the other services. It requires that these services are _actually_ independent, of course.

I do, however, avoid RPC based micro-services like the plague. RPC adds run-time dependencies to services. If possible, I limit RPC to other (micro) services to launch/startup/initialization/bootstrap, not run-time. In many cases, though, the RPC can be avoided entirely.


> Any positive experiences with micro-services here?

Yep. We already had a feature flag system, a minimal monitoring system, and a robust alerting system in place. Microservices make our deployments much more granular. No longer do we have to roll back perfectly good changes because of bugs in unrelated parts of the codebase. Before, we had to have involved conversations about deployments, and there were many things we just didn't do because the change was too big.

We can now incrementally upgrade library versions, upgrade language versions, and even change languages now, which is a huge win from the cleaning up technical debt perspective.


How granular are your services? I've heard a lot of talk about microservices without much talk about how micro they are. As someone who's happy with the approach, would you mind giving a little context in terms of what sort of degree you've split the system down?


They are of varying sizes. We have maybe 3-4 per developer. And they vary in size. The smallest are maybe 5-6 python classes.

To be honest, we still have a monolithic application at the heart of our system that we've been slow to decompose, though we're working on it. We deploy it on a regular cadence and use feature flags heavily to make it play nice with everything else.


How many instances do you deploy of services that are essential but low usage?


That sounds more like your team doesn't know how to use git beyond nothing more than an SVN replacement.


"We rolled out this update with 220 changes. There's a breaking bug. Where is it? We need to find out in the next 5 minutes, revert, and deploy. Otherwise we have to revert the whole thing- we're losing money."

Git doesn't really help with that. More granular deployments do, and if microservices help with more granular deployments, go for it.


> We rolled out this update with 220 changes.

That's your problem right here


Git-bisect does, doesn't it?


Only if you have a test that catches the bug and it still needs time to run. You'll also need time to write a fix, validate it and deploy. Plus any extra time your organization needs between code and deployment


It helps with the find part. The revert and deploy not so much, especially say if it's the middle commit if 200 and you'd still like to deploy all the commits before and after.


If you had a test for the issue, you probably wouldn't have deployed the software in the first place.


My experience is that most devs don't even know how to use SVN correctly. I just had a conversation with someone waiting for me to finish something before they could branch. The idea that I could merge my change into their branch afterwards didn't occur to them.


>Any positive experiences with micro-services here?

It makes sense for some thing. We run a webshop, but have a separate service that handles everything regarding payments. It has worked out really well, because it allows us to fiddle around with pretty much everything else and not worry about breaking the payment part.

It helps that it's system where we can have just one test deployment and everyone just uses that during testing of other systems.

I've also work at a company where we had to run 12 different systems in their own VMs to have a full development environment. That sucked beyond belief.

The idea of micro-service are is enticing, but if you need to spin up and configure more than a couple to do your work, it starts hurting productivity.


> have a separate service that handles everything regarding payments. It has worked out really well, because it allows us to fiddle around with pretty much everything else and not worry about breaking the payment part.

Is the payments service a single service that manages the whole transaction, or have you go for multiple services handling each part and, if so, how did you manage failure with a distributed transaction?


It's a single service. It just sites between us and our PSPs. That way no other system needs to worry about integrating directly with the PSPs.


Not sure if it's the case here. But what works really well for us is queues with at-least-once guarantee. (For payment services you might need an additional check to guarantee exactly one execution.) I think you can find this queue offered by most providers.


Totally agree.

We had almost the same story with payments. Except for we've jumped to a payment-processing SaaS but got dissatisfied (all those SaaSes I saw don't work with PayPal EC without so-called "reference transactions" enabled) and decided that wasn't a good idea and we have to jump back to in-house implementation.

I didn't want to re-integrate the payments code back to the monolith - thought it would take me more time and make code messier. So I wrote a service (it's small but to heck with "micro" prefix) that resembled that SaaS' API (the parts we've used). It had surely evolved and isn't compatible anymore, but it doesn't matter as we're not going back anyway.

Works nicely and now I'm feel more relaxed - touching the monolith won't break payments.

On the other hand, I see how too many services may easily lead to fatigue. Automated management tooling (stuff like docker-compose) may remedy this, but also may bring their own headaches.


I don't think having a handful of services handling a specific, atomic section of the app really classifies as this 'micro-services' claim, it's just smart separation of concerns.

We have specific services that process different types of documents, or communicate and package data from different third parties, or process certain types of business rules, that multiple apps hook into, but it's literally like 20 services total for our department, some that are used in some apps and not others.

When I hear 'micro-services' I'm picturing something more akin to like node modules, where everything is broken up to the point where they do only one tiny thing and that's it. Like your payment service would be broken into 20 or 30 services.

But maybe I'm mistaken in my terms. I haven't done too much with containers professionally, so I'm not too hip with "the future".


I'm building a podcast discovery app and I find myself being de-facto pulled towards modularity. It's because my feed checker is in Elixir, my site is WordPress-based, and I communicate between them using the WP API, and I'm using Google Cloud SQL, and Elasticsearch on its own virtual machine...

The thing is though, the Elixir feed checker has its own database table that tracks whether it's seen an episode in a feed. And when there's a new episode it sends an API call to WP to insert the new post. The problem is that sometimes the API calls fail! Now what? I'll need to build logging, re-try etc. So I'm thinking of making the feed checker 'stateless' and only using WP with a lot of query caching as the holder of 'state' information about whether an episode has been seen before.

To sum up my experience so far, there's something nice about being able to use the right tech for each task, and separating resources for each service, but the complexity--keeping track of whether a task completed properly--definitely increases.


I think your problem might be not expanding wordpress. PHP will gladly do the above through wordpress plugins and cron jobs. I built something similar and ended up going that way. The system is still running to this day. Sure, its not hip but gets the job done with miniml fuss and makes money.


You are right that PHP can do the feed checking part but I wanted to use something with easy async/concurrency out of the box (and wanted to learn Elixir instead of using Node.)

One hard tech limit is that with 50k podcasts, 4million+ episodes, search definitely doesn't work well. Not just WP, but SQL itself. Hence Elasticsearch. I also plan to work on recommendations, etc. so will need probably to be exporting SQL data into other systems anyway for making the "people who liked this also liked this" kinda things.

Also I kinda lied about using the WP API--that's how I built the system initially (and will switch to it moving forward), but to import the first few million posts from the content of the feeds, I just used wp_insert_post against the DB of new entries that Elixir fetched (I posted the code I used here: http://wordpress.stackexchange.com/a/233786/30906).

I also plan to write the whole front-end in React (including server side rendering) so will have to figure out how to get that done. Would probably use the WP-API with a Node.js app in front of it, will look into hypernova from AirBNB. So probably more usage of WP API accessed by another service...


I hope you are not doing all of this alone. I'd try and keep things as simple as possible within a monolith and then improve as needs increase. Good luck :)


I generally write what I consider monolithic Django apps. I would add in Haystack (a search module for Django) and configure it to use Elasticsearch to overcome the problems you describe.

It doesn't sound like microservices are needed, just adding in the appropriate tech for the job.


cron jobs

Once these are doing anything other than rotating log files, can the system really be considered monolithic?


How do you define a monolith? Please establish that before we discuss further. :)


That's an incisive question. My impression, which may be mistaken, is that a cronjob would be used to move data (pages compiled from templates, chart images, etc.) into the PHP host on a "batch" basis. To me, that implied the existence of other systems that handle the data in their own way, but I guess in this thread the salient difference between micro and mono is that the former connects components via a web stack. Are there more agile interfaces available for cronjobs? If instead we're only considering transformations of data already resident on the host (as what, flat files?), I don't imagine that cronjobs are the best solution available.


Care sharing your progress with the podcast discovery app? I'm a cofounder of Podigee which is a podcast hosting service. Maybe we can exchange know-how, find some synergies or even join forces on certain topics. Feel free to drop me a line at mati@podigee.com


Sure, emailed. Also added a screenshot/twitter to my profile in case anyone else is interested.


The problem with microservices is that your state is spread over multiple systems. You completely lose the concept of transactional integrity, so you will have to work around that from the start.

The advantage though is that APIs (system boundaries) are usually better defined.

Perhaps one should use the best of both worlds, and run microservices on a common database, and somehow allow to pass transactions between services (so multiple services can act within the same transaction).


One of the big advantages of microservices is scaling/migrating the databases behind each service independently. If you need transactions across multiple services then one could argue that either your API endpoint is doing too much, or your services are doing too little. It's not perfect, and certainly not always convenient, but it's a balance. Microservices with a common DB is asking for trouble. The monolith is a better option in that case IMO.


Idempotency, event sourcing and sagas with compensations are ways to solve your problem.

A shared database is an anti-pattern in distributed systems.

Similarly, distributed transactions (ala. DTC) is an anti-pattern.

Distributed systems aren't hard. They're just different.


Say you sell a widget. You want to update both your cash account and your inventory, and never one without the other. Which is easier to understand and more reliable: doing them atomically, or making sure you have designed in 2^n intermediate states and all the code required to complete work that should happen but hasn't yet?


The problem with microservices is that your state is spread over multiple systems.

Then again, sometimes it's advantageous to identify parts of your system where aspects of state can be safely decoupled. And in which having them reside in disparate systems (and yes, sometimes be inconsistent or differently available) might actually be a better overall fit.

You completely lose the concept of transactional integrity, so you will have to work around that from the start.

Then again, sometimes your state changes not only don't need to be transactional; it can be disadvantageous to think of them that way.

Depends, depends, depends.


> Then again, sometimes your state changes not only don't need to be transactional; it can be disadvantageous to think of them that way.

I'm curious; in what kinds of situation would this apply?

> Depends, depends, depends.

Flexibility is usually an important requirement. Often you cannot freeze your architecture and be done with it. I think a transactional approach could better fit with this.


I'm curious; in what kinds of situation would this apply?

Any situation where the business value of having your state be 100% consistent does not outweigh the performance or implementation cost of making it so.


> You take your monolithic app and you split it into like 12 services.

The non-web world has been doing this with message queueing for about 15 years. Maybe more.


Probably more. I'd say, like, at least 30-40 years.

I mean, the infamous "UNIX way" of "do one thing, do it well" (something we nearly lost with popularity of "do everything in a manner incompatible with how others do it" approach in too many modern systems), when complex behavior was frequently achieved through the modularity of smaller programs communicating through well-defined interfaces.

Heck, microkernels are all about this, and their ideas haven't grew out of nowhere. And HURD (even though it was never finished) is quarter a century old already.


You do know the author is a taking the piss out of the practice, right?


You know nothing in the comment you're replying to indicates I wouldn't, right?


Don't be so quick to assume micros services involve message queuing. For most it seems to just be an elaborate RPC mechanism (unfortunately).


Oh yeah, there's certainly other ways of doing it. My own experience is just that message queuing seems to be the default loosely-coupled RPC mechanism for larger orgs (from before the term 'micro services' was popular).


Yes. A lot of success. And with only one person on the backend full time.

That said, in places where it doesn't make sense we didn't try to force it. Our main game API is somewhat monolithic, but behind it we have almost 10 other services. Here's a quick breakdown:

  - Turn based API service (largest, "monolithic")
  - Real-time API service (about 50% the size of turn-based)
  - config service (serves configuration settings to clients for game balancing)
  - ad waterfall service (dynamic waterfall, no actual ads)
  - push notification service 
  - analytics collection service (mostly a fast collector that dumps into Big Query)
  - Open graph service (for rich sharing)
  - push maintenance service (executes token management based on GCM/APNS feedback)
  - help desk form service (simple front-end to help desk)
  - service update service (monitors CI for new binaries, updates services on the fly - made easy by Go binary deployment from CI to S3)
  - service ping service (monitors all service health, responds to ELB pings)
  - Facebook web front-end service (just serves WebGL version of our game binary for play on Facebook)
  - NATS.io for all IPC between services
...and a few more in the works. Some of these might push the line of "micro" in that they almost all do more than a single function's worth of work, but that level of granularity isn't practical.

But don't get too caught up on the "micro" part. Split services where domain lines naturally form, and don't constrain service size by arbitrary definitions. You know, right tool for the job and whatnot.


I only use a microservice if its something that can operate by itself. Things like a file data store, reports generation, etc. But all business logic goes in the monolith.


Oh yes. We're splitting up a large monolith into a bunch of different services. Completely amazing, though there's a ton of tools (like Netflix's Hysterix, etc) that make it much, much easier to do.

I wouldn't, however, just "do microservices" from day one on a young app. But usually that young app has no idea what the true business value is, i.e., you have no idea what down time of certain parts of your services really means to the business. That's the #1 pain point we're solving: having mission critical things up 100%, and then rapidly iterating on new, less stable feature designs in separate services.

You should, however, keep an eye on how "splittable" everything is, i.e., does everything need to be in the same DB schema? Most languages have package concepts, which typically align (somehow) with "service" concepts. Do you know their dependencies? That sort of thing. Then, the later process of "refactor -> split out service" is pretty straightforward and easy to plan.


saw done properly once, with good separation of services etc, but it still required a truckload of commitment to make it work, because one thing is interface changes when the code just don't work when you merge, another is figuring out the publish and restart order of each service when you have to add an operation so you don't have to knock out the whole system at every upgrade.

I don't really like that model applied to everything, but eh now you are kind of forced in a hybrid approach - say, your macro vertical plus whatever payment gateway service, intercom or equivalent customer interaction services, metrics services, retargeting services, there are a lot of heterogeneous pieces going into your average startup.

but back on topic, what Docker really needs now is a whack on the head of whoever thought swarms/overlays and a proper, sane way to handle discovery and fail-over - instead we got a key-value service deployment to handle, which cannot be in docker and highly available unless you like infinite recursion.


I don't know if the community will consider this example to be microservices as currently defined, but many years ago I wrote client-server systems using a PC-based fat client application, and transactions running on a CICS server. I think this was pretty similar to what people currently think of as a microservices architecture, although we didn't have to worry about running/monitoring multiple servers (all the transactions/services ran on a single mainframe server), and the transaction monitor managed things like start-up and shutdown pretty simply. This approach worked really well for us, and we built several robust, scalable applications using this approach. To be clear, we numbered our users in hundreds, not thousands or millions. I can well understand how scaling this approach across many servers could be very challenging.


> Any positive experiences with micro-services here?

I'm currently working on a large refactoring effort along these lines. The end goal is to create a modular, potentially distributed system that can be deployed in a variety of configurations, updated piecemeal for different customers, and integrated by our customers with the third-party or in-house code of their choice using defined APIs. We aren't typical of the other examples, though, in that we do literally ship our software to our customers and they run it on their own clusters.


positive experience with microservices: identify discrete functions that can considered "stateless" (i.e. no side effects, deterministic output for given input) and factor those out into stand-alone microservices.

a good example of this that I've used in production at my current $dayjob: dynamic PDF generation. user makes request from our website, request data is used to fill out a pdf template context which is then sent over to our PDFgen microservice which does its thing and streams a response back to the user.


Does it connect to the database to fill in some values in the template? Does it keep connection pool of let's say 5 connection always open (as libraries like to do)? Does it have authentication? Is it public or private API? Who is managing security? Is it running behind it's own nginx or other proxy? Does it have DoS protection (PDF generation can be CPU intense)? What about the schema for request? How do you manage changes to the schema? They need to be deployed together with changes in other services, right? What about changes to database schema - you need to remember to update that service as well and redeploy it at the right time as well - just after successful db migrations - which live in another project.

All of that and much more needs to be replicated for each microservice, right?

Why not just have a module in your monolithic app that does it. The logic will still be separate. In most languages/frameworks you can spawn pdf generation task. Any changes are easier to introduce as well. There's no artificially materialised interface. Updates are naturally introduced. All auth logic is there already, you don't need to worry about deploying yet another service, same with logging etc.


> Does it connect to the database to fill in some values in the template?

the template has values that are related to database models. the main app (still mostly monolithic) fills out the template context. the context itself is what's passed to the microservice. the microservice does not connect to a database at all.

> Does it keep connection pool of let's say 5 connection always open (as libraries like to do)?

no. the service probably handles a few hundred requests per day, it is not in constant use. communication is over HTTPS. it opens a new connection on each request. this does impact throughput, but its a low throughput use case, and pdf rendering itself is much slower and that time totally dominates the overhead of opening and closing connections anyway.

> Does it have authentication?

yes, it auths with a bearer token that is borne only by our own internal server. this is backend technology so we don't have to auth an arbitrary user. we know in advance which users are authorized.

> Is it public or private API?

private

> Who is managing security?

we are, with a lot of assistance from the built-in security model of AWS.

> Is it running behind it's own nginx or other proxy?

the main app is behind nginx. the microservice is running in a docker container that exposes itself over a dedicated port. there's no proxy for the microservice, again, because of the low throughput/low load on the service. no need to have a load balancer for this so the most obvious benefit of a proxy wasn't applicable.

> Does it have DoS protection (PDF generation can be CPU intense)?

yes, it's an internal service and our entire infrastructure is deployed behind a gatekeeper server and firewall. the service is inaccessible by outside requests. the internal requests are queue'd up and processed 1 at a time.

> What about the schema for request?

request payload validation handled on both ends. the user input is validated by the main app to form a valid template context. the pdf generator validates the template context before attempting to generate one also. its possible to have a valid schema that has data that can't be handled correctly though. errors are just returned as a 500 response though. happens infrequently.

> They need to be deployed together with changes in other services, right?

nope. the microservice is fully stand alone.

> What about changes to database schema - you need to remember to update that service as well and redeploy it at the right time as well - just after successful db migrations - which live in another project.

the microservice doesn't interact with a database at all. schema changes in the main app database could potentially influence the pdf template context generation, but there are unit tests for that, so if it does happen we'll get visibility in a test failure and update the template context generation code as needed. none of this impacts the microservice itself though. it is fully stand alone. that's the point.

> All of that and much more needs to be replicated for each microservice, right?

in principle yes, and these are good guidelines for determining what is or is not suitable to be a microservice. if it would need to auth an arbitrary user, or have direct database access, or be exposed to public requests, it might not be a good candidate for a microservice. things that can stand alone and have limited functional dependencies are much better candidates.

> Why not just have a module in your monolithic app that does it.

because the monolithic app is Python/django and the PDF generation tool is Java. one of the main advantages of microservices architecture is much greater flexibility in technology selection. A previous solution used Python subprocesses to call out to PDF generation software. It's actually easier and cleaner for us to use a microservice instead.


<sarcastic mode on> Maybe so. But didn't the move create a cool Software Architect job position out of nowhere -wink-wink- ? <sarcastic mode off>


> Also, the team spent too much time debating standards etc.

Ah yes, the 'let's have decentralised microservices with centralised standards!' anti-pattern. It results in lots of full-fledged, heavyweight, slow-to-update services, which also have all the problems of a distributed system. It's the worst of both worlds.


was it a complete rewrite? I don't think thats the right way to transition. Why didn't you try to separate the features one-by-one from the monolith? That would give more immediate feedback and real problems to work on instead of the possibility to get stuck on the holy architecture debate.


No it wasn't a complete rewrite. We started by separating out the most mission critical components. Maybe that's where we went wrong; the most mission critical components were quite large and unwieldy to split out all at once. There was also the overhead of keeping the newly separated out component and the monolithic app in sync.


Microservices is a technology. IMO, like any other tech, it should be used when there are clear benefits expected in the near future, not as a blanket "always microservices" policies.

Although I personally had to deal with some monolithic monsters that I wished were split into smaller services.


What was the benefit you envisioned?


We were looking to break up our monolithic rails app into micro-services so devs could iterate and develop faster. We also thought that the application as a whole would become more failure resistant. Unfortunately, inter-dependancies among the services themselves meant that the failure-resistance didn't pan out as we thought.


You split off specific parts of your app into microservices because you want to scale those parts independently from the rest. It's not just a blind decomposition of a monolith for the sake of decomposition.


Wait a minute, this sounds familiar. looks at username Oh.


'mrhektor is a green account.


I should've been more clear. The story sounded familiar, because I worked with them and recognized them based on the username.


Hi Stepan!


> Also, the team spent too much time debating standards etc.

IMHO. You need a lead with a clear vision that drives the effort. Too many leads will create chaos.


We did have a lead with a vision, and part of the vision was standards for each service (for example, file structure in Go). I can see the rationale behind it; a new dev can onboard very quickly on to a new service. But in hindsight, maybe it wasn't thought out enough.


Heh, since the day I heard of "microservices" the only thing I could think was "have fun maintaining that".


> It was mostly because the monitoring and alerting were now split into many different pieces

Well, there's your problem - you need a monitoring microservice and an alerting microservice! Well, those may be too coarse by themselves, but once you break them down into 5 or 6 microservices each, you'll be ready for production.


Author here.

To answer some questions: yes this is obviously poking fun at Docker, but I also do really believe in Docker. See the follow-up for more on that: https://circleci.com/blog/it-really-is-the-future/

In a self-indulgent moment I made a "making of" podcast about this blog post, which is kinda interesting (more about business than tech): http://www.heavybit.com/library/podcasts/to-be-continuous/ep...

And if you like this post you'll probably like the rest of the podcast: http://www.heavybit.com/library/podcasts/to-be-continuous/


It's a good post. It capture both the innate complexity of the problems some of us are working on, and the incredible WTF moments involved. Both sides of your theoretical conversation have their own pitfalls. Everyone in the comments here seems stuck on the "I don't really need microservices train", (and there is such a thing as overdoing it … but I've never seen it) but I can't help but think that Italics nailed it here:

> -It means they’re shit. Like Mongo.

> I thought Mongo was web scale?

> -No one else did.

It's so incredibly true, and I laugh (and cry, b/c we use Mongo) at this section each time I read it. Also, this gets me every time:

> And he wrote that Katy Perry song?


I have to admit, I was startled to get to the "we sell these services" blurb at the end, because that was such a well-done unsell.


I like the humility on display here. And I'm not trolling or being sarcastic.


There was a time when Heroku seemed just as foreign to me as Docker does in this article.

- So shared webhosting is dead, apparently Heroku is the future?

- Why Ruby, why not just PHP?

- Wait, what's Rails? Is that different from Ruby?

- What's MVC, why do I need that for my simple website?

- Ok, so I need to install RubyGems? What's a Gemfile.lock? None of these commands work on Windows.

- I don't like this new text editor. Why can't I just use Dreamweaver?

- You keep talking about Git. Do I need that even if I'm working alone?

- I have to use command line to update my site? Why can't I just use FTP?

- So Github is separate from Git? And my code is stored on Github, not Heroku?

- Wait, I need to install both PGSql and SQLite? Why is this better than MySQL?

- Migrations? Huh?


When you heard about Rails it sounds like it was mature enough for you to start using it. Because it was a serious improvement over PHP, as I took a similar route. Docker/virtualization is still early and people are figuring out how the pieces fit together, and what the best pieces are. So it's best to wait until then (IMO).


To be fair, a lot of these questions are valid. You arguably /don't/ need MVC for yours simple website, if Dreamweaver floats your boat you might as well use it, it's unclear why a small website needs an elaborate deployment infrastructure, using a VCS for personal code can be overkill, PGSql and SQLite aren't better than MySQL for every use case -- and so on.

Frameworks, orchestrations, even just new technologies -- these are great if they actually make your job easier or if they make your product better. Unfortunately, they often do exactly the opposite.


Agree until

> using a VCS for personal code can be overkill

I've been burned before, have you? If you're using something like Google Drive, you should use DropBox instead, since it seems less likely to lose your work.


Obligatory link to "The S stands for simple", a SOAP-bashing classic: http://harmful.cat-v.org/software/xml/soap/simple


"Let me tell you about UDDI"

Nooooooooooooooooo. Everytime someone says "service discovery" a kitten dies (Except for consul, that's the biz).


Everything must be in XML. Except the SoapAction header. Which has no defined standard. Yeah I remember all that madness.


remember? Thomson Reuters on demand APIs are still largely SOAP based.


I'm working with a very well-known American company with over $4b annual revenue that shall remain nameless and is currently developing a new SOAP API to replace the existing "dump a CSV on an FTP server" integration.


never touch a running system...


fair enough, but I posit X users started consuming this API when SOAP was prevalent and Y users started when ReST was prevalent and Y >> X. Furthermore, SOAP is hard to maintain these days because it's so ancient. i.e. the libraries are not new and/or actively maintained.

As such, I maintain SOAP should be gone for the good of the running system.


In Python you simply don't have good SOAP libraries. They were all started at the tail end of its popularity and then all died quiet deaths when attention shifted to ReST before they were actually production ready, and if you now want to talk to a SOAP service… well, better don't do it in Python. 2, that is. Forget about 3.


Have you seen Zeep?

It's literally billed as "A fast and modern Python SOAP client". Python 2 and 3 compatible. Last commit was two weeks ago.

http://docs.python-zeep.org/en/master/


Nope. We needed one last September, zeep didn't yet exist back then.

And going by the bugtracker, it's running into quite a few problems with almost-but-not-quite compliant servers/WSDL files, which is a real issue when you're trying to interface ass-old legacy APIs (we're talking "not upgraded since 2006"-old) made by $BigEnterprise. Maybe this time the project won't die before they work out all the little kinks.


If that was ever true it certainly doesn't seem to be true now. All the tools support WSDL-first. All the tools are compatible with each other. Fill in the URL, let it autogenerate the interface, write your code and it all just works.


Because it was the latest trend 10+ years ago, and now people have made it just work because their applications are all built on it and people need actually good tools to use these architectures. It's always about tools, it's not like TCP is the best protocol or anything, it just has the best tooling, ditto for C, POSIX, etc. anything can be a good standard after 15+ years of work on it. Containers will be like that in a couple of years. It's all just cycles man.


Man, I dont think this is the future at all. OK, Docker is good and has its propose, and is very good on what its do: "Run only one process in one brand new kernel", but beyond than that, its just a daemon that uses and abuses of linux containers, you can easily scale, but is a pain in the ass to upgrade apps, also you need to run only one process on that. Does not looks like the future for me to have 30 different linux containers running just only one process in each of them, dude, you have a kernel in your hand, why the hell you will run only one process on it? (what the heck, you can protect yourself and scale without be the bitch of a daemon, you just need to know your best friend kernel), you dont need to make micro services for everything, its good ok, but its not the solution for everything like the people are saying...

I really dont have any idea why the people are are so excited about "docker" all the things.


It's all about simplifying deployment. That's it, that's what's so good about using containers.

I don't know if you understand what Docker really is when you say something like this: "Run only one process in one brand new kernel", the kernel is shared between containers, that's the whole idea, you package the things your application need and be done with it.

The current problem with containerization is that there are no really good or understood best practices, people are still experimenting and that's why it's a big moving target and, consequently, a pain in the ass if you need to support a more enterprise-y environment. You will need to be able to change and re-architecture things if the state-of-the-art changes tomorrow.

I agree with your sentiment about going overboard on "docker all the things", that's dumb and some people do it more because of the hype than by understanding their needs and using a good solution for it but I think you are criticising something you don't really grasp, these two statements:

> "Run only one process in one brand new kernel"

> you have a kernel in your hand, why the hell you will run only one process on it?

I'm not trying to be snarky, I really recommend you doing a bit more of research on Docker to understand how it works. Also, Docker doesn't make it a pain in the ass to upgrade apps, quite the contrary if you do it in some proper ways.


Doesn't statically compiling programs solve the deployment issue better? I mean, as far as I can tell Docker only exists because it's impossible to link to glibc statically, so it's virtually impossible to make Linux binaries that are even vaguely portable.

Except now Go and Rust make it very easy to compile static Linux binaries that don't depend on glibc, and even cross-compile them easily.

Hell I think it's actually not even that hard to do with C/C++: https://www.musl-libc.org/how.html

If I have a binary built by Go, what problems does Docker solve that just copying that binary to a normal machine doesn't?


Binaries are one thing, but there are the other abstractions that containers bring in regard to networking and storage.

You expose what are the network APIs of your apps (e.g open ports), filesystem mounts, variables (12 factors), etc.

Your application becomes a block that you can assemble for a particular deployment; add some environment variables, connect a volume with a particular driver to a different storage backend, connect with an overlay to be able to talk to other containers privately across different servers or even DCs, etc.

It's really all about layers of abstraction for operating an application and deploying it to different environments.

With the latest container orchestration tools, you can have a catalog of application templates defined simply in Yaml and it's very easy to make it run anywhere. Add some autoscaling and rolling upgrades and it becomes magic for ops (not perfect yet, but checkout latest Kubernetes to see new advancements in this space).

With the proper tools and processes, this removes a lot of complexity.


> add some environment variables, connect a volume with a particular driver to a different storage backend, connect with an overlay to be able to talk to other containers privately across different servers or even DCs, etc.

But environment variables already exists without docker. Volumes already exists, aka partitions. "Overlay network" already exists, aka unix sockets or plain TCP/UDP/etc over the loopback interface.

I'm not trying to be a dick here, it's just that the points you brought up doesn't really bring anything new to the table. How is this different from just having a couple bare-metal or virtual machines behind a proxy?

There are some aspects to containerization that are very feasible, but only at certain scales and the points you brought up makes me question whether you perhaps might be over-engineering things a bit.


Those things exist, but you need the "setup" bit to achieve the level of isolation that you want.

For example, volumes: With Kubernetes (on Docker), the lifetime of the volume mount is handled for you. No other containers have access to the mount. Container dies, mount dies. Whereas on plain Linux, mounts stay. You need cleanup, or you need to statically bind apps to their machines, which will seriously limit your ability to launch new machines -- there will be a lot of state associated with the bootstrapping of each node. Statefulness is the enemy of deployment, so really what you want is some networked block storage (EBS on AWS, for example) plus an automatic mount/unmount controller, thereby decoupling the app from the machine and allowing the app to run anywhere.

Environment vars are inherited and follow the process tree, so those are solved by Linux itself.

Process trees also handle "nesting": Parent dies, children die. But you will end up in a situation where a child process might spawn a child process that detaches. This is particularly hard to fix when a parent terminates, because the child doesn't want to be killed. Now you have orphaned process trees. The Linux solution is called cgroups, which allows you to associate process trees with groups, which children cannot escape from. So you use cgroups, and write state management code to clean up an app's processes.

I could go on, but in short: You want the things that containerization gives you. It might not be Docker, although any attempt to fulfill the principles of containerization will eventually resemble Docker.


It's about the automation of these things.

You now have generic interfaces (Dockerfile, docker-compose, Kubernetes/Rancher templates, etc.) to define your app and how to tie it together with the infrastructure.

Having these declarative definitions make it easy to link your app with different SDN or SDS solutions.

For example, RexRay for the storage backend abstraction of your container:

http://rexray.readthedocs.io/en/stable/

You can have the same app connected to either ScaleIO in your enterprise or EBS as storage.

We are closer than ever to true hybrid cloud apps and it's now much more easier to streamline the development process from your workstation to production.

I think it's pretty exciting :)


We are closer than ever to true hybrid cloud apps and it's now much more easier to streamline the development process from your workstation to production.

This sounds exactly like the "It's the future!" guy in the original post...


Have to admit, as a fellow Go dev, with single binary static compiles, I don't really GET why I need docker... all it seems to offer is an increased workload and complicated build proc


You don't really, but tools like kubernetes, which are really useful if you're deploying a number of heterogeneous apps, expect a container format as they aim at a market wider than just golang. The overhead of putting the service inside docker and following 12 factor is minimal and largely worth it, but if you're only running a single go binary, you could legitimatly go other ways.

Something like kubernetes also lets you abstract away the lock-in of your cloud infrastructure, so whilst it adds another layer and a bit of complexity, it again is arguably worth the effort if you're worried about needing to migrate away from your current target for some reason in the future.

As a framework it abstracts apps from infrastructure quite well. It's super easy for me to replace my log shipping container in kubernetes and have most things continue to work, as all the apps have a uniform interface.

Nobodies saying you can't build these things without kubernetes, but it definitely gives me more of the things than configuation managment systems currently do. Personally, I'd rather aim at the framework than handles more of what I need it to do.

Finally, bootstrapping a kubernetes cluster is actually quite trivial and you can get one off the shelf in GKE, so I'm not really sure why I'd personally want to go another route.


In my humble case, Docker solves the problems I have to manage the systems on which my application runs (and that's mainly it). A single dockerfile of 20-30 lines describes a whole system (operating system, versions, packages, libraries, etc), and cherry on the cake, I can version it in my git repository.

This is not revolutionary in itself, but having the creation and deployment of a server being 100% replicable (+ fast and easy!) on dev, preproduction, and production environments, plus it's managed with my usual versionning tool, that is something I appreciate very much.

Sure, there are other tools to do the same, but docker does the job just fine.


> having the creation and deployment of a server being 100% replicable

The problem of ensuring that upstream dependencies can be reproducibly installed and/or built is, of course, left as an exercise for the reader.


Isolation is a strong argument. You don't want one process to starve another. You can get isolation via one-host-per-service or you can get it using cgroups. Docker sort of gives you both, without the waste of one-per-host and with a manageable set of tooling around cgroups.


systemd runs services in their own cgroup by default and gives you control over the resources alloted to those cgroups.


Yes yes yes. We're nearly 100% Go on the backend and deployment is a breeze. We don't use Docker because it wouldn't give us anything beyond more things to configure. Our CI deploys binaries to S3 and the rest is just as easy.


Namespaced filesystem and networking, just for one. you seem very eager to dismiss a technology you only barely understand.


Namespaced filesystem shouldn't even be a special requirement - your program should use relative or at least configurable paths. I mean, directories are namespaced filesystems.

What networking problems does Docker solve?


Namespaced FS as in chroot.

Your program don't see what else is running on the system. Also means that it removes possible conflicts for shared libraries and other system-wide dependencies.

This kind of isolation is not only good for app bundling as a developer, but even more important as an operator in a multi-tenant scenario. You throw in containers and they don't step on each other toes. Plus, system stay clean and it's easy to move things around.

Network namespace as in linux network namespace (http://man7.org/linux/man-pages/man8/ip-netns.8.html).

Each container has it's own IP stack.

Containers provide proper abstractions so you can then assemble all of this, pretty much like you use pipes on a unix shell.


It seems to me that you're confusing configuration management and containers...

Deployments, installations etc. are pretty easy, it's not something containers are actually good at solving. At best you containerize the configuration management itself, which simply makes it harder to work with.


I've been working with configuration management for some years now, apart from also working as a developer so I don't believe I'm confusing them as much as I'm admitting that containers make configuration management and deployment easier. I might not have been so eloquent on that point but it's my feeling from using Docker for the past 2 years.

Nowadays all that I do is setup a barebones CoreOS instance and fire away containers at it, be it with kubernetes (and then my config management is a bit more robust so to setup k8s in CoreOS) or just use CoreOS's own fleet if it suffices.

Then I get the goodies of containerization such as process isolation, resource-quotas, etc.

Like I said: it isn't painless, sometimes much the opposite, but it's worked much better for the lifecycle of most of the products and services I've been working on the past couple years.

Even before with automated deployments it wasn't so easy when configuration begins to get hairy. And yes, you can argue that this might be a smell of something else but that's what I've seen happening over and over.


Docker containers don't contain a kernel. A container isn't anything special -- it's "just" a namespaced set of processes that are isolated from the host system. If you run "ps" on the host, you will see all the containers' processes.

One process per container is perfectly fine. In fact, that's the common use case. There is absolutely nothing wrong with it, and there is practically zero overhead in doing it.

What you gain is isolation. I can bring up a container and know that when it dies, it leaves no cruft behind. I can start a temporary Ubuntu container, install stuff in it, compile code in it, export the compilation outputs, terminate the container and know that everything is gone. We do this with Drone, a CI/build system that launches temporary containers to build code. This way, we avoid putting compilers in the final container images; only the compiled program ends up there.

Similarly, Drone allows us to start temporary "sidecar" containers while running tests. For example, if the app's test suite needs PostgreSQL and Memcached and Elasticsearch, our Drone config starts those three for the duration of the test run. When the test completes, they're gone.

This encapsulation concept changes how you think about deployment and about hardware. Apps become redundant, expendable, ephemeral things. Hardware, now, is just a substrate that an app lives on, temporarily. We shuffle things around, and apps are scheduled on the hardware that has enough space. No need to name your boxes (they're all interchangeable and differ only in specs and location), and there's no longer any fixed relationship between app and machine, or even between app and routing. For example, I can start another copy of my app from an experimental branch, that runs concurrently with the current version. All the visitors are routed to the current version, and I can privately test my experimental version without impacting the production setup. I can even route some of the public traffic to the new version, to see that it holds up. When I am ready to put my new version into production, I deploy it properly, and the system will start routing traffic to it.

Yes, it very much is the future.


It comes down to not wanting different applications (not equivalent to processes) to share a single filesystem and all that implies like shared dependencies.


Docker contains the effects of sucky programming to a single container. If your programs follow best practices, systemd is just as good.

More

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: