Hacker News new | past | comments | ask | show | jobs | submit login
How to avoid picking the wrong technology just because it's cool (bradfieldcs.com)
398 points by scarhill on June 7, 2017 | hide | past | favorite | 158 comments

I used to work at Standard and Poors. They do media monitoring for lots of financial companies.

When I joined, I thought they using state of the art tools to monitor millions of sources.

Turns out they just had a team of thousands in India that manually visited certain websites to check for press releases, transcripts, earnings reports, etc and push them to a database if they found them.

Related. I know a company that does analysis for soccer games. I thought it was state of the art too until they told me they just hired a bunch of people from China that logged every move (distance ran, kicks, etc) from eveyone and also put it in the database.

go on...

Don't know about the China angle but the life of a soccer analyst isn't all it's made out to be:

- https://www.pastemagazine.com/articles/2017/01/the-secret-so...

- https://www.pastemagazine.com/articles/2017/01/the-secret-so...

There's plenty of other blogs / articles out there now that people are trying to "Moneyball" soccer a bit more. Like anything that has a tinge of glamour and a lot of money around it, you have to take poor earnings to "pay your dues" in the trenches. Same as other aspects of pro sports, entertainment, etc. Hopefully something motivates you other than money.

I was curious about the logging system for that kind of data. Now that I know the name of the position I made a search and found that most systems use people to input the data. Thank you.

Well, that makes sense, doesn't it? Back in "the day" when certain US companies were trying to convert all the Whitepages and Yellowpages nation-wide into CD-ROMs, they'd ship phonebooks from each zip-code to China to have them triple entered (error correction), and then that's where the nation-wide CD-ROM phone-books came from. This is the same deal, no?

this somehow made me laugh so hard. made my day~

laugh, and then cry

@AznHisoka I would like to hear more about this! This is exactly a problem we are trying to solve (http://evolution.ai)

One cannot even imagine how many people in the world have a boring job of looking at word document and C&P some of the data into excel spreadsheet. Other half have a job of looking at some document and effectively making Yes/No decisions all day.

Disclaimer, I am the CTO

Sounds similar to Scale (https://www.scaleapi.com)

I'm sure they'd rather no job than a dull job... Oh wait.

wonder if that could be crawled and save money.

Accuracy. Every solution I've seen that relies on automatic crawling will eventually have a parsing error when someone changes their sentence structure of a press release.

It's not so obvious when you're looking at the breaking releases for a few stocks or companies, but historical records have at least 1 error per stock per year.

So split your stream:

    1. Data matching expectations (you do have a definition of correct, right?)

    2. Log for manual review -> manual inserts or correction and placed into queue for (1)
Monitor (2). When inserts start trending up, it may be time to update your processing logic.

I came up with a similar idea for a company several years ago where we had a team of people doing data entry from faxed documents. I wanted to build something that would do all the OCR it could and then display it to users to verify, which should have been a 10 times efficiency increase, not to mention speed and accuracy.

The idea was rejected, they wanted either a perfect solution or nothing. I don't know why, but for some reason the idea computers removing humans is acceptable to management, but computers augmenting humans wasn't.

right, it depends on what the reward for high precision vs high recall is

Would writing and maintaining the crawler be less expensive than paying the small army of third world employees to manually check pages all day?

The short answer is that it would depend on the value of any precision loss that occurred and whether or not the shift would disrupt any other systems, be they social or technical, within the company. But with that in mind, there's also the issue of whether or not the money saved would be worth it. It's very possible that S&P simply has enough money to play with that any savings from swapping to an automated solution for media monitoring aren't worth the potential disruption to the company workflow.

or just set google alerts.

financial companies usually use the worst formats to extract info from.

Woah there cowboy

Humans are the cheapest machines.

The author's acronym is silly, but it's a real problem. Soylent liked to blither about their "infrastructure", for a product that sells a few times per minute. They could be using CGI scripts on a low-end hosting service and it would work fine.

Wikipedia is some MySQL databases with read-only slaves front-ended by Ngnix caches and load balancers. That seems to get the job done. Wikipedia is the fifth busiest web site in the world.

Netflix's web site (not the playout system) was originally a bunch of Python programs.

The article mentions a PostGres query that required a full table scan. If you're doing many queries that require a full table scan, you're doing something wrong. That's what indices are for.

In fairness to Soylent, their marketing revolves entirely around the perception that they're a tech company and not one of many meal replacement shakes. The problem they were solving might have been "Techcrunch hasn't written an article about us recently", which an over-engineered stack does actually fix.

Yeah, you can't buy that kind of publicity.


I remember reading a scaling-out article from some startup. Some of the things felt a little over-engineered, some were impressive, some seemed wrong. But then they get to the point where they brag about their scale, and the metric they used was that they can handle thousands of requests.... per day.

This is hilarious and sad at the same time. However, most of these write ups are aimed at attracting talent. Even more, some tech stacks are deliberately built to attract talent when the core domain is just too simple or boring. "We serve user subscriptions and recipe data from an SQL database using Rails" just doesn't sound as snappy as the infra-porn on the blog.

Isn't that kind of thing a real red flag for the kind of talent you'd want to attract? If someone told me they'd built a GPU compute cluster for their phpbb based social club forum, I'd think they were an idiot and not want to work with them.

I am sorry if I'm completely off base but I'm still thinking about the danluu page on options v cash which quotes this


Maybe startups don't want the absolute best of the best but rather the best of the gullible?

Edit and or poor

So you're thinking that kind of technical mark-missing is the startup equivalent of the typos and other glaring errors in email scams? They're to weed out the people smart enough to be a problem?

Absolutely. I'm running into this problem constantly in the front-end dev world. Most developers I talk to are way too eager to list off a million cool techs that they're familiar with, most of which are completely inappropriate for the job at hand.

If I'm interviewing a candidate, I'm not interested in which tools they know how to use. I'm interested in whether they understand the problem those tools solve and when to apply them. Over-engineering is problem #1 with the community at large in front-end dev at the moment and you'll go a long way as an engineering team if you make it your priority to defeat that mentality.

If I'm ever in a position when I'm needing to attract talent, that would not the kind of talent I want to attract.

I'd want grumpy, paranoid sysadmins who really don't want to be woken up late at night because something has gone wrong.


You'd think you'd want to attract good engineers regardless of skin color or gender but what do I know?

Eh, there are benefits to a heterogeneous environment that aren't always obvious. For example, not accidentally making racist software.[1] ;)

1: http://www.kotaku.co.uk/2017/01/12/how-we-accidentally-made-...

Oh my god. This checks off all the boxes for companies I would never work at.

Starting with this nugget:

>Our gateway is on the frontline of our infrastructure. It >receives thousands of request per day.

LOL! Wat???!!! Thousands of requests per day. Thousands. Not per hour or per second. Per day.

Call out the big guns, people! We need cutting edge enterprise-scale architecture! A boring rails app or a flask app will never be able to handle this kind of scale!!

I don't even need to go farther down the checklist than that. You're putting all of that complexity and infrastructure and problems-waiting-to-happen behind an API that serves thousands of requests per day.

Yeah, I'd rather be homeless than work on that team. Jesus.

Shhh... 3 guys and Flask app don't employee as many people as a large team drawing diagrams on whiteboards and working on "infrastructure".

Next you are going to be criticizing 3GB big data databases with dedicated sys admins.

Sometimes I wonder how much of this is some type of "jobs program". A transfer of money from the financial sector back to the grunt sector via VC.

3GB database might even still be appropriate SQLite territory, especially if you're not super write heavy. :)

Exactly what is wrong with encouraging more women and minorities into the industry and making them feel welcome at a company. I assume the ideal workplace for you is just a group of 20-30 year old, white males.

And in your "expert" opinion where did they go wrong here. What is the correct number of articles they should post on any particular subject in order to get the HN tick of approval ?

What? I was just remarking that their engineering blog seems to be more recruiting focused than focused on the actual technical achievements of the company. I was not at all commenting on whether those posts were good or bad.

I'd wager the vast majority of company tech blogs are maintained with exactly that goal in mind. And in my opinion that's totally reasonable. Done well it helps people understand more about the people, technical philosophy, and problems faced by that team

I'm not the op, but I would charitably assume that they meant that those blog posts demonstrate that the blog's strongly geared toward attracting talent.

Your attitude frankly sucks.

How can it be a bad thing for developers to write about what they are working on ? Not all developers are genius experts who suddenly woke up one day and instantly knew how to perfectly architect a system. People learn, people grow, people make mistakes and most importantly people need feedback.

Blogs like this are fantastic to (a) learn what other people are doing and (b) for developers to have an opportunity to show off and be proud of what they are working on. Especially since the rest of the company in most cases isn't going to care less.

It's great to write about your tech. But unless blogging is your business model, it's not great to pick your tech so you can write about it.

ergh, you weren't kidding. I hope that was a typo on their part; even low thousands rpm is mostly not a big deal on modern hardware.

   Our gateway is on the frontline of our infrastructure. It receives thousands 
   of request per day, and for that reason we chose Go when building it, 
   because of its performance, simplicity, and elegant solution to concurrency.

For all we know the blog is a work of fiction and they're running off a laptop...

Holy shit! I almost didn't believe you.

Then again, if they didn't have that beefy infrastructure, they could get DDOS'd from up to multiple request per minute! I mean you can't be too careful when you are in that league!!1!

Also interesting (in the same way as rubbernecking a car crash) is going to their homepage and viewing source.

'Dashboards around the office show how the system is performing at any given time.'

Probably was someone's full time job for a couple months.

When was this written? Why is there no year in the date?

Granted, HDMI can be infuriating, but months?

More seriously, you can set this up with KPIs in a day with Zabbix. A week, if you've never dealt with Zabbix, which is admittedly somewhat complex compared to some of the others.

Depends on what you're trying to accomplish.

But setting up a data warehouse that builds out the KPIs you're looking for along with a dashboard to display them (the easy part, for sure) can certainly take over a month.

...with only 10+ high-CPU large machines.

With that level of traffic, they could run their entire frontend from a smartphone...

Why did you have to type that. Now in my free time there's a 50/50 chance I am going to waste minutes or even hours looking into the viability of running a postgres/sinatra/nginx app on a phone.

Something about that appeals to me in a huge way. Showing people my "server room", etc etc.

Attach the phone to a small weather balloon, and you can have a real "cloud hosted" setup.

There's a bunch of httpd apps on the Play Store, many complete with database engines and all that.

Plot twist: Run it from a flip phone.

Phones are expensive. Raspberry Pi.

I am surprised they are still up, given the amount of attention from the HN community ;-)

I think their CDNs are saving their 10+ High CPU servers from the load.

I didn't want to share the original article, but you found it :) Then again, Google search doesn't turn up many other results bragging about thousands of requests per day.

Be honest, were they hosted on their founder's smartphone?

Great idea! I should try this for my next project ;-)

At the other end of the scale, an acquaintance of mine is a sysadmin for a government department, and can't get their web framework changed to his frustration (politics...). He gave an example of the websites he has to look after, one being a boring legal website ("so you can imagine how low the traffic is already...") and the software could barely serve one request per second!

Hey, that was a major milestone for a RoR app.

> Wikipedia is some MySQL databases with read-only slaves front-ended by Nginx caches and load balancers.

You're not wrong, but they are slightly more complicated than that: https://meta.wikimedia.org/wiki/Wikimedia_servers

They have Kafka, ElasticSearch, some workflow engine thing, and they manage a distributed object store (Swift) deployment.

Apparently in Wikimedia's case they also have a stated company policy / philosophy around self-managed infrastructure that prevents them from using public cloud for a bunch of stuff (I've heard this from a former employee but can't find any further documentation about it).

>Netflix's web site (not the playout system) was originally a bunch of Python programs.

seems that the bunch has grown slightly since then:


(from https://medium.com/netflix-techblog/announcing-ribbon-tying-... )

Or this view https://image.slidesharecdn.com/devopsisraelpdf-141030000609...

(from https://blogs.mulesoft.com/dev/microservices-dev/api-approac... )

This happens all the time on StackOverflow/ServerFault. "What sort of server cluster do I need? I have thousands of hits a month!" Usually the answer is "an average smart watch could do it".

Yes, forget the acronym.

It's simply, "think for yourself." Do you ever talk to someone on a topic (in technology or politics) and it feels like you're talking to list of commonly accepted prepackaged talking points?

> Netflix's web site (not the playout system) was originally a bunch of Python programs

That is fascinating, I would love to read up articles about this.

Queries that require full table scans? What are we talking about? WooCommerce?

Hey! Those all have indexes now. :)

Good to know!

Working on a migration from a site that modified the WooCommerce core to a new ecommerce platform. Couldn't believe WooCommerce ever did that when I started.

Congrats on such a widely used platform though.

Wikimedia actually uses Kubernetes, but not on the serving path (yet?):


Kubernetes is essentially platform as a service. The parent didn't address the backend infrastructure,only the services serving the web content.

That's just for the Tool Labs (tools.wmflabs.org).

Nothing I've built has ever needed anything more than a $5 Digital Ocean droplet and one of my services gets around a thosand requests a second at peak. Purely anecdotal and I'm not doing anything CPU intensive but I really feel startups are overdoing their infrastructure.

Startups? I work for a finance firm, and while we certainly have a need for large farms of servers to store data, my current team keep talking about web request latencies as an important infrastructure concern when the literal maximum number of users is in the tens of thousands.

Which is, you know, 1 maybe 2 machines with nginx plus 1 for redundancy. Our internal services are slow because people cohost them with batch jobs, and no other reason.

It's so frustrating, because when it all boils down to it, the story is really quite simple:

1) What metrics reflect your customer experience? Monitor them, alarm on them. Target them as a priority for improvement. Check in on them weekly at a minimum.

2) What metrics make up the metrics that reflect the customer experience? Monitor them as well, and consider whether alarming on them is the right thing to do. Use these to direct what you target to solve 1.

3) What do you need for SPOF resilience? If you've got something critical, you need a minimum of two servers running it, and no more than, say, 40% total CPU usage so that you've got sufficient overhead to cope with a single host failure and any unexpected work increase.

It isn't just startups. And I agree with another commenter: it seems like resume driven development.

There's also a weird peer-pressure involved. Overheard a conversation a while back that summarizes it nicely - someone was talking about a scheduling system written they've used for years, and mentioned it was written in Perl. Another participant guffawed, and after the requisite Perl-bashing, the original person allowed that, yes, even though it worked fine, they should rewrite it.

No idea what company that was, but I'd love to work in a place where that was the most pressing concern on my plate.

I've encountered this sort of thing too many times here in the last few years. A set of Python scripts to ETL a few hundred MB of data once daily is suddenly desperately in need of replacement with a real-time, fault-tolerant MQ backed server (I've seen more than one of these, from custom ZMQ proposals to Kafka). A set of two or three scripts run on a compute server that are kicked off by cron and that do some work in batch on a modest set of data (few hundred GB) suddenly must be replaced with an asynchronous task management service capable of managing a huge set of processes with complicated dependencies (Jenkins, Luigi). A bit of data one team produces (once daily) has another team asking for it: instead of exporting the data from the DB directly or standing up an endpoint on, say, a Flask server there's a drive to slap a Spring microservice architecture in place. And so on. It's really a sight to behold, the rationalizations people go through who, in other contexts, laugh at the inferior logical reasoning abilities of non-CS folks.

Well...there's a lot of context in "works fine," from "it is perfect and there's nothing to add to it" through "good enough" to "it sort of works within most expected inputs, but everyone is afraid to touch it, so we've built a bunch of hacks around it."

Were all of these people competent?

Honest question. I have only seen the "it's written in X, so therefore it must be re-written in something nicer even though it is working" thinking from incompetent people who were just trying to take ownership of something they didn't quite fully understand. Though I never have seen it in a appropriately functioning commercial setting; if management is competent, they'll immediately recognize the high costs with no concrete benefit and say no.

It's one thing to say "we have to re-write this because it uses Java applets, and Java applets are problematic because Oracle is dropping support for them, so our customers are going to be screwed soon if we don't do something." It's another thing to say "we have to re-write this because it's in Perl because Perl is something I don't like."

I've seen this situation multiple times, and yes the developers involved were competent. They were even well-meaning, and wanted to build something for the benefit of the company, not just their resumes.

I think the tendency to over-engineer and over-polish comes mostly from getting too invested in one particular project or task. The developers have "professional pride" - they want to deliver software that has good architecture, high test coverage, easy to understand and maintain code, reliable, scalable, etc.

This means competent developers are very tempted to continue working on a project as long as there are possible improvements to it, even if these improvements do not make business sense. Nobody wants to admit that "cron job that fails once per month" is a sufficient solution when they can see a better solution, and go work on the next hacky cron job instead.

To your last line -- I've got a PHP sub-system that I would love to have re-written to match our chosen stack language set, maybe simply because we could then have fewer required skills on the team (few of us are PHP guys). But for all its warts, it works. And that matters. I am thus hesitant to re-write for no other reason than 'we hate it'. It would be a lot of work for zero functional gain. (All benefits would be non-functional - and that stuff gets put on the back burner)

OK, but the other side of that is all of us looking at resumes and going "wow, that's cool that this person built that" instead of "that sounds overengineered and silly".


1. This is why you ask in the interview about it. 2. I don't know about you, but if I see certain buzzwords, I assume that the candidate merely chases the latest fads and subtract points accordingly until counter-evidence is produced.

Who is "us"? These days if I see things like that on a resume I'm more likely to be skeptical than awed.

The point is the resume building is done for the purpose of impressing the people reading the resume. If it didn't have that effect then nobody would bother resume building.

You're not using Rails, aren't you? What stack do you use to handle 1000s of requests per second on a 512MB VPS?

I don't know about the specific overhead of rails. But if you had a uniform distribution of 2000 requests a second, and each request took up to 50ms, then you'd only have 100 requests at any given time. So each request could take up to 5mb, which is generous unless you're doing something explicitly fat.

5 ms isn't generous at all for a backend.

You mean 5 mb? I can't say I have a ton of experience. But of the three projects I've worked on, the fattest was a Vaadin CRUD app, which stores the user session/state on the server and it could be around 5mb. More context: This isn't counting any DB effort.

Working in the D.C. area has given me a high tolerance for acronyms and backronyms (seriously: P.R.O.T.E.C.T. Act stands for "Prosecutorial Remedies and Other Tools to end the Exploitation of Children Today").

U.N.P.H.A.T does raise a smile to my face for trying, but if the author is reading, I'd suggest you change it to a prescriptive paragraph where the first word in each sentence becomes a letter in the acronym (e.g. B.A.M.C.I.S).


Here's my best try:


Understand the problem.

Nominate multiple solutions.

Prepare by reading relevant research papers.

Heed the historical context.

Appraise advantages versus disadvantages.


Drat, it's too late to edit my comment and take it out, but "Consider candidate solution" wasn't meant to be included in the acronym. It was part of my brainstorming.

Ok we took that out for you.


Tupac was the master of this

GHETTO: Getting Higher Education To Teach Others

BAMCIS: Begin Planning Arrange Recon Make Recon etc...that BAMCIS?

Yep. That one.

You gave me a flashback dude!

A lot of this just sounds like Resume Driven Development, not people thinking they're Google or Amazon.

I was thinking the same thing. I wonder how much of it is due to the way we make devs work and the hours we make them keep.

For example, I'd be more than happy to use the same old, tried and true, boring tools to just get the job done, if it meant that I could then go play golf or otherwise not be in the office.

But if you insist that I be in my seat 8 hours a day regardless of workload, then goddamn let's take this shiny new tool for a spin!

Do I want my resume to show that I used the same tool for every job for the last ten years? Or do I want it show some new hotness?

The industry and employers are as much to blame for this as the engineers, if not more. When you use middleman firms to find your employees, and all they understand is buzzwords, well then guess what game the devs are gonna play?

Fellow golfer and tinkerer - we are a product of the job markets. Hot employers typically want new and shiny on the resume in addition to fundamentals, seems like everyone is playing the same game...

I get the feeling that my skillset is becoming outdated.

Fact is I know Django inside out, and plenty of Python libraries. Its very rarely that I will find something that requires me to learn a new language or tech (I will likely get things done a fair bit faster using the tech that I do know). Anything else feels like resume driven development.

I recently worked at a company with a working batch legacy system and an ongoing update to Python and SQL with newer tools. It processed large data but it fit in a big server. 3x the amount of people required.

They replaced it with Spark and Scala. Rewrote the whole thing and bumped into the same previously fixed problems for 8 months. It was not faster and it was a pain to extract the result data for the online website. Oh, and they had only 1 server running Spark. On top of the existing one. They also dumped IRC with very useful plugins for proprietary Slack channels.

Yeah, their resumes are very competitive now with the Big Data hype. Worst part is the higher ups liked it because they had more keywords in the bag for investors.

We are living in very strange times.

This to me is the real problem, especially in a larger company. I mentioned "Resume Driven Development" at work the other day and got the response from a peer of "that's how it should be."

Convincing others to not chase the shiny new toy is harder than picking the right technology. The other posts under this parent are right: we are a product of the job market.

I am fortunate, in that I got a lesson in not over-engineering things very early in my career.

My first programming job was a 3 month contract at the maintenance department of an international airport. They had a bunch of information in large, unwieldy ERP system and wanted to automatically generate job sheets for the different maintenance crews. So I did the simplest thing possible - I generated an excel file from the ERP system, then using that file as input, I outputted different excel worksheets for the different crews.

It was very plain GUI app that had one or two buttons. I remember being a bit worried that it wasn't nearly fancy enough for 3 months work, but everyone seemed pretty happy with it.

Later on I found out that - before me - they had hired an experienced software developer who had worked on the same problem for 6 months, and at the end of that 6 months had apparently not produced a solution. I had done the dumbest, simplest thing - not because I had any insight or wisdom, but because it was really the only thing I had the skills to do. But I delivered.

It was a brilliant, accidental first lesson in not over-engineering.

I generated an excel file from the ERP system, then using that file as input, I outputted different excel worksheets for the different crews.

As a complete aside, you might be surprised how far you can go with Excel these days. Do you know it has a built-in in-memory columnar database now? You can have millions and millions of rows of data in there that you can use in tables and charts completely independently of the size of the grid. Pull back a huge chunk of data from the DB and slice and dice it to your heart's content locally.

I look at people buying expensive "business intelligence solutions" and I think, it's right there on your PC all along and you don't even know it...

The problem is people using Excel for everything that it shouldn't be used for.

"Throwaway" Python code winds up becoming part of real systems all the time, but we don't blame Python for that

Same here, I've watched multiple devs come into my first job with agendas of extreme over-engineering and unnecessary rewrites. I think the company is actually pretty lucky that they haven't gone under from all the damage.

I've seen this in action. Using code generators to convert XML configuration to a few API end points. Or using a DSL/rules engine because you don't want to write code. Or having APIs that hit other APIs ad infinitum when the whole thing runs on one server because "micro services are the only right way". The result was we spent time gluing together what was already a monolith disguising as microservices, rather than adding features the customers wanted

More recently I had to solve time drift on 1000s of devices. The problem was someone installed puppet to manage those devices which uses NTP. The devices are behind firewalls so if they block the puppet master or mess with SSL puppet doesn't even phone home. Or worse it gets incorrect time from NTP peers on the network. The solution was to throw out the shiny tool "puppet" and just call "date". Puppet and NTP are great in theory for getting time down to the millisecond but totally backfired when some devices were off by over 24 hours. For our purposes as long as all devices were within 5 minutes we were good. The irony was after disabling NTP puppet just started it again. And we couldn't use puppet to fix that since 50% of our users had it blocked. No other choice but to throw out puppet and start over from scratch. The guy who spent months setting up puppet was not happy.

the real issue is why the firewalls were randomly blocking puppetmaster and/or ntp and why the puppet ssl stuff stopped working (apparently randomly?)

Everyone involved sounds like they need a lot more experience.

With all due respect, you don't know the real issue. Your response is the same thing the guy who installed Puppet said to me... just have them unblock it.

Our sales pitch is "these devices use plain http and will work behind your corporate firewall". The blockage wasn't an issue that could be solved, it was our whole business model to workaround the blocks by using simple http instead of https, proxying everything through our IP, and things like that.

Even the puppet documentation says not to run a puppet master when you have devices that are behind firewalls or limited network. The guy who added puppet apparently didn't read that.

I wasn't the one who decided the business model just the guy who fixed it to work as advertised while dealing with the pressure of everything crashing & burning. You're right no one had experience but thats not the point.

My point was that the fancier tools sometimes just add new issues without solving your real issue. Despite my lack of experience I solved the time drift using a linux built in "date" to set the date time. It didn't account for network lag like NTP, and an NTP developer would probably laugh at my solution, but now all devices are accurate to within a few minutes & that particular problem was solved. So don't always go for the most complex tool is all I'm saying.

For what its worth I do plan to bring back puppet but run it in "puppet agent" (offline) mode. We'll using custom scripts to copy in new puppet configs so puppet does not need to phone home.

It's worth knowing that NTP's round-trip delay compensation is not at all magic — it works like this (see RFC 5905, section 8):

1. Client records "origin" timestamp T1 when its sends a query to the server. 2. Server records "receive" timestamp T2 when it receives the query from the client. 3. Server records "transmit" timestamp T3 when it sends a response back to the client (the response includes T2 and T3). 4. Client records "destination" timestamp T4 when it receives the response from the server.

Round-trip delay = (T4 - T1) - (T3 - T2)

And that's it. You could do this with "date" too!

I would love to get into it more deeply as you continue to supply details! :)

distributing hardware outside of your network and using puppet in master/client mode is obviously a bad idea, just like having any dependency is difficult to manage (sometimes like NTP)

However, clocks will drift. Consider ntpdate in a cron or an easier-to-manage sntp client vs ntpd, which is a little nutty.

So the point is that a tool like puppet, only properly configured, is probably a great asset for your use case of distributing hardware, as it can help keep things working as expected.

Yes Puppet solved one problem... How do I add a cron to all devices, and retry it if it failed without adding it twice to devices where it worked. Puppet is amazing. It solved that problem....

But then it created a whole new world of problems since it violated our business model to have it phone home. Thanks for the suggestions on NTP. We'll likely add features that do require more accurate time in the future & your suggestions will probably come in handy!

You can use date to set the time, but you must use some other way to get the correct time to be set. So you essentially had to re-implement NTP. Your solution was the more engineered one, not his, he used the standard tool for syncing dates.

I'm not seeing your option wasn't correct (if NTP didn't work, you can't use it), but I don't think it's a good example of the errors described in the article.

> you must use some other way to get the correct time

When the devices phone home to our IP (which is whitelisted in all their firewalls) I compare to the server time & set the device to the server's time if it is off by more than 5 minutes. Our server in turn uses NTP (which is not whitelisted on the devices networks). The devices will be still be off by 30 seconds or so if there is a network delay. On one hand we could have just told them to whitelist NTP, on the other hand we tried that & you get one department who blames another department, or worse they just don't return our calls. Plus we originally told them they only needed to whitelist one IP.

> you essentially had to re-implement NTP

No, I called "date -s". I didn't re-implement NTP. My solution does not handle many of the things NTP handles. It does not attempt to deal with any of the things NTP handles like compensating for network delay. If I had originally written this, I would have just used NTP but proxied it through our domain. Instead I was called in when things were "on fire" & had to come up with a quick fix.

> Your solution was the more engineered one

That's your opinion. My solution took 15 minutes while he spent months setting up Puppet. His solution resulted in devices being off by multiple days, whereas my solution has proven to keep devices accurate to within +/- 5 minutes.

> he used the standard tool for syncing dates.

What "standard" says I need to use NTP? Surely the "date" command can be considered standard as well.

> don't think it's a good example of the errors described in the article.

Its exactly what's described in the article. Our problem was to do a job at a certain time. No one cared if it happened at 4:21 instead of 4:20. Big companies like Google need more accurate time & control their network so ntpd is a good fit for them. I don't need as accurate of a time & don't control my network, so ntpd was not a good fit. So just because Google does it doesn't mean you/I should. Because sometimes you/I are solving a different problem than Google had to solve.

Just to be clear, I'm not in any way criticizing your decision. It seems clear to me that it was correct, from what you've told us. I just don't think it's a good example, that's all.

> Don’t even start considering solutions until you Understand the problem. Your goal should be to “solve” the problem mostly within the problem domain, not the solution domain.

I'd guess that this guideline alone would stop 2/3 adoptions of JS SPA frameworks (and 4/5 Angular adoptions!) if followed.

At least the SPA frameworks themselves have a reasonably common legitimate use. The tooling around them is the major problem. Most projects I've worked on, even complex ones, could comfortably trim from 100 dependencies down to 10 and have the developers working on them be an order of magnitude more productive.

People wilfully wrestle with thousands of functions worth of APIs every day and don't even notice the immense slowdown it's causing them. It's especially bad in React land, which is ironic seeing as Sebastian Markbage at Facebook has an excellent talk about reducing API surface area.

Rule 1: don't use any modern javascript framework.

I'm typing this impromptu but this article seems to be a qnd informal carnation of the Architectural Tradeoff Analysis Method (https://en.m.wikipedia.org/wiki/Architecture_tradeoff_analys...).

I suppose another point worth bringing up is that hardware has made some pretty strong advances in recent, especially with SSDs being widely available. Stuff like that has raised the ceiling on what a single box can do compared to 2000 and earlier, when Google was building MapReduce at first.

Pretty sure that's in the article is it not?

"Don't use tensorflow to predict everything"

Many people may not work at Google scale, but name would probably like to work at Google

We temporarily replaced this article's baity title with the text's more accurate self-description.

If someone would care to suggest a good title—i.e. accurate, neutral, and preferably drawn from the language of the article itself—we can change it again.

FWIW, I think the original title was pretty good. I have had the unfortunate experience of screaming pretty much exactly those words (albeit replace "Google" with a different company from which one of my CEO's advisors came from).

EDIT: how about combine the two?

You Are Not Google: Another "Don't Cargo Cult" Article

I like your proposed title, "Another 'Don't Cargo Cult' article" on its own seems dismissive, when the content seems quite useful for many engineers (the acronym need more work, though).

Ok, fair point. I've made up a title, even though we hate to do that, because I can't find any phrase in the article that neutrally summarizes it.

New title seems fair. (Also, welcome to the dark side of the editing force, etc etc)

"You are not Google" is a linkbait trope ("you") and also one very tired cliché.

It's tired because it apparently requires infinite repetition.

In other words, it's pointless as well.

What's predictable is intrinsically uninteresting. The OP partly redeems itself by offering concrete ideas, though, which gives the article a bit more substance than usual.

> What's predictable is intrinsically uninteresting.

Information theory aside, I don't agree. Observing repeated patterns of failure and error is very instructive, especially since restatements often help me see it in a better light.

If anything, HN trends heavily towards repetitions of survivor bias. Much more fun to read cool papers and imagine yourself implementing them. But the fellow on the chariot with the red-painted face still needs an attendant whispering "remember, thou art mortal" over and over.

"UNPHAT: one answer to cargo culting"

Regarding "UNPHAT": Is this... serious? Does the author genuinely hope that we will use this acronym as a means to help guide our technology choosing decisions? Is it not their creation and is just something I wasn't aware of yet?

Finally, do these forced acronyms ever help anybody else out there? I mean seriously, the "N" standing for "eNumerate?" The "P" standing for "Paper," which barely correlates to the actual meaning "consider a candidate solution."

Seems to me just saying "apply a principle of unfattening your technology decisions" would be a hell of a lot easier to remember.

Checklists and acronyms help some people, so more power to them. But I think the central observation of most of the anti-cargo-cult articles boils down to a simple maxim of "don't use a technology unless you understand what it's bad at".

There are very few technologies that can reasonably be thought of as strict upgrades, and the few that do exist (i.e. MySQL to Postgres) tend to be incremental enough that switching rarely justifies the migration costs. Instead, many solve one or two exceptionally dramatic problems (gargantuan datasets, huge write volumes, partition tolerant master-master replication, etc.) and are willing to make equally dramatic tradeoffs to achieve it. Saying Technology X is good at Problem Y is only half the story.

In my own practice I've found that forcing myself to stop and explicitly enumerate both the pros and cons of a new technology is usually enough to get my professional intuition to kick in. And 99 times out of 100 it tells me to just use boring old SQL and move on.

Try DMAIC six sigma - Define/Measure/Analyze/Improve/Control

or OSEMN data science - Obtain/Scrub/Explore/Model/iNterpret http://www.dataists.com/2010/09/a-taxonomy-of-data-science/

I doubt it. I think it needs to be three or fours letters. (E.g. Always Be Closing. Keep It Simple, Stupid.)


Trying my hand:

- understand the DOMAIN

- find the OPTIONS

- research a CANDIDATE

- know the HISTORY

- consider the ADVANTAGES

- apply deliberate THOUGHT


Hmm. That's a much better acronym, but it makes me come back to my original question - an acronym I guess is a way to make a process (dare I say algorithm) easier to remember, yea? So like, when you're switching to a new technology, the steps are as you listed, which are basically just

1. Understand what your problem is, and the potential solutions to that problem.

2. Use your brain.

3. Make an intelligent decision.

Or really just

1. Use your brain

I mean I just don't see the need here. Could be arrogance, I guess? Is this the very problem the author was trying to solve? Create a defined process for technology choosing?

You've just described the Feynman Algorithm.

The point of most processes is to help stupid people (or, just, people without large amounts of analytical talent) make smarter decisions than their own brains would generate. Processes "raise the waterline" of an organization's aggregate behavioral intelligence, by ensuring that the stupid-est decisions being made are no more stupid than the process.

Processes also frequently serve as checklists, to ensure that smart people aren't being temporarily stupid—"did I check that I have all my surgical tools before closing?" and such.

Maybe simpler:


Don't Over Engineer

(which more or less brings us back to KISS principle)

I like DOE because the inverse is E (Engineer). It illustrates an on going challenge in technology where the first question isn't "What capabilities should our resulting systems have? And what constraints are there on our implementation?" (which would be engineering a solution) instead we get the question "What other systems out there seem to solve this problem?", or worse "What other systems have similar inputs and outputs to the ones we have and want?"

Also there is this other question (as I see it):

What CAN this (pre-chosen) something (insert here hardware or tool or programming language or library) do?

Let's use ALL (or most) these functionalities! (because we CAN)

Losing sight of the actual question which should be "What is actually needed"?


Can be confused with Do Over Engineer.

>>DOE >Can be confused with Do Over Engineer.

YSNOE (You Shalt Not Over Engineer) which is more imperative is worse at reading.

Maybe NOE (Never Over Engineer) would be acceptable.

It seems to me just a complication of YAGNI.

YAGNI is dismissive. That other approach with a comparably silly acronym is an actually useful checklist.


This is just a "my technology stack is better than yours" post like countless others we see daily. Sorry to dismiss it so abruptly but it gets tiring.

If you have the mindspace to be sorry, you have the mindspace to not do it.

And, since you clearly didn't actually read the article: he never says a word about his own technical preferences except "don't pick ones that don't fit your scale and scope".

I think people downvoting tend to ignore the fact that the proposed "optimal" solutions for non-Google companies were at some point novelties themselves. If the same logic is applied we'd be using CICS app servers, IMS, and buying terminals.

At some point things change, the new normal changes, etc. The shift we are seeing in some areas also contributes to finally accepting the realities of distributed systems.

The people are probably downvoting because you have missed the point of the article. Nowhere does it say that one shouldn't change things, or adopt new solutions.

Wouldn't that actually require the author to promote their own technology stack?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact