Nice, I laughed out loud when I read these two paragraphs:
> Initially, we tried messing with some garbage collector parameters we didn’t really understand, but to our surprise that didn’t magically solve our problems so instead we disabled garbage collection altogether. This increased our memory usage, but our automatic on-demand scaler handled this for us, as the graph below shows
> Today we are making some of the code that we can afford to open source available on our GitHub page. It is useless by itself and is heavily tied to our infrastructure, but you can star it to make us seem more relevant.
Sensible? I guess when you dig yourself into a hole, wanting to move to a shallower hole seems sensible. It showed that they understood neither their requirements nor their chosen tools, not surprising given that their stack is Elixir, Go and Rust, languages that there aren't too many experts for, and such consistent choices are a clear display of inexperience. There's just no way that technical sensibility is a chief concern of a company that goes from Elixir to Go to Rust within a few years.
If they had started with a runtime with better GCs or chosen, say, C/C++ from the beginning, none of this would have been necessary. They clearly choose hype first and everything else second. If that's their technical evaluation process, they are likely to choose wrong the first time, the second time, the third time, and however many times they need to rethink their choices, and every time their rationale would seem sensible when it becomes clear that their previous, ill-advised, choice is wrong, and before their new, untested choice shows its own problems. Making a bad choice and then wishing to fix it by making another using the same process that's led to the first doesn't inspire confidence.
Yes, if they used C/C++ from the beginning, they wouldn't have run into GC issues, but they would run into another whole set of issues exclusive to C/C++ (free-after-use, buffer overflows, annoying threading bugs, etc.)
Just like they've run into issues with Elixir and are going to run into plenty of serious issues with Rust. The belief that your chosen language/runtime does not have some very serious issues is the mark of inexperience. The only thing you can do is know what they are and what you can live with. Choosing a hyped language makes the first part very hard; their history shows that they consistently fail on the second part, too. It looks like it's a very inexperienced team, which tech has plenty of, but turning their blunders into success stories is precisely the kind of amateurism dressed in bluster that this post is rightly ridiculing.
There's an argument that it's good to avoid premature optimization - build your thing with something easy like rails or elixir and then if you run into performance issues rewrite those bits in something faster.
It's not about premature optimization. They clearly didn't know enough about Go when they decided to use it -- partly because few people do, but they did know that -- and they clearly don't know enough about Rust, either. They're stumbling from one hype to another based on sales brochures and bragging about it.
Again, Tech is full of such low-experience teams writing this kind of unimportant software and rewriting it every couple of years, and I guess it's good that someone volunteers as a lab rat for new languages (even if they do it over and over). But they're so inexperienced that they can't even see they're being snooty about their ignorance of their own requirements and tech.
Elixir could be argued as almost-sensible if they made the typical chat product backend decision to bolt some features onto ejabberd and pretend like they're hot shit at designing distributed servers. Though I would still stick with vanilla erlang because if you use elixir without also knowing erlang, you'll find yourself up shit creek without a paddle eventually. But a halfhearted google doesn't even confirm that they built on ejabberd.
I work for twitch and I've seen that article before and it doesn't seem to fit the pattern of we messed with GC and it didn't fix out problem. The solution in the article uses weird features and behaviors of the golang GC and how VMs allocate memory but it did work very well for them.
See also Hotmail when they started migrating it to Microsoft tech.
The servers were stable for a day or two, so they cooked up a reaper process that would sequentially reboot every machine in the cluster, wait for it to start routing traffic, then reboot the next one. Every machine got at least one reboot a day from this.
The moral authority with which I used to share that anecdote has eroded quite a bit in the years of Continuous Deployment. Most of my endpoints reboot at least once a week, except for holidays where we occasionally discover longevity bugs. With autoscaling some instances will be up for hours at a time (and if we are not very careful we'll completely pollute our server statistics).
eBay did the scheduled reboot thing around 2000, but on a massive scale.
Their front-end servers were Windows ISAPI, and leaked memory like a sieve. So there was a documented process for rebooting groups of servers with a diagram dividing the front-end servers into rebootable clusters and the time between reboot intervals.
And yes, docker/k8s sure reminds me of those days when reviewing logs today and seeing some pods with unexplained non-zero restart counts.
Yes, but the Python garbage collector is only needed to handle cyclic references. Objects whose reference count drops to zero are still freed even with GC turned off. So AWS didn't make a mint from the experiment.
I'm sure the cost of getting all of the devs to write Python that safely runs without GC is far les than 10% of your AWS bill for Python services... Maybe at Instagram it is, but it won't be at 99% of companies.
It isn't like most Python devs you interview are going to have any kind of a reasonable answer for "How would you write XYZ - with the GC disabled?" I'd bet the ones with a good answer cost >10% more on average.
So swift using reference counting instead of a GC was BS marketing where the python approach (using both where it make sense) is a precedent and a superior solution?
FWIW, there's actually a CPython patch from Instagram [0] that proposes "Immortal Instances" (and I understand that YouTube had their own set of patches for "Eternal Refcounts") which aims to alleviate some of those issues and reduce Copy-on-Writes.
Sigh, it looks like the title got a bit mangled…oh well. As most of the commenters have gathered, this was mostly tongue-in-cheek; I kept a window with a few of these kinds of posts open as I wrote so I could refer back to them so if you might be able to make out some familiar companies or technologies in there. That being said, I love hearing about your employer's technical decision making process, so I don't want to discourage anyone from writing blog posts like these. Just try to keep in mind that your decisions not be the right ones for everyone else. On the flip side, if you're looking to make technical decisions yourself, sometimes what $FAMOUS_COMPANY did is not the right choice for you :)
Nice one. I'll cast my vote for $AN_ENGINEER_TOOK_A_MYTHOLOGY_CLASS as "funniest bit."
Obviously at some point you're gonna have to do the companion/followup piece, "Why $FAMOUS_COMPANY Ditched $HYPED_TECHNOLOGY for $BORING_DEPENDABLE_TECHNOLOGY and What We Learned"
But honestly, I would really enjoy seeing more of the latter type of think-piece in all seriousness. I genuinely do believe that the field as a whole would benefit from a bias towards $BORING_DEPENDABLE_TECHNOLOGY and customer centricity.
Yeah I like those. Obviously it would be great if there were more of them during the early part of the hype cycle, when contrary viewpoints are hard to come by!
> Obviously at some point you're gonna have to do the companion/followup piece, "Why $FAMOUS_COMPANY Ditched $HYPED_TECHNOLOGY for $BORING_DEPENDABLE_TECHNOLOGY and What We Learned"
The posts I see tend to have $MORE_RECENTLY_HYPED_TECHNOLOGY in place of $BORING_DEPENDABLE_TECHNOLOGY.
Saagar - I haven't laughed this hard in months. Thank you for writing this - the bit about "Medium article we found that had something to do with multi-armed bandits" got me on the floor. Laughing during this time feels so so good, and I didn't realize how much I needed that!
You too! And wrt to $HYPED_TECHNOLOGY - we at $GIANT_CO originally created it to solve some challenges, as one of the primary authors I'm starting $PERCY_JACKSON_CHARACTER_CO that provides commercial support and some extra features (honestly - just RBAC with LDAP and some light UI).
We added SSO, please pay us $30k extra. Also, pay us another $30k to be hosted on our Enterprise SLA guarantees, which actually just runs on the same hardware as the rest of our customers, but hey, whatever, you've got money.
Lost it at $AN_ENGINEER_TOOK_A_MYTHOLOGY_CLASS. I guess it's not as bad as $A_FOUNDER_WHO_TOOK_AYAHUASCA or $THE_VC_WHO_SPEAKS_ONLY_IN_RAP_QUOTES_AND_METAPHORS.
I can't believe anyone is still considering $HYPED_TECHNOLOGY in $CURRENT_YEAR. We at $NO_PROFIT_STARTUP gave it our best shot, but the bugs they mention aren't as small as they seem -- they completely sank the project. At this point it's clear that $HYPED_TECHNOLOGY is falling behind on both adoption and ergonomics to $TRENDY_TECHNOLOGY, which handles these issues in a much cleaner and more scalable way. We'd strongly suggest anyone considering $HYPED_TECHNOLOGY to take a closer look before following the advice in this post.
Right, and those cowboys at SFAMOUS_COMPANY obviously haven't read the latest blog posts that clearly show $HYPED_TECHNOLOGY doesn't support $HIPSTER_PARADIGM in an async I/O functional stackless copy on write environment, meaning $FLASHY_LANGUAGE can't be used with dependent types unless you don't mind your static analyser giving false positives every time you try to multiplex coroutines in your template meta-programming frameworks; what were they thinking?
You all have it so wrong! We should go back to $REALLY_BASIC_TECH that we already had in $HAPPIER_TIMES and clearly solves your $OBVIOUSLY_SIMPLE_USECASE.
Look man, it's relevant because $ESOTERIC_LANGUAGE is much harder to hire for than $MUNDANE_LANGUAGE that is taught in most CS programs now. It's not like anyone with 4 years of post-High School education can learn anything. And we certainly wouldn't invest any time in teaching them anything. All of our graduates jump ship to $TRENDY_STARTUP after less than 2 years with us so they have no loyalty anyway.
If you've been following this industry long enough, you'd know that $REALLY_BASIC_TECH was introduced in $HAPPIER_TIMES to replace a solution that is conceptually very similar to $HYPED_TECHNOLOGY. If you wait long enough there will be a new spin on $REALLY_BASIC_TECH once the next cycle of computer hardware becomes ubiquitous.
Exactly! Everyone knows that $HYPED_TECH is just reinventing the wheel that $OLD_TECH already invented, and back in those days we were handling hundreds of $MULTIPLES of users a $TIME_UNIT on a single machine with only 64 ${SI_PREFIX}bytes of RAM over a dial-up connection.
are we seeing the start of true meta-forums ? Place where we don't talk about one subject, but just have generic conversation about ranges of topic in one conversation ?
We hear you. We at $SOON_TO_BE_FAMOUS_COMPANY spent $N_YEARS developing our own version of $HYPED_TECHNOLOGY, because $HYPED_TECHNOLOGY didn't quite fit our use case in that we couldn't understand how to operate it well on our environment.
Did you consider $EVEN_WORSE_HYPED_TECHNOLOGY? Based on a keyword match with some academic paper I didn't understand and my poor understanding of virtualization, containers, and latency, it could solve all your problems.
> Every metric that matters to us has increased substantially from the rewrite, and we even identified some that were no longer relevant to us, such as number of bugs, user frustration, and maintenance cost.
That has happened more than I'd like to admit: massive overhaul effort, $new_thing is an improvement in many ways but has all of the reliability and maturity problems of a new system that hasn't gotten years of patches. But the team feels like they have to show ROI on reinventing the wheel, so they massage the metrics to look like an unambiguous success, when everyone involved at the ground level knows it wasn't.
> We hope that you internalize our company’s anecdote as some sort of ground truth and show it to your company’s CTO so they too can consider redesigning their architecture like we have done.
The sad part is how often this is the case. Not just about technology choices, but everything else as well, HR, marketing, and so forth.
Someone reads something, usually very short - that's semi correct, but lacks a ton of context, and then it gets accepted as some sort of unquestionable truth.
> I started 6 companies and they failed, here's what I learned. Obviously you should take my advice because they were "learning experiences" and it has nothing to do with the fact that I'm a retard
Oh, that one's just on a longer cycle. It's not just a HN thing; the entire Internet collectively rediscovers polyphasic sleep every five or ten years.
This page really should just support get params in the url! Do you know how many people I could trick into believing this with ?FLASHY_LANGUAGE=rust&HYPED_TECHNOLOGY=kubernetes ??
We at $CONTENT_FARM fell in love with this envsubst-as-a-service paradigm, and built our own with A/B testing. We then discovered we can apply it not only to our content but also to our bash codebase!
Now looking for funding to add evolutionary algorithms, combined with the best ideas from TRAC and m4, to make this a joy to maintain.
Sums up the last 2 companies I've worked for and they weren't even large companies reallly.
Re-writing our entire platform in Go from .Net (at both places) solved zero of the problems we had as there is nothing in language syntax that could fundamentally fix: priority churn, lack of dev skills, no testing on the most critical platform components(!).
IMO it made it worse because 2 weeks of PluralSight doesn't turn a .NET (or any other) dev into a high performing Golang dev. They just write .NET apps using Golang syntax - a common Golang refrain I hear. So if you thought they weren't great in uncool-language-xyz then just wait until they 'learn' Go.
That excerpt about your experience almost felt like a post mortem of my last job. The only differences are that we went from Typescript > Go, and had absolutely no time to stop and learn the language.
“but as we seek to grow further it’s clear that a complete rewrite of our application is something which will somehow prevent us from losing two billion dollars a year on customer acquisition”
Used to work at a PaaS, which had a 17 year old codebase. It did something that none of our competitors could do, which was really important to our client base.
We were a second-tier player in the space, where the 1st-tier player would be exponentially more expensive to implement and of little benefit to any but the largest clients (think, SAP vs Mysql, for analogy's sake).
When courting new customers, we often encountered the question, "When are you going to re-write your product and get off [language platform]?"
We wrote a white paper to explain where we were, where we were going, and how [language platform] wasn't impeding our 1) security, 2) performance, 3) extensibility.
But we got that question every time. CEO asked tech team how long it would take to re-write the entire 17 year old code base, without losing any client facing features, in [any_new_platform].
They ended up selling to a big financial concern (for analogy's sake, think CA -- Computer Associates).
You know, I guess these kinds of articles are formulaic, but I enjoy seeing the rationale for switching. There definitely are reasons to sometimes migrate to another infrastructure or language (be it cost, ease of hiring, performance, etc), and I like hearing the rationale behind it, even if I don't agree with it.
Going onto the other end of the spectrum, I had a job in 2012 doing ColdFusion, ActionScript, and DOS FoxPro. ActionScript was still semi-relevant, but ColdFusion was barely supported anywhere anymore, and FoxPro (in any version AFAIK) hadn't been updated in at least six years.
Whenever I got annoyed by this, they had this strong attitude of "if it ain't broke, don't fix it", and told me that there's no reason to rewrite anything.
Coldfusion isn't the worst language I've used, but ColdFusion code can be extremely difficult to debug, especially before they introduced CFComponents, and after about six months, I quit. I haven't talked to anyone there in a number of years now, but my understanding is that they really haven't been able to replace anyone who left, since the number of people willing to do ActionScript and Coldfusion and FoxPro is shrinking every day.
Changing to $Hyped_technology has one really good side effect: it attracts talent. Is your site getting 2000 visitors a day really going to benefit from Kubernetes? Is your Android app really going to see an appreciable difference if you rewrite the server in Rust? Is your code really going to have less bugs if your new server it in Haskell? The answer to all these questions is "maybe", but all of those technologies have the advantage of being sexy to people who absolutely love compsci, and being able to attract that kind of talent has value.
I don't know how much of a success story this really is, but I took a job at Jet.com, when that was still relevant, because it was using F#, and that seemed pretty neat. I met some of the most incredibly smart and talented engineers that I've ever worked with there, and I think that's in no small part because a lot of people who are interested in F# do programming for more than "just the money".
> Today we are making some of the code that we can afford to open source available on our GitHub page. It is useless by itself and is heavily tied to our infrastructure, but you can star it to make us seem more relevant.
Our internal studies showed that gaslighting users by showing them a completely new interface once in a while and then switching back to the old one the next time they loaded a page increases user engagement, so we made sure to implement such a system based on a Medium article we found that had something to do with multi-armed bandits.
Just letting out some pent-up frustration about $MESSAGING_APP that was running this kind of A/B tests in their TestFlight builds where I'd get a different UI at each launch (I know I opted in to the bleeding edge, but could you at least make it consistent?) $SEARCH_ENGINE keeps changing fonts and spacing on me too, which is less jarring but somehow more annoying because it instantly feels "off" to me.
"Every metric that matters to us has increased substantially from the rewrite, and we even identified some that were no longer relevant to us, such as number of bugs, user frustration, and maintenance cost. Today we are making some of the code that we can afford to open source available on our GitHub page. It is useless by itself and is heavily tied to our infrastructure, but you can star it to make us seem more relevant."
Most people here seem to agree that $Hyped_technology is a bad choice vs. $Old_and_tested_tech ... but those same people piss on tech like PHP and/or MySQL and instead recommend $Hyped_technology
company i am consulting switched from php stack to client side react. killing search traffic and botching evey other metric they had. inlcuding revenue.
there is a german word for this fallacy: “Technologieglaeubigkeit”: the belief that with modern technology you automatically make a the right decision.
I'm waiting for that $startup_name to try (ab)using this $Hyped_technology everywhere and subsequently writing a blogpost titled: '$startup_name: Our incredible journey'.
$Famous_company to $other_startups: We can't wait to see what you'll come up with!
Anxiously awaiting the “Why we are moving away from $Hyped_Technology” post replete with details on how $startup_name reinvented and reimplemented an older, more efficient wheel.
> We know you’ll ignore the fact that you’re not us and we have enough engineers and resources to do whatever we like, but the decision will ruin your startup so it’s not like we’ll see your blog posts about your experience with $HYPED_TECHNOLOGY anytime soon.
Can you either (a) let users specify some of this values to generate these blog posts about (company, technology) pairs they want to, or, (b) randomly fill in those values on each refresh?
It's like this guy was reading my mind... he's likely using $THE_NEXT_BIG_TECH !!! I need to buy all the $TECH_PUBLISHER books, attend the $hot_location conferences and buy some $ONLINE_LEARNING courses. I also can't use my $old_free_ide , I need to buy a license for $new_ide (just a skin of the old one, actually, in the $old_tech, but that's not important).
$old_tech has a lot of free courses, blog posts, books, etc but all those ppl had it wrong - and it can't be that everyone selling $new_tech online courses, conferences, books, ides, tooling, etc is just trying to make money off my FOMTW[0], right?
Within the past two to three decades at least, I saw much of the tech world grow (for better or worse), benefiting from NIH inclined innovation + OSS bazaar model; some random evolution examples coming to mind include minix vs linux, java vs golang, memcached vs redis, apache vs nginx vs traefik, lxc and docker ... not that the development of the latter was motivated by lacking capabilities of the former, boring tech more or less has been always available but it takes guts and perseverance to embark on a green implementation. We probably have to admit that most of us do have NIH, to some degree.
$HYPED_LANGUAGE is simultaneously every language and no language, possessing generics while somehow still lacking it. It is shapeless and formless, yet intimately familiar in a curious sort of way. The legends say that some used it for something useful back when it was nascent, but all that's left is a desiccated husk, beaten daily by hecklers on forums across the Internet. A vessel for aspiring language theorists, it is but a placeholder now, a place where interesting languages go to die.
Kubernetes makes way more problems than what it solves in my company. We even migrated to another cloud provider, hoping for a better managed kubernetes. We ended up with different bugs and issues and a worse than useless support in a different timezone, in addition to a much more expensive cloud bill.
There's no good managed k8s anywhere. You can get bad managed k8s cheaply from Digital Ocean, or expensively from EKS or GKS, whose pricing and service levels are broadly similar. The major difference is that, if you do >$100M ARR and have EKS spend to match, you can expect to have several TAMs available, at least one of whom will usually understand what they're all taking turns apologizing for this time. Google, by contrast, will never give a shit about you no matter who you are, and charges slightly more for this service.
At least web assembly brings us something completely new that we never had before. Rust is just an attempt to improve things, which is fine! But, it needs a decades long proven track record to come close to competing with Java/C++/PHP, which takes time obviously.
As someone leading the charge moving a company from Rails on Heroku to Go on Kubernetes with Prometheus/Jaeger/Fluentd/$CNCF I’ll be self aware enough to say “all of the latter.”
$Hyped_technology is a great hiring mechanism for less well known or not popular companies. I'd take a job at $boring_company at a low rate if I could do interesting Rust projects.
Well said company better hope $Hyped_technology will become mainstream, otherwise they're screwed with a hard to maintain codebase. So it's kind of a gamble.
On point satire. I’m always a little bit wary of people who talk about aggressive caching and other aggressive things.
It’s as if you didn’t really get the solution you wanted, so you just hit it harder until you got a few more percent out of it and called it a day instead of figuring out an order-of-magnitude better solution.
Sadly it's a static site, so the best I can do is stick a contenteditable on it :( On the plus side, I'm immune from comments here on Hacker News about why my website isn't a static site.
This is absolutely brilliant, I can't find words to express the genius of this article.
I often would have similar feelings to this author, as I am an (chemical) engineering student and I have learned to appreciate traditional stable designs that are meant to last for decades and are there to just work, not to look good on reddit and Medium. But I never knew how to express myself. This post did it better than I ever could.
From a game theoretic/psychology perspective, doing this kind of thing makes sense. It punts the ball on so many issues, and gets people into the mindset that all of our problems will go away when the rewrite is "done." I wonder what analogs out of software engineering exist for this - probably a lot.
the saddest thing, once you work in software is: at the majority of places people are not as smart and pragmatic, as they say they're or seem. so the software world keeps going in depressing circles e.g current state of docker, k8s, SPA's etc. maybe one day, someone will documented the amount of human hours wasted on these useless rewrites and reinventing the wheel etc.
One of my last teams used gods from Greek mythology (e.g. Apollo [1], Ares [2]), and another nearby team used Pokemon names. Lots of folks started using similar naming conventions and eventually a clear naming manifesto was sent to all engineers to discourage this practice for new services.
It's much easier to discuss and reason about complex systems when the names are self-explanatory, especially when you get to 100s or 1000s of names.
I work at a very large tech company and my experience with names there leads me to disagree.
* It is often most critical to differentiate between things that have similar functions. The relevant distinctions will not be clear by the time of naming (i.e., this is the part that used to manage the whole X but now just responds to callers, this is the part that was separated out from the old X for scaling, this is the old X backend, this is the older backend that runs for the old platform Xes only...).
* The same name may seem to make sense to people for very, very different things. (Viz. how many technical meanings "migration" has) If you have to namespace your names to get over this, Org Product Name starts getting clunky enough it will be called something different in practice anyway.
In sum, I would say that it's much easier to discuss and reason about complex systems made up of parts with clear divisions of responsibility, and I'm sure that everything tends to be easier when those divisions have been static enough over time such that the names can still correspond to responsibilities. However, I will take every single Pokemon name in existence over the confusion caused by names that try to describe what services do and are inaccurate.
We used WWII-era warships. Keep in mind this was at a Navy contractor.
Total PITA by the way. App-Prod-DB-001, App-QA-007 or DB-Bak-003 tells me what the server does, and what sort of tier (Prod, Test/QA, Dev, misc.) and a couple of other details about just looking at it. Folks reading this post could probably make some guesses just by the names.
Meanwhile, what the hell do servers Leonidas and Hephaestus do? Both of em crashed -- how much should I be panicking about that?
I giggled at that, because I felt seen. I took an overview of western literature class in college which included Obie’s Metamorphosis, and it’s had a severe negative impact on my ability to name services.
Edit: I obviously meant Ovid’s Metamorphoses, but autocorrect got me. I’m leaving it, because that’s hilarious.
> Initially, we tried messing with some garbage collector parameters we didn’t really understand, but to our surprise that didn’t magically solve our problems so instead we disabled garbage collection altogether. This increased our memory usage, but our automatic on-demand scaler handled this for us, as the graph below shows
> Today we are making some of the code that we can afford to open source available on our GitHub page. It is useless by itself and is heavily tied to our infrastructure, but you can star it to make us seem more relevant.