I worked at a fortune 10, backing up our mysql db daily took 8 hours. We had a job that processed data that also took 8 hours. We had a small window of time during work hours to fetch the data, if you happened to query when either service was running, you'd probably not get any data back.
Few months in, both services reached 12 hours to run. We had to change the start time just so they'd happen back to back. I rolled up my sleeves and got to work on the db.
A week later, the job's runtime was down to 2 hours. I was nearly promoted out of the job. I kept making improvements to the database, reorganizing the application, and optimizing. One day I manually ran the job, then went to get coffee. When i came back, prompt was awaiting my next command. I thought it had silently failed. I ran it several times throughout the day and checked the response. It ran in ~17 minutes. The backup was also reduced to less than 20 minutes. This was all on mysql 5.7
We literally gained 23 hours of availability. We had no idea what to do with it. I was fired shortly after.
Can you go in some high level details as to why it was slow and what you did to make it fast. That's always the most interesting part of a post like this.
The db connections were poorly managed. Each query started by starting a new connection. It then checked if it failed, slept, retried, then ran. There were several queries called in loops, so the connection pool was always dry. The code spent most of it's time sleeping. Then the queries themselves were highly inefficient, bad joins, no indexes, etc. It was satisfying to fix the mess.
I had two similar experiences in the past years: I added some database indices, the application become super fast, the rest of the team starts acting weird, I get shown the door.
At this point I'm pretty sure anything IT relate became a bullshit job.
I've done something similar, you start out with something naive and simple like mysqldump that takes an age, and move on to more specialised tools like Percona XtraBackup that allows for incremental backups.
Depends on environment. If you can do disk snapshots that's the way to go (can be hard with disk striping). wal-g works for storing both base backups & wal to various storages in parallel & can be throttled with env variables
or embarrassment and concern on the part of the people who created this system before him.
I've made similar improvements (40x speedup, 8x reduction in RAM footprint) and had been shown the door. This was for software that was a bit over 30 years old too so it was a bit involved.
The team winning does not help a narcissist feel better. You'd be better off getting nothing done while stroking their ego daily.
Why do so many people on this forum have kittens whenever someone points to a success story about Rails and Ruby? It's like the hamburger people can't stand it when someone enjoys a plain hotdog with ketchup and mustard. It's tiring. Get a life, and just enjoy your bland, overpriced, gourmet, too-big-to-fit-in-your-mouth, Instagrammed burgers.
heh, I know right? I see all those comments shitting on ruby and rails, and it's like, yeah, I get it dude. They hate rails, but luckily nobody is forcing them to use it. The more choices the better. Some people hate dynamic types, some people hate static types. There's room for both, imo. It's not like it's a secret there are faster or more efficient stacks.
I guess they can't help themselves and need to make their position known. Personally, I love ruby, and rails is pretty good too.
Was running a rails monolith with up to 300 concurrent users and a huge growing database on a $20 VPS and free cloudflare.
Sure it only had highly optimized queries (something rails makes kinda easy too), but that's what you are supposed to do anyway.
There wasn't ever a need to recode to run it on a $10 instance. Nether do I think 99% of all database driven projects, including mine, are ever going to reach as many concurrent users.
IMO: talking about performance for ANY major coding stack is just premature optimization
I'm sure there are many variants and definitions, but the company I'm at runs one.
The gist of it is that there's one codebase with multiple separate "modules". This codebase is packed and linked as a library and then we build different super slim hosts that load different parts of the monolith in production containers. Usually just different environment variables or config.
But locally, we can run the whole thing in one process. We're.using .NET so `dotnet run` brings up the whole app. Whereas we might run parts of the app in different console hosts in prod, locally they are hosted background services in-process.
From a debug perspective, this is super awesome since you can just launch and trace one codebase. If we broke it out into 3-4 separate services, we'd have to run 3 processes and 3 debuggers. 3 sets of configuration, 3 sets of CI/CD, 3 sets of testing. Terrible for productivity.
We have parts of the system connected to SQS for processing events and if we need more throughout, we simply start more instances of the container all running the same monolith.
I think GCP is probably one of the best platforms for building modular monoliths because of its tight orientation around HTTP push.
I implemented this in one of my past startup jobs. Basically a core banking system that implemented multiple roles of the system, including open banking. Depending on config, it would act as a bank, as an OB service provider, as an OB registry, as a merchant, etc. In addition, it was built in a way that instances, if allowed could talk to each other in 2 ways:
1. As part of the same entity, so you could scale your operation.
2. As part of an ecosystem, so you could for example create an entire open banking network or just a regular network with bank transfers and card payments using proper protocols such as ACH, ISO8583 and ISO2022 for example.
And there is a separate concept of a configurable application which can completely reshape its component graph according to some high level configuration flags (like database=prod|dummy), we call it "multi-modal applications".
I'm looking to break an old java monolith application in something like that: modular code base, single deployment artifact, multiple configurable use cases.
I've yet to find a good existing tool to compose the application at build time, other than using a godawful lot of maven profiles combinations
Hmm; at least in the more recent versions of .NET, Microsoft has really cleaned up the runtime host paradigm so that it's consistent across console (think background services, timer jobs, pull-oriented processing) and web (classic HTTP).
For us, it just becomes a matter of configuring the correct construction of the dependency injection container at host startup using some flag (usually environment variable) to pick the right bits and pieces to load into the container and which services to run from the monolith.
Then each of the host "partitions" gets its own Dockerfile.
Often this is view as counter microservices pattern. Microservice architecture assumes each service has its own data storage and is logically independable from other services.
Monolith on the other hand is a single application with all of the logic in one place.
Distibuted monolith is a set of applications\services like in microservices pattern but they can share common data storage and depend on each other.
Data Storage and Compute should be separate orthogonal issues, it's not needed in this comparison.
Stateful vs Stateless.
Your monolith is a binary that gets distributed to hosts to perform some function. The binary has multiple entry points that can be envoked. Most calls are via internal library call.
Microserverices (also stateless) have a different artifacts for each component, services call other services via a private API (often grpc/httprc).
blasphemy, this should immediately be moved to microservices and AWS and cost optimized and each component should be scaled separately (maybe some components only need 0.5 CPU, think of the cost saving t3.micro could bring and you have a highly available 3AZ potato to make sure it never goes down!!)
Then to handle any load we need to build autoscaling and spin up the toaster to medium potato and 20 instances (this costs 60k a month, but no worries we only pay for what we use so we will only run this for 27 minutes during our big sale).
Oh what wonderful world we live in and the pain we inflict on ourselves.
GJ Shopify for running a sane (tm) tech stack :)
(btw of course Rails scales it's shared nothing setup, spin up infinite app servers as long as the db can handle it. It's pretty expensive for compute though)
Maybe you haven't worked on a large production Rails app? Rails takes 3-4x the infrastructure hosting costs as other languages/frameworks. Not to mention the insane testing costs from Ruby test bloat.
This could / was definitely true in the 201x. When most of the world's developers 's wages hasn't gone up evenly, and cost / infrastructure performance was expensive.
But we continue to see cost / performance improving. Ignoring the Rails framework and Ruby VM both together has likely gotten 2x speed improvement. We will get 96 Core Graviton v3 or 128 Core Zen 4 EPYC. Cost / Core performance is coming down and will continue to do so at least until Zen 6.
Depending on your App, somewhere along the line it will surely lean towards Ruby Rails's flavour. Assuming you do value what Rails have to offer.
I am just waiting for fibre/ async ( or something similar ) to be built into Rails and Active Record, along with even faster RubyJIT.
I worked at a very large Ruby shop where errors in production were very expensive. This meant that we spent many times more money on instances running the test suite for every build than we spent on all production servers combined.
I got burnt out on Rails after the third app in a row that I was responsible for upgrading. I appreciate Rails' contribution to web development. It took about a decade for the front end framework-library ecosystem to figure out that MPAs are more effective than SPAs for most apps. Fortunately, Astrojs is part of the ecosystem. It's sortof like the fullstack JS version of Rails/Sinatra...without the ORM which IMO adds incidental complexity.
If you have a thriving business already using Rails, it's difficult to justify moving off of Rails...now that painful upgrades seem to be mostly in the past. However, I do find isomorphic JS components & state management to be a pleasure to develop & maintain compared developing an app in two languages.
come checkout phoenix. you get a lot of the same batterries included benefits of ror but the architecture is more moduler. runtime performance is closer to go-lang and there's a really fantastic websocket and reactive frontend system built in. Erlang OTP takes it a whole other level beyond that too if you're trying to scale up.
If you plan the updates since the beginning, and are up-to-date with what's coming next, you are good. Having the app dualboot, running it in your CI to catch upcoming changes is a easy way to avoid such headaches
It's not easy to automate. As you know Github now works directly off the Rails main branch. They have both core Ruby and core Rails maintainers on staff. They have to find, fix, approve, and merge, introduced bugs every day, as well as deal with Rails bugs themselves in the main branch that emerge. This is a huge overhead cost for keeping software up to date, and does not (and should not) apply to any other company.
That's absolutely not a huge overhead, quite the contrary it has many benefits.
We do the same at Shopify, and running off the edge allow to catch bug much sooner and identify them much easier.
It also very significantly cuts down on maintenance cost because we no longer have to work-around bugs, we can fix them upstream and so a small update.
As for you pointing at the Rails 5 migration taking a very long time, it's true that certain major migrations were a pain this one in particular, but it's because a major API had been removed (attr_protected). We (both Shopify and GitHub) work with the edge also in part to make sure the community won't have to suffer this kind of harsh migrations ever again.
All this to say you are blowing things out of proportion. The Ruby and Rails teams at Shopify and GitHub are not an overhead, they pay for themselves, and any tech company as large as Shopify or GitHub you will find major contributors to the stack they use. e.g. There used to be a JVM team at Twitter (half smaller than Shopify even at peak).
I'm not surprised you don't consider the team you work on as overhead. But it is just that. This is a Rails only problem. This is not a problem that applies to other engineering ecosystems, because upgrades are easy to trivial in other ecosystems. It is a problem that shouldn't exist. It makes sense that the only way Shopify has found to do efficient Rails upgrades and keep using Rails is to dedicate a team to it and subsidize Rails core development. But this isn't a model other companies should follow.
I don't see the JVM team as the same thing as a team dedicated to working off a project's core. The JVM team looks closer to the YJIT project. Making a JIT also calls into question Shopify's scalability: Rails was slow enough that they allowed a JIT to be built internally? That's quite the trade-off to make.
> Making a JIT also calls into question Shopify's scalability: Rails was slow enough that they allowed a JIT to be built internally?
You are conflating scaling ability with language speed, there is of course some relation between the two, but it's mostly orthogonal.
At the scale of Shopify (several thousands developers) having a few people focus on improving Rails and Ruby is a drop in the bucket and payoff immensely.
Rails and Ruby are both Open Source projects not backed by for-profit organization. It's perfectly normal for an user of such project to contribute patches... That's how Open Source is supposed to work...
There is plenty of organizations contributing patches to the Linux Kernel (Google, IBM, etc). Using your logic that means Linux is slow enough that Google allowed a new scheduler to be built internally?
I have just spent weeks at $dayjob resolving Java dependency issues I wouldn't wish on my worst enemy. As much as I dislike Rails, it's nonsense that this is a Rails-specific issue.
guess what? 99% of the apps are not github, neither shopify. You add to your pipeline some automation with https://github.com/fastruby/next_rails and you are done. I'm maintaing a rails app from 2006, with more than 1000 LC (excluding the 2k specs and views) and we are running on rails 7.1. We had one critical moment around rails 5.2 , but after that we managed to keep our stack always up to date.
Did you really mean 1,000 and 2,000 LOC for tests? Or did you mean 1000k or something else? If you meant 1,000, that is an objectively tiny app, and not relevant to the difficulty of maintaining large Rails applications.
Infra is cheap when you're in hyper build mode. When your growth rate stalls and your expenses are greater than your income, infra gets expensive very quick and will kill your company.
I love it. Apologies if you didn't mean it this way, but given the usual context of this forum, the insinuation here is so clever: Imply that it takes significantly more "compute" costs for a Rails platform, as opposed to what it would take if it were running on something "more performant," say, Java, and the JS front end flavor of the week.
I would argue that the computing costs of any computing endeavor (short of AI, at this point) are dwarfed by the human costs, and my personal experience is that Ruby/Rails is at least a 10x headcount savings over the programmer sprawl required for Java/JS.
Java and JS sure are good for job security. I'll give them that. Throw in an unequivocal demand for Oracle, and Cisco networking gear, and you have the whole Fortune 500 world that got dumped on us from the people who couldn't manage the mainframes, either.
That they can solve it at all is critical information. One could have expected the stack to not be able to horizontally scale this far. You are right though, that there is also a question of a trade-off between hosting cost on one and and the developer hours, risk, opportunity cost and lower hosting cost that come with a rewrite on the other for which one would need to know hosting cost among other critical factors.
A while ago I read shopify hosts stores in pods. Pods are self contained instances that runs everything inside it, that includes mysql, redis, etc. does that mean this number is collective counting all the pods and not a single database?
Would love to see the infrastructure bill. Anything is possible if you throw enough $$$ at it.
I’m not a ruby hater, but the average joe can’t accomplish this. If you consider each store an individual instance with an isolated/shard of a db it makes sense. But the underlying foundation is immense.
Partitions are usually the key to scale.
I’d imagine parts of their infrastructure would be better served by different runtimes. They’d save a lot of money.
But if your entire team is hyper focused on ruby there is something to be said for a huge monolith.
If you have decent margins on the pages you're serving, Rails is fine. Where you might want to investigate other things is if you're, say, an ad driven business with really thin margins and you want to minimize costs. Or if you've got things dialed and you're just not changing things much any more and you want to eke out some savings.
And in any event, Rails is a good choice to figure out the problem space you're working in. Even Twitter started with it, and objectively, Twitter is very much not the sweet spot for Rails.
If your app is just a postgresql database that needs to be exposed with auth, access control, etc then yes rails is fine.
If you’re doing a tremendous amount of parallel processing it will fall over without throwing lots of compute and scaling horizontally. Rails doesn’t scale vertically. You need to give it compute, and every other resource it uses will also need more compute. Average joes are fine with a 2 core VPS. Lots of businesses are not.
I need you to send 5-10 million API calls per day to 20 different API’s … are you using Rails? Every API is rate limited differently, with different batch sizes (each having unique parameters) and it needs to literally be done ASAFP to make certain deadlines. If you want to throw money at the problem, sure. If you want to do the same thing with 1/10th of the resources you’ll use a better runtime like Erlang/Elixir or Clojure/Golang something with CSP.
HN is so quick to dismiss things like kubernetes and wax poetic about simpleton life but there are very legitimate reasons to choose alternative tools for your problem space.
The key to scaling Rails applications is effective use of caching (at multiple levels).
For an online store (compared to, say, a live action game) so much of the content that you serve will be cached that regardless of your apps runtime a lot of the user experience will be defined by how effective your caching strategy is.
I think that Rails still offers enough compelling advantages for developer productivity to offset the (possibly) higher hosting costs.
Although they probably spend most of their time on the application layer, I suspect the most important and trickiest work was done to scale the database.
Depends on what you’re doing. Elixir isn’t winning the war on raw compute but if you need to do literally 2 million things at once it can’t be beat. It’s ostensibly a drop in replacement for rails (Phoenix) but will do everything rails does better including background tasks.
JVM is indeed a powerhouse and that’s where I’d go Clojure without hesitation. Scala might be an easier sell but I’m a lisp fan and have used Clojure in prod (was briefly CTO at FarmLogs a YC company where most of our core infra was Clojure) and it’s really an amazing language but it requires very senior engineers to do correctly.
Having built similar software, at a smaller scale, I think you're massively overestimating the complexity of shopify.
At it's core, what they provide is (largely) read only set of products, and a (largely) append only set of purchases.
The only write operation that needs to lock the database is when you adjust the quantity of available inventory. You need to pay attention and think things through to do that with good performance, but it's not that complex. and they wouldn't be doing millions of sales per second.
You can probably build an 80% Uber or Twitter simply. It’s the 20% that’s the rub. Did you know the Uber mobile app has thousands of native screens, for example? [1] it’s common to believe that a small simple version is the same as the scaled version but that’s rarely the case.
They might save money on infra, but their speed of development would probably go down, potentially requiring them to either hire more people or deliver features slower. The impact on overall cost might not be that great.
Most software organizations (remember most are not FAANG) will be better served by paying the $$$ to have their engineers focused on building more/better features for customers.
My Master's thesis was on high performance distributed computing. And my conclusion was you likely don't have a problem that is hard enough to justify it. Thank you for promoting reality!
And maintenence and onboarding.. I guess to find the right balance is the key. I tend to go with monolith and use microservices to offload heavy dutie tasks or to benefit from other languages and framework when they are a better fit for a specific problem domain
That doesn't sound like "microservices", but like a monolith with a few services split out where it makes sense. People have been doing that for a million years, long before the web and certainly long before "microservices" was coined. It's just "using common sense" basically.
I'm not aware of such nuanced definitions. For me a microservice is a one endpoint service, designed to do one task, with full introspection, and deployed on the "cloud". As fair as I know, I can orchestrate them with monoliths and they still microservices.. or where you draw the line between services and microservices? The size of the service itself or with whom it interacts?
There is no One True Definition™, but I think most people understand "microservices" as "an application where splitting the functionality up in small services is core of the architecture", or something along those lines.
"We split off 2 things in to small services because it made sense" is rather different. I mean, I don't really care if you call this 'microservices" I guess, but it is different, right?
yup, monolith goes very long only to the point where you have hundreds of engineers or some very specific niche technical problem that I would start thinking about microservices.
It would be without the tooling, yes. Shopify has a merge queue thingy that you can just shove you MR into and it will eventually get around to deploying it. It even gives you a ping on slack when your changes are about to go live IIRC.
How is that the case? This example uses a distributed MySQL cluster which was of course tuned for high performance. Similarly the Rails app is distributed as well. Arguably the Rails app likely wouldn't qualify as "high performance", but it's distributed.
Sorry, I love Rails, but because something can scale (which I never thought it couldn't) doesn't make it a high performance system. That's totally fine, Rails makes other tradeoffs that IMO are more universally useful, even though some people seem to not be able to understand that server cost for most companies is tiny compared to developer cost
They're talking about "distributed" as in a system of services communicating, rather than just copies of the same monolith across multiple instances. The former adds communications and synchronisation over heads and complexities of failover for every extra service introduced
That's a totally bizarre definition. Having worked on a high-performance in-memory data grid for the last eight years, I can guarantee that you'll get all the fun distributed systems problems even with a single code base. That definition also excludes pretty much all famous distributed systems like most databases, messaging systems like Kafka and Rabbit etc.
What you seem to be getting at, isn't distributed systems, but the totally self-inflicted pain of a service oriented architecture
> Having worked on a high-performance in-memory data grid for the last eight years, I can guarantee that you'll get all the fun distributed systems problems even with a single code base.
Having spent the last 28 years building distributed network-connected systems, this comes across as wildly obtuse.
The point is that there are orders of magnitude differences in complexity when scaling a system with few communications paths and little distribution of state across process or network boundaries as there is when scaling one with many paths and state distributed in many locations. We don't tend to start talking about distributed systems when you have a tiered stack of a horizontally scalable component sandwiched between a load balancer and a database even though in a very strict technical sense already that is "distributed".
Once you start adding message queues etc., then it certainly becomes more and more reasonable to talk about a distributed system, but there is there as well a distinct grey area if dealing with e.g. queues just triggering jobs in the same code base against the same database with respect to the intent clearly expressed by the original comment.
Put another way, ignore the word "distributed", re-read the original comment, and consider that irrespective of which label you're comfortable with, what the comment is doing is drawing a distinction between two classes of systems with wildly different complexity in the distribution of responsibility and state. Where precisely you draw the line is entirely irrelevant.
> What you seem to be getting at, isn't distributed systems, but the totally self-inflicted pain of a service oriented architecture
No, it really was not. This separation between basic 2/3 tier apps and systems with a more complex data flow pre-dates the SOA buzzword literally by decades.
Maybe the distinction here would be one of which scope the respective maintainer cares about. For Shopify MySQL is mostly a black box, they don't need to re-implement their own atomic commit protocol, network partition detection etc., since MySQL did that for them. Implementors of MySQL did have to solve these distributed systems problems though and pick their CAP trade-offs, but I guess that's not the scope Shopify cares about here.
Oh, I read the parent comment to thank them for confirming that "you likely don't have a problem that is hard enough to justify it". But reading it again, it could be read both ways.
Edit: To be clear, I agree that this is an example of distributed, high-performance which is why the comment made little sense to me.
Yes, if you take distributed to just mean "the same code on multiple machines". The GP above probably means "different code on different machines interacting" which brings its very own set of problems.
By that definition pretty much any problem you study in distributed systems theory, can occur in a system that doesn't fit that definition and the most well known examples of distributed systems like distributed data stores, message queues etc aren't distributed systems.
My impression is that it's simply harder to get promoted as an engineer in the industry by using boring, sustainable, unexciting solutions that have been used by everybody else and their dog. How do you even stand out that way? Looks bad on the resume, like you didn't even try. Great for the business, but maybe terrible for one's career?
You could turn that one Rails app into a complex microservices architecture and do a conference talk about it, and get a promotion. Then you can undo the microservices architecture, write a blog post about returning to the majestic monolith, and do another conference talk about it, maybe get another promotion. Abstract, de-abstract, bundle, unbundle, rinse and repeat.
It feels like a tragic situation that's nobody's fault, just the reality of human psychology being wired to reward the wrong things.
Oh no I want my pizza size teams with five thousand Microservices written in Clojure, Haskel, Erlang and Scala.
The cost argument about this monolith is just a straw to clutch at. Microservices are not cheaper than a monolith. Operationally or infrastructure wise. Logs, monitoring, tracing and what not for each Microservice.
microservices are a solution to certain problems. if the stack is already pretty diverse, needs a lot of separate teams, hiring is hard, coordinating deployments is hard, etc, then it makes sense.
Shopify probably looks more like what you ridiculed than not. we can guess that it's not one big team, and it's not hundreds of identical copies of this big monolith (but configured during deployment to run in different roles).
I'm saying that microservices as a tech-cultural phenomena (or even era) was a response to a very specific set of constraints. The whole cloud thing was new, hiring was hard, expertise was scarce, business was booming so loud people got deaf from simply thinking about it, the church of true scalability was omnipotent, tools were crude, marvelously maintained monorepos were only artifacts of seriously wet [FAAN]G-fueled dreams, etc.
It made sense for Netflix, because they had a big Cassandra cluster, too much money, and a very picky organizational/hiring culture, and so on.
This is great, but it would also be great if they accept other solutions than monolith Ruby on Rails on MySQL for their systems design interview rounds.
I’ve interviewed twice and there seems to be a resistance from engineers both the times I argued that MySQL wasn’t the right approach. Sure, MySQL could possibly run any use case possible in the world if you throw enough engineering resources at it and highly optimize for that use case, but why do that if there is another engine specifically designed for your use case. The argument was “Shopify runs on MySQl and we can handle millions of queries… blah… blah…”.
I dislike MySQL as much as the next guy, but if you go into an interview-situation and argue that their technology choices are wrong then that is a big red flag not so much because you're wrong, because you might be 100% right, but because it is likely to indicate to interviewers that you will be difficult to work with, possibly pushing for changes they don't want to happen, and that you won't read social and political situations within the team well - for starters by arguing with an interviewer.
It's fine to argue with interviewers, but the threshold where it indicates to both sides that you're probably wrong for that job is pretty low.
As much as producing 'technically sound' decisions is paramount so is being able to work with other people, namely your boss whose decisions are always technically sound if asked. ;)
I think if asked it's fine to probe how much disagreement your boss has the stomach for. Sometimes you're being brought on because they know they need to fix things. But, yeah, don't start to rip apart technical decisions they appear wedded to from the first moment. Not least because a lot of the time as you say 'technically sound' decisions aren't everything.
A whole lot of technical decisions we - me included - have very strong opinions about don't really have that much of a measurable effect on technical outcomes.
I once, many years ago, had a conflict with someone reporting to me because I refused to entertain rewriting our entire frontend in Rails. This was just after Rails was released, and we had a lot of PHP code. We had PHP code because PHP frontend devs were "cheap" and plentiful, not because any of us liked PHP.
I agreed with him about the preference for Ruby, and we used Ruby for other things (ironically all our Ruby use was on the backend), but he kept pushing on the basis of no insight into the relative market conditions for hiring at that time and on the basis of making assumptions about how "trivial" a rewrite was because he had no insight into why our then-current frontend had all of the capabilities it had which he'd chosen to ignore because he didn't know the roadmap.
He went to my boss - the CEO and co-founder - and tried to get me fired. The CEO went to me and asked if I wanted to fire him instead. I didn't, but I did have a rather serious chat with said developer about our respective roles, and how it was about more than technical preferences, and how he might get a lot further if he actually tried to work more constructively with me instead of thinking he had the clout to get me fired.
Said CEO hired me to run a development department again in my previous job, while he to my knowledge has never again worked with that developer - looking like a troublemaker to the wrong person can have long term effects.
I am not arguing their decision is wrong on their real world system. It is a system design interview with a hypothetical situation completely unrelated to their prod system, where I suggest a different db than MySQL. They are the ones arguing that my decision is wrong and MySQL is the correct choice, at which point you have to defend your choices.
> both the times I argued that MySQL wasn’t the right approach
Looking at it holistically, MySQL is ALWAYS going to be the correct approach, if you already have an internal team of MySQL DBAs. Either your problems are small enough that it really doesn't matter if you use CSV files, MongoDB or MySQL or they are large enough and important enough that you want to stick with technology you know and understand, even if it requires 25% extra hardware.
We did a project where we where looking into OpenStack, which was objectively the technological correct choice. Factoring in training, ramp up cost and hardware, it made more sense to just pay VMware.
Without knowing you, I suspect the issue is in how you answer such questions, not whether or not that you're technically correct. I'd go with the route or presenting two options, the one you find to be "the right approach" and the one that fits into the company's current infrastructure. Coming mostly from the operation sides of things, I find that developers can be pretty clueless about the cost and complexity of operations. Often to the point where you wouldn't trust them to design anything unsupervised, because doing so would end up in an operations nightmare.
i am not arguing their decision is wrong on their real world system. It is a system design interview with a hypothetical situation completely unrelated to their prod system, where I suggest a different db than MySQL. They are the ones arguing that my decision is wrong and MySQL is the correct choice, at which point you have to defend your choices.
Okay, then yes that's not normal. I'd never tell a candidate that they are just flat out wrong, that honestly seem unprofessional and kinda aggressive. I still might say something like "So I'd go with MySQL here myself, can you guide me through your reasoning for picking another database system or which circumstances that might cause you to go with MySQL as well?".
what do they use to shard MySQL? are they using upstream? fb's fork? what storage engine? innodb? myrocks? at millions of q/s "mySQL" is a bit meaningless :D
They may be bragging about that, but they still make people use Liquid and tell their customers that features were released when they're in beta for another 3-6 months.
> But Rails doesn't scale so what are we even doing
This without context is meaningless. What is the cost in $$ and engineering time to scale to that level? Would a native image be able to scale to the same level at half the total cost?
You seem to imply that you know better than the CEO if Rails is a good fit both from product and cost perspective. Unless you are the CFO or have inside information I will be skeptical of the idea that I - an outsider of Shopify - would know better than them what would works best for them in terms of tech stack and costs savings.
Even if I would be a consultant for them I will first try to understand the current situation and then imply a native image will have half the total cost is a good solution. What if reducing the hosting costs will actually damage their speed of pivoting and adapting to changes?
Few months in, both services reached 12 hours to run. We had to change the start time just so they'd happen back to back. I rolled up my sleeves and got to work on the db.
A week later, the job's runtime was down to 2 hours. I was nearly promoted out of the job. I kept making improvements to the database, reorganizing the application, and optimizing. One day I manually ran the job, then went to get coffee. When i came back, prompt was awaiting my next command. I thought it had silently failed. I ran it several times throughout the day and checked the response. It ran in ~17 minutes. The backup was also reduced to less than 20 minutes. This was all on mysql 5.7
We literally gained 23 hours of availability. We had no idea what to do with it. I was fired shortly after.