Take logging for example. If you buy a log aggregation platform like Splunk Cloud or Loggly the pricing is likely based on the quantity of data you ingest per day.
This can set up a weird incentive. If you are already close to the limit of your plan, you'll find that engineers are discouraged from logging new things.
This can have a subtle effect on your culture. Engineers who don't want to get into a budgeting conversation will end up avoiding using key tools, and this can cost you a lot of money in terms of invisible lost productivity.
Tools that charge per-head have a similar problem: if your analytics tool charges per head, your junior engineers won't have access to this. This means you won't build a culture where engineers use analytics to help make decisions.
This is a very tricky dynamic. On the one hand it's clearly completely crazy to invest in building your own logging or analytics solutions - you should be spending engineering effort solving the problems that are unique to your company!
But on the other hand, there are significant, hard-to-measure hidden costs of vendors with billing mechanisms that affect your culture in negative ways.
I don't have a solution to this. It's just something I've encountered that makes the "build v.s. buy" decision a lot more subtle than it can first appear.
What killed us was the combo of the desire to really aggressive log on the staging-master and release-candidate environments (five envs total) and keep the logs for more than 7 days so the QA team could compare the logs from the current release candidate to the previous release candidates. Between the volume of having all the services set at the info level and the desire to keep it more than 7 days, Datadog was >30k a month.
Which at least finally got me permission to set up an open source log manager and switch everyone over to that. Once the initial panic over the Datadog bill died down.
The bill will still be in the low thousands but paying 30-40k per year for a logging solution is probably worth the cost if it means you don't have to maintain it. Having logs and metrics (and the APM if you're using it) all in one place is _really_ nice.
When you start you'll likely pay a few hundred dollars per month tops. Your personal time alone is probably worth more than that.
I at least helped 10x the customers and cut costs from when I joined to before I left (stayed a little over a year)… but man, talk about constraints…
Mostly because you don't want pricing to act as a disincentive to exercising the flywheel.
The mechanisms by which your product is marketed, sold and distributed are core to your business flywheel. For example, if you’re a B2B Saas, distribution is often way more important than the product itself.
So by the “core to our flywheel” standard you need to build your own email marketing software, your own version of salesforce CRM, your own analytics tool for triggering marketing funnels, etc.
That’s just hilarious. As someone who runs a business, I wouldn’t spend a single minute of engineering resources on anything that we don’t sell directly to a customer.
I’d go even further and say if you have less than 5000 employees, ALWAYS buy. Never build.
For example, let's pretend I have a fictional company that sells a web server. Marketing, sales, and distribution are all important parts of my business, but my "Core" business is the web server software. If I am looking to build reinforcing feedback loops that give me an advantage over time (flywheels) it's my opinion that you shouldn't be looking to build them (at least initially) in marketing, sales, or distribution.
Another option we've found works is buy, to just get started, and invest moving forward in our own project. Getting something off the ground quickly tends to help us understand better the requirements.
Salesforce isn't cheap, but its not exactly expensive either. However, with this short sighted shortcut you did, you certainly cost your company big. That data is useless. If you upgrade, transition, migrate you now have to spend employee time fixing what was a fixed cost.
Human time is expensive. But what is more expensive is putting artificial costs where they don't need to be.
A brutal experience.
On the “clearly crazy to invest in building your own” front I think it’s useful to add the nuance about how much of that time is purely related to the product and how much is the tuning, integration, understanding the data, etc. which every product requires. I think we’re prone to underestimate the latter and have something like Splunk save less time and cost more than expected because the commodity part it optimized for wasn’t as much of the whole as anticipated.
It also meant that whenever the data size increased, however slightly, the invoice also increased, and managers started investigating whether this increase of data size was truly necessary. Sometimes even when then data size didn't change, managers kept asking whether it could be further decreased.
Making the billing proportional to X means that X will become a political topic. Choose your X carefully!
That's such a great way to put it.
We ended up running our own ELK stack and it was much better.
For exactly this reason I wanted to build a developer screening tool from scratch . In all the companies I worked for, I never got the budget to pay for one of the available tools in the market (most have "contact us pricing"). The result is that most teams I've met spend a lot of time manually evaluating coding challenges. This is a waste of time both for the company and the candidate. A custom screening tool allows the company to calibrate it according to the perceived average candidate level, and avoid those algorithm questions that are useless for most small to medium companies.
In my previous company it was used in a few hiring processes and it worked fine - both for back-end and front-end development. It is not open source yet because your need some basic Docker and Rails skills to use it - even though the screening test can be in any language(s) you want (that runs on Docker). If it matches your experience and you want to try it out, please contact me.
"Thinking about this more, I realize that this isn’t a technology problem: it’s a process and culture problem. So there should be a process and cultural solution.
One thing that might work would be to explicitly consider this issue in the vendor selection conversations, then document it once the new tool has been implemented.
A company-wide document listing these tools, with clear guidance as to when it’s appropriate to increase capacity/spend and a documented owner (individual or team) plus contact details could really help overcome the standard engineer’s resistance to having conversations about budget."
You basically just articulated the solution. Spend the money to give more of your team access to the analytics tool, using the argument made above.
It will still be infinitely cheaper (by like 100X) than building anything.
And the more you spend with a vendor, the more they’ll be willing to completely customize their product for you.
If you're an engineer in a larger software organization, you probably don't have that spending authority. Which means you have to spend time convincing the people with that authority to spend that money.
That's enough of a friction point that many engineers won't bother - which is why I talk about the invisible cultural damage that this problem causes.
You and I know that it will be cheaper. The challenge is getting organizational buy-in.
From this sentence alone I'm not sure how much time you spent working for a large company.
It's not just that you have to convince your manager, depending on how much is to be spent they have to convince theirs. Often creating very long decision paths.
And it's not only the time you spent but sometimes also the delay incurred that messes with this approach. Often enough even with buy in you get the answer that it can be planned for next year's budget.
One year delay later, you might get the approval, but it may be too late.
And if the company is large enough often they've already built a subpar internal solution that everyone is convinced you should use. Since it's "already working".
The amount of bureaucracy at large companies cannot be overstated, especially when technology is a cost or auxiliary to the core business.
I’ve run into this twice in my 24 year (!! good god, am I that old??) career.
The worst was during a several month consulting gig for an investment bank, and it was extremely painful.
Every change to the dev and uat environments required sign-off and approval, from sys admins who were in a different timezone.
If the changes had cost implications, it would have been far worse.
In larger organizations budgets are complicated things. Your higher up with spending authority may agree with you, but they've hit the budget for their group and so your request for more logging is competing with a request to pay for a new compliance auditing tools or upgraded server capacity for machine learning models.
I'm certainly not arguing build over buy here - but this challenge is a genuine issue which I don't think gets enough consideration.
I’ve experienced this many times.
Fast forward almost a year, at least 7 releases, and probably $200k of payroll on our end and their product barely runs, and doesn't handle all of our standard workflow, let alone edge cases. Best we can assume is they were writing the API from scratch because we needed it and didn't have the skill to do so. Documentation was offensively bad when it was correct, which most of the time it was not, to the point where there were several instances of the endpoints in the docs being wrong, causing us to call support to ask what the real endpoints were we were supposed to be calling.
All this to say yeah, buying is great if the company is professional and has a well-documented product, and the sales and technical pre-sales folks know what they're talking about. But that isn't necessarily the case, even for six-figure purchases. And when it's not the case you can easily burn a non-trivial amount of money before you ever realize it.
It’s such an easy trap to fall into, because I hear similar stories all the time.
I guess the moral of the story is to assume sales is lying to you until proven otherwise? Haha not sure, just do as much research as you can but you’ll probably get burned anyway from time to time
Having dealt with enterprise sales countless times, yes they do and I yes I do presume so.
They lie in many ways, the most innocent one is by demoing unrealistic perfect scenarios without any corner cases, promising integrations that do not exist ("it's in the roadmap!") etc.
This is so standard it doesn't even deserve mentioning.
I mean... talking to people with every incentive in the world to stretch the truth and/or little technical acumen doesn't sound like a recipe for success.
I'm pretty sure my company wouldn't put me on the front lines during sales for reasons like this
Customer "How are you at X and Y"
Me "Oh, our X is really good, on the other hand our Y is utter shite"
Sales rep next to me: evaporates
If the deal closes, Z% of the company ends up being tied up in trying to kludge Y into doing what was sold, spiking engineer burnout and lowering morale, and furthering a negative relationship with sales.
Plus you've now pissed off your new customer, by lying to them.
This was incredibly effective, as it meant that the client actually trusted us when we said something would work.
That being said, I'd normally avoid calling our products crap (even when they were) and just push the client to use something that wasn't crap.
This only worked though, because we had a separate reporting line to sales, so any VP pressure had to go through our VP (which happened, but not as often as you'd think).
My understanding is that they changed this after I left, with predictable consequences.
I had the pleasure of carpooling with some random owner of a small tech firm, and he flat out said it was difficult to keep on top of salespersons.
Meaning they were pretty shifty by nature and hard to trust.
I'm assuming Salesforce (e.g.) isn't going to suddenly pivot into Lyft-for-dogs, or whatever the product is.
This is why pilots exist.
- Researching what your options are / what's already out there.
- Comparing different alternatives.
- "Hopping on a call" with a sales rep to get a product demo (there's this super annoying trend where many SaaS companies' landing pages don't explain what they do and the only option they give is to "schedule a demo").
For CRUD-like internal tools or simple 3rd party integrations, my experience has been that it's often much faster (typically < 1 hour) to build a production-ready app on Retool (https://retool.com) than it is to even get started with SaaS vendors.
- https://www.forestadmin.com // Fastest way to build self-hosted admin panels on top of SQL (Postgres, MySQL...) and Mongo databases
- https://www.appsmith.com // Open source alternative to retool
- https://www.internal.io // A no-code alternative to all the above
- https://www.basedash.com // Very spreadsheet like XP, YC20 startup
Plus maybe I'm weird but I think it's enough to love the energy you can get from building something. It's fun. And considerable portions of work should be fun. Especially if you work for yourself, where building stuff for money, for other people, often isn't enough without a strong connection between the work, customer, and the subject's own interests and values system.
(IMO a lot of those kinds of projects also build on the edge of one's core expertise, extending it outwards, almost like a recon mission, so it's less of a binary yes/no core expertise condition.)
I think this is good if you can get there. The IKEA of software is pure dopamine. If had to grow and saw those trees less so. If you are pouring concrete and get it delivered, pure dopamine. If you have to crush the aggregate by hand, less so.
For example make a JSON serialization library and you'll be asked "how do I spawn enemies from JSON" which is arguably entirely orthogonal to your library but don't answer and get downvoted for bad support. Do answer and you'll just be asked more questions about how to adapt your example to their personal project and you'll have to teach them programming in the process.
These incentives make it almost impossible to make money selling 3rd party code (plugins/add-ons). If you charge what it should really cost given the amount of work put in the market won't bare it. If you charge what the market will bare you'll go broke.
Maybe Unity should split the market into "Pro" and "Hobbyist" and the Pro market would have prices more inline with what it actually costs. Check out the prices of libraries like Radgametools.com (you'll have to google the prices) for comparison.
I'd much rather occasionally work on feature for a well-engineered internal reimplementation than constantly fight with shellscripts wrapping around a solution with a huge impedance mismatch.
> Most enterprise systems require an engineering team to keep them running.
If we wanted a team of 3+ to run the cloud then we could buy openstack, or cloudstack, probably 2+, but that's also without pushing features. And suddenly we wouldn't be profitable anymore. I left, so I guess they will find out.
It's not an easy thing to do. I've found that open source libraries are often more stable, require less ongoing maintenance due to API changes and have better support lifetimes compared to the SaaS equivalent.
Risk is low for commodities, but high for anything "disruptive".
1. Buy some platform/framework which you will then need to hire a small army of costly consultants to integrate and customize for your particular business need. Or
2. "Build" your own solution by orchestrating a bunch of open source technologies to solve your problem.
Moreover, it is not quite clear up front whether 1 or 2 will be costlier in terms of development effort and overhead.
However, if you want a bunch of neat features, custom rollouts, and all that stuff, you're simply not going to be able to get the same value (unless you have HUGE scale) building it, and should just buy it.
Most organizations aren't comfortable saying "this is all we'll ever need" and are worried about both building, buying, and then migrating, which can be significantly more expensive than either of those options in a vacuum.
1. "Buying" a solution that solves your problem exactly and doesn't require engineering resources to implement and maintain
2. "Building" a solution where you have to solve every problem from scratch where you don't have any particular expertise.
> is not a binary choice
Then, offering two options. :)
this is where experience matters.
> The problem then with that is everyone who is working on those services is usually trying to get off of them. After all, no one wants to work on something that their boss doesn’t care about.
> Yes, your CI/CD system is absolutely critical, but it’s easy for executives to not think about. This leads to a failure mode where you have a lost garden of internal tools.
I love maintaining stuff. I don't care that my boss doesn't care about something. I care about... what I care about :)
I don't mind the "boring" tasks like keeping library versions up to date, being on the latest runtime version, and using a recent version of the framework we've adopted (we're currently running on an ancient version of it). I like doing that stuff. I like keeping things tidy. If things are up to date, I can move on to making our CI better, or improving our test coverage, or really anything that improves the whole team's productivity. There's always something to update or polish or improve.
My dilemma is that as much as I'm willing to do all that stuff, I'm essentially not allowed to. My lead and their boss say my skills are too valuable to be spent on that, so instead I must do things like lead the new team they're forming to build a new service... which I'm not interested in at all. I'm not keen on the whole lead thing. Glad to be a follower. I'm also not thrilled about building new stuff. I like to say I'm more like a car mechanic rather than an engineer: I like tweaking and tuning and fixing existing stuff, not creating new stuff.
Does anyone else feel like this? I've tried bringing it up in chats with coworkers and everyone looks at me like I have two heads.
I think people with your interests are really valuable. We've changed around what we index on for performance evals/promotions and what work we assign them so that we can better accommodate these sorts of work preferences and skills. If someone is happier working on an essential but unsexy, unloved corner of your systems or infrastructure, and they're also many times more productive in that area than other engineers who have no interest in it, then you might as well take advantage of it!
The key factor is making sure that the work those people are doing is truly high impact (for example, CI improvement might reduce deployment failures, improve team deployment velocity) and not simply maintenance for sake of abstract cleanliness.
It's really cool that you've that. And I'm glad to see I'm not alone in this.
You could probably make a career out of being "the optimization guy". Product isn't working well? Ask the optimization guy to take a look before somebody spends a million bucks trying to replace it.
So yes, your skills are too valuable. All engineering skills are. That's the point of buy vs build.
If you truly feel you aren't being allowed to work on the the things you care about, it's possible you're simply in the wrong role or company.
If anyone feels like it's hard to get your company to care about this stuff or even let you work on it, maybe it's time to consider a move? :-) You can reach me at email@example.com if you prefer not to reply here.
Systems of Engagement & Systems of Record should employ a buy-first strategy. These are typically more operations-focused systems and as such you're usually better off buying them, but not always (see below). It's possible that some of these systems may also be Systems of Differentiation.
What is a buy-first strategy? It means consider buying first. How much would your operations be impacted by the necessary changes to your workflow? How much customization will be required in order for it to be usable and how are those customizations maintained throughout product upgrades? How much integration is required with existing systems? Buying software is rarely a buy, deploy, and done proposition! Usually the TCO is lower if you buy, but not always - depending on how much customization and integration is required.
So there's a paradox I see which is akin to the credit system where only people who don't need credit are offered it: you can only "afford" to buy instead of build when you have the engineering competence to build - that's when you can intelligently choose not to.
On the other hand, if you lack the internal competence to build and for that reason choose to buy? That's when all the bad things happen. You're going to get screwed by your vendor - they are going to know you aren't technically capable of supporting yourself, they will give you stupid timelines, blown out costs, unreasonable constraints ("it has to have a 16 core, 256GB RAM server or we won't support it") etc etc. You will end up with the worst case of vendor lockin and a system everybody hates.
So one of the pitches I make when people are making this decision is that you should be building at least some things, because the strategic cost of not doing that affects everything else you do.
1. It is easy to run your own services. You should be investing in internal developer tooling that makes this easy, and in fact that developer tooling should sit in front of any vendor solution you ever buy so that the way it integrates with your alerting, monitoring, data exporting, etc., is completely standardized to be uniform with service delivery in any other system in the org.
2. and 3. You do need complete control over what the application does because it absolutely always is unique and special on a per-use-case basis every time. A good example is search. Anyone who thinks search is a commodity service you can just throw ElasticSearch or Algolia in front of is sorely mistaken and dangerously naive. Every different search use case is going to have different success criteria, different data privacy concerns, different timeliness and freshness concerns, etc. and you need business software to control these elements in ways that fit into standard internal product management and QA procedures.
4. Vendor lock-in is a critical problem. If you choose GCP vs AWS, you are defining culture and you are defining experimentation and exploration that you cannot do. You’re essentially cleaving away many future possibilities from even being testable. It’s much worse than just having a crufty old system to maintain, it’s about brittleness and lack of ability to appropriately empower engineers to consider whatever part of the solution space they decide is needed. Companies that “get it” will prioritize “ease of swapping” so that you can constantly improve and leverage autonomy without needless parochial constraints on what can be considered. Thinking, “yeah but just buying it solved our problem today” is such a death knell of weak leadership who cannot fathom strategy or how to leverage real solution ideation from their staff.
I'm especially angry at the dismissal of the vendor lock-in problem in the main article. I've seen quite a few start-ups being trapped by a vendors which were very cheap at first (trial period, starter plans, etc.) and ended up eating much of the profits. Not to mention the innovation cost...
For example, a service provider will have a 5H outage without any clear indication about what's going on or when it will get resolved. This leaves us in a difficult position to raise an incident with an unknown ETA where we can do nothing but wait. Then when it's all resolved, we either don't get a post-mortem or a hand-wavy one.
Or a service provider will lose data, which may have been minor to them but critical to our operations. And all we get from them is an "oopsie!"
So over time, we've ended up insourcing mission critical pieces (within reason) since we can engineer solutions with guarantees that are inline with customer expectations and our own targets.
Separately, in some cases, buying and integrating turned out to be a ton more work than what building in house would have been. Because when we build in house, it can be built to meet our full requirements in the first place.. whereas when buying, we may find ourselves working around limitations in the APIs provided to us.
0: https://www.enchant.com - shared inboxes, knowledge bases, live chat
> My counter-argument to that is there is also lock-in with internal systems. The most common version of this is the keeper of the spreadsheet.
The author then disparages spreadsheets as becoming the exclusive domain of one employee who wouldn't want processes to change.
In reality and my experience, though, spreadsheets are one of the most versatile and accessible systems, and their close cousins (Airtable, Notion) great as well! You can customize it to your own processes and they're pretty universally understood, so the barrier to change is pretty low.
Their close cousins are locked, proprietary, slow, bloated, exceedingly complex, poorly designed (ui/ux), emojized, vc-backed feature extravaganza and have a subscription fee.
Experts of Excel use keyboard exclusively, their keystrokes are a melody of efficiency, expressivity and productivity that is continued to be mocked in similar fashion as the 2007-era Mac vs. PC advertisements.
The alternative could instead be a system with open standards that many vendors implement, and only relying on standardized behavior.
This works to some extent with for example SQL or C, where you can migrate from one DB or compiler to another with limited effort.
Unless you are actually purchasing whatever it is you’re using... and I’m guessing they aren’t doing that.
Problems mounted - cost, security, integration issues, performance, ... - we couldn’t do anything about any of that, because we just wrote glue code to combine all the SaaS stuff.
I think about that job from time to time and I have two theories as to why this guy was like this - a) lack of trust/knowledge his engineers can build stuff, b) bragging about using a shiny-new-tech is better than talking about writing a Python script that saves data to Postgres.
All in all - good points in the article, but there is more to the decision (as hinted in the summary), building stuff is not just about scratching one’s itch.
It’s cost dependent and you should carefully study both options.
It’s part of the engineering: study fixed and variable costs.
Outsourcing is cost effective when the market is mature enough.
You should maybe not rebuild cloudflare’s core services to operate a website.
Depending on your scale, you certainly should build your own ML stack for instance.
There are personal reasons as well:
5. Innate curiosity and excitement about technology
6a. Get experience
6b. Resume talking point
Many of us picked a career in IT because just like building things out of Legos is fun, building a streamlined full CI/CD pipeline is fun, building a full stack application is fun, etc. Speaking of career, one needs to acquire experience in the new technologies to pad their resume and it's convenient to do it on company time.
I'm not justifying putting your interests ahead of your company's, but understand that's what some of us do. I'd say #5 and #6 are often stronger drivers for decisions than #1-4.
Further, you need to consider that the stuff you buy also needs to aligned with your needs over time.
However, you also need to consider this when building stuff.
The alignment is the tricky thing it is easier to achieve when building in-house as long as you engineering is up to the job.
Also, note that building in-house doesn't mean engineering "from scratch". Usually it means use lower level components over higher level components/services.
I have a small paragraph from a blog post where I explain this with some math that is probably crappy but captures the idea:
"When you spend time in areas that are not the core of your product, you're actually being financially inefficient. Taking screenshots is likely not a core task of your business so it doesn't makes much sense to waste development resources in this area.
Here is a brief example of how inefficient it gets:
A mid-level developer in a small market earns $90K USD per year, working 40 hours per week. His/her effective hourly rate is $47 USD per hour.
Writing a basic implementation of a screenshot utility will take at least 5 hours. But this is just a simple prototype. Many use cases will need to be addressed, and there's likely going to be a large amount of time spent in optimizing, securing, provisioning, testing, etc.
A realistic utility that can be used for a development workflow is going to take at the very least 30 hours of development time, but likely more.
Other un-accounted areas of development time are: documentation, training and maintenance.
Given this scenario is very likely that writing a semi-good solution will take more than a week and maintaining it, will take at least an hour every month.
This means that writing this solution will cost almost $2000 USD of development time plus another $500 USD or so, just to support it every year. And of course, this doesn't include the cost of the infrastructure you're paying to run it."
My argument is that the more utilitarian the service, the more cost-effective that is to buy it instead of writing it. Writing it could be simple but there's a drain in engineering resources when you try to maintain it in the long run.
Nobody cares about your bussiness as much as you. The motivation behind your product/service provider is to take your money. Buying will always be a battle with the other party to get value for your money. They are strongly motivated to offer as little work for as much money as they can get away with because there is where profit is made.
The provider might even be more competent than you in building. But think about hiring a good programmer that just doesn't want to work vs a mediocre one that is highly motivated. I think everybody has come across these examples and knows the outcomes.
He was incredibly stingy with money and always wanted to know how much it would cost him to have his engineering team implement something vs how much it would cost to buy it and integrate it.
Admittedly he was pathologic about this to the point of having us create a timer that counted in pounds sterling that he'd start running at the start of every meeting, but I feel like it made me way more efficient with my time (and hate long meetings)
As for the subject itself: frankly I do not see any advantage of hosting on say Azure over hosting myself. Either requires a good deal of maintenance. And no you can not really rely on Azure doing it for you.
Uptime for me is probably just as good on my servers as each one I have in my office also has ready to use up to date shadow copy located elsewhere. Azure is way more expensive of course.
The amount of hardware/processing power I have on my own server would cost me a fortune to have on Azure. Scalability does not matter much either because:
I am not Google and do not have to serve the rest of the world.
On top of that my servers are usually high performance native C++ applications that can handle thousands of requests per second sustainably and without breaking a sweat.
Vertical scalability with modern CPUs and multilevel storage is absolutely insane.
In my opinion, cloud hosting only makes sense if your systems are architectured around being hosted on the cloud.
If you can build your application in such a way that it runs in the free tier of serverless, then it's going to be considerably more cost effective than renting a dedicated server - but if you can't, then it absolutely won't be.
Serverless - that would be vendor/architectural trap. Besides free tier does not come anywhere close to be able to serve my applications. They serve real medium/big size businesses.
You also mentioned free tier. To me it is irrelevant as the amount of resources it gives is useless for me.
The only thing I can really offer you is that in the spirit of this article, when I did some work that would run on AWS lambda it was very handy not to have to think about any of the infrastructure that was around the business logic I needed to code.
I fully expect that code will spend its entire useful lifetime running in AWS Lambda with no need for it to escape the vendor or technology behind it. It's even possible no one will think about it until it breaks and stops sending events.
If it costs the company £50/hour for me to look into something, me being able to complete a task quickly and then never look into it again is almost certainly going to save more money than writing a dedicated process and running it on a physical server that I then need to maintain.
For a VC-backed startup with millions galore and a pressure to beat competition, buying is natural.
For a bootstrapped company or a side project, building your own may be best, especially when the advanced features that make market-leading solutions more expensive does not add a lot of value at the company's current scale, and the growth is not exponential.
As long as you're wiring those things together in ways you can replicate across cloud providers or in your own datacenter or on your own server in the closet, you've achieved the sweet spot in terms of deployment portability.
Running your own (Kubernetes cluster|database instance|pubsub system) on a cloud provider is crazy given that you now have at lease $150K+/yr piece of overhead in the form of at least one (dev)ops person. Save that expense for when or if you decide your needs have become predictable enough that you can start building your own software/hardware/network infrastructure.
Unless advanced devops fu is an important part of your company's mission or valuation, don't do it unless its saving you a lot of money over the alternatives.
First, check the motivations. Emotional or logical. If the driving motive(s) is ego, that's already a big, red flag. Consider running like a bull in reverse.
Second,intel. Gather information about the situation; educate yourself. Find the apples on the road.
Third, formulate a vision, an image of the final result that all the stakeholders (those whom you cannot ignore, no matter how hard you try) agree on. This is where beautiful diagrams are created. Which don't matter. All that matters is each stakeholder signs & dates it.
Fourth, plan. Use your knowlege gained in step two to formulate a plan for getting from the current situation to the "vision".
Fifth, execute said plan. Find more road apples.
Sixth, debrief. Analyze the outcome. Get a drink with the ones who did the work.
This can all boil down to just you spending 2-3 days running around doing it yoursels, or multiple teams spending most of a year herding cats. The important part is knowing the steps, and mentally looking for them. And for those who push to skip.
This paragraph is confusing me, either it's missing a 'not' or something, or I'm just confused. Is anyone following?
> The question you should be asking is what else could be done instead of tuning your own stuff or building a new internal system. The answer is usually spending more time coming up with the correct architecture instead of fighting fires or developing actual customer-facing features.
Instead of building internal systems you would be coming up with the correct architecture instead of fighting fires or developing actual customer-facing features? Wait, what? there are too many 'instead of's in there and i'm not sure they are all meant how they are said.
I think developing actual customer-facing features is the goal, but here it sounds like the thing you would like to avoid... and I'm not really sure about fighting fires (ideally you want a system where you do less of that, right?) or coming up with the correct architecture (?).
We, software developers and users - people and small organizations - should adopt the opposite slogan:
Built, Don't Buy.
When we build, we improve our knowledge and understanding of systems. We help affect them, through engagement with their developers or through modification and forking. When the challenges and overhead of building exceed our abilities - this will drive us to:
1. Strive for better-fitting software - easier to use, deploy, build and maintain; and
2. Cooperate and collaborate with related organizations and communities/groups of developers and enthusiasts, either to share knowledge on how to get things done more easily, or to distribute the load of work between many parties, each of which can't do it on their own.
Even if a lot of FOSS contribution these days are made at for-profit corporations - the above is key to the future of both software freedom and social freedom.
Let me mention one story here to make the point in the small. During the 80s/90s HP did selectively less OEM work. They bought Fujitsu wave soldering and pick n place machines. That is something like 1million per line. That kind of equipment comes with a serious stack of user manuals. HP threw those out and provided their own set with training. Why? Any fool can plunk down money. Smart OEMs needed a way to get their line engineers to "know how the line thinks" and to move jobs in and out of the line smoothly in an integrated process with attention to spc, quality and metrics tracking. Yes they bought but not by going ignorant.
If there's a general rule for making these decisions, I haven't found one. What's helped me, though, is asking whether or not the problems my team has is any different from most other teams. Sometimes it's true, like when I was at an entertainment company working on asset management. Sometimes it isn't, like when I was at the same company figuring out an AB testing solution.
Sometimes "building" means writing 6 shell scripts (each <10 LoC) and "buying" means getting CKA/CKAD certificates for all people involved and spending half a year on training.
For customers, it's better to give their data consciously to one provider within one chosen jurisdiction than moving personal data, documents, activity logs, etc within multiple 3rd parties, as well as to do due diligence on any web service they provide any data.
Please, respect your customers and don't share their activity when it's not necessary and cost just a few hundreds bucks per month.
Treating humans as "units" instead as people with rights is a big problem of IT businesses.
I think there are some things that do make little sense for anyone to build unless it's super core. These tend to be low level, cheap developer tools that do a couple of things well.
Trying to "buy and then integrate" larger pieces of software is often more work. Lots of people "buy" because they don't understand the problem. It's a mental opting out. This leads to disjointed experiences internally and externally.
But it all depends on your situation.
1 There is some other expense you can save significantly on by DIY. An example would be DIY k8s on bare metal via bare metal providers for a massively bandwidth intensive service where paying AWS or GCP outbound data rates would be thousands and thousands a month.
2 You are at sufficient scale that there is an economy of scale that can be accessed. This overlaps a bit with the first point but can also happen if you are just big enough. The first point is more about some special need.
Some engineers love to say "I can build this over a weekend". But the main cost is almost always on the maintenance side.
I forgot where I saw this number - in the entire lifecycle of a piece of software, (on average) maintenance cost is at least 8x of the initial dev cost.
Yes, buying a solution also costs time (and $$$) to integrate and maintain, e.g., using a 3rd party API - time to write code, time to set up monitoring / alerting, error handling... There are always exceptions, edge cases, special situations...
But in general there are fewer and fewer things that a web company needs to build from scratch. It takes less engineering time to launch a web product than before, thanks to all those out of box solutions, e.g., SaaS, apis...
I like building a software/web business today than 10 years ago :) One-person (or tiny team) web businesses will be very common (I'm running one ), and running an API business is not bad because it actually saves customers time & money(vs building one in-house)  thus they are willing to pay a small fee (compared with hiring one or more full-time engineers).
The word “maintenance” can’t perfectly capture the meaning of all the continuous development, small incremental improvements, bug fixes, operational tasks...
The biggest selling points of such a distribution system should be:
1) It's REALLY easy to set-up, no technical knowledge required.
2) It doesn't require any type of maintenance (eg. can run perfectly at least one year without changing anything, this means that the software should allow for OTA updates).
3) It is a lot (10x) cheaper than a SaaS alternative.
For the service provider there is a fairly fixed to linear base overhead so it does not make sense to make an special offering for the 1/10th or less of functionality you might require at 1:10th of the price. It doesn't make economic sense.
So for the small business it can mean they get away with a homegrown minimal service at a fraction of the cost, even though the 'quality' of the produced for general applications might also just be a fraction of integrating the 'off the shelf' offering.
I've seen some startups running in major deficits simply because they went all out on integrating external services the aggregated cost far outstripped the margin their market allowed for.
What am I supposed to buy, except some PDF/Excel/whatever library every now and then?
But maybe I'm just not the target audience for this article.
I had just had a look at Azure and had seen their Logic Apps, and quickly whipped up one which downloaded the mail and uploaded the files to SFTP. Nice.
Until a few months ago when it suddenly stopped working because for some reason Azure doesn't like the key exchange algorithms Bitvise SSH server provides. Or at least, that's as far as I've been able to determine. Zero logging on the Azure side so no clue what's wrong.
After spending hours trying to figure out what went wrong, I made my own program that does the same. It took a bit longer than the Logic App, but at least I can fix it when it breaks...
No, we don't need k8s, Kafka, a home rolled CDN, custom authentication, etc.
Would have probably spent a lot more time wrestling with a more general solution to make it fit our particular environment.
Some interesting takes
1) There's more to it than the 'risk' factor, consider value & costs!
2) Ask yourself "What can I actually build right now?" and be brutally honest before attempting any comparison (vs external tools)
3) Solutions that hinder your efforts to move any component outside their ecosystem are best avoided. These prove to be costly in the long run.
However, at the end, a somewhat more sensible advice is presented:
> I suggest that you prioritize buying to building, unless building will provide a real sustainable advantage for the business.
I generally only buy things to save time when I really don't have it, and usually as a temporary solution. Buying isn't an all-in or all-out solution. There are many levels at which you can decide to either buy or maintain the thing yourself. In a lot of average/usual situations, reading how to properly setup the service rather than just buying it from x-as-a-service will provide valuable learning not just for the setup itself, but also for tooling used around it, which will make developers even happier and more productive. For example, I wouldn't buy Redis/Elasticsearch as a service if I can spare a couple of days to read how to set it up (and of course there are always exceptions). Care about security first, then efficiency. It will take time, but it's not as difficult as it's made to seem.
Quick example, a few years ago, I witnessed a company save more than 80% of their hosting cost by switching from Heroku to a simple EC2/Ansible setup. They're still buying, but there are levels. Also, it helps to know what are the actual requirements when you're buying.
Thinking that buying everything as-a-service will save time is a misconception imo, as you'll likely end up having to spend time reading the docs of whatever you're buying. Decision changes will also cost more time because unless that service was using an open standard, you'll likely end up having to change a few things.
There's great value in learning open standards and how to automate setups. With that being said, no one can learn everything at once, so if the need is urgent and you're short on time and have properly studied the financial model of the service, yes, buying is a good option.
Not related to this, but I've been having a lot of fun messing around with Nomnad. I've also learned some very interesting things about how docker networking work because of it.
I wrote a piece just a few days ago talking about the same topic, although it was aimed at earlier stage developers, so there is less detail, but the overall message is similar:
Deciding when to build a custom solution in web development
But do you know how many times engineers started to build something and made a discovery that resulted in inventing something significant. Or created something that gave their company a significant advantage. A not small number of times.
So much effort is spent on hypotheticals. If you think about a problem for too long, it will seem like a worse problem than it is.
Given the limited number of projects creating values, it's often a choice of working on the next failed project or saving the company 100 grand.
It's often simpler to take open source software and adapt it to your needs than to buy commercial software.
I want to agree with the principle that you build what is essential to your business and you buy in the rest. As both an entrepreneur and a developer, I want to focus on the unique, value-adding parts of whatever my business is doing. I want to let someone else handle the mechanical stuff and the formalities, and I have no problem with paying a fair price for that.
But that whole argument is predicated on the idea that outsourcing will get an overall better result than building in-house, so that in some way it saves time and/or money that you can better invest elsewhere. So many services we've used have changed or discontinued functionality and forced us to relearn or reintegrate just to avoid going backwards, so many services we've used have pushed their prices up in real terms while adding little of extra value to us, so many services only ever reach 75% of what we'd like from them and the missing 25% of functionality hurts more and more, so many services we've used just don't work so often that you wonder how they manage to stay in business at all, that I now find it hard to trust or expect good results from any external service at all.
Sometimes you have no choice about outsourcing. No small business has the resources to do some things, particularly in heavily regulated areas like payments and tax, or in large-scale infrastructure like operating a data centre, unless that is part of the purpose of the business. So you choose the least of evils and hope for the best.
But today, as someone just starting to set up another business, I am looking for services that handle everything in a particular area where for practical reasons we can't do it in-house. I want to specify what we need, and have it done and the results delivered. I don't want to care about anything in between. I don't want to be distracted by any related legal and regulatory compliance matters. I just want the job done, properly and legally, for a reasonable price. I have little interest in working with anyone offering less than that, unless I have absolutely no choice.
Maybe this means I actually do agree with the original principle after all, and I've just learned from experience that an extreme interpretation of it is the way to go.
If the answer is no, lean towards buy. If the answer is yes, lean towards build.
For ops, netflix would say "yes" whereas most companies would say no.
This is particularly true when you have to pay to add some missing feature you need. They will, naturally, try to ask for as much as possible to get what you want.
Others are natural consumers: they develop tastes and grow to be rabid shoppers. It's good to sit in the middle of these two opposites, but often better to decide which extreme you want.
I know I've made the decision too late in life, when I wasted a vast portion of it trying to sell some product/service/idea when really I just wanted the latest Porsche!
In this case we assume there is a cashflow of some sort. How the cashflow comes into being is a public matter for most, but sometimes a private matter. It is assumed that a certain amount of disposable income can be used to acquire tastes and become a 'shopaholic' of sorts.
If you are in decision making position, I struggle to see why would you want to spend so much energy installing Prometheus + Grafana stack instead of just using DataDog.
Even worse if your company attempts to build Prometheus/Grafana from scratch. Just why? What a huge waste of money.
If it contributes to your core business then ok, I can see why you may want to build custom solutions.
Also with all buys systems there is a hidden costs in managing the legion of user accounts to access these new services. It's rarely as easy as just setting up SAML or single sign on.
The final thing that people never talk about is cost. Yes there is the cost of an engineer to set it up but often that costs is cheaper than the bought service. This is especially true of pay per api call type services.
Because it could be done in few commands the second time. And learning is only possible when wasting time on this kind of things, rather than trying hard to custom fit your usecase using some buyable tool.
That being said, I kind of partially agree with you and the author though. Don't maintain service just to feel that you are in control, but with the exception that if you are doing it for the learning purpose.
btw, the build vs buy should be a topic of discussion whenever people want to build something. at an absolute minimum they need to understand (and use) what’s already out there before they embark on the journey of building themselves
Ours is the usual case: we have homegrown brokered message queues plus Kafka and 21 more. We have homegrown single-master replicated DB, but also use DB2, Oracle, Postgres, MySql. We have private in-core caches with speciality code to knit together data and also use Memcache, Redis. Some of that code is installed from DPKGs; some if it slightly edited code and installed as service so internal users aren't bothered with the icky-details.
We have the Jekyll & Hyde behavior mentioned in OP's write-up and from commenters: on stuff considered core to the company and clients for a sufficiently complex system, we're not going to rely on 3rd party trouble ticketing systems and support. It's in-house. But that resolve falls away in stages and never lasts. We also have people who strongly push the business practical side: we're not here to write Azure or Kafka. We're here to make business apps. Get focused, and get client focused. That comes with an interesting blend of boredom and brand awareness gone wrong: I looked around & I saw or heard people in Dept X are working on caching solutions so ... what'd be the point of you doing it?
Now, here's the interesting question: suppose we needed a distributed ledger (without blockchain) or we needed 2PC for some homegrown caching. Now what? Do we write that in-house as a reusable components? Dig it out of PG or MySql code if they have it and we know were to look? Suppose we make a
business case to upgrade a cluster to kernel bypass NICs. Should we go into Redis and self-modify the code to support that I/O path? No? Frankly, what then is the purpose of a CS guy/gal with a MS in distributed computing otherwise?
My own perspective on this:
- The first 1/3rd of this back and forth is the proxy conversation that flies above reality which is more about risk management and perception of failure i.e. company reputation & gossip mongering.
- The second 1/3rd of this is what people talk about because we can't talk about the underlying CS concepts. Most company employees usually have no ability to formally describe, or otherwise even sketch 2PC giving pseudo-code for it on a whiteboard. Or take this another way: see if you can get your lead on your homegrown replicated DB to explain how it works and why it works like we'd see in a senior level CS class: clear, specific, and yet abstract enough not be talking about superfluous implementation details. Can the audience understand it?
- Insufficient grounding in customer needs so one can better determine what clients really need, and what's lacking in the current offerings. Failure here will send you on goose chases to no where.
Gedanken experiment: you and I start a new software company broadly in distributed ledgers, caches. We are able to get 25% of MS' distributed computing leads --- not all --- but none of these guys and gals are slouches. What now? Do we write it in house or expand and extend some IBM offering? The the difference is we have talent, and there's some reason to believe they can deal with a distributed system's worth of risk.
To end this with humor, here's a nice analogy I once heard: On a Saturday, mom and dad were sick of the noise of their three kids. And the house was a mess. So they send them outdoors to work their energy off. At sunset the house is clean, and they're sitting on the porch watching the sun go down over some wine. They see the kids coming back: they are filthy dirty. They see work: all 3 kids needs baths. The clothes have to be washed. And they'll probably destroy the bathrooms they just cleaned. So the father looks at the mother and says: As I see it, we could clean these kids up or make a new one. Not sure if making a new one is wisdom, but it's a hell of a lot more fun.
LOL this is hilarious. Because outsourcing to consulting firms (cough IBM cough) has never resulted in blow-outs and poor delivery.
Buying almost always means being trapped in a snake-oil contract that requires ongoing consulting costs that quickly outweight the costs of the actual software.
> This can be avoided by making sure that the key differentiators for your business are in-house.
Oh....so you are saying we should build now?