
Software has diseconomies of scale, not economies of scale - henrik_w
http://allankelly.blogspot.com/2015/10/software-has-diseconomies-of-scale-not.html
======
ScottBurson
There's a lot more that can be said about how things change as the scale of a
software system grows.

Every order of magnitude increase requires a new level of discipline. At 10^3
lines, you can do whatever you want -- all your function and variable names
can be one or two letters, you don't need comments (or indeed any
documentation), your functions don't need well-defined contracts because they
only need to work in a few cases, etc. etc. At 10^4 lines, if you're smart you
can still get away with making a mess, but it starts to become helpful to name
things carefully, to add a few comments, to clear away dead code, to fuss a
little over style and readability. At 10^5 lines, those things are not just
helpful but necessary, and new things start to matter. It helps to think about
your module boundaries and contracts more carefully. You need to minimize the
preconditions of your contracts as much as practical -- meaning, make your
functions handle all the corner cases you can -- because you can no longer
mentally track all the restrictions and special cases. By 10^6 lines,
architecture has become more important than coding. Clean interfaces are
essential. Minimizing coupling is a major concern. It's easier to work on 10
10^5-line programs than one 10^6-line program, so the goal is to make the
system behave, as much as possible, like a weakly interacting collection of
subsystems.

There's probably a book that explains all this much better than I can here,
but perhaps this conveys the general idea.

~~~
wtbob
Very good point — that's why good abstractions are such a win. With functions,
I don't need to think about all the code needed to read the body of an HTTP
request; I just read it.

That's also why I think that macros are so valuable: a single macro can
abstract away an extremely complex and tricky piece of code into a single
line, or two.

With the right abstractions, we can turn 10^6 projects back into 10^3
projects.

~~~
ludamad
This alludes to another issue: you can pack a lot into one line, but that
doesn't necessarily mean it'll be a tractable line of code lest something goes
wrong

~~~
wtbob
True enough, and it applies to functions as well as macros. A function named
getThingList which also sets the user's password and send a rude email to the
Premier of Elbonia is just as bad as an unhygienic macro!

------
siscia
In my opinion the whole idea is a little flawed.

Is not "software" that has "diseconomies of scale", is the design process.

Then, because software has close to zero of marginal cost the design cost is
all the one we end up paying.

Imagine the same milk that the author is talking about, imagine to design the
system that get the milk from the cow to 2 people every morning.

Not a big deal, get a cow, a farmer and a truck mix all together and pretty
much done.

Now imagine to serve a supermarket, well you start to need something more than
a cow a farmer and a truck.

Now imagine to serve a whole city, what about a whole nations? What about the
whole world?

Simply design how to serve milk to a small supermarket is a problem, but since
the marginal cost of the milk is not approaching zero, the cost of the design
process will always be less that the cost of the milk (otherwise it wouldn't
make financial sense), hence the whole idea of "economies of scale."

To conclude I believe that the root causes of "diseconomies of scale" doesn't
lie in the "software" part but in the "design" part.

~~~
kartickv
Does it matter in practice whether the diseconomies are due to "software" or
"design"? Are you offering a solution to the problem the author outlines?

~~~
siscia
If you don't understand the problem you can't really find a solution...

Other than that it does matter because in a future you may see the same
problem in other industries... A __very advanced __3d printer may bring this
whole class of problem to the manufacturing industry as an example.

If we learn how to manage the complexity of the design phase we will be able
to apply the same concepts to other fields.

------
crazygringo
This is ridiculous and misleading.

Economy of scale has nothing to do with the price of milk in different-sized
containers, nor does it have anything to do with software complexity.

Economy of scale is about it being cheaper for a large farmer to produce a
liter of milk than for a small farmer, because overhead costs don't increase
linearly.

Likewise, software has economies of scale because, on a per-user basis, it's
far more expensive to support a 1st, or 5th, or 100th user than it is to
support a 100,000th.

The author is fine to note that software gets more complex and costly to
maintain as it gets bigger, but that has _nothing_ to do with economies of
scale -- the author is completely confusing concepts here. Economies of scale
are about the marginal cost per user, not the marginal cost per bugfix.

~~~
kartickv
Maybe he shouldn't have used the term "economies of scale", instead using
"economies of scope", which he does use. Do you agree with his point about
diseconomies of scope?

~~~
threepipeproblm
I appreciate the thought process here, but to my reading the understanding of
economies/diseconomies of scale is quite wrong.

(a) The author correctly point out some diseconomies of scale in software,
i.e. things that cost more when you do more of them.

(b) The author fails to identify that economies of scale typically far
outweigh the aforementioned diseconomies. The main error seems to be basing
the argument on this statement --

"This happens because each time you add to software software work the marginal
cost per unit increases"

\-- without considering the idea of dividing the expense by the number of
users, which can increase dramatically for projects reaching a critical mass.
To take a trivial example, most 10-employee businesses can't afford to pay for
a 10,000 line software application. A company with 100 employees may need to
write a 30,000 application due to increased complexity of their environment,
but they can afford it because the project now involves 300 lines of code per
employee rather than a 1,000.

In short this author accounts for both the numerator and denominiator in the
milk analogy that's up front, but then effectively ignores the denominator in
the discussion of software costs.

Of course this is why most programmers work for large organizations, at least
relative to the average employee. It's also why a handful of large software
projects make most of the money in the software business. I'm not happy about
this, btw, but it is the case.

To my reading the author's use of "economies of scope" and "economies of
specialization" are even further off base. For example, the trend over the
last 50 years or so has been towards _increasing_ specialization (which again,
benefits larger teams, although the app economy may have provided a brief
counterpoint, the same forces are at work there).

~~~
kartickv
Many apps do have diseconomies of scope — they have too many features, and are
therefore hard to use. And the company spent more resources to build a more
featureful app, like hiring more people, but this extra investment has had a
negative return.

Overspecialisation also causes problems, like engineers who don't understand
user experience or empathise with the user or even stop to ask whether the
flow they're building is needlessly complex. Or designers who propose ideas
that are not technically feasible given the platform.

If you can increase scale without increasing scope, like WhatsApp supporting
900 million users with 50 or so engineers, great. If you're increasing scope
in order to increase scale, you can't assume that the former leads the latter.

------
gwern
> Suppose your developers write one bug a year which will slip through test
> and crash the users' machine. Suppose you know this, so in an effort to
> catch the bug you do more testing. In order to keep costs low on testing you
> need to test more software, so you do a bigger release with more changes -
> economies of scale thinking. That actually makes the testing harder but…
> Suppose you do one release a year. That release blue screens the machine.
> The user now sees every release you do crashes his machine. 100% of your
> releases screw up. If instead you release weekly, one release a year still
> crashes the machine but the user sees 51 releases a year which don’t. Less
> than 2% of your releases screw up.

This example makes no sense. The user in either case still gets 1 crash per
year. They're actually worse off in the many-releases case because in the
annual update scenario, they can at least block off a few days to cope with
the upgrade scenario (in the way that sysadmins schedule upgrades and downtime
for the least costly times of years), but in the weekly release, they could be
screwed over at any of 52 times a year at random, and knowing Murphy, it'll be
at the worst time. '% of releases which crash' is irrelevant to anything.

~~~
ZenoArrow
I agree, it makes no sense, not only from a planning perspective from an
implementation perspective, specifically...

"In order to keep costs low on testing you need to test more software, so you
do a bigger release with more changes - economies of scale thinking."

If you wanted to keep testing costs low, you wouldn't do bigger releases,
you'd create automated tests. You may spend more effort up front on building
the tests, but as long as you make the test components modular and target the
tests at the right level of your application the 'cost' of testing will
decrease over time.

~~~
AstralStorm
Yeah, sure, everything can be tested in an automated way. I can name a few
projects that tried it. They were released with a hilarious amount of bugs.

Also writing good automated tests requires a great test developer. The thing
is, anyone with such credentials would be a great developer and as such, not a
tester.

Even if you go fully test-driven, which makes it much cheaper, the cost of
test a lot development model is surprisingly high for any application of
useful size.

Just imagine trying to write even something a simple as MS Paint with good
test coverage.

~~~
ZenoArrow
Writing good automated tests doesn't necessarily rely on a great developer, it
all depends on how you approach testing. For example, let's say you want to
use Selenium to write web UI tests. One common approach is to have developers
create a page object model, which testers can then use to write readable and
robust tests. Creating a page object model is a simple task, and working with
that page object model is a simple task, as the model guidelines define an
easy to implement and easy to follow control structure (essentially all page
object methods must return another page object, which means you can chain them
together).

Oh and MS Paint would be easy to create a good test suite for. For what it's
worth I'm a software tester by trade, so perhaps it's straightforward for
someone who creates tests for a living to know how to approach it, I guess
someone who was inexperienced wouldn't necessarily know how to approach it.

~~~
lovich
Do you have any suggestions on where to learn more about testing? I've gone
through numerous tutorials and read through The Art of Unit Testing but I
still feel like I'm writing tests just to write tests. It's not really
clicking for me

~~~
ZenoArrow
Sure. First of all, even though you're clearly a developer (as you mentioned
writing unit tests), I'd recommend starting at a higher level of abstraction
with BDD tools, basically anything that implements the Gherkin language (I
can't tell you which BDD tool to use as I don't know what language you're
coding in, if you can tell me I can give you a more specific recommendation).
The idea behind this is that you're writing your code to meet a specification
that your client can also work with, so you can be sure the specification
you're coding against is what the client wants. As an added bonus, the Gherkin
scripts form a type of living documentation, providing clear information about
what the application does whilst also remaining up to date (so long as all
tests must pass before a new version is sent out).

Second recommendation is to look into how to avoid test-induced damage, which
is where code gets bloated and more complicated in order to make it more
testable. One major source of problems in this area is the need to create mock
objects. I'd recommend this video from Mark Seemann as a good starting point
in this area, as it looks at how you can create unit tests without mocks:

[https://www.infoq.com/presentations/mock-fsharp-
tdd](https://www.infoq.com/presentations/mock-fsharp-tdd)

If you can let me know which language you're mostly coding in I'll see if I
can give you some more recommendations.

~~~
lovich
I'm doing 90%+ of my code in C#/vanilla JS at the moment. I was using moq to
mock out my C# objects. Thanks for the help, by the way

~~~
ZenoArrow
You're welcome. As you're familiar with C#, I'd recommend SpecFlow.

[http://www.specflow.org/](http://www.specflow.org/)

As you're also using JS I'm guessing you're creating web apps, so I can
recommend Selenium if you want to automate front end tests. You can code
Selenium tests with C# too, and you can also abstract away the details of the
Selenium implementation to use SpecFlow to write the tests. If you have access
to Pluralsight (if you have an MSDN licence check to see if it's bundled with
your MSDN licence, I think I got a 45 course Pluralsight trial with my MSDN
Enterprise licence), there's some good courses on Selenium and SpecFlow,
including one that takes you through combining the two.

[http://www.seleniumhq.org/](http://www.seleniumhq.org/)

[https://www.pluralsight.com/](https://www.pluralsight.com/)

~~~
lovich
You've got it right on the web apps. And thanks again, this should be a big
help

------
sp527
This is a complete misinterpretation of the concept. The only economy of scale
that matters in software is that (in theory, barring certain scaling
considerations which are in any case becoming more abstracted out thanks to
IaaS) the same program can be run to support 10s of millions of users. Each
additional unit costs very little to produce (serve it up via a CDN, some
amount of network calls, processing and data storage over the lifetime of a
user).

~~~
henrik_w
My read of the article is that he argues that _software development_ doesn't
have economy of scale - which I agree with.

~~~
rogersm
But the issue is not that this happens because this is software development.
The reason is software is a job in which cost increase is linearly related to
manpower. There are no silver bullets that can magically reduce the number of
people needed.

Also increasing number of people makes things even worse as communication
problems increase, but I think all agree on that.

But the main problem with the article is that milk is always the same: in a
glass or in a tank car. Software is completely different: more functionality
require more lines of code that increases people... and cost.

------
heisenbit
The argument that software _development_ has diseconomies of scale is well
know e.g. it has been incorporated a long time ago into the COCOMO estimation
model. It would be interesting to reverse the question - where does software
development really have some economies of scale? What situations does scaling
the team or the chunk of software under consideration up make sense?

\- system testing?

\- roll-out?

\- bulk purchasing of licenses for development?

\- architecture in the sense leveraging consistent frameworks, naming etc.
across a team?

\- infrastructure?

\- development time too long? (according to COCOMO there is an optimal time
and beyond effort increases although slowly)

\- management of "hit by bus" risk

~~~
yk
When I think about economy of scale and software development, all examples I
come up with have the property that any gains are diminishing quickly. For
example, a large system has a better build system, which catches a higher
number of bugs, therefore the developers are more productive until the now
larger project eats the added productivity due to higher complexity.

On the other hand, what in traditional industries would be called production,
that is producing the specific website for one request, has a rather absurd
economy of scale. With a single server and a static site you can serve a
sizable fraction of the world population, with negligible marginal costs of
serving an additional user. Actually Metcalfe's law suggests that the marginal
cost of serving an additional user is negative and hence we get the behemoths
like Google and Facebook instead of competition of a few different
corporations.

------
nradov
Small, frequent releases are a great idea for web applications and (to a
lesser extent) mobile and desktop consumer apps where updates are
automatically pushed through some sort of app store. But this totally doesn't
work for software licensed to enterprise customers for on-premises use. Every
upgrade you release means your customers now have to incur an expense to test
your update with the rest of their environment, retrain their users, and roll
out the update. Release too often and your customers will be unhappy. So the
tolerance for defects goes way down.

~~~
kasey_junk
There are lots of software models where you can't rerelease _at all_. The
amount of formalism in your software development model is highly correlated
with your release costs.

------
poof131
A few days ago I would have agreed whole heartedly. Now, after reading the
article about the formation of AWS I’m not so sure.[1] It seems that Amazon
(and others such as Google) have achieved economies of scale in development
with internal services and well organized APIs. Certainly much of this is
being exposed externally for profit, but it does seem to indicate that well
designed software has economies of scale for the development of future
software. Certainly one monolith application with a team of a thousand
starting from scratch is a bad idea, but once internal APIs and services
exist, these definitely seem to aid in the rapid development of future
software products for a company, which seems like economies of scale to me.

Bloated bureaucracies and bad processes can negatively impact any company in
any industry, not just software. Some of the article’s logic doesn’t seem to
differentiate software development from anything else, such as “working in the
large increases risk”. So while a large monolith application is risky,
building a hundred million widgets is risky too. Better to iterate and start
with a prototype and expand, but the same goes for other industries too:
better to prototype and market test your widget before mass production. Seems
to me like the article is talking more about lean and agile and process in
general than about economies of scale.

[1]
[https://news.ycombinator.com/item?id=12022915](https://news.ycombinator.com/item?id=12022915)

------
jondubois
What the article says is true for now, but it doesn't mean it will always be
true.

Making and transporting large milk bottles is very efficient today (because
it's all automated and there are well-tested processes in place), but it
wasn't necessarily always like this. When people where still figuring out how
to make glass bottles by hand (through glassblowing), bigger bottles were
probably more challenging to make (more prone to flaws and breakage during
transportation) than small bottles. So probably they just figured out what the
optimal size was and just sold that one size.

With software, it's the same thing, we don't currently have good tooling to
make building scalable software easy. It's getting there, but it's not quite
there yet. Once Docker, Swarm, Mesos and Kubernetes become more established,
then we are likely to see the software industry behave more like an economy of
scale.

Once that happens, I think big corporations will see increased competition
from small startups. Even people with basic programming knowledge will be able
to create powerful, highly scalable enterprise-quality apps which scale to
millions of users out of the box.

~~~
fennecfoxen
> Once Docker, Swarm, Mesos and Kubernetes become more established, then we
> are likely to see the software industry behave more like an economy of
> scale.

> Even people with basic programming knowledge will be able to create
> powerful, highly scalable enterprise-quality apps which scale to millions of
> users out of the box.

I must disagree. The real problem with scalability is that any system that
scales enough must become distributed, and distributed systems are obnoxiously
difficult to reason about, and as such remain difficult to program and to
verify.

Talk to me about Docker and Swarm and the like hosting technology platforms
and frameworks that make it trivially straightforward to program distributed
systems reliably, and really hard to program them wrong, and we might have the
utopia you speak of.

~~~
greenshackle
Powerful abstractions make the promise that you'll only have to learn the
abstraction to write powerful software, so "even people with only basic
knowledge will be able to do X using abstraction Y".

The promise is almost always false. All abstractions are leaky, and if you do
serious development with them inevitably bugs will bubble up from below and
you'll have to dive into the messy internals.

For example, ZeroMQ makes distributed messaging relatively painless. Someone
with very little knowlegde of the network stack can write simple programs with
it easily. But for any serious enterprise application with high reliability
requirements you'll eventually run into problems that require deep knowledge
of the network stack.

>Talk to me about Docker and Swarm and the like hosting technology platforms
and frameworks that make it trivially straightforward to program distributed
systems reliably, and really hard to program them wrong, and we might have the
utopia you speak of.

Yes, that.

~~~
jondubois
Your argument definitely applies to Backend-as-a-Service kinds of software and
I agree 100%, but the nice thing about the Docker/Kubernetes combo is that it
gives you an abstraction but it does so in a way that doesn't prevent you from
tinkering with the internals.

The only downside I can think of is that tinkering with those internals can
become trickier in some cases (because now you have to understand how the
container and orchestration layer works). But if you pick the right
abstraction as the base for your project, then you may never have to think
about the container and orchestration layer.

~~~
greenshackle
Maybe I was unclear, what I meant is that usually you _need_ to tinker with
the internals at some point. Which is fine, but it does mean you need more
than basic knowledge to use the tool productively. (And if the software is
proprietary and poorly documented, you're SOL).

The lie is that this tool is _so easy_ , you just have to read this 30 minute
tutorial and you'll be able to write powerflu software and you don't even need
to learn the internal mechanics of it.

I havn't used Kubernetes, it's possible it's so good that you don't need the
learn the messy details, I'm just sceptical of that claim in general.

------
fizx
By this guy's definition, housing construction has diseconomies of scale. You
wouldn't believe how expensive a thousand story tall condominium complex would
be!

------
musesum
Article conflates distribution with production. Production would be a bigger
cow.

Deep learning benefits from scale: both in computation power and in a larger
corpus.

From the example: is the final product the cow, the milk, or the nutrition it
provides? Is a software product the lines of code, the app to download, or the
service it provides?

~~~
HillaryBriss
Yes.

It seems like the article's milk analogy is applied to the wrong thing:
another pint of the same exact product.

Maybe it's more apt to make the analogy with a pint of a different type of
milk product, a type of milk with brand new features. I'm talking about real-
world products like these which have appeared in the last 10-20 years:

* Almond Milk

* Cashew Milk

* Coconut-based Milk products (not traditional coconut milk)

* Soy-based Milk products (not traditional soy milk)

* Lactose Free Milk

* Grass Fed Cow Milk

* Organic Cow Milk

* Omega 3 Enriched Milk

etc.

We would not expect the marginal cost of these to be less than another pint of
conventional regular cow's milk, and indeed, it isn't.

That said, I suspect that the article is basically going in the right
direction. I mean, it seems like additional features on a complex software
project really do cost much much more, on a percentage basis, than a new kind
of milk beverage does.

~~~
musesum
Interesting take on production. Multiple kinds of milk that use the same
distribution medium of cartons and grocery stores. These products are
decoupled from each other, so the development cost is comparatively linear.

I agree with OP's premise that productivity diminishes with project complexity
- a decades old problem addressed by Fred Brooks' Mythical Man-Month. But, a
lot of the complexity is now wrapped by reusable components. It is now
possible to write a component that is shared by 1000's of projects. Somewhat
akin to a milk carton that can hold many kinds of milk.

------
kriro
What he describes (a product growing in complexity) isn't a typical economics
of scale type of product. I'd also argue that the general observation is true
for all complex products. Even a car has relatively high R&D cost and only
reaps the benefits of economies of scale (lower per unit production cost than
the competition) once it is ready to ship. Unit 1 is going to be very resource
intensive, the following units are what contributes to the scale (and like he
said for software that scale is amazing).

However maintenance cost in software are usually really high especially since
there is a tendency to gradually "improve" the same code base instead of
ripping it out and building a new one every n years (buy a new car).

------
vnchr
We're transitioning from a monolith PHP app to a Node microservices
architecture. The approach with microservices seems to take these scaling
issue into account, allowing for narrow focus on each service during initial
development and ongoing maintenance.

------
hyperpallium
[https://wikipedia.org/wiki/The_Mythical_Man-
Month](https://wikipedia.org/wiki/The_Mythical_Man-Month)

------
ucaetano
Any production system has the same issues, they're usually called overhead or
in some cases agency costs.

This is captured well in the marginal productivity of labor: the returns on
adding more labor decline as you add more labor.

Talking about actual economies of scale, software has massive economies of
scale. The marginal cost of serving one additional copy of software (in the
old "software as a product" way) is close to zero.

------
_pmf_
When developing product lines, you can enable tremendous leverage through good
architecture over time. The problem is scaling with people, i.e. this is not
something that can be accelerated by involving more people..

------
wodenokoto
This is usually not what is meant by economies of scale.

Very few production lines get cheaper as they get more complex.

But software, more than most products gets cheaper per unit, as you scale the
number of units.

------
hackaflocka
This article is unoriginal click-bait.

Software _does have_ economies of scale. How much did a copy of Windows 7 or
Office 2013 cost? Only about $100? That's because the more we
produce/supply/consume the cheaper it gets, just like milk.

The notion that there's an optimal amount of human capital needed for a
project is nothing new. We've all heard of "Too many cooks spoil the broth."

------
mehh
bigger projects have bigger risks ... no shit sherlock

------
draw_down
Some great points in the article. Unsurprising that people here don't want to
hear them.

