
A single character code change - nfrankel
https://blog.pitest.org/how-i-once-saved-half-a-million-dollars-with-a-single-character-code-change/
======
gmfawcett
Good lord, this long-winded writing style is maddening to read!

Here's what he changed:

> Set<Thing> aSet = new HashSet<Thing>();

to:

> Set<Thing> aSet = new HashSet<Thing>(0);

His explanation:

"Most of these sets were empty for the entirety of their life, but each one
was consuming enough memory to hold the default number of entries. Huge
numbers of these empty sets were created and together they consumed over half
a gigabyte of memory. Over a third of our available heap."

I'm not sure he ever got around to explaining how it saved "half a million
dollars", though.

~~~
jcims
Have you ever noticed that it's becoming more difficult to find recipes
without a six page blog entry on top of it? This is kind of like that.

I love the word bloviating for times like this. It's not really an
onomatopoeia but something like it. It's also somehow clearly an insult.

Edit: jrockway has actually changed my perspective on recipe thing...if people
want to put a six page backstory and photos on top of a recipe that's
completely their right and I would never suggest that more quality (?) content
is a bad thing. My problem, it seems, is with Google putting that as the top
result when I'm searching for a recipe. Esp when I'm on my phone in a store
getting spammed by some captive portal while trying to figure out if the damn
cookies I'm making need baking soda or baking powder.

~~~
vharuck
For recipes, it's possibly a way to qualify for copyright protection[0].
Recipes themselves aren't considered original enough, so authors can encase
them in a copyrightable shell of narrative.

There might be other reasons for this on the net. And I wouldn't be surprised
if new recipe bloggers write narrative because "that's how it's always been
done."

[0][https://info.legalzoom.com/recipe-copyright-
laws-20049.html](https://info.legalzoom.com/recipe-copyright-laws-20049.html)

~~~
beerandt
Except the copy-writable part is not the part anyone wants. But I think you're
on the right track.

I'd argue it's the authors trying to differentiate (and add value to ) their
copy of the recipe vs the 100s of others. And I suspect some do believe that
adding dialogue or slightly changing the recipe covers them from copyright
claims (not realizing recipes aren't covered) or provide them with some
copyright protection of their own.

Others probably do it because that's the way it's always been done, but don't
realize what they're trying to emulate...

It's the Martha Stewart / Alton Brown recipe for success, where the details
are more interesting than the underlying recipe. But people mistake the extra
detail for sentimental fluff (Martha Stuart) or technical babble that they
don't understand (Alton Brown), and so they just perceive the formula as
recipe plus fluff, without seeing the value that fluff should add.

Or they misjudge the value that they think they are adding with their own
fluff.

Also probably relevant: recipes (along with product pages) are a couple of
categories that Google treats drastically different than general web pages.
And as a result has had unintended consequences for the worse, as far as
overall quality of the web goes. I'm looking at you, affiliate links.

~~~
cpach
_”Or they misjudge the value that they think they are adding with their own
fluff.”_

This. There are some blogs that really succeeds with the story-telling. I
don’t mind the stories on Food Lab for example. But if it’s just an incoherent
and fluffy word salad, then I’d rather just read the recipe straight away.

~~~
beerandt
_Good_ Story telling has its own value (to some), but I consider it to be
different than what food lab, MS, or AB are doing, which I guess you could
call more accurately a narrative of constructive details. As opposed to just
an experience narrative to go along with the recipe.

The experience narratives can certainly be constructive as well, like maybe
describing different versions of dishes tried at various restaurants when
trying to recreate a recipe. But even most of these aren't helpful, depending
on the author.

And of course the worst is the people who just try to describe an appropriate
weather condition or family feeling that goes along with a meal, ala the
_Giada_. Of course Giada is proof that even some people can make that model
work well. It's just that most people on the internet don't.

------
kinkrtyavimoodh
There is a 'pre' in 'pre-mature' for a reason. If you don't seek to optimize
even when it is 'mature', you are not a good software engineer.

In this story, instead of doing meetings, why did no one fire up the profiler
and actually try to figure out why the heap size was so big? If the senior
architect's first thought is to come up with an elaborate proposal even before
they understand the memory and processing budgets of their program, what are
they a senior architect for? It's like trying to shave off micro seconds of
processing time while your network call takes milliseconds.

Good style guides strike the balance where common caveats and easy to miss
gotchas are avoided by making rules around them (and then it's no longer
premature optimization because someone already did the math and decided on the
optimization, so each new programmer using it is not optimizing but merely
following a rule in the style guide), while leaving the complex cases for the
programmer to optimize as and when needed.

Many people who scoff at opinionated style guides forget that in a big company
you don't want 1000 engineers to spend time independently figuring out whether
X needs to be optimized [1]. If someone can do the profiling, figure out what
should be done in 99% of cases, make a style rule out of it explaining the
rationale, it saves everyone the hassle. I believe Google does exactly this.

[1] Apart from the waste of time, not every engineer might be capable of
actually figuring it out. Many of the intricacies of C++ for example aren't
immediately obvious even if you have spent years with the language. It's more
optimal to hire a few C++ super-experts, have them do the profiling, and come
up with style guidance.

~~~
Ididntdothis
That’s what I was thinking. With any piece of software where there are
performance concerns, be it memory or speed, the first thing you should be
doing is to fire up a profiler and get some hard data. I usually profile
almost all my stuff just to see if anything unusual or unexpected shows up.
This is often a good learning experience.

------
hinkley
Author discusses a Java 1.3 app where he saved tons of memory by reducing leaf
collection sizes.

I had sort of the opposite with a Java 1.2 code base. Some clever person has
noticed that a bunch of vectors often had only 4-7 entries so they overrode
the default (10) to 7. This is unfortunately greater than n/2 which will
become important momentarily.

In Java collections tend to double in size on space exhaustion. Using a
multiple instead of addition amortizes the insertion time to O(1). So if you
start at 7, you go to 14 next.

As I’m sure you can guess, the number of entries crept up to 8 over time.
Switching back to the defaults reduced the wasted space from 6 to 2 slots, and
dropped our memory usage by more than 10%.

------
paulddraper
I had a somewhat similar discovery with V8 (JS VM):

V8 allocates a whopping 64 bytes per empty object.

    
    
        {}
    

But if you initialize it with a single property, the size _shrinks_ to 40
bytes. [1]

    
    
        {v8Hack: null}
    

I too was creating lots of object that would stay empty, but V8 (incorrectly)
assumed I was going to change them later so it reserved some extra space. [2]

Unfortunately, there is no size hint like for Java's HashMap; the closest hint
you can do is create your objects with dummy properties.

[1] [https://stackoverflow.com/questions/59044239/why-do-empty-
ob...](https://stackoverflow.com/questions/59044239/why-do-empty-objects-take-
more-memory-than-non-empty-ones)

[2] [https://www.mattzeunert.com/2017/03/29/v8-object-
size.html](https://www.mattzeunert.com/2017/03/29/v8-object-size.html)

------
stygiansonic
Although the article is a bit long-winded as others have pointed out, it did
mention the importance of _profiling_ your application before making any
changes.

In particular, you should always aim to profile your app when running under a
production load so that you do not have to make assumptions about its
behaviour. Something like async-profiler[0] is good, since it avoids the
safepoint bias issue and can also track heap allocations.

0\. [https://github.com/jvm-profiling-tools/async-
profiler](https://github.com/jvm-profiling-tools/async-profiler)

------
bcrosby95
Back in the 00's our servers were suffering under heavy load. Normally they
just bought more servers, but I asked if anyone ever profiled our app and no
one could remember doing so. I did so and our home grown ORM did a couple
really dumb things. Including access the disk each time it allocated a new
object that represented a row (wtf). After spending a couple days fixing those
things we needed 1/4 of the servers we did beforehand.

------
redis_mlc
Randal Schwartz saved more than that by enabling caching in a template for one
of Yahoo's properties near LA.

So it really is possible.

As a DBA, I routinely save that (and more) by doing query performance tuning.
I aim for 500x-1,000x faster on physical hardware, and 10,000x in the cloud,
thanks to EBS/pSSD latency.

And not using Hadoop.

------
ping_pong
Terrible article.

But even worse, the "fix" was what I would consider a spaghetti hack. They
create a HashSet that is mostly never used? The real fix is to create the
HashSet when you know you need it and don't create it if you don't need it. It
sounds like the underlying code has expectations that the HashSet exists and
is valid, which in and of itself is bad code.

Check to see if it's null, and if it is null and you need it to not be null,
then allocate it then, with size 0 or default capacity, whichever makes more
sense.

~~~
hk__2
> It sounds like the underlying code has expectations that the HashSet exists
> and is valid, which in and of itself is bad code.

How is that bad code? No NPE and other null-related issues, cleaner API,
easier to test.

~~~
remote_phone
The fact that it cost $500,000 in real money because of this code is more than
enough proof that it’s bad code. Plus it’s a small enough change that if it
gets modified again, it will cause further money loss.

~~~
hk__2
> The fact that it cost $500,000 in real money because of this code is more
> than enough proof that it’s bad code.

If this is a proof of anything, it’s that it can be optimized; not that it’s
bad code.

------
mannykannot
> I had to spend time justifying the presence of that ‘0’ many times over the
> years as other developers questioned its purpose.

One of the better cases for using appropriate comments...

~~~
1-more
Slightly zestier idea: make a helper function `makeMostlyEmptyHashSet` and a
linter rule that warns on `new HashSet`. You can turn off the linter warning
with a comment in every linter I've used, and that comment could include a
justification for the default sized `HashSet`.

~~~
egdod
Having to justify using a default is a little too zesty for my taste.

------
jandrese
I have to wonder about an architecture where he's creating thousands of these
hash objects _and then rarely using them_. Initializing them to size 0 was a
workaround, but it makes me wonder if maybe he should consider lazy evaluation
where the objects are only created if they're actually used? That's still a
bunch of CPU cycles burned churning through the setup/teardown of empty hash
tables.

~~~
ambulancechaser
The article mentions that they rarely had elements, not that they were rarely
used. They might have been tested for membership in lots of hot paths. There's
no way to know. And I bet a lazily-instantiated set might take up even more
space than a regular set anyways. And if they are often used, even if not
added to, it would have made the problem even worse.

------
enitihas
If your java application needs a lot of small collections, it may make sense
to use guava immutable collections. They have specialised overloads for lots
of sizes, and small collections will not take any more memory than a hardcoded
minimal collection.

------
ChrisMarshallNY
It seems to mostly be about optimization.

There are some good points, but, like so many of these topics, it enters the
world of "orthodoxy," where we all have to do things the same way, everywhere,
and in all places.

Optimization (effective optimization, anyway) is a fairly intense process. For
one thing, common sense doesn't really apply. What _seems_ to be an
optimization can sometimes do exactly the opposite, like breaking cache, or
introducing thread contention.

It can also introduce _really weird_ bugs, that are hard to track down, and
strange code structures that relatively inexperienced maintenance programmers
may have difficulty grokking.

It's all about the metrics. Profilers are your friend.

Also, in many cases, optimization isn't necessary at all. If the code controls
response to a tab selection, then the code that redraws the tab is likely to
be executed in a separate thread, anyway, and done at the pleasure of the OS.
Why bother optimizing the lookup index to save a few microseconds?

Another matter, entirely, when we are iterating an Array that is many
thousands of elements long. In some cases, using HOF can actually decrease the
performance (but YMMV). I have sometimes had to replace a _map()_ with a
_for_.

We measure and find hot spots, and then concentrate on them.

If we are working with mobile or embedded, then we also need to worry about
power consumption. That's a fairly fraught area, right there, and optimization
can actually cause power drain.

~~~
gav
> It's all about the metrics. Profilers are your friend.

In my experience, profilers are not used nearly enough. They also tend to be
used when systems are at or near crisis already.

It's way more valuable to have profilers run at on every build (or at least
every deployment) to see if there's regressions. Having performance metrics in
production to continually track key metrics is critical. All software is built
with assumptions that impact performance and all of a sudden those can be
invalidated. There's a lot of O(N^2) code that's just fine right up until it
explodes.

Some examples I've run across:

\- A query API that could contain metadata saying "throw away this result and
re-query with these values instead" that worked great for years because it
only impacted a single-digit percentage of queries. One day the users decided
to really start using this feature, and found that the impact of executing a
query that took 100ms and running another 100ms query was way worse than
doubling the average.

\- A system where you could configure plugins to make "pretty" URLs by running
a list of regular expressions. Again, this worked great for years until both
the number of URLs and the number of plugins increased and the 25% of the page
generation time was spent running regular expressions.

~~~
ChrisMarshallNY
Good advice.

------
mirekrusin
The takeaway should be - run profiler at least once before hiring extra
architects and spending half a million on hardware/licenses.

------
pdfernhout
Been trying to get WordPress to optimize memory usage for four years on this
ticket:
[https://core.trac.wordpress.org/ticket/34560](https://core.trac.wordpress.org/ticket/34560)
"Right now, you will unhappily see "500 server error" and "white screen of
death" results eventually if you, say, edit a 400K page for hundreds of
revisions, even with a 256MB server memory limit. ... There are apparently
_three_ requests for all posts versions being made by the (Page) editor."

That might be some record _both_ for inefficient coding and then for not
caring much about inefficient coding and all the extra expense millions of
WordPress users face to host such inefficient code? Either that or maybe few
WordPress users make many edits to large pages?

------
dekhn
One of the more interesting optimizations I've seen is the C++ string class. A
class instance takes up 24 bytes, many of which weren't used, so for short
strings the data was stored inline rather than having a pointer to external
memory. This is big if you have an app that has millions of tiny strings.

~~~
chrchang523
Note that std::string data representation, and the associated set of small-
string optimizations, are highly implementation-dependent; it even differs
significantly between clang and gcc's standard libraries. For gcc, std::string
size has been 32 bytes on 64-bit targets for the last several years.

~~~
dekhn
Yep. You can see some more context here:
[http://scottmeyers.blogspot.com/2012/04/stdstring-sso-and-
mo...](http://scottmeyers.blogspot.com/2012/04/stdstring-sso-and-move-
semantics.html)

I don't know if there are any canonical writeups on the various string
implementations and their performance implications wrt modern hardware.

------
raverbashing
Premature optimization is not great, but for low hanging fruit, just do it.

As much as developer time is more expensive than machine time it seems people
are forgetting best practices and/or minor improvements that can have a big
impact.

~~~
saalweachter
This is somewhere where project style guides and code reviews can yield tiny,
incremental wins.

------
exabrial

      The form:
      Boolean b = Boolean.valueOf(true);
      Is slightly more efficient than the subjectively uglier:
      Boolean b = new Boolean(true);
    

So couple of notes on that. The "new" operator is cheap, but still involves an
allocation. Oddly enough, one of the strangest bugs I found back in the day
was:

    
    
      Boolean a = new Boolean(true);
      Boolean b = new Boolean(true);
      System.out.println(a == b);
      
      false
    

But with objects and wrapper types, the .equals() method is the correct way to
compare two objects.

~~~
stygiansonic
In my opinion this behaviour shouldn’t be considered strange - “new” always
means instantiation of a new object and the “==“ operator for objects always
checks for object identity. Since both operands are objects, the above would
apply.

This is another reason to avoid boxed primitive whenever possible.

~~~
exabrial
Agree. The "new" operator in my case was hidden deep inside a corporate
framework where the guy that invented it had left a long time ago. Quite an
adventure down the rabbit hole to figure that out!

Boxed primitives are useful when working with systems and languages where
"null" is a concept. Take for example, interacting with a database where you
have a boolean column that has true/false/null. That translates well to boxed
primitive types.

------
gwbas1c
I really want to know more about "why." Specifically, more about why the
design of the system leads to creating tons and tons of empty hashsets.

Leads me to believe that there's lots more meaningful optimizations remaining
that someone who better understands the system could implement.

------
kazinator
> As I found out while writing this article, modern JDKs do not allocate any
> space in a HashSet until something is added.

Ah, but the object still exists. If large space of objects is needed that is
sparsely used, the thing to do is to make that space itself lazy, not the
individual objects.

------
tudelo
Most of my Java optimization attempts have ended in 'oh, library x has
deadlocks' where library x was necessary to use without massive time
investments (big libraries like hibernate, oracle db drivers)...

------
phs
Shucks! I came in here guessing it would be adding the trailing period on a
domain name, to disable a long list of resolv.conf entries.

------
asveikau
The article advises against applying this change everywhere, but I think this
actually sounds like a good argument that the Java standard library should
adjust the default constructor for Set. I am going to guess that Sets usually
contain a small number of members. Paying memory allocation costs [amortized
constant time] for large sets sounds a lot better than requiring memory to fit
the possible range of values for small ones.

~~~
nradov
What you're really asking for is to change the backing
HashMap.DEFAULT_INITIAL_CAPACITY field.

[https://hg.openjdk.java.net/jdk/jdk14/file/f77e9e27b68d/src/...](https://hg.openjdk.java.net/jdk/jdk14/file/f77e9e27b68d/src/java.base/share/classes/java/util/HashMap.java#l237)

There's no hard data to show that a smaller value would give better
performance for typical Java workloads. Resizing operations are quite
expensive.

~~~
asveikau
It sounds like you are squeezing my suggestion into a very specific solution.
If there was a fix not tweaking current knobs but that made low numbers of
elements cheap, while subsequently minimizing resizes when some reasonable
heuristic for "mid-size to large" hash tables kicks in, you would oppose it
universally? Resizing a small number of elements is not going to be bad.

------
unilynx
The Windows /3GB bootflag might have also given the needed 0.5GB of extra heap
space, but apparently the JVMs couldn't really make use of it according to
[https://stackoverflow.com/questions/21631167/java-heap-
space...](https://stackoverflow.com/questions/21631167/java-heap-space-and-
the-ram)

