
Java.­math.­BigDecimal toString is not thread safe - ThomasKrieger
http://vmlens.com/articles/java-math-BigDecimal-toString-is-not-threadsafe/
======
TazeTSchnitzel
I don't understand the race condition here.

stringCache is only ever assigned to with an initialised string (the output of
stringCache()), so given Strings are immutable and assuming Java's assignment
is atomic, how can you get a race condition that causes a problem? There
shouldn't be a TOCTTOU issue either given stringCache is copied to sc before
being null-checked.

What is the statement reordering doing here to break this? How can it be
prevented? (Does it need write barriers or something?)

~~~
joncrocks
The issue may be to do with the guarantees that the JVM gives you about orders
of operations and write visibility across threads.

The author hints at a possible solution when they say "As we see a non-
volatile field stringCache", the key being 'volatile' which is a Java keyword
with a specific meaning that constrains the order of operations and write
visibility across threads.

See [http://tutorials.jenkov.com/java-concurrency/java-memory-
mod...](http://tutorials.jenkov.com/java-concurrency/java-memory-model.html)
for a reasonable explanation.

edit: suggestion elsewhere that it could be a wonky JVM implementation, so may
not be a BigDecimal problem after all

~~~
exDM69
> the key being 'volatile' which is a Java keyword with a specific meaning
> that constrains the order of operations and write visibility across threads.

I must point out that "volatile" in Java means something completely different
than what it means in C and related languages.

In multithreaded C code, "volatile" is almost always incorrect. There are only
a few correct and portable uses of volatile (such as dealing with setjmp or
unix signal handlers).

~~~
_pmf_
The most important thing being that in Java, volatile is a memory barrier,
whereas in C it isn't.

One thing I'm unsure of: I think in C, a volatile write to some location and a
volatile write to another location (even without any data dependency) may not
be reordered); is this correct?

~~~
mikeash
C's volatile guarantees very little. It's mostly there for compilers to
implement in whatever way is useful for the environment they run in, and so
there isn't much portable code you can usefully write using volatile.

All it really guarantees is that reads and writes won't be elided. For
example:

    
    
      *x = 42;
      *x = 43;
    

If x is a normal pointer, the compiler can eliminate the first line. If it's a
pointer to volatile, the compiler must write both values.

Volatile predates multithreading in C (it was meant for memory-mapped IO and
similar things) and hasn't been updated for it, so it has pretty much no
useful properties for multithreading. There are _no_ guarantees about
reordering when it comes to multiple threads. You're guaranteed to see reads
and writes in the correct order from the perspective of the thread your code
is running on, but the compiler won't insert any memory barriers, so it's
completely up for grabs how other threads might see it.

(More completely, it depends entirely on your CPU's memory model. If you're on
an architecture which does strict ordering at the hardware level then you
could potentially take advantage of that. If you aren't then you'll see
whatever crazy results hardware reordering might produce.)

In contrast, Java's volatile is _only_ about multi-threading. So really, the
only thing that's similar between the two languages' use of volatile is how
they spell the keyword.

------
pulse7
Very similar bug was in Digitalk Smalltalk 25 years ago: conversion from
integer to string used a global cache... I notices this while running a batch
with progress indicator on UI. Sometimes integer to string conversion in batch
didn't work and I couldn't find the reason. After loooooooong period of
extensive debugging the issue was pinned down to global cache in integer to
string conversion, which was not thread safe. At first Digitalk didn't want to
fix it "because of performance issues", but in later versions it was fixed...
This bug can cause very serious side effects...

~~~
xxs
BigDecimal has nothing to do with the issue. There is no race conditional. The
code is textbook example how to use (cheap) unsafe publishing with no
locks/CAS -- assign to local variable and 'return' the variable value (not a
class member one). There is no global cache, it's per BigDecimal instance. As
for 'this bug', yes JVM concurrency bugs are serious by definition.

------
gwbas1c
I don't understand something:

> stringCache = sc = layoutChars(true);

How does this assign an uninitialized String to stringCache? I thought String
is immutable and the String object is fully calculated at assignment? Where is
the StringBuilder used when stringCache and sc are objects of type "String?"

Shouldn't the "problem" be that two threads might call layoutChars at the same
time, leading to some extra CPU cycles wasted and some extra garbage?

I suspect that the real source code is different, or the real problem that
leads to the NRE is different.

~~~
mmastrac
You are correct that this is allowed by the Java spec (1). Final fields are
guaranteed to happen before reference assignment, and strings are explicitly
mentioned. It appears to be an ARM JVM bug.

1\.
[https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.htm...](https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.5)

------
xxs
Not sure if late for the party... But there is no bug int the impl.
BigDecimal.toString(). It's a JVM bug. It blatantly violates the spec of the
original JMM (as of Java 1.5 and backported to 1.4)

Not only that but it lacks optimization/intrinsics for String. String.length()
should never/ever yield a NPE. I'd expect a process crash than adding metadata
for trapping read access faults.

\---

Edit: final fields have become ubiquitous even for objects that are safely
published [via volatile, synchronizeed, CAS, before thread.start()]. All
wrapper classes Integer, Long, etc. have them. Final fields are very much
advised to be used and they do help reliability and readability by making it
easy to reason about object state (i.e. it doesn't change once seen). On x86
field fields require a compiler barrier at best as the writes are not
reordered. In other words they are extremely cheap. Stuff like
AtomicReference.lazySet is next to free and a welcome way to build fast
concurrency primitives.

There is Doug Lea's parer[0] for the improved jmm. There are different
versions of ARM architecture with different memory models, overall ARM is
considered weak. v7 has dmb[1] only, and to my knowledge it's not cheap.
Skipping dmb requires rather deep analysis, so I wonder if that was the case
experienced by the poster. ARMv8 has store-release fence but I don't know how
efficient would be spamming it.

0:
[http://gee.cs.oswego.edu/dl/html/j9mm.html](http://gee.cs.oswego.edu/dl/html/j9mm.html)

1:
[http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc....](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CJAIAJFI.html)

------
GreenToad
In java lang specs: "A thread that can only see a reference to an object after
that object has been completely initialized is guaranteed to see the correctly
initialized values for that object's final fields." This makes it a JVM bug
because no thread should be able to see uninitialized final fields of an
object.

------
vivaamerica1
Agree with others that String.length() should never ever throw NPE from within
String, regardless of it being called from whatever thread.

Now, in the comments there's another idea suggesting the unsafe publication of
String (partially constructed). This would be the case if String.value wasn't
final (essentially the case of double-checked locking). But that's not the
case here because String.value is final[0]

While it's a JVM bug, making stringCache volatile would be one way to fix it -
unless the JVM is also broken around volatile...

[0] [https://stackoverflow.com/questions/11306032/please-
explain-...](https://stackoverflow.com/questions/11306032/please-explain-
initialization-safety-as-spelled-out-in-java-memory-model)

------
gus_massa
Did you fill a bug report about this?

------
benmmurphy
is there even a good reason for this to be cached. this almost like a memory
leak. if someone wants to cache the result of toString they should be doing it
in client code. this shouldn't be something that is forced on everyone.

~~~
Cthulhu_
And yet, it's an interface, as a user you don't notice this performance
improvement at all - it returns a string version of an (immutable) data
structure, how it does that and whether it caches it is not something that the
client can control, and thus not something they should care about too much.

What they should and will care about is the performance of
BigDecimal.toString(), and if it's cached then the 2nd and onward calls will
be very fast.

Mind you, I'm sure there's a lot of similar optimizations in a lot of
toString() implementations; a generic implementation that is threadsafe would
be preferable to a roll-your-own-caching.

~~~
wruza
Fyi, there are well-known safe approaches like memoize() to do that instead of
roll-your-own. Selecting hidden cache field for every damn value seems like
the legacy architectural error. Also, on some platforms not caching this at
all would be better than caching under atomic primitive.

------
osi
the OP observed the problem on a raspberry pi. sounds like it could be a
problem with the ARM JVM that is being used.

------
chvid
The blog author is wrong; this code is not the source of the problem his test-
case triggered.

------
ht85
Is there a reason why it doesn't do something like this?

    
    
        if (cache == null) cache = computeCache()
        return cache

------
Someone
_" BigDecimal is an immutable data type. So every method of this class should
be thread safe."_

I'm not familiar with the ins and outs of Java's API and memory model and this
is unexpected, but should it? If so, where does the documentation make that
promise?

Also (nitpick) one _can_ subclass BigDecimal and make it mutable (a design
error that won't be fixed because of backward compatibility)

~~~
juancn
This is a JVM bug, BigDecimal is thread safe, but there's peculiarity of the
java memory model that the ARM JVM is not honoring and thus the bug.

~~~
Someone
That's what my question is about. The OP claims _" immutable implies thread-
safe"_. My question is whether the spec promises that, and if so, where.

Reading
[https://docs.oracle.com/javase/7/docs/api/java/math/BigDecim...](https://docs.oracle.com/javase/7/docs/api/java/math/BigDecimal.html),
I can see the class is immutable, but that page mentions neither "thread" nor
"concurrent".

And yes, I think it has to define what it means by those terms before one can
assume that seemingly obvious claim to be true. Reason? Both _thread-safe_ and
_immutable_ are fairly vague terms that different readers can interpret
differently. For example, BigDecimal is declared to be immutable, but, in the
implementation being discussed, has a field that can get modified when one
calls _toString()_ on it.

~~~
hyperpape
Immutable implies thread safe. Somewhere in the concurrency section of
Effective Java, it both that this is true and that documenting a class as
immutable is good enough to document that it's thread safe.

Not sure if the spec lays it out, but it's a consequence of the spec.

------
lozzo
which version of Java ?

~~~
dudul
Seems to be at least until Java8

~~~
pjmlp
And Oracle specific, other JVMs might behave differently.

------
british_india
Was just looking at this problem!

------
dingo_bat
I am a complete java noob, but it seems like in the test program the same
variable testBigDecimal is being shared by different threads without any lock
or mutex or any sort of concurrency control! Won't any function working on
testBigDecimal be thread unsafe if it was not specifically written assuming it
was working on a shared object? Why is this news?

Disclaimer: I am so java illiterate that I am applying my C understanding to
the code snippet.

~~~
hyperpape
It's immutable, and immutable objects are thread safe. This is also true in C:
you can safely pass references to structs between threads with no locks, as
long as you never mutate them. Java just enforces that immutability if you
write the class the right way.

...the bug is that it's not really immutable, insofar as it's using an unsafe
mutation under the hood to implement its toString method. So it either needs
to document the fact that it's not immutable despite the appearance otherwise
(crazy), or fix the problem.

~~~
cesarb
The rules for C and Java are different. In Java, it's guaranteed that the
assignments in the constructor to fields marked as final is visible to other
threads before the newly constructed object is (unless you stash a reference
to it before returning to the constructor), but C has no such guarantee, so in
C you have to put the memory barriers yourself.

~~~
hyperpape
Yes, what Java gives you for immutable objects is a guarantee of safe
publication. In both languages, once you have the object published, you can
freely share it between threads.

------
nerdherfer
There's an easy fix: use a better language.

------
merb
This is even more strange since it uses it's own implementation of toString
instead of NumberFormat (which is not thread safe)

Edit: This could be easily fixed with double-checked locking

