
Java: "+=" applied to String operands can provoke side effects - javinpaul
https://bugs.openjdk.java.net/browse/JDK-8204322
======
kjeetgill
Wow. This is a pretty big bug. The official bug report:
[https://bugs.openjdk.java.net/browse/JDK-8204322](https://bugs.openjdk.java.net/browse/JDK-8204322)
as Stuart Marks, a jvm architect, posted. I'm surprised it stayed uncaught for
so long.

For the unfamiliar, the way string concatenation is compiled changed with
invokedynamic.

Previously, it would compile to either Sting.concat calls or transform into a
StringBuilder, but with invoke dynamic we can leave it to the runtime to
optimize when it first runs.

~~~
ballenf
Edit: This is obviously a bug and needs to be fixed. My post is more about how
the syntax "+=" is usually explained as shorthand for "x = x + 1"
([https://www.tutorialspoint.com/java/java_basic_operators.htm](https://www.tutorialspoint.com/java/java_basic_operators.htm)),
when it's not strictly true (depends on what's in "x").

"+=" is [usually explained as] shorthand for repeating the left side. Why
should

> foo[i++] += 1

evaluate differently than

> foo[i++] = foo[i++] + 1

?

The way it currently works is the way I would expect it to work, but I'm
evidently in the minority here. I also don't write Java and since this is a
change from Java 8, it needs to be fixed.

Let me state that again: yes, of course this bug needs to be fixed.

~~~
jjnoakes
> The way it currently works is arguably the way it should work, irrespective
> of the spec.

The spec is how it should work, regardless of developer's opinions.

> Why should [a] evaluate differently than [b]?

In my opinion (and I'm guessing in the opinion of the spec authors, though I
can't be sure), saying "x += y" means "add y to x", so I'd expect the language
to:

    
    
      1) Figure out what "x" is once
      2) Add "y" to it
    

Your opinion seems to be that saying "x += y" means "desugar to x = x + y" and
you expect the language to:

    
    
      1) Figure out what "x" is as a value
      2) Add "y" to the value of "x"
      3) Figure out what "x" is as a location (which may differ from 1)
      4) Store the result into "x"
    

To me step 3 is very surprising, that "x" may differ, which is why I think the
spec's definition is the more intuitive one (plus, it doesn't require
knowledge of "desugaring" things).

------
Terr_
> Operator += is broken for Strings in Java 9 and 10

The title [Edit: since changed] is a little bit fear-monger-y. It's only
broken when the left hand side is not safe to evaluate more than once.

It can be prevented with a compile-time flag.

~~~
java-man
I think it's a big deal: introduced in 9 and fixed in 11? It suggests two
things: Oracle has gaps in the regression test suite, and they don't consider
compiler error such as these as a big issue.

Java 9 broke other things. Here is one: [https://github.com/andy-
goryachev/FxDock/issues/8](https://github.com/andy-goryachev/FxDock/issues/8)
WebView is working fine in 8, totally broken in 9, partially broken in 10. Do
they have a QA department?

~~~
kjeetgill
Yea, I'm surprised the infamous TCK regression suite didn't include something
so fundamental.

That said, javafx is a bit of an orphan compared the rest of the JDK. And
that's why it's getting pushed out.

~~~
java-man
that's another gripe...

~~~
kjeetgill
I don't know if this is your issue but in a minor Java 8 release, the JavaFX
team made all references held by a webview's JS weak references.

What a dramatic change over what seems to be a solvable memory leak. I only
toyed with JavaFX but at least it's webview is dead to me.

See:
[https://bugs.java.com/view_bug.do?bug_id=8154127](https://bugs.java.com/view_bug.do?bug_id=8154127)

~~~
java-man
interesting. no, it is not my issue - in my case nothing gets rendered in
WebView, the view stays blank.

and another thing - how does one submit a defect report to Oracle? neither
bugs.openjdk.java.net nor bugs.java.com have a clear link, and I can't find
where one can register an account.

------
nine_k
Adding side effects to your array index expression and depending on order of
their evaluation is a bad idea anyway.

Unfortunately, the `++` / `--` made it to Java, and are still used sometimes,
despite JVM not being PDP-11, and definitely not trying to conserve every
precious byte of RAM.

~~~
wrs
Apparently it’s nothing specific about arrays or ++ — _any_ expression on the
left hand side of the compound operator is evaluated twice, so any side
effects happen twice.

~~~
nine_k
If you read the original post to the end, you'll notice a piece of code that
seemingly refutes this statement. That is, in case of Java 8, the _index_
expression is evaluated once. and in case of Java 9, twice.

~~~
Dylan16807
What does that refute? You seem to be agreeing with the post you're replying
to.

But importantly your original point about "depending on order of their
evaluation" is wrong. Most of the examples don't care about order at all. They
break because things happen the wrong number of times.

------
freefal
I used that operator in a tight loop before the compiler was smart enough to
automatically convert the code to use a StringBuilder. Boy was that slow...

~~~
rmetzler
I didn't touch much Java Code for ~ 10 years now. I remember reading about the
string concatenation problem in the Effective Java book. Using a StringBuilder
was the very first thing I usually tried when I needed to improve performance
on some unfamiliar code.

I didn't know the Java compiler converts this automatically now. Since which
version is this? I suspect there might be still edge cases where it doesn't
work.

------
ianamartin
Time to rewrite Java in Rust.

~~~
dtech
I know it's a (bad) joke, but Rust wouldn't solve this. This is incorrectly
implemented logic and Rust will not prevent you from doing that.

~~~
kibwen
To be marginally less facetious, Rust would "fix" this problem by dint of
deliberately neither having a ++ operator (in either post- or pre-increment
varieties) nor by allowing assignment operators (including augmented
assignment ops) to evaluate to the value of their RHS, so the example from the
OP would have to be written in a different fashion. :P (Rust has previously
had bugs of its own related to the evaluation order of += and friends, it
seems like a problem that nearly every language has at some point...)

~~~
Dylan16807
> neither having a ++ operator

The bug happens with a function call on the LHS too.

> nor by allowing assignment operators (including augmented assignment ops) to
> evaluate to the value of their RHS

I'm confused. The assignment is happening last, and whatever it evaluates to
is being discarded. How would changing what it evaluates to do anything?

But more importantly, the comment was about implementing _Java itself_ in
Rust, not the broken function, and that is why it wouldn't change anything.

------
kazinator
Lisp macros get this right.

    
    
      [2]> (macroexpand '(incf (aref a (incf i)) x))
      (LET*
       ((#:G3214 A) (#:G3215 (INCF I)) (#:G3217 X)
        (#:G3216 (+ (AREF #:G3214 #:G3215) #:G3217)))
       (SYSTEM::STORE #:G3214 #:G3215 #:G3216)) ;
      T
    

You can't just blindly expand the += by copying the expressions/AST node on
the left. Or in general; you don't take expressions in the program and blindly
duplicate them. How can someone working on compilers not know that???

~~~
kjeetgill
I see what you're getting at, but that's not the mistake that was made here.
People working on it knew what it should do. It's in the spec so Java agrees
with you.

This bug is a little narrower than this makes it out to be, it ONLY fails when
the array on the left is a String[ ]. The way String operations get compiled
down to bytecode changed a lot around those releases so the bug is more like
an unforeseen interaction between optimization passes than a naive MST/Macro
implementation.

The "funny" thing here to me is the surprising gap in the regression testing
given the quality of most of the JDK. go figure

------
pleasecalllater
Yea... what was the Bloch's explanation for not having operator overloading in
Java? I think it was something like "it's not allowed because I've seen too
many programmers who did that in a wrong way" :)

On the other hand this explanation is stupid, as a programmer can implement
`add()` in a wrong way.

------
oblong
> "side effects"

A side effect would be like submitting an order for am Amazon Echo by mistake.

This is a bug, a serious bug, that leads to variables being assigned the wrong
value.

~~~
brianpan
"side effects" because this is only a problem if your lhs is not stateless
(has side effects).

