Hacker News new | comments | show | ask | jobs | submit login
In Java 3 = 12 (virtualspecies.com)
36 points by sayham28 on Nov 15, 2017 | hide | past | web | favorite | 52 comments

1) Left-associativity

2) The price you pay for type coercion. Be thankful Java defines the order of evaluation. Not all languages which coerce specify the order of evaluation.

Yes but that's strange. Java is such a staticaly typed language you would not expect it to do that if you are like me and don't know it very well.

While Python is very dynamic, but:

     >>> print(1 + 2 + "=" + 1 + 2)
     Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
     TypeError: unsupported operand type(s) for +: 'int' and 'str'

dynamic vs. static typing is independent of strong vs. weak typing. strong typing is basically allowing fewer implicit converstations, weak more. static means type checking happens before runtime, dynamic means the opposite.

Thanks for that explanation. So Java is weakly typed? Or would you say less strongly typed? Is the strength determined by the amount of implicit type coercions that are available?

Does static vs dynamic also imply that the strong-ness is checked at compile vs runtime? In other words, in python this bug can lurk undetected in your code

    list = (1, 2, 3)
    ‎x = list + 1
while in Java it will throw the error at compile time

    int[] list = {1, 2, 3}
    ‎Object x = list + 1

> In other words, in python this bug can lurk undetected in your code

Well, if you use Python 3 and a modern editor, you will get a big red warning. mypy has been able to check types for a while and a lot of editors embed it.

But mypy is basically a static type checker for python. So my point still stands that static implies that strong-ness is checked at compile time, or IDE time in this case.

Thinking about this more, if all that strong vs weak is referring to is implicit type coercions then it is a useless confusing term. Static vs dynamic refers to when the type checking occurs, implicit type coercions are just one of the things the type checker looks for.

I've always seen the difference as this:

- In statically-typed languages, variables are containers, which allow only objects of a certain type to be contained within.

- In dynamically-typed languages, variables are more like labels which can be attached to different objects, which can be of different types.

Each of these approaches have its merits, which is obvious when you look at all the lengths languages of one type go to to emulate the features of the other: interfaces and generics in statically-typed languages, or type hints and statical checkers in dynamic ones. But I have yet to see a language that would offer both types of variables...

As a Python developer, I can easily imagine a PEP that would introduce some kind of "container variables" that would accept only properly typed objects and throw a syntax error if a violation occurs at compile time. But I'm not holding my breath, with a language which ironically doesn't even have an implementation for constants.

> Does static vs dynamic also imply that the strong-ness is checked at compile vs runtime?

It's less like "implies" and more like "it's the definition of".

Strong vs weak typing also means whether you can circumvent the type system. So C/C++ are weakly typed systems because the laws of types are more like guidelines there and you can cast around them to reinterpret memory contents.

Ruby is like Python, Perl and JavaScript are like Java. Type coercion has little to do with being strongly or weakly typed, dynamically or statically typed, interpreted or compiled, which are orthogonal axis. It's more like syntactic sugar to save a str(), a .to_s or a printf pattern. Maybe the Java implementation just overloads the + operator and converts the types inside it.

Strong/weak is not a useful thing most of the time, but when people do try to define it, often they do so in terms of the presence and number of implicit conversions the language will perform for you in order to allow operations to succeed despite being provided operands of the wrong type.

Also, overloading the "+" operator is basically what Java is doing, which is an odd inconsistency for a language which doesn't allow end-user code to do operator overloading.

This is possible exactly because Java is statically-typed. Java's type system works over the operands, and does implicit conversions before the expression gets a chance to run. It sees values and does implicit magic from left to right.

Python, on the other hand, does not need to do a typing pass at parse-time. Instead, literals are stored as values, and those values are tagged with their source class. Python's way of doing it means that a + operation is called upon with an integer and a string.

Java might be able to yield the same kind of result as Python if it had operator overloading. It could define Integer's + operator as only taking other Integer values, therein yielding a type error at parse-time.

My own language uses ++ instead of + to distinguish between addition and concatenation, though I've considered some other token since the two are quite similar to each other.

Actually Python does have a way to change addition semantics. It's just not defined for this operation.

> This is possible exactly because Java is statically-typed.

Are you saying that Python can't make "a"+1 become "a1" because it lacks static typing? That's not true. If they wanted to, they could have added an overload for the string+integer case.

> Java might be able to yield the same kind of result as Python if it had operator overloading.

Not necessary. They would have just had to omit the built-in overload for string+integer (and perhaps also for string+object, given the newish auto-boxing feature).

This has literally nothing to do with static typing. It is 100% due to implicit promotion (demotion?) of addition operands to string if either operand is a string.

All static typing does is mean that the type check, and so decision to insert the conversion call is done at compile time. Dynamic typing just means that the type check and decision to insert the conversion is done at the point the + is evaluated.

that's also strange to me.

you are seeing this error because python is strong typing language.

but this is weird that java is also strong typing language[0]

0: https://en.m.wikipedia.org/wiki/Java_(programming_language)

Strong v weak typing isn't an absolute categorisation, it's a spectrum. There aren't many places Java will automatically coerce types, and the ones that do exist are narrow (numeric promotion, string concatenation, and I can't think of any others).

I'm glad some of the newer languages like rust and go just dump implicit coersion altogether. Yeah it's convenient but it's not worth the headache and subtle bugs.

I would have expected the compiler / type checker to barf at the example code. Then again, I am not a Java person.

However, I do dislike relying on automatic type coercion anyway, because my intuition is not good at predicting what will happen, and I am too lazy to learn all the rules. ;-)

This isn't at all surprising if you know how Java works; you could probably point out this kind of a strange feature case in nearly any language if you look hard enough.

It does left to right order of operands, so 1 + 2 happens as integer addition since both operands are integers, then + "=" does string concatenation since one of the operands is a String, then "3=" + 1 evaluates as a concatenation since one of the operands is a String, giving "3=1" + 2, which evaluates to "3=12" by the same logic. Parenthesis solves this by explicitly specifying order of operations.

And explicitly specifying order of operations is almost always a good idea for making sure it does what you want and maintenance purposes.

> you could probably point out this kind of a strange feature case in nearly any language if you look hard enough

I disagree. I think it's a product of bad language design.

I disagree. I am not an expert of java language subtleties, but I had no problem to predict and explain this output. I think it is a product of the language simplicity. I have a deeper knowledge of C and more experience, but I regularly fail to predict output in similar exercises.

The bad design here is having implicit type conversion at all. The few characters you save with this rarely used feature are not worth the potential confusion. But it is a trade-off, Java is normally criticized in the other direction, for forcing you to be overly explicit.

Does the same in .NET


Used .NET for 15 years and never knew this...

It's evaluates left to right.

(1+2)=3 then concat with string = then concat with 1 and concat with 2.

The optimiser converts these concats to a StringBuilder.


Second time I've seen something like this on HN.

I really don't think this should be confusing - having a decent grasp on associativity and precedence is a pretty useful skill - so for example understanding WHY println(3 * 2 + "-" + 6 * 3) behaves "differently" is something I'd hope people would get.

And let's be honest, you definitely don't want to 'fix' this by tweaking precedence to include types. That way madness lies. ;)

Definite shout to ubernostrum's commment though - perhaps the confusing thing is that string concatenation is +, but a world of explicit Appends seems horrid, and that ship has very much sailed.

Just be (maybe) glad java doesn't allow generalised operator overloading ;)

I think + for append is fine, it certainly seems to make sense to me that you get "helloworld" from "hello"+"world".

I think the confusing thing is the type coercion. Does it make sense that "2"+3 is "23" but 2+"3" is not 5?

    (3*4).toString() + "=" + 3.toString() + 4.toString())
would not be as surprising that the two sides are different.

In Ada the (String) concatenation operator is '&' (and of course, no implicit promotion to String there, you'll have to use your_type'Image(your_var) or with GNAT your_var'Img).

I expected to see something more sophisticated than a simple implicit conversion to string.

And in php (after https://eev.ee/blog/2012/04/09/php-a-fractal-of-bad-design/):

"foo" == TRUE and "foo" == 0 but TRUE != 0

123 == "123foo" but "123" != "123foo"

"6" == " 6", "4.2" == "4.20" and "133" == "0133", but 133 != 0133, "0x10" == "16" and "1e3" == "1000"

I suppose that's something every Java programmer should know: How types are automatically converted during "arithmetic" operations. There are some subtleties (like byte+short => int), but it's not too much to remember.

The deeper question is: Was it a wise choice to disallow operator overloading BUT use "+" as a string concatination operator AND convert all values to strings during string concatenation?

I think not, but it's a rather moot point, since that desing decision is not likely to change...

Eh, it's just a design decision. They could have required that people construct strings using printf-style formatting (which you can do, with String.format(), but isn't required), or have some sort of interpolation (like Scala does with s"This var is $foo"), but they just opted to let people concat with + instead. You could argue it any way, I suppose; there's no one right answer.

The behavior in the OP's example is unsurprising for a language that defines + to be left-associative and has a well-defined order of operations.

Thats why I'm glad D chose '~' as a binary concatenation operator so that there is never any confusion.

There is something to be said about using all the special characters of the ASCII table in term of readability.

Looking forward languages that will start making good use of Unicode! I am actually surprised Greek letters haven’t made it already. Lots of computer scientists are mathematicians that are used to math papers looking like an Ancient Greek tablet.

The Fortress programming language was a research project at Sun that incorporated mathematical notation.

See page 24 of http://www.oracle.com/technetwork/systems/ts-5206-159453.pdf

They also had an ASCII-only representation for everything, so you could use a regular text editor if you wanted, kind of like Markdown.

I feel the title is a bit unfair. Technically it is '= + 1 + 2'. Which is expected as a Java developer.

Coming from other languages maybe it does look a bit odd though

This was touched literally on second exercises (so second hour of learning Java) on university when I learned Java.

I had to look twice at this. I was surprised that I could add the string “=“ to ‘3’. I guess what’s happening here is the int gets boxed and then gets implicitly converted to String. Proof of the pudding would be to try it with a simple string assignment rather than System.out.println. What happens after the “=“ is more straightforward

This code gives the same result:

        String splat = 1 + 2 + " = " + 1 + 2;
So the author's assertion that it is based on some `System.out.println` compiler magic is false.

The JLS spec [0] accounts for this in its treatment of the '+' operator:

"If the type of either operand of a + operator is String, then the operation is string concatenation."

This does seem like idiosyncratic behaviour for java to be honest.

I don't think it's the compiler doing it either, because this will throw a null-pointer exception:

        Integer borken = null;
        String splat = borken + 2 + " = " + 1 + 2;
EDIT: Also interesting, if I drop the language level to 1.4 I get this error:

    Plus.java:8: error: bad operand types for binary operator '+'
        String splat = borken + 2 + " = " + 1 + 2;
[0] https://docs.oracle.com/javase/specs/jls/se7/html/jls-15.htm...

My question is why shouldn't it be assuming there is no compiler error and funny pointer arithmetic. If surprising part was lack of error then 1 + "0" would be a sufficient example. Is left associativity and + operator precedence not being dependent on data types the surprising part?

This is standard behaviour I would expect in most languages that have implicit type conversions. The article seems to be willfully disregarding that. I can think of no language that would not produce either that string, a fault, or (on the outside) NaN. Java is doing a completely sensible thing here.

That majority of operators are left associative, and that associativity is not dependent on the types involved.

Very high level let's say the grammar is:

    Expr :- Expr + Primitive
         |  Primitive
    Primitive :- Integer
              |  String

And then determining the type is always:

    type(Integer) = int
    type(String) = string
    type(Expr) = type(Primitive) or minimum_type(left, right)
With the simple rule (in this case)

    minimum_type(left, right)
        if (left == right) return left;
        return string
In java the type conversions are done as pass during compile time, doing something like this

    for (node in AST)
        if (type(node) != required type)
            replace node with type_conversion_node(node, required_type)
But none of this is java specific -- all statically typed languages (with implicit conversion) would do this. Dynamically typed languages in general cannot do this (in practice you can do a degree of it with local inference, etc), but that just changes when the type check is done.

[Edit, because the same confusion about static/dynamic/strong/weak typing is being made in multiple places.

Strong typing: The language does not allow you to perform an operation on a type if that operation were invalid when applied to that type. In java (to make it explicit) I could do (int[])someUnrelatedObject - that cannot be statically verified but it will fault at run time, because it is strongly typed. The equivalent in C will happily go off into the weeds and destroy everything.

Weak typing: A language does not guarantee that invalid operations will be prevented. For example C and C++ do have types, and they will disallow invalid uses, but there is no guard to prevent you from casting to an unrelated type. A more extreme example is some older languages that don't even disallow argument mismatches, etc.

Static typing: the types in the code are static, specifically they do not change run-to-run. The types of fields, variables, etc in the source could be inferred, or explicit by the programmer, but they are all known before the code is ever run. So C, C++, Java, etc are statically typed. Mostly. Because of Java's boneheaded array sub typing behaviour Object array[]; array[0] = someObejct; can fail at runtime due to a dynamic type check.

Dynamic typing: some or all expression types are not known until the code is run, everyone knows python, js, etc

(getting tired running out of steam)

So anyway, you can have:

Strong/Dynamic: Python, JavaScript, ... Strong/Static: Haskell, etc Weak/Dynamic: Can't think of a good example here because tired :D Weak/Static: C

And obviously there's a tonne of in-between

C++: all of C, plus statically type safe casts, and dynamically type safe casts Objective-C: all of C, but the ObjC object system which lets you send arbitrary messages to arbitrary targets. But you don't have to have the correct parameter types... ...

And this all ignores what caused this article in the first place: implicit type conversions. These do not effect the Strong/Weak description of the language if they are well defined, eg. people often complain about string promotion, but neglect to comment on someInt + someFloat promoting the int to a float, even though such a promotion can lose data. The distinction is purely a matter of "does the language allow you to perform an operation on a value of a type that is not compatible with that operation". One way to achieve that is the say it's completely illegal: throw an exception at runtime, fail to compile, etc another alternative is to implicit convert the type to something that is valid. Note that /converting/ a value changes the value being used. Compare that to the not type safe version where you literally ignore the type of the value being used. In the case of int + float it's the difference between converting the int to a float value, and just treating the bits of the integer as if they were the bits in a float.

Many apologies about the awful writing, but it's late :D

I was expecting something more interesting, very disappointed.

I think it's mildly amusing that Python produces a TypeError here, while the altogether stricter Java does not.

One of these plusses is not like the other.

Things to consider here are:

- operator precedence

- implicit type casting

- how operator+ is defined per each type (e.g: numbers add, strings concatenate)

The interesting thing is that the "obvious correct" behaviour would require the precedence of the + operator to depend on the types of the operands - which is obviously impossible if it's done at the parse stage...

So on mobile you can't see the entire width of that website, but trying to scroll to the left or right changes articles. Absolutely brilliant, why do people think that overriding the scroll on mobile is a good thing is beyond me.

The explanation is horribly wrong.

1 + 2 + "=" + (1 + 2)

Or as one might write in real life:

  (1 + 2) + "=" + (1 + 2)
A few braces here and there are a very practical substitute to wasting brain cycle about subtleties in operator precedence and associativity.

Oh but it's not an article about how shitty javascript is, so let's just ignore it.

so does javascript

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact