Hacker News new | past | comments | ask | show | jobs | submit login
Java and Scala’s type systems are unsound (2016) (acm.org)
38 points by rbanffy on Nov 23, 2021 | hide | past | favorite | 52 comments



Are there any java programmers around who are surprised by this? The docs don't beat around the bush: They state rather clearly that [A] generics can be used to cause heap corruption and [B] `null` is all types and no types at the same time (it can be cast to anything, it is `instanceof` nothing).

Any thought put into what that might mean for the soundness of the type system trivially leads to the 'correct' conclusions.

I'm not sure the headline is 'deserved' (it insinuates more alarm than I think is warranted), and the actual paper linked __definitely__ is engaging in some unwarranted hyperbole. "The existential crisis of null"?


Using casts between types with different generics can cause "heap corruption", but the compiler will give warnings on them (except if explicitly suppressed). The article doesn't use any such casts, but can still "cast" any non-null object of any type to any other, without any warnings emitted.

The behavior used to get that to happen, as far as I remember, uses the java spec giving a concrete way on how to parse generics. Multiple other compilers (including the official Java 9 one, if I'm reading correctly) actually emitted an error on it, despite "technically" it being "valid" Java code, to emphasize how weird the situation is.

edit: this is of similar (mostly academic) significance to, say, finding a case where you can double-free in Rust without any unsafe code. Sure, rust is unsafe if you use "unsafe", but other than that, you should hope it's sound. It's not at all as catastrophic as JVMs are already pretty much just dynamically typed though, but it's still a bad thing to have.


More accurate comparison: A double drop in rust. A double free would generally be a library bug


No, you can't cause "heap corruption". One of the casts inserted by the javac compiler can fail at runtime though. The JVM does not know or care about generics; they are "erased."


Having a Dog in a list of Cats is called 'heap corruption' in most places (including the relevant mailing lists on openjdk itself). It doesn't cause core dumps or security issues, it merely means any attempt to meaningfully interact with it will cause a ClassCastException on use.

Still called 'heap corruption'. Even if as far as the JVM is concerned, that's just a list of whatevers - it doesn't matter, on-use the bytecode will cast.


I interpreted "The existential crisis of null" as funny catchy title for conference talk or some such.

I am Java programmer, actually like java and took no offense on it.


>actually like java

I honestly can't fathom how that works, I thought aversion to verbosity is near-universal among programmers, so how can somebody _like_ (not just use or be okay with) a language that requires you to state the maddeningly obvious and do a mountain of paperwork all the time, so many times and for the simplest most trivial tasks. It's the modern day COBOL and worse.

When I first saw Java coming from C++ and Python object models (among others, but those two I'm most familiar with), I felt the same rage upon seeing that the accessibility modifier must be put before every field or method as people feel when they have to jump through stupid bureaucratic hoops. C++ just requires you to put "['public'|'protected'|'private'] :" then every field or method coming after that colon will be of the accessibility before it, which is the commonsense thing to do because usually all publics are together and all privates are together. It's something so very trivial and so utterly lowbar that C++ managed to do it right on the first try, I don't know what frame of mind was playing out in the mind of the first person who thought that this common thing needs to be multiplied out and stated for every field and every method it annotates.

This was the archtypical model of my interaction with Java ever since, everytime I hear the name I instantly think of the stupid and meaningless convoluted procedures typical of corrupt governments and bloated administration systems. "veryLongTypeName veryLongObjectName = new veryLongTypeName" everywhere as far as the eye can see, obscuring the dance of algorithms and datastructures expressing the core logic in favor of trivialities about allocations and declarations, then designers introduce the useless syntax sugar "var" and gloat about how that enhances readability in some cases, as if the language's syntax shouldn't have been designed like so and trimmed much much more than that from the very beginning. "Everything is a class" because we want to be so cool like our idol objective-C's idol smalltalk, so you have to do stupid things like naming a class "fooUtils" to signal it's just a namespace of static methods, but actually forget that not everything is a class and generics can't be used with primitves because reasons, nevermind the extremly obvious and painfully trivial solution of having the compiler generate a generic_primitiveType class for every generic it encounters, even if it's still a quick-and-dirty hack and only works with code with source available it would be infinitely better and more clean than the hack job they did with autoboxing and unboxing.

Etc, etc, the language is so comically badly-designed it's baffling, I don't think they could have done a worse job if they tried.


'public:' / 'private:' / whatever may save you a couple milliseconds when defining a new class, sure. At the cost of having to scan arbitrarily far backwards to find the visibility of a given thing. I personally often don't group things by public/private, but by their utility (e.g. blocks of fields by what part of state they store, regardless of what of that state is public, or putting private fields caching values near the public function that sets/reads them).

I like being able to immediately look at a line, and know what type of variable it declares, and, separately, look at what the variable definition does. When reading code top-to-bottom, you can pretty much ignore variable types, but when scanning code for what you want, you pretty much ignore code, and look only at variable types.

Having to wrap your static utilities in a class does seem a bit pointless, but the alternatives are having a whole another system for outside-of-any-class static methods (which complicates the language model and makes moving parts of your program around harder), or having 'namespace YourUtils { … }', which isn't all that different from a class anyways.

Generics are by far the thing I like least about Java. And that's represented by how they were added to Java - bolted on, not actually properly designed in. A problem with autogenerated types is that that'd generate a new type for each compiled jar file, and including jar files would then have the same type defined multiple times. As far as I understand, being able to trivially fall back to non-generic types was also a reason for the weirdness. (yes, these are horrible reasons, and have resulted in a vastly worse language. Unfortunately the designers valued keeping old things working over making future things better. Yes, Java could easily be better. But it still manages to be good enough for people.)


>'public:' / 'private:' / whatever may save you a couple milliseconds when defining a new class

The issue is not at all the time you save at write-time, this is completely unimportant and automated by [IDEs|text editor macros] anyway, the issue is reading the code later. Imagine that every time I write a sentence I prepend "Banana699 says" or "Banana699 thinks", this could be automated enough to be non-bothering to me as I write it, but imagine the sheer annoying line noise that will be my paragraphs with this. This is also bad for me (the writer) because I read and re-read my words when I write, having it full of noise clutters my thought process and impedes thinking about important things.

> I personally often don't group things by public/private, but by their utility (e.g. blocks of fields by what part of state they store)

You can still do this in C++, the class declaration can contain any sequence of 'modifier: declarations" clauses. (At the very extreme end, you can simply prepend 'modifier:' to every field and method and just have your java syntax back with an additional ':', the fragment "public: int x; private: int y; public: int z;" is valid C++ inside a class or struct declaration and does exactly what you'd think it would. But this is obviousely non-optimal in this case and so many others.)

>you can pretty much ignore variable types

Just like you can ignore my irrelevant username if I put it before every sentence I write, but the mere presence of irrelevant info steals your attention and clutters the code for no good reason.

>look only at variable types

Which you can still get and very easily at that in an inferred-types language if you're using any moderately modern IDE/TextEditor. Haskell is one language whose type inference can feel like magic sometimes, and yet a simple hovering over the identifier in even a lightweight IDE like repl.it reveals what you want. It's not a problem at all. Not all info has to be stored in the raw text, after all even Java is not bad enough to require you to use "new foo.bar.baz.class" even though it would be "clearer" in the strict sense of the word. Type inference simply applies a step further, and you can completely ignore it if you want.

>Having to wrap your static utilities in a class does seem a bit pointless

Meh, it's not a hill I'd want to die on, just a single example that I could have ignored if it was isolated.

>generate a new type for each compiled jar file, and including jar files would then have the same type defined multiple times

This is a bit confusing, I will rephrase it so I can be sure what I'm replying to, you can correct me if I misunderstood you.

You're highlighting the problem that if library B and library C both had a common ancestor library A, and they both use the generic type Ta<int>, then the compiler (or possibly two different compilers) generates Ta_int_Autogen two times for each of them because they are totally isolated. But if a library D includes both of them, then that would mean including two versions and that would cause bad performance and possibly other funny things.

That's C++ template problem, it can be completely avoided in Java's object model because there is (currently) a finite (and very small) number of types that you need to autogenerate. The compiler can simply treat "class generic<A> {...}" as syntax sugar for the following declarations "class generic_for_object {...} class generic_for_int {...} [... and so on for the rest of primitve types]". This is done at declaration time, i.e. in library A itself. All downstream code would then use this one version and this one version only, there is nothing else.

The reason C++ can't do this is that the semantics of C++ generics is literally and explicitely "Yes, this is just copy-paste for every type you instantiate the generic on". The compiler has no other choice but to instantiate on a use-by-use basis. If it started eagerly auto-generating at declaration time, it would never run out of types to auto-generate for. (it can't even know all the types the generic could be instantiated on, neither a common ancestor for all of them.)

>As far as I understand, being able to trivially fall back to non-generic types was also a reason for the weirdness

I don't actually understand what that means. What's the fallback and who's doing it? when and why ?

>But it still manages to be good enough for people

Anything can be good enough if you're constrained enough, what I found surprising is the "liking" part. For me, liking a language means that, in absence of all other constraints and externalities (back-compat, hype and who-else-is-using-it syndrome, perf considerations,...), you would prefer that language over any other. I find it alien that somebody can think that of Java, the language is completely out-of-sync with every frequency in my brain, it's like trying to sing while a rabid dog is barking in your ear.


> Imagine that every time I write a sentence I prepend "Banana699 says" or "Banana699 thinks"

It's more like starting paragraphs with ">" here. It'll also consistently be at the start of single lines, one above each other, so there's a clear guide on what to skip reading if you don't want to. With the benefit that if you do want to read it, you don't have to scan for context. Java likes putting context locally.

> but the mere presence of irrelevant info steals your attention

Except it doesn't. I literally just don't look at it if I don't care. And, unlike your username being repeated, it's actual information, a different value for each line.

> hovering over the identifier

Sure, that's also a way to do it. I have my java editor show the full type after each "var" instance, but even that's messy because it makes the editor more unpredictable (you can't move the cursor mid-type-description, they interfere with alignment, etc)

> Not all info has to be stored in the raw text

But there's still (unfortunately) a lot of value in having it. GitHub, or 'cat'ting in terminal won't provide you with the descriptions, and the cases when I don't have the thing open in an IDE will also be the cases I'm the newest to the project, and as such need as much help reading things as possible!

> Java is not bad enough to require you to use "new foo.bar.baz.class" …. Type inference simply applies a step further

There may be 10 or so classes imported in a file. There may be hundreds of variables in a file. foo.bar.baz.class would be duplicated dozens of times, and so is worth remembering. The type for a variable is duplicated.. twice, max? And, outside of "Type name = new Type();", it's useful duplication too, in most cases.

> because there is (currently) a finite (and very small) number of types that you need to autogenerate

Not quite "currently" as of Primitive Objects[0], which are available as a preview feature. And adding generics of primitive types is also a thing being worked on[1].

> I don't actually understand what that means. What's the fallback and who's doing it? when and why

An existing, pre-generics library has "void setProperties(HashMap map);". You want to pass a "HashMap<String,String>" to it. So you do, and it works.

> For me, liking a language means that, in absence of all other constraints and externalities (back-compat, hype and who-else-is-using-it syndrome, perf considerations,...), you would prefer that language over any other.

People can like multiple languages. Especially different languages for doing different things. Doesn't even have to be that constrained for some to arrive at Java. As-is, Java has wonderful IDEs, nice performance, trivial memory management, acceptable syntax, is portable, quite simple (both in terms of language complexity and usage, even if at the cost of verbosity), is OOP enough (if you're into that kind of thing), and will have libraries when you need them. Which is plenty for at least the things I do.

[0]: http://openjdk.java.net/jeps/401

[1]: http://openjdk.java.net/jeps/218


>It's more like starting paragraphs with ">" here.

It's a full word, not a single letter operator. Its uses add up to noise pretty fast.

>Java likes putting context locally

How much 'non-local' would a bunch of declarations become if you factor out a common modifier and hoist it just a few lines above them? is your screen real estate situation really that desperate you can't afford a single new line?

>Except it doesn't. I literally just don't look at it if I don't care.

I have a feeling we won't reach a productive understanding about this, you don't seem to realize that I'm not talking about conscious control. Your brain have an amazing ability to filter out conversations, you can talk to your friend in the middle of a 10000-person company and still be heard loud and clear. Even with this capability, do you like your conversations to always be in the middle of 10000-person groups or in quiet cafés?

>unlike your username being repeated, it's actual information, a different value for each line

By this argument why write functions then or generics or really any abstraction at all? Every function invocation or generic instantiation is unique and contains novel information, so why not write the whole thing each time for extra readability?

If there is no value in factoring out the repeated boilerplate that doesn't change much why aren't we all itching software ALU-control-signal by ALU-control-signal?

>Not quite "currently" as of Primitive Objects

I'm indeed aware of that, that's why I said 'currently' meaning the current stable release. This won't break my scheme too because those are full-blown objects that are part of the usual class hierarchy and could be manipulated as such, so they are covered by the generic_for_object version. There is another separate proposal to make the boxing of primitives use primitive objects so as to make it as performant as naked primitives and fix the 25-years-old stupid rookie mistake in the language design. This still wouldn't make HashMap<float, bool> legal, it would just make the boxed version more performant. Basically, you would still be left with a small ugly hole in the language where special-cases are introduced just for fun. Because severing the type system into primitives and classes at the language level was never a good idea.

>An existing, pre-generics library has "void setProperties(HashMap map);". You want to pass a "HashMap<String,String>" to it. So you do, and it works

The autogen solution isn't contradicting the boxing approach, if you absolutely need to use this function, just declare your HashMap to be of <Integer, Integer> instead of <int, int>. The advantage is that you don't need to most of the time, the generated class would still be there for you when you need the performance and have modern libraries to match.

>People can like multiple languages. Especially different languages for doing different things.

Sure thing, but Java is out-designed by nearly every serious language made after 2000, and most languages made after 1980. Even its niche is overflowing with superior competitors, so I don't understand why you would like it when everything good minus everything bad in it is offered by languages that fill its exact niche.

>nice performance, trivial memory management [...], is portable,

That's the praises of the JVM being sung here, not java the language.

>simple both in terms of language complexity and usage

Java is anything but simple, you're literally commenting on a thread discussing how the designers broke the type system by accident. One designer is on record saying that adding constraints to generics (<T implements some_interface>) made contributing to java compilers require too many PhDs. In terms of usage, Java frameworks are some of the most bloated lovecraftian horrors I have ever seen in my life, countless meaningless configuration here and there with no obvious ryhme or reason why this goes there. That's obviously just a single experience, but I have seen it echoed too many times to not be convinced it captures some underlying truth.

>is OOP enough

Aped it without understanding from smalltalk (through objective C) and couldn't even do it right.

>will have libraries when you need them.

Again, a feature of the JVM. A compiled .class knows nothing about Java or any other language it happened to come from.


> It's a full word, not a single letter operator. Its uses add up to noise pretty fast.

But it's still 1) a different color; 2) always in the same place (along with static/final). For me to call something "noise", it must be unpredictable, or otherwise hard to avoid. A thing consistently at the start of each line and in a different color is very easy to avoid.

> is your screen real estate situation really that desperate you can't afford a single new line?

It's not screen real estate, it's how much effort you have to put in to find something. In Java, to find whether a thing is public, I have to just look left a bit. In C++, I have to scan multiple lines upwards for a 'public:' or 'private:'. (but vertical screen real estate is indeed a ton more expensive than horizontal)

> Your brain have an amazing ability to filter out conversations

My whole point is that the filtering is extremely trivial. Not as trivial as ignoring whitespace at the starts of lines, sure, but still very much not a problem, at all.

> By this argument why write functions then or generics or really any abstraction at all? Every function invocation or generic instantiation is unique and contains novel information, so why not write the whole thing each time for extra readability?

That information is both very trivially derived, and not that much more useful than the parts. A single type annotation, however, can save a long time on trying to track down what even you're working with. Arbitrary expressions are much more complex, requiring you to either need to use the IDE to assist in understanding (and I can surely scan plain text much faster than any doing interaction with an IDE), or have read & understood much of the preceding/surrounding code to be able to infer the types of things.

> Because severing the type system into primitives and classes at the language level was never a good idea.

Actually, I prefer them being separated. But that's the performance-wanting part of me, not the language-design one. If done well (which java of course hasn't), it allows for a bunch of trivially achievable performance improvements, and shouldn't complicate usage. (C# has that with class vs struct, but I don't know enough to comment on how easy they are to use)

> That's the praises of the JVM being sung here, not java the language.

Not every language can performantly map to the JVM though. And, given that the JVM was made for Java, I think it's very fair to praise java for good things about it. End result is the same.

> <T implements some_interface>

ok yeah those are rough, with all the different variations that exist. But luckily they're used very infrequently, and are nevertheless not that hard to understand in practice.

> Java frameworks are some of the most bloated lovecraftian horrors I have ever seen in my life

Even with those around (and oh boy are they), there are plenty of actually good frameworks/libraries too.

> Again, a feature of the JVM. A compiled .class knows nothing about Java or any other language it happened to come from.

I usually prefer source code as documentation over actual documentation, so the library source being in the same language that I'm writing is very valuable to me.


Nope, but some papers are required to state the obvious I guess.


I mean. I don't believe Java or Scala claim type correctness or whatever the term is, compared to more formal languages.


I'd classify this one under: So what? If you beat your tools hard enough, something will give. The simple solution is to stop abusing your tools.

Generics were bolted on in Java 5, and some painful compromises had to be made to have both backward compatibility and generics. The whole type erasure thing being the prime example.

There is some kind of heap corruption in that you can force a Duck in a list of Cars. But the very first time you try to use that Duck as a Car, the JVM will catch it and hit you with an exception. You won't get C-like undefined behavior this way.

Java had some painfull bugs with real world consequences. This is not one of them.

Now if you want to really abuse Java, try this one :

  Field field = Integer.class.getDeclaredField("value");
  field.setAccessible(true);
  field.set(1, 666);
  
  List<Integer> list = Arrays.asList(1, 2, 3);
  System.out.println(list); //prints [666,2,3]
and java allows half of the number 1s in your program to magically change into a 666. Total mayhem ensues. Solution: Don't do this.


Type erasure gets blamed for too much. Want to see some languages that make generics sing? Take a look at SML, Haskell, or Rust. What's another interesting characteristic that unites them? They erase all types. Not just generic type parameters like Java does. Everything is type erased. And they do just fine.

I would argue that the real problem with Java's rollout of generics is at the language level. The language itself introduced some strange gaps in the semantics of generics, such as failing to build any decent bridges between arrays and generic collections. I don't know that this was necessary. Scala and Kotlin are working out OK, and they have unified arrays with generics. And the other problem is that generic types are not exactly first-class types in Java, even before type erasure takes place. This means handling some key use cases for generics requires resorting to the "super type token" pattern, which is awkward and still not all that well known.

I don't think I'm alone in this sentiment. Coders at Work has an interview with a developer who worked on Java 5, and he also expresses a belief that, in retrospect, the problem with Java generics wasn't type erasure, it's that they rushed the feature and some of the language-level design characteristics weren't fully baked yet. (He didn't go much into specifics, though, so I can't say if our opinions agree in the specifics.)


> Type erasure gets blamed for too much. Want to see some languages that make generics sing? Take a look at SML, Haskell, or Rust. What's another interesting characteristic that unites them? They erase all types. Not just generic type parameters like Java does. Everything is type erased. And they do just fine.

You're both right and wrong.

You're technically correct, which we all know is the best kind of correct, but you're wrong because you're leaving out very important Gestalt context. (EDIT: Actually, you vaguely touch on it, but you focus on arrays and Lists, which I think misses the point)

You're correct because Rust and Haskell, et al, have fully erased types (well... Rust has trait objects, which are implemented with vtables- does that count?), and nobody really complains about it. So, in a very specifically-literal sense, you're right that it's not the type-erasure that is the (sole) problem in Java.

The real problem is that Java is BIG on *runtime reflection*. You can't "reflect" on types that don't exist, so every time you do use reflection (which is approximately 100% of Java projects) and generics (which is also approximately 100% of Java projects), you're in for nonsense. Between generics, reflection, and null, I literally feel my blood pressure rise when I think about dealing with Java, and even Kotlin.

Where Java leans on runtime reflection, Haskell and Rust cannot and do not. And, therefore, the communities have found compile-time (or just tedious write-time) solutions to things like (de)serialization.

So, type erasure isn't the problem, and runtime reflection isn't a problem. But you can't do both and expect anything good to come of it.


> I would argue that the real problem with Java's rollout of generics is at the language level. The language itself introduced some strange gaps in the semantics of generics, such as failing to build any decent bridges between arrays and generic collections. I don't know that this was necessary.

It was necessary. Java made the mistake of letting primitive arrays be (incorrectly) covariant in their type param. Java's implementation of generics is, fortunately, correct in how it handles type parameter variance. However, this means that they are incompatible with Java's primitive arrays. I don't think there is a way to "bridge" them seamlessly without making generics unsound.


I was under the impression that Rust uses monomorphization, not type erasure in its implementation of generics? (Although it does have a form of type-erasure in &dyn Trait objects)


It monomorphizes generic functions and methods. Type erasure happens to the actual values.

So, for example, at run time an Option<i32> ends up just storing one word for indicating whether it's Some or None, and another for the value when it's Some, but doesn't store any information to indicate that those 8 bytes represent an Option<T> of any kind, let alone exactly what kind of value is being stored in it. None of that is needed at run time, because the compiler was able to statically verify that no function is going to try to read them as a u64 instead.


I have never heard such a definition of type erasure, generally it means that the concrete type is hidden behind a generic "OO" interface which is itself hidden behind a value type providing the same interface / API


The difference is subtypes being treated as super types, ie, OOP stuff. Type erasure is different in a language where basically any typed value could also contain a value of any subtype. All those other languages you mention make that functionality opt in, which is why they aren't object oriented languages.


One could imagine, though, an object-oriented language that doesn't retain objects' actual types, just pointers to their virtual function tables. Perhaps, as an optimization, it even skips that in cases where it can prove that all method calls can be statically dispatched.


Exactly. Every programming language out there has some cases where you can break the established rules by doing weird stuff. When this sort of thing happens, there's two things to look at:

1) Is this likely to happen by accident?

2) Does this enable some functionality not otherwise available (even if it's still a bad idea)

Looking at the paper, it seems like neither of these is the case for Java/Scala types. Given that, I would categorize this as a weird quirk of Java and Scala, and leave it at that.


I dont think this has to be true. We shouod be able to build perfectly logical and deterministic type systems--and ny understanding is that Java was unambiguous before generics were added.


I'd guess ARandumGuy has the soul of an engineer. Confronted with an imperfection, the question is how bad it is in a world full of even bigger imperfections. Is it worth the time,effort and resources to fix it?

Gunax has the soul of a mathematician. Confronted with an imperfection, the question is how to eradicate this blight from the world and get us one step closer to perfection, no matter the cost.

Both are equally correct reactions, of course. Just nice to see them in the same place.


> Now if you want to really abuse Java, try this one :

This is no longer possible out of the box after "JEP 403: Strongly Encapsulate JDK Internals" [1].

[1] https://openjdk.java.net/jeps/403


That will teach me to test on Java 8. Oh well. The whole point was it was a bad idea.

UPDATE: I see there is a flag --illegal-access=permit so if you want to abuse Java, you can still go ahead ;-)


Support for --illegal-access=permit was removed with Java 17.

But you can use "--add-opens=java.base/java.lang=ALL-UNNAMED" instead if you really want.

If you really want to shoot yourself in the foot. You can always go native...


OpenJDK 64-Bit Server VM warning: Ignoring option --illegal-access=permit; support was removed in 17.0


Damn you guys/girls are fast.


As of Java 17, no longer. This option was intended as a migration path for badly behaving code, and as of today it no longer works.


In Smalltalk you could write "Smalltalk become: nil" which would replace all references from Smalltalk (global variable holding the whole Smalltalk) to nil. VM would crash instantly... But who cared? Developer had power to do all this...


I did that once with `True := False` or something equally stupid. Of course it worked and the machine froze instantly.


The tinkering with the Integer cache to make 2 + 2 = 5 : https://codegolf.stackexchange.com/a/28818

The trick is to make the int literal to get boxed so that it hits the cache - which requires a printf (rather than println).

And yea, don't do this.


There was an old-time "hacker test", one of the questions of which was "Ever change the value of 4? In a language other than FORTRAN?"

So I guess the answer here would be "yes" to both parts. Well done!


As an occasional dabbler in Forth, I can answer in the affirmative.

Well, I'm not sure I redefined 4 specifically, but I tried out redefining a few integer literals.


The Java module system doesn't allow this trick to work anymore.

java.lang.reflect.InaccessibleObjectException: Unable to make field private final int java.lang.Integer.value accessible: module java.base does not "opens java.lang" to unnamed module


This trick does not work with Java 11+, you can not peek and poke the JDK internals anymore.

It's the reason why the transition from Java 8 to 11 has been painful.


Um, why does this only change half the number 1s in your program?


That use (and I haven't checked it - see my other post in this thread which I know works ( http://ideone.com/o1h0hR ) involves the boxing of an int literal to an Integer.

In making it into a List<Integer>, it goes through boxing.

https://docs.oracle.com/javase/tutorial/java/data/autoboxing...

The "half" means that this doesn't impact things like "int foo = 1 + 1" which will be 2 rather than 1,332. It's not because it's half of the uses of 1, but rather that some portion of the code will be converting int to Integer and that will impact those uses.


My understanding is that Scala 3 introduced a sound type system as a key feature


This is also my understanding. Based on Dependent-Object Types calculus.

Scala 3 is one of the best-designed languages I've ever used.

I came into learning it without Scala 2 experience, and not a big fan of hardcore FP (a-la Haskell), and it's intuitive to write as you would Python or Typescript if you choose to.


Any tips, tricks, resources for learning scala 3 ? I have no scala 2 experience as well


Yeah, definitely!

First/foremost, the community essentially held my hand, especially during the early weeks. I would ask questions on their Discord and always someone would answer them and explain concepts if needed:

https://discord.gg/scala

They have a great community. The experienced users tend to be heavily FP-oriented but they're respectful of the "I'm not really the Haskell type." sentiment.

I went through the Scala 3 "concepts" book on the website, and then started to build things + ask questions about best practices or how to do stuff on Discord:

https://docs.scala-lang.org/scala3/book/introduction.html

They have also updated the free online courses "Effective Scala" and the Scala specialization to Scala 3:

https://www.coursera.org/learn/effective-scala

https://www.coursera.org/specializations/scala


Thanks I'll take a look


It's the reason why it took over a decade [1]. Also why some Scala 2 features were removed, they couldn't prove soundness. Haven't seen anything regarding generics there though.

[1] https://www.scala-lang.org/blog/2016/02/03/essence-of-scala....


"polymorphism was not integrated into the Java Virtual Machine (JVM), so these examples do not demonstrate any unsoundness of the JVM. " Is in the article


Btw this is the same problem with any kind of proof system that depends on terms and has exceptions / divergence / null (i.e. anything that can generate a term of any type).

Same problem would appear in Haskell, where you can define a term of any type (using a divergent function); the actual value is never executed (because of laziness), so it would crash at runtime. In OCaml, the type system would accept this, but it wouldn't crash (because the "any type" term cannot actually be created, but would result in an exception and/or divergence).

The correct solution is for the type system to track "computability" and require that all terms used in proofs are actually well-defined. Liquid Haskell in particular had this problem that they had to solve by enforcing strictness for such terms.

(Also, technically generic functions in Java/Scala aren't "parametrically polymorphic" - which means that you cannot do any operation on a value of polymorphic type - not even `typeof`).


modern java however will straight up tell you where the trick lies and refuse to budge:

"Inferred type 'B' for type parameter 'B' is not within its bound; should extend 'U'"


Does this also apply to Scala 3.0?


No. Its type system is sound.

(Assuming the proofs hold and they've actually implemented what the proof is based on.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: