Hacker News new | past | comments | ask | show | jobs | submit login
How generics were added to .NET (mattwarren.org)
225 points by matthewwarren on March 5, 2018 | hide | past | favorite | 124 comments



I started looking at the source diffs and it triggers a panic attack. That's a ton of work.

All the more reason to admire and appreciate MS Research and the development of C# generics. Really amazing accomplishment.

And Linq is one of the single greatest inventions in modern day programming.


LINQ is the reason I became a .NET developer after leaving university.

I was bored of Java at university, and as the project I was basing my dissertation on was a web-based search engine I decided to use C# and ASP.NET. The professor helping me was fine with it, since he didn't know PHP or Ruby, and he felt he could read C#.

I came across some stumbling blocks with some logic, so I remember posting on a forum somewhere asking for help with setting up a data structure. Someone sent me some LINQ, and I thanked them for the pseudocode. When they pointed out that it was real functioning code, I plopped it into Visual Studio and the whole thing worked. From that moment, I fell in love with the language - ASP.NET WebForms less so, but shortly after this MVC became a thing in the .NET world and I jumped on the bandwagon.

A decade later, and while I've moved onto other languages I still fight C#'s corner, and LINQ/the lambda syntax is one of the first points I raise when people bitch about Micro$oft. The only negatives about working as a .NET developer is being locking into an ecosystem, so if Microsoft and the .NET Foundation can get .NET Core to be feature compatible with .NET Standard then they've got a winner on their hands.


Many other languages offer similar functional APIs for querying collections, here is C#'s 101 LINQ Examples rewritten in different languages:

   https://github.com/mythz/swift-linq-examples
   https://github.com/mythz/kotlin-linq-examples
   https://github.com/mythz/clojure-linq-examples
   https://github.com/mythz/java-linq-examples
   https://github.com/mythz/dart-linq-examples
   https://github.com/omnibs/elixir-linq-examples
   http://templates.servicestack.net/linq/
Java 8+ has since improved with their Streams API. The primary benefit that differentiates LINQ apart is that you can easily capture the Expression tree of the LINQ expression and rewrite its intent to apply to other sources, commonly used in LINQ -> SQL to provide a Typed Expression API for generating equivalent SQL to run on an RDBMS.


Not op, but I think these are the equivalent of linq to objects. Linq also has the capability to treat lambdas as expression objects, so instead of executing them, you may further interpret them and turn them into something else. Like sql statements.


I covered capturing LINQ expressions in my last sentence.


I am .NET developer, I thought I will be Java dev or whatever else web backend brings. But years are showing .NET is getting well. Msft has skin in the game with .NET ecosystem and it is "developers, developers, developers". Other ecosystems don't have huge company living off it. Oracle for Java also does not have skin in the game, they will just take over next big thing and milk it away.


Oracle was one of the first companies to adopt Java.

They used to do presentations together with Sun, had the crazy Network Computer idea together with those Java Stations at Sun, ported their installers to Java, were the first RDBMS to allow stored procedures to be written in Java, had their own JVM, IDE and JEE server long before Sun started to get into troubles.

Have kept Java team together, saved Sun labs, turned the Maxime research work into Graal, are making the effort of supporting AOT compilation (tabu at Sun),....

I am a polyglot developer, and while Oracle could be doing a better job, they surely are doing much better than Sun was doing on its later days.

Who knows, if Google hadn't ripped off Sun and they were getting some Android revenue, maybe Sun would still be around.


> Who knows, if Google hadn't ripped off Sun and they were getting some Android revenue, maybe Sun would still be around.

Or maybe if Sun showed some promise with J2ME and tried to challenge iOS it would have been main iOS competitor instead of Android. But J2ME did not deliver much and the idea to grab some bullshit patents revenue, which Oracle tried, failed.

In copyright case Oracle may still get some money but after belatedly joining CNCF Oracle would not like to be on warpath with Google.


J2ME was doing perfectly fine in Nokia and Sony-Ericson devices, and we already had SavaJe OS when Android happened.

Android fragmentation, OEM customizations, lack of enforced updates and cheapo unresponsive 100 € devices are no better than the devices that gave bad name to J2ME.


> so if Microsoft and the .NET Foundation can get .NET Core to be feature compatible with .NET Standard

What? It literally already is, all the time. If anything it's full .NET that lags behind standard often.


Regarding LINQ: I really like certain parts of it, dislike others (mainly inline Select-statements and the EntityFramework built on top of it), but I'd be hard-pressed to call it the greatest invention of any time. PEP 255 laid the ground work for similar language features in 2001, and I doubt that the python guys "invented" all of that out of thin air (cf. Reference #4 in the PEP).

What you can argue is that LINQ made this stuff available to more developers (with useful syntax and in a statically typed language, which also must have taken a lot of work), but filter/map/reduce have been around for quite a while.

That obviously doesn't make it less useful or the underlying tech less impressive though.


The magic behind LINQ is that it's a query layer on top of a backing data store. When it first came out, you'd see data providers written against twitter, file system and everything like that. Then entity framework & nhibernate linq provider came out. It was great, hid a lot of details from user, brought static type checking to otherwise very hard to "verify" sql. You'd now think in terms of objects, and not tables.

Linq compiler compiled the whole query into a AST, which you could then convert to a backing query your data store could understand.

To date, I still think Linq was great. I was indifferent to the "sqlized linq", but i liked the lambda quite a lot. It was a nice abstraction and a pretty good language to write queries on.

Disclaimer: I worked on NHibernate Linq integration ~10 years ago.


What is your opinion on the state of NHibernate ?

I've started using it in 2009. I am still using it in my projects, but I am still not sure if NHibernate is functional complete or if the project is nearly dead.

It works fine for me (if it is not worth the effort I am using Dapper.net), but it doesn't seem that there is much active development going on.


On the way out (daily user here).

I started a project about 4yrs ago, it was EF vs NH. Went with NH because EF was still pretty immature.

The momentum is pretty clearly toward EF IMO. NH is missing a lot, including proper migration tooling and deep async support. But I don't think that's the real issue. The more foundational problem in my view, is that NH deeply assumes blocking/synchronous database access. I have no idea how NH lazy loading will ever work with async. I imagine you'll have a bunch of FK refs all marked Task<T> that will have to be awaited? In any case, my bet is that EF will get there a lot more quickly than NH.


To my surprise, it seems fairly active: https://github.com/nhibernate/nhibernate-core

I stopped working on nhibernate around 2010-2011, when i started masters. I did only keep in touch with one friend - i am not very sociable person. So I guess short answer is, i am not sure to be honest.

Back in 2008ish, I was thinking it was going to die because Entity framework provided a great support for linq, and it had a great hype. A bunch of enterprises moved away from NHibernate (we also had a lot of legacy from XML configuration - a bunch of internal data structures were referencing xml), so it was clunky. Fabio, Oren (Ayende) did a bunch of improvements, I worked a bunch of shortening initialization times and so on. I think Ayende did a very good job with Linq initially, but there were plenty of edge cases that i remember having to fix :) It was my first opensource project as a core member (second and last was castle project) - so it has a very special place for me :)


Okay - it really seems to be quite active right now. Thank you for your insight.

I've used the Linq queries in one of my projects, but I like the QueryOver statements more (a little bit too much black magic in the Linq queries).

But I believe that it was a great effort - so let me thank you for that !


In my expereince you shouldn't even touch the black magic stuff unless you're a small org or it's a throwaway app.

Linq with the EF is great for simple queries and updates, anything else and you're in for serious performance problems as soon as the app scales. Better to drop to normal SQL for complicated data loading.

Even MS can't write decent LINQ queries, their ASP.Net identity provider is now the most 'expensive' bit of our app now because they used expensive LINQ queries instead of using raw SQL queries. Admittedly our use-case is abnormal as the specific problem we have is that because of the way users are added their password gets reset almost immediately, as they're invited by an organiser. It also means new users are constantly being added. When the password reset is saved it completely unnecessarily "verifies" the update by making sure the username and email are unique, which means two UPPER()s and CONTAINS()s on string fields. Unlike the old provider there's no usernamelowered field already in the db to avoid this.

We can fix it by over-riding these queries, but it's annoying that the ASP.Net team took a core framework piece that was very performant with fast performing SQL and made it substandard. At least I can look at the code now ;).


I've never thought of LINQ as black magic and have scaled some astoundingly complex LINQ queries in a past life. I've even done some things in LINQ optimization that you couldn't do in just "normal" SQL (clever joins of databases on different servers without linked servers; complex client-side caching and statistics work).

I think LINQ gets a lot of flak it doesn't necessarily deserve in complex queries due to people stopping at the black box and assuming black magic. It's a very functional programming paradigm embedded in an otherwise procedural world, and so the skills to debug complex LINQ should be unsurprisingly just a bit different than debugging most else in C#. I don't blame people for stopping at the black box. I just think more people should know that you can do more than stop at the black box.

(Also, it amuses me that your example from ASP.NET identity's changes have nothing to with that it should be using raw SQL queries versus LINQ, and everything to do with denormalization versus the query execution engine. It shouldn't be a huge surprise that Microsoft might trust SQL Server to be able to UPPER() and CONTAINS() fast enough, and the queries rare enough that it isn't necessary to denormalize that information.)


Check ravendb and see how linq can be a first class citizen in a database.


Most of the points you brought up existed even in .NET before LINQ with DataTables/DataSets


For me DataTables where magic objects with no type safety at compile time.


After LINQ, I entirely refused to work on DataTables code without refactoring it to something more sane, even if just ugly DBReader junk and its version of casting hell. I was briefly sad that .NET Standard 2.0 had to bring in DataTables and preserve that ugliness for the rest of foreseeable time.


Type-safety is a blessing.


datatables are pretty much how regular SQL behaves, there are no magic behind them, right?

I would have prefered using DataReader, most often.


Filter/map/reduce are ubiquitous ideas from the functional world, which in turn are ideas drawn from mathematics. Math has pre-existed many of our programming constructs, so in that sense, nothing was really invented as such.

To me, what LINQ offers is a very nice orthogonal DSL that allows programmers to express these thoughts. The underlying constructs are universal, but one could argue that LINQ itself was invented. LINQ was apparently guided by category thinking [0], which makes it fairly elegant, but in terms of expressiveness it is probably similar to SQL. It's all set theory underneath anyway. Unlike SQL however, LINQ does force the SELECT statement to the very end of the query expression, which makes IDE auto-complete work better due to avaiable context. :)

I first had the realization that most data manipulation operations on list-like or table-like objects were actually just glorified set operations (and there was a certain universality to them regardless of query language syntax) when I read this quote [1]: "The turning point was a conversation with Erik Meijer. He told me how categorical thinking guided him in designing LINQ, and explained how category theory brings out a simple relationship between various kinds of databases. The diagram below comes from a paper he wrote on this. It turns out that so-called No-SQL databases are in a categorical sense co-SQL databases."

[0] https://queue.acm.org/detail.cfm?id=2024658

[1] https://www.johndcook.com/blog/applied-category-theory/


I always saw Linq as the iPhone of functional programming: it didn't really invent anything, but put existing things into such an appealing package for mass-market that it really finally made all those things popular and usable by people (developers) at large.


This is a broader theme in how technology develops.

When I look at technologies like bitcoin, it's just proof of work/puzzles (a well-understood defense to sybil attacks) + basic crypto primitives like signatures + some of the p2p stuff from bittorrent, etc.

There's real genius in combining the pieces, though. I also spent a few years in serious CS academia so not clear how widely these things (especially puzzles) were known beyond universities.


Yes, I think you sum it up pretty nicely.


It seems you're just thinking of the IEnumerable API and not LINQ at large, which includes the Expression and IQueryable API and enable things like LINQ-to-SQL.


...which I'm also not a big fan of after having seen their implications in older products (i.e. ones that had to be maintained for a while). In most cases, they made the problem worse.

Plus, while they are enabled by other techniques, those weren't invented in LINQ, either.


>PEP 255 laid the ground work for similar language features in 2001, and I doubt that the python guys "invented" all of that out of thin air (cf. Reference #4 in the PEP).

LINQ is not about generators...


The IEnumerable parts are mainly enabled by coroutines, which for this case more or less equate to generators. The other parts are pretty much what I've already seen so much problems with that I wouldn't touch them with a 10 foot pole (see the "black magic" comment by others, that sums it up nicely for me).

But those are also more or less just about building ASTs and transforming them, something Lisp has done for decades.


> I started looking at the source diffs and it triggers a panic attack. That's a ton of work.

Don't forget it took several people a few years to do all that! But yeah, I agree, it was an impressive achievement!


This surprised me:

> It was only through the total dedication of Microsoft Research, Cambridge during 1998-2004, to doing a complete, high quality implementation in both the CLR (including NGEN, debugging, JIT, AppDomains, concurrent loading and many other aspects), and the C# compiler, that the project proceeded.

I didn't realize that it was less a "must-have" and more of a "research and if-possible" task. I wonder how the .NET framework/languages would have changed if they went with type erasure. Would we even still talking about .NET today?


We're still talking about Java, which does type erasure. Oracle has really been phoning it in for many years, and despite that the platform is still going strong. Killing large, entrenched platforms is not easy.

Windows remains 800lb gorilla on the desktop, and C# is Microsoft's recommended environment on it. It's hardly surprising that .NET remains relevant.


I've been working with .NET since it was in beta in 2001, and I've never considered .NET's strength to be its desktop story. .NET (and now .NET Core) are awesome because of C# and the BCL's generally excellent design. .NET is an excellent choice for building server-side applications of all kinds. .NET Core removed that last complaint I had about .NET: it wasn't cross-platform.

It's a great time to be a .NET developer.


.Net is also well designed. Some of the most subtle design choices drastically exaggerate poor choices in other languages - one example that has always stuck with me is how well C# namespaces are designed. Not only because they are frictionless, but also because of the lack of global namespace types in the standard library. It's a tiny detail, but .Net is just that: hundreds of well-executed small details.


I think the only way you get this is having someone very, very experienced leading the project. MS had Anders Hejlsberg, who hasn't done much in life, except leading the Turbo Pascal and Delphi efforts, two of the most successful rapid app design languages of the 90s (note: heavy sarcasm in last sentence)

I'm not one bit surprised this is the the guy leading the C# effort's second or third language.


I disagree with you about namespaces. I find Java's packages more useful since they give you access control which increases encapsulation. If I want a factory method guarding the instantiation of a class in java I can make the factory public and the class package private. In C# the only answer for this type of encapsulation are assemblies but they're much heavier weight and it is recommended not to have too many of them in a project.


You can make the constructors on the class private and have a public static method on the class that does the instantiation.


For the simple case yes. But if you want several classes cooperating, maybe helpers or related data classes, your options are assemblies or like you said to shove everything into one giant class. You have less flexibility of separating things out.


On large platforms, I am going to piss off people but I'll do it anyway. It's not because a language is widely used that it is a well designed language. Javascript, VBA and php immediatly come to my mind. There are many other reasons for a language to be a heavy weight other than its own merit.


This reminds me of Peter Lynch's comment about the stock market: "a voting machine in the short term, but a weighing machine in the long term".

In the short-term, languages with good developer outreach and other factors win. But you don't get things like generics or the .NET TPL without some serious long-term vision. I really do believe well-designed languages win over a long enough timescale.


There are many metrics to judge a programming language by, and even different "axes". Going from theoretical correctness, consistency, type system soundness all the way to ease of development, "batteries included", empowerment, developer productivity (on small and very large programs, as that is not the same), and more recently "forcing" good practices.

Even popularity is a value, as it automatically provides a community, but it's also a pitfall, as massive popularity inevitably means the average programmer of language X becomes as smart, careful and reliable as the average programmer irrespective of language. And, to put it mildly, that's not a positive evolution. That, above all others, was VB and Delphi's downfall.

On three axes, I would argue VB, Javascript and Delphi did/do incredibly well: ease of development (of small-ish programs), and the deployment story, as well as the empowerment they provided.

On things like consistency, developer productivity, batteries included (Delphi was better on the batteries included front), type system soundness, ... they were somewhat sub-par.

It's just what people value at the time. And of course, it is critical during a career in development to distinguish yourself from the average programmer.


Yeah, but I wonder if .NET wouldn't have succeeded until .NET Core without generics as a distinguishing feature. I feel like generics + some of the features built on them (LINQ+TPL+etc.) is what let .NET/C# stand apart from Java.

I'm not necessarily saying it would have failed, but I do wonder :: )


C# succeeded based on the strength of Visual Studio.

Sure, the language being nicely designed and having some code translation tools helped people switch from VB.

But C# being the “best supported” language in Visual Studio. The productivity gains for developers were too real to ignore, managers saw it, and bought into the Visual Studio ecosystem en masse.


In turn, Visual Studio has great support for C# because of the C#'s great type system.


Is already at the beginning of the article, but should be more visibible how MS Research and Don Syme seen the opportunity to add generics (neeeded for F#) and added that. With design already supporting later addition like variance.

Not just the design, the real implementation too in the existing already complex CLR codebase. And that include lot of areas like NGEN, AppDomains, etc, so is no small feat at all, for a production ready framework already used by lot of developers.

I am programming in .NET Framework from v1.0, the v2.0 (with generics) was a clear cut, so is not something you can add too late because it become pervasive in the framework who leverage it in the design (generics in v2.0, LINQ in v3.0, TPL and Async v4.0) As a note, .NET continue to support non generic collections for backward compatibility, but usage is deprecated.

Really interesting piece of .NET history in https://blogs.msdn.microsoft.com/dsyme/2011/03/15/netc-gener...


> Is already at the beginning of the article, but should be more visibible how MS Research and Don Syme seen the opportunity to add generics (neeeded for F#) and added that. With design already supporting later addition like variance.

I definitely agree, that's why I put it right at the start of the post :-)


> Ultimately, an erasure model of generics would have been adopted, as for Java, since the CLR team would never have pursued a in-the-VM generics design without external help.

Can someone explain why "in-the-VM" generics design is better than the erasure model of Java?


In Java, you can't make runtime decisions based on the generic type of an object. Eg. you can't have a List<?> and check to see if it's a List<Foo> or a List<Bar> -- that information has been erased.

You could look at the head of the list to make that determination, (because the List members still have a concrete type), but that isn't a general purpose solution.

Languages like Scala use manifests to preserve the information but it's a major kludge.


Looking at the head of a List does not work. The class of that element is within the scope of <? extends ListType>, an is not necessarily not ListType.


To spell this out so it's a bit more obvious, given a Thing at head of a list, you could have:

    - a List<Thing>
    - a List<ThingSuperclass>
    - a List<? extends Thing>
    - a List<? super ThingSubclass>
    - a List<IThing>
    - a List<ILiterallyAnyImplementedIFace>
    - a List<IHaventForgottenInterfacesCanExtendOthers>
    - a List<Object>
    - a List<I'm probably forgetting some possibilities>
And you have no way of knowing which it is.

So yes, it's a boundary, which is significantly more than nothing. But far less useful than it feels like at first.


*and is not necessarily ListType.

Idiot.


For the downvoters, the author is correcting themselves after the edit window. It's clearly self-deprecating humor, not a personal attack.


Not to mention that an empty list still has a type...


One the way to looking at other things(tm) I found a blog post by a CS language researcher talking about type erasure.

His two points.

Being able to do type erasure is great because it means your language is coherent. (Don't need to do any nasty kluges in the run time)

And then everyone ratfucks themselves by actually implementing type erasure, and now your language will never have good tooling and thus will die a lonely unmourned death.

He commented that C/C++ has type erasure and it's also a massive kludge (via the elf format). No one will invest that much effort into a new language.

tl;dr: Type erasure, it's great, don't do it.


What tooling do you think is missing due to erasure?


Not a CS person, but I think if you keep the type information pined to the object in memory then a debugger can use the object type information in memory to match it with the object's definition in the program. That makes creating a debugger easy.

Imagine wanting to know what the object at address 0x34199920 is. If you have type info[1] you can cross reference the type info with 'generic array' and then do the same with element 2, address 0x34199920 and know it's a 'foo object'.

[1] Say the first 4 bytes is _always_ a type index that the compiler or interpreter generates.


But then what is the difference between a a collection of type object and a errasure model generic collection?


At runtime? None, it's a static (compile-time) feature, though the compiler does add some type assertions & casts automatically.

Note that this is not necessarily an issue for very statically-typed languages e.g. if I remember correctly GHC uses erased generics (they don't exist at runtime) which works fine because Haskell has neither pervasive RTTI nor specialisation, so the type erasure has ~no runtime visibility or impact.


At the VM/bytecode level, not much (if anything?). The benefit comes at compile-time, where the compiler can verify that you're putting the correct types into a collection, and that you're expecting the correct types out.


> At the VM/bytecode level, not much (if anything?).

I don't think this is correct. In C# code at runtime, and in the bytecode, you can inspect types for metadata about their generic type params. It is present in the .Net bytecode.

See https://docs.microsoft.com/en-us/dotnet/csharp/programming-g...

> When a generic type or method is compiled into (bytecode), it contains metadata that identifies it as having type parameters.

from https://docs.microsoft.com/en-us/dotnet/csharp/programming-g...


But C# generics aren't erasure model collections are they?


No, they are not. The point is, it is different at the bytecode level.


You've got that completely the wrong way around. An erasing compiler has full access to the type (obviously - it's doing the erasing) and so can verify whether you're putting the correct types in or out of a collection.

Where erasure can cause issues is at runtime. For example:

    X isa List<Int>
Even if X is indeed a list of ints that information isn't held at runtime, so the test can't be supported.

Having said that in some respects this is a C# / Java ecumenical matter. Haskell, for example, doesn't bake types into output code so I suppose it erases even more that java, but it supports very rich polymorphic behaviour.


Performance: you can get rid of all those bridge casts without the need for an inter-procedural analysis. However, since C# also supports structs, generic instances of structs can be (and most of the time are) inlined to realize struct performance benefits.

Ease of programming: your generic type bindings don't disappear at run-time and can be used run-time logic decisions, basically violating Bracha's law (types shouldn't influence run-time behavior), but I found it incredibly useful for meta-programming reasons.


As a Haskell aficionado I can appreciate type erasure and the idea that you don't need to know shit at runtime. However yes when using hybrid languages like C# you have to cheat sometimes. Although anything that uses reflection makes me cringe and there is the practical concern that analysis tools such as find all references etc. are of less use when there is stuff going round and reflecting on shit.


If you admit dynamic casts/tests in your language, then why go only half way? Such inconsistency leads to bad usability. Haskell is at the other extreme (no dynamic casts) but then it is consistent about it.


Yeah. People complain about Java type erasure, but the problem is not that they have type ensure, it's that they did it only halfway.


Yeah, I can see that. Basically you've got two options, go the Haskell way and have zero runtime reflection support and full type erasure, or go the C# way and have full reflection support and no type erasure. Java it's just painful because you get runtime reflection except it's incomplete because half the type info was erased at compile time.


This seems orthogonal though. Just because a runtime uses reified generics, doesn't mean that it has to provide reflection on them.

Is there something special that type erasure adds, that you couldn't have Haskell without it? From a layman's perspective it seems that Haskell could be implemented with either an erased or a reified generics model under the hood, without changing the public surface of the language. But is there something that type erasure enables that reified generics does not?


> This seems orthogonal though. Just because a runtime uses reified generics, doesn't mean that it has to provide reflection on them.

It's not completely orthogonal as erased generics are annoying/problematic when the language does provide reflection/RTTI.

So you've got 4 (reification x reflection) states 3 of which are fine:

* if you have erasure and no reflection (Haskell) you're fine: you don't have runtime types but they don't matter/are inaccessible

* if you have reification and reflection (C#, C++/RTTI) you're fine: you can access runtime types and have them

* if you have reification and no reflection (Rust, C++/noRTTI) you're fine: you can specialise & discard types at runtime

* if you have erasure and reflection (Java) you're fucked: you can access types at runtime, but many aren't here anymore

> From a layman's perspective it seems that Haskell could be implemented with either an erased or a reified generics model under the hood, without changing the public surface of the language. But is there something that type erasure enables that reified generics does not?

A simpler implementation.


Sure you could put reified generics in there, but what would you use them for if you didn't have reflection?


Think of all the places where you need to pass in an explicit type to a Java class to workaround the fact that the types are erased (list.toArray() being the canonical example, but reflection uses hit this a lot as well). That's the downside.

The only benefit to the erasure model is it's easier to implement for the runtime.


Interop is another advantage of the erasure model. Getting a version of Scala to run with reified generics on the CLR is a huge, perhaps impossible, undertaking. And if you can't do that, interoping with generic C# code becomes really hard. On the other hand, Scala interoping with Java code, generic or otherwise, is much easier.


I've heard this argument a few times now, it's an interesting point, having 'reified generics' imposes a tax on any language that wants to use that runtime.

Here's a few other examples of the same point, https://twitter.com/jon_cham/status/969929683587432450 and https://twitter.com/headius/status/958371298975080448


> Interop is another advantage of the erasure model

And, in the case of F#'s type providers, it means not having to generate zillions of types when coding against some huge schema. But, on the other hand, it does give you the option to do so.


If the bindings are reference types, the CLR will economize their representations as long as they don't have any static state, I think.

I'm not sure if F# is leveraging the DLR, but there is a lot of fun that can be had there in generating types on the fly, on par with the most dynamic languages out there. It is too bad that it has been nerfed with the UWP/AOT compilation push.


> If the bindings are reference types, the CLR will economize their representations as long as they don't have any static state, I think.

Yep, I knew about that, but type providers don't necessarily generate generic types. So, I meant something a bit different.

As you know, If Foo<T> is an ordinary generic, then T must be a CLR type, which is to say that it must be code that was somehow turned into a CLR type (usually by a compiler such as C#). But, if Foo<> is a type provider, then arbitrary code of an arbitrary language can go in between the angle brackets (which is why type providers can accept strings). Essentially, each type provider is a compiler, but the code that is generated must be expressed (or "wrapped", if you will) as a CLR type. But these types need not be generic (and usually aren't, I think). So, not having erased types in F# is akin to having a compiler that can't generate loops or subroutines†.

†(Perhaps a better analogy is a compiler that can't eliminate tail-calls, since erasing everything to a certain base type (that F# lets you choose) is kind of like reusing stack frames vis-à-vis the analogy)


It is possible to generate interface definitions in C# on the fly (at run-time), but unfortunately it requires going outside of the CLR or even the DLR. Resource usage also becomes a problem since you have to generate it in a new assembly.


> It is possible to generate interface definitions in C# on the fly (at run-time),

Huh, I never thought about this restriction, so you can't add a new interface with 'Reflection.Emit', you can only add a class that implements an existing Interface, is that right?

> but unfortunately it requires going outside of the CLR or even the DLR.

By this do you mean writing raw CLR metadata yourself and then somehow injecting it into the running process or something else? Either way, its sounds like an interesting technique, any links to how it's done?


Using type builder, you just set the type attributes to interface and define it otherwise like a class. See https://stackoverflow.com/questions/136528/using-nets-reflec...


Thanks for the link, I see what you mean now


Scala used to target CLR.

It was abandoned because interop was hard, and interop was Scala's strategy for success


Scala used erasure to target the CLR.


Yep. And erasure meant that it couldn't interop with generics, which was a big deal for them.


Yes, that is correct.


Note also that Java does have reified generics, but only for arrays. The designers of the JVM from early on did realize the advantages of reified generics. Type erasure is not some principled stand being made by the JVM.


Well, really their hand was forced because they needed to be able to specialize all of the array-of-primative types. It's not as much they saw the advantage as they couldn't imagine a language where you can't express `byte []`. But the point stands that they could have realized that `ArrayList <byte>` is almost as fundamental and created refied generics for user-defined types as well.


They could have had type erasure for all arrays of non-primitives. Their hand was not forced.


> Note also that Java does have reified generics, but only for arrays.

By that count, so does C.


Java has generics but they are only reified for arrays.

C has types that are only generic in the case of arrays.

There is a difference.

In Java I can write this function

    <T> String getTypeOfArray(T[] arr) {
        return arr.getClass().getComponentType().getName();
    }
and that will return the name of type T at run-time. If I instead of T[] I use List<T>, getComponentType will return null and there is no way to access what type T is at run-time. At compile time List<T> and T[] behave the same, but at runtime the List loses what type if contains.

In C I can't write a function with that signature at all because it doesn't support generic functions.


C has no runtime type information at all. Java knows the difference between Integer[] and Float[] at runtime but not between List<Integer> and List<Float>.


list.toArray() is only a problem in Java because arrays aren't erased. Mixing erasure and non-erasure in the same language is a problem, but it's not a problem with erasure.


2 main things:

- Boxing/unboxing would not be an issue anymore, so you gain performance.

- Runtime types are enforced: List myList = new List<String>(); myList.add(new Object());

In JAVA, this is ok. In c#, it throws.


Testing this, it doesn't throw in C#, it just doesn't compile because you can't declare a variable to be a List without declaring the type with it. If you specify myList as List<String> then it still doesn't compile.


For c#, try compiling with IList instead of list.

IList myList = new List<String>(); myList.add(new Object());


To be fair, raw types like List are really deprecated and only exist in order to interface with legacy APIs.


Java? In a lot of cases you still need them. Anything that touches reflections is in that category, plenty of di/ioc systems, or mappers, etc relies on them.


Can you give an example?

Most of the time when reflection is needed I see something like this instead of just a plain list type.

    <T> void foo(List<T> list, Class<? extends T> type) { ... }


You'd have to call the Java variant via reflection to show the difference at runtime, as the compiler would already enforce that myList.add(new Object()) won't compile. The point stands, though.


Reflection is not required. I actually had the code, somehow got deleted

List myList = new ArrayList<String>(); myList.add(new Object());

this would compile.


Ah, I missed the missing generics on the declaration. Sorry.


I would say java and c# have both failed on your second point, separate to their respective VMs issues.

It would be nice if mistakes like that were deprecated then eventually fixed, even if over many years/versions.


Does anybody remember the article about their generics implementation in assembly ? How they basically were able to reduce the cost of generics by observing they needed only 3 implementations, by writing specialized versions, one for 8-64 bit values (just one version), and then one for 64 bit pointers, and one for structs (struct passed in by having length and value).


> Does anybody remember the article about their generics implementation in assembly ?

Do you mean http://joeduffyblog.com/2011/10/23/on-generics-and-some-of-t... or https://blogs.msdn.microsoft.com/joelpob/2004/11/17/clr-gene...

They're 2 articles I came across that covered the low-level details of the generic impl.


Did they really have to split the library into generic/non generic? I think Java handled that much better.

It's been more than a decade since I last used C#, so excuse me if I recall incorrectly.


I disagree. I think that C# generics are way better. The C# generics are supported at the virtual machine level, whereas Java just pretends to do generics. Java generic types are unknown to the virtual machine since the compiler just compiles a List<Foo> to a list of objects, leaving the VM ignorant of what it really is.

I thought the Java implementation was almost a workaround to avoid the hard work of a true generics implementation.

The fact that C# generics are understood by all parts of the .NET ecosystem makes it so much better and boosts the language to a much higher level than Java.

C# and Java always seemed to be roughly the same to me until C# got generics, and that boosted C# to a much better plane of existence, which was followed by a great deal of innovation such as LINQ, async-await, lambdas, etc, all of which benefit from C# generics.


To answer the parent's question: Yes, it was necessary. Doing it the way they did allowed source compatibility with pre-generic or non-generic-aware code. Passing a List<string> to something that expects an IEnumerable works just fine and effectively gives you the Java solution (at a source level) when talking to older APIs! This is why interface IEnumerable<T>: IEnumerable.

>I disagree. I think that C# generics are way better. The C# generics are supported at the virtual machine level, whereas Java just pretends to do generics. Java generic types are unknown to the virtual machine since the compiler just compiles a List<Foo> to a list of objects, leaving the VM ignorant of what it really is.

It is worse than that. Java generics don't work with value types at all thanks to this. A C# List<int> is much closer to a zero-overhead abstraction. It does not box the primitive int values. If Java ever adopts primitive generic types it will break compatibility or introduce massive performance overhead, as touching any non-generic API forces a boxing conversion. Either it negates the reason for type-erasing originally or it eliminates the main benefit of non-boxing value type support (performance).

(I should clarify that Java doesn't have first-class value type support either so this really only applies to primitives)

>I thought the Java implementation was almost a workaround to avoid the hard work of a true generics implementation.

It was a deliberate design decision to retain compatibility between pre-generic and post-generic Java code.

Personally I think that was the wrong tradeoff; there is never a better time to make a breaking change than right now. The cost only ever increases with time. It was also clear back then that Java would exist for far longer with generics than without and that far more code would be written in Java post-generics than pre-generics. The result is everyone who uses Java is stuck with limitations and negative performance impacts forever, rather than accepting some short-term pain.

It is also my personal opinion that source compatibility is what developers actually cared about and Sun should have told the stodgy risk-averse big Java houses to suck it up and get ready for JVM v2. It would have been a good opportunity to fix a few other things in the JVM.

C# had the benefit of hindsight in some ways. I'm not sure if Java demonstrates how open delivers inferior results, how bad leadership can impact a project, or just what happens if you pay attention to what "enterprise" customers claim they want.


> Personally I think that was the wrong tradeoff; there is never a better time to make a breaking change than right now. The cost only ever increases with time.

Yes, but from a marketing perspective breaking with the past is usually not desired. Especially if your product is currently being adopted and is already partially adopted in both hardware and software (which might have been the case when they implemented generics?).


The Java implementation is a workaround to avoid breaking processors that executed Java bytecode. ARM Cortex can still theoretically do it, though it now just traps into a soft VM.


Java didn't handle it much better, they simply chose a different tradeoff. They chose to implement generics via type erasure, which basically means List<String> and List are the exact same classes at runtime.

.NET chose to implement specialised generics. Many people think that is a better tradeoff.


Note that Java's originally only generic type, the array, has C#-style reified generics. The early JVM implementors should have felt a bit dirty in providing reified generics support for arrays at the VM-level without at least having a forward-compatible VM-level provision for general reified generics.


See also: syntax-baked reified arrays, slices, maps, and channels in go (but "no generics" right?)


I think they made the right choice at the time. Targeting devices smaller than desktop computers about 5 years before C# (in a time when Moore’s law was in full swing), they had to cram the jvm in much smaller systems than C# targeted.


C# did the same exact trade-off in 1.0.


I'm not sure what you mean by "split the library," but what Microsoft did do is introduce generic versions of many popular and essential classes. For example, the untyped ArrayList became Collection<T>, List<T>, etc. Microsoft did this to retain backward compatibility with existing code.


If it took 6 years to get generics into .NET, we can't fault the Golang guys for taking their time too.


It took them 6 years to do it properly on top of an existing framework with the aim to maintain a crazy level of compatibility.

They did so because they knew the advantages to be massive. And so does everyone else at this point.

That golang decided to forego that wisdom when they started from scratch (and thus create a complex compatibility-story if they were to add it later) is entirely their own fault.


So the Golang guys are in the same situation now, aren't they? Isn't it a bit unfair to fault them for not including generics from the beginning?


> Isn't it a bit unfair to fault them for not including generics from the beginning?

No. Because that would be much easier and was the only obvious answer.

They took a shortcut, hoped nobody noticed and later when people started complaining about this obvious omission in a modern language, tried to weasel their way out of with some “complexity” bullshit and how “people wouldn’t understand”.

So I can fault them for both not including it and then later being disenginious about why.

For this glaring omission the golang designers deserve nothing but ridicule.


golang generics is templates, or is it not?


Golang doesn't have user-defined generics (parametric polymorphism).




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: