There isn't so much in languages features themselves, but in their implementations. GC can be largely pauseless for many practical purposes and a GC language can be within 2x-3x the performance of C-like languages.
Also the amount/cost of memory has improved so that we can use immutable datastructures and functional style in many contexts, which definitely feels like a 'level-up'.
Concurrency has been getting easier too, with many languages supporting coroutines/async and/or threads. Reference capabilities are exciting as in Pony, Rust, or Clean.
In general there's a great convergence where ergonomics are improving (editors, compilers, build/package systems) and each language evolves to adopt features of other languages.
I just dabbled into the Sorbet type checker after not writing any C++ since the 90s, and it was surprisingly browseable/readable and could copy/paste the concepts based on recent Ruby, Java/Go knowledge.
It's nice to read a positive comment like yours occasionally, because the vast majority of the time, I'm just disappointed in how bad our programming tools (including languages) are.
It's become a meme in my office that I'm the guy constantly bitching about how stupid our languages are. This week I was back on my soap box about the fact that almost zero mainstream (statically typed) programming languages can even let you write down that you want a non-empty string. In some languages you can implement your own class/type that wraps around the built-in string type, but most of the time you are now introducing memory and CPU overhead, can't use your type with other APIs that expect strings, etc. So nobody does that. But ask yourself this: how often have you ever written a function that requested a string as input and actually wanted an empty string? How many times did you not even think about what would happen if someone DID pass an empty string?
Same goes for positive and non-negative numbers. How many times did you write "int" when you actually only wanted a positive number? If I have to type "if x <= 0 throw FooException" as the first line of one more function, I'm going to scream. (A lot of languages do have unsigned ints, to be fair. But some still don't, or their unsigned ints are almost useless.)
People make all kinds of Stockholm Syndrome-y excuses for it ("validate inputs at the edges is all you need"), but the truth is that (most of) our languages are so deficient for expressing really basic ideas.
Thank goodness there are some languages that do try to make it possible to write "newtypes" and try to make concurrency safer. Now if we could just get everyone to adopt those in less than a decade, maybe we'd be able to get to the next generation after that, and then maybe we'll have good programming languages before I die.
I'm not so sure that I sympathize with your example. Why not a type for even numbers? Odd, prime, not-prime, etc?
You really are asking for a type that is "valid data." Commendable, but not a static property of data. As a fun example, what is a valid email address? Once established as valid, how long will it stay that way? If invalid, how long until it can become valid?
Do I think better typing can be a boon? Absolutely! Can it also be a burden? Absolutely!
I meant that line as a bit of a tease to other tricks.
And it isn't like this isn't done often. Take the lowly format string in c languages. With the requirement that the format string has to be static, it is common to fail builds if you give it a bad format string or supply the wrong number of arguments.
Why do you think that? It might require a different mental model of types in order to see the benefits but I can't believe anyone is working on a project that doesn't have any sort of properties or 'business' logic that dependent types wouldn't help encode. Have you read anything about type-driven development? https://blog.ploeh.dk/2015/08/10/type-driven-development/
I didn't mean for that to read as dismissive as it clearly does. Should have waited till I had more to say. :)
I don't think the number of places dependent typing could have helped me is zero. Just like linear typing is nice. I just know that many of the dependencies I work with are much more dynamic in nature, such that getting them in static types has tough ROI to justify.
Bluntly, the hardest program failures I've ever seen have been second systems from people learning generics and higher kinded types. I'm convinced these techniques are worth learning. I'm not convinced that they provide the immediate return on effort that they are usually sold with.
I actually meant my "looking forward" to be sincere. I've used tooling that can point out exactly where an injection attack is possible, and that was quite nice. That said, u expect more of these tricks in my tooling, not necessarily cuffed by me.
Elsewhere I pointed to format strings checked at compile time for valid shape and arguments. This is actually common in lisps, ironically enough, and is akin to this sort of affordance.
I have just also seen too many projects fail that thought they could get it all in the types. It is very frustrating.
So... why NOT a type for even numbers? Or prime numbers?
I'm not asking for someone to supply these things in the standard library of $FOOLANG. I'm saying that very few languages offer the tools to define such things without it being very cumbersome and often incurring significant runtime overhead.
I know it's possible to do better because I've read up a bit on Ada. I've used Rust and written my own "newtypes" with the Deref "trick". I've toyed with Haskell.
I don't want to write a validator and litter calls to it throughout my code base.
I want the "validator" to be the constructor of a type (or a refinement mechanism, etc).
Then every function in my code base can have a clear contract via its type signature. What good is a function that says it accepts a string and returns a string if it actually just explodes when given some strings? That's not the same thing as returning a string...
I want to do three-ish things:
1. Make my function's type signature be correct. It is NOT correct to say that you accept an "int" but actually crash if you're given `0`.
2. Push bug catching to compile time instead of runtime. Calling a validation function inside a function is a runtime check. In the meantime I can write 100 calls to a function that I know will cause a crash and the compiler will say nothing.
3. Push type requirements to the callers of functions. Everyone preaches "validate at the edges", but if your function signatures had very precise types, that would happen "automatically." Your code won't compile unless you pipe the correct types all the way from the top function call down to the bottom one.
My point was that focusing on an empty string as the hill to die on was a touch artificial.
My argument for why not, is that I don't think it pays off for most uses. You will wind up putting a ton of logic into the types, but then you have to do a ton of logic to correctly serialize into the types you have.
Do I think there are times/places this could pay off? I'd hope/expect so. But where the data hits the wire is likely not where you can set many of these constraints such that your type system can really help with them.
> My point was that focusing on an empty string as the hill to die on was a touch artificial.
There must be a term for this phenomenon. My original reply that sparked this thread could be tl;dr as "I get frustrated with most programming languages. People at work know me as the guy that always complains about programming languages. One example of that is that I recently complained about most languages making it hard to statically guarantee that an input string is non-empty."
But then I get painted as "dying on a hill" for non-empty strings. It was one example, and it's not even the most important complaint I have about current programming languages. It was just one that came up this week at work because we literally had to deal with a bug from a REST API of a big company because THEY sent an empty string in a JSON payload that isn't supposed to be empty. Their code obviously missed a check somewhere and instead of sending a 404 response, they sent a bad payload (according to their own docs).
> My argument for why not, is that I don't think it pays off for most uses. You will wind up putting a ton of logic into the types, but then you have to do a ton of logic to correctly serialize into the types you have.
>
> Do I think there are times/places this could pay off? I'd hope/expect so. But where the data hits the wire is likely not where you can set many of these constraints such that your type system can really help with them.
This is hard to debate because we're speaking in very abstract terms. Obviously different domains will have different needs, etc.
So, unfortunately, I'm not sure I follow your argument for "why not". But here's my argument for "why". A lot of times, when we design software, it ends up working in "layers". Some function calls some other function calls some other function, etc.
If that "bottom" function requires something like a non-empty string (maybe that function is going to print a mailing label and it would be ridiculous to waste printer time on a blank label), you have two options: compile time enforcement or runtime enforcement.
In my experience, the majority of the time, "we" choose runtime enforcement, even in statically typed languages. What does runtime enforcement look like? Usually it's one of two things: you throw an exception or you return some kind of failure value.
If you throw an exception, it bubbles all the way up and your top level main loop has to catch it and understand how to handle it. That somewhat implies that your top level has to know everything that could go wrong at any layer of your code.
In addition, you now have a dilemma. You know that if you pass an empty-string through, that you'll eventually hit the function that requires a non-empty string. So you have two sub-options for this option. You can validate at the top level and pass it only if it's valid, or you can pass everything through and potentially do a lot of computational work before hitting the failing function and throwing away all that work. Most people choose to validate at the top level. So, you're validating at the top level anyway, and you're possibly validating in TWO places (the function doesn't know who might call it)- not DRY.
Other issues with the (unchecked) exception approach: It ALSO means that the type signature on your bottom-layer function is LYING. It "said" that you could pass it a string and it would return Whatever. It was wrong. You passed a string and instead of returning a Whatever, it started unwinding the stack for you. That's not static typing. Callers of this function can't trust its type signature to be complete. Instead they now have to read documentation (hopefully you wrote some). But if they already have to read documentation to understand what your function accepts and returns, why did we bother write the types at all? Just use Python or JavaScript and don't put types on anything. You just have to read the docs to know how to call it and the types will never get in the way.
The other runtime option is that your function might return a value that indicates failure. Some languages have "Result" or "Try" types. But if you do that, then the function that calls that bottom function has to handle the return value. There's a good chance that the "second layer" function can't really handle the failure, so it ALSO has to return a failure value. Etc, etc, until every function in the call chain has altered its return type to indicate that it may fail. Then your top level loop inspects the return value and handles the failure. I include Java-style checked exceptions in this category and not in the above "exception" category.
This return-failure-value approach pollutes all layers of your code even worse than if you just had a NonEmptyString type! Instead of the top level inspecting the bubbled up failure value after doing a bunch of computation, it could have just tried creating a NonEmptyString type from the input. If it failed, then the top-level handles the failure same as before, but didn't waste a bunch of CPU and clock time doing computations before hitting the failure. Furthermore, it's very DRY because the validation logic is in the type itself, either in some kind of type refinement mechanism, or in a factory function, etc.
Furthermore, the NonEmptyString type approach gives you more compile-time safety from bugs. If you try to call that bottom function with a maybe-empty-string, it won't compile. In the other cases, you'll only find out at runtime, even though you KNOW ahead of time that it's illegal to do so. I hope your tests cover everything.
First, an ack that I am almost certainly not touching all of your points. My apologies on that.
I think I am going to lean in on our arguments. I would rather change mine from "why not" to "why this isn't worth forcing".
That is, on the logic and aims, I fully agree with you. It is more that in practice, I have seen this fail too many times. I expect and look forward to it succeeding some day, but I still caution against jumping all in.
I should also ack that I am pretty happy with how common lisp does this. By mixing in read time and evaluation time, you can actually get a lot of this. I'm on my phone right now, but (format nil "~{") will not evaluate in sbcl. Instead, it will indicate an error in the format string. This is similar to how c will fail the build with Wall and Werror on similar bad format strings. Difference is that in lisp, you can add such evaluation time checks as a user.
As for why we do so much at runtime, my assertion is we check at runtime that which is determined at runtime. When getting data from outside the static program, there is little help the static type system can offer. So, deep in your system, get things out of strings and primitives as soon as you can. Don't pass a string username, pass a username. This gets you essentially what you want, but flags where the invalid value could have come from, as well as where it could have been used. And lets you add on other validations. AuthenticatedUsername and UnauthenticatedUsername, for example.
I get that you called those out as another category of error, but types are explicitly for that, as well.
And finally, in somewhat reverse form, I agree that arguments on specifics as a proxy for general are frustrating. For my part, apologies on adding to it.
> As for why we do so much at runtime, my assertion is we check at runtime that which is determined at runtime. When getting data from outside the static program, there is little help the static type system can offer. So, deep in your system, get things out of strings and primitives as soon as you can. Don't pass a string username, pass a username. This gets you essentially what you want, but flags where the invalid value could have come from, as well as where it could have been used. And lets you add on other validations. AuthenticatedUsername and UnauthenticatedUsername, for example.
See, but that's exactly what I'm saying. Get out of primitives ASAP. I just want our languages to allow us to do that with minimal ceremony and minimal performance cost. Most languages fail on at least one of those. As a result, most developers don't do it, and I'm stuck interacting with their code.
Type checking is not a panacea. You cannot use type checking to spot and fix all bugs.
Funny enough the problem you raised was trivially solvable by creating a BNF-like checker based on the spec for "THEIR" API response data.
YOU just didn't want to do it.
I'm pretty sure there are solutions at least in C++ and Java (eg. validate at the edge and convert it into a ValidatedString type and use this type throughout), but honestly, you seem like the type that complains about missing incoherent features that can be trivially resolved (and not recognizing that the "imperfect" parts being an inherent part of the problem as opposed to being a deficiency in tooling), so I'll just leave my comment at that.
> Type checking is not a panacea. You cannot use type checking to spot and fix all bugs.
I didn't say it was or it could.
> Funny enough the problem you raised was trivially solvable by creating a BNF-like checker based on the spec for "THEIR" API response data.
>
> YOU just didn't want to do it.
I didn't write the code, but if I did, I WOULD have used a NonEmptyString type for the deserialized result of the response body. That's... pretty much my whole freaking point here.
The "problem" is that my colleagues, as well as many, many, devs, including the ones at the company in question, are going to do the easiest thing they can. If your language provides String, and writing your own NonEmptyString type would require a bunch of boilerplate, most devs are just not gonna do it. "We'll never have an empty string here, anyway" they'll say. Until they do.
And, as I pointed out in another comment, I DO write these types in languages like Java. It's a ton of boilerplate and the language fights you at every turn. In Java, most texty APIs expect a String instance, which is a final class. So I have to wrap String with my class and convert to-and-from explicitly all over my code. Not to mention the performance overhead and optimization loss from wrapping a primitive type.
So, thanks for the condescending attitude, but I DO put my money where my mouth is and I still feel completely justified in criticizing the current state of software development tools.
If our tools and "best practices" are adequate, then how in the hell did Apple, a trillion dollar company, just release a version of macOS a year ago that literally had a calculator app that gave the wrong arithmetic answer if you used it too quickly?!
Every time I criticize software dev stuff and someone replies and tells me everything is totally fine, it only makes me stronger.
I'm rambling because our programming languages make it awkward, difficult, and performance-sub-optimal to do such things.
If you do what I often do, you have to wrap and unwrap your primitives explicitly so that you can use APIs that others have written.
You take a performance hit with all of the boxing and unboxing.
It's absolutely still a missing language feature if the language doesn't have ergonomic "newtypes". Just because it's possible to write a ton of poorly performing boilerplate to accomplish my goal of type safety, doesn't mean that I think we're done. You can do these things with essential zero runtime overhead and almost no boilerplate and friction in the code. It's entirely possible to do newtypes and refinement types in Rust, Haskell, D, Scala, even TypeScript. It's just that most of those languages are not very popular. TypeScript is the most mainstream of the lot.
I think "good programmers" are more of the problem than bad programmers. Good programmers are so used to these shit languages and all of their workarounds (excuse me: "design patterns") that they can't even see the forest for the trees and realize how tedious and stupid most of our work actually is. There's no Earthly reason it should take a million lines of code to write a CRUD app (I'm counting dependencies).
I sympathize if you put it this way. The boxing and unboxing of types can probably be done better in some languages. I suspect a well-optimized C++ compiler can probably make the perf cost mostly free if you give it the class the correct magic keywords.
It's just that you seem make a bigger fuss out of the problem than it actually is. Even with Java, there's boilerplate and some performance cost, but it's doable, and the real reason people don't do it (when they should) is because they underestimate the risk of making mistakes. (eg. Nobody uses JSON if the requirement is to squeeze out every single ounce of performance from the CPU, there are better binary protocols.)
Maybe it's indeed Stockholm syndrome, but I don't really see it as a problem of the "world at large". The tech is there, it's easily and freely available. Of course there's trade off between popularity and technical superiority, as always, but there's really nothing preventing you from writing Rust or TypeScript right now, if you feel strongly enough about it. So your job requires you to use a 3 decade old language? You can always change jobs or start your own thing using new tech. I mean, I'm using these 3-decades-old languages at work but at least I know it's a choice I make in exchange for decent pay and a stable job. Complaining about the lack of adoption of new tech (or coworkers not utilizing the type checking system properly) and then not taking proactive steps to fix the problem seems a tad bit hypocritical to me... The reason the world has not caught on with better tech is precisely due to the same reasons that you're not using it in the first place.
That’s easy, a valid email address is one that is well formed, conforming to the specification for email addresses.
I know what you’re trying to get at, but that’s just a category error. It’s a misuse of the term valid in this context. For example my mail archive contains many emails from valid addresses for which there happens to be no currently active mail box end point. They’re still valid data though. The fact that people sometimes use th s term valid to mean something completely different is just an unfortunate linguistic accident, but its a mistake to think it’s meaningful.
Thats propably because you are looking inlined assembly of definition. If you name and reuse all the partial patterns it become much more clear. Though regex is cool obfuscation method.
It would be than at least 3/4 of a page long I guess.
The fun part is this is not even the full truth. As the list of TLD isn't very static any more it's additionally difficult to determine whether a host name is valid. That is only possible with some dynamic list (or a regex that would grow indefinitely and ever change). The presented solution doesn't even take this into account.
The source page I've linked is a quite interesting read on that whole topic.
You propably would want to reuse referenced definitions like domain and IP which are not email specific. But yes all of our JS could be much shorter if we used APL but most of us like readability :P
I kind of not get why TLD should be validated. Does it matter anymore than if sub domain is not registered of if IP is not reachable. I think valid as potentially deliverable and actually deliverable should be distincted (like well formed XML and schema validated XML).
The TLD part matters as some part of the email format is defined through the format of a valid host name. "something.com" is a valid host name, but "something.something" isn't currently a valid host name. So an email address "something@something.something" isn't a valid email address (currently).
But at the end of the day this is all moot, imho. The "only" sane test to check the validity of an email address when someone shows you one is whether you can successfully deliver mail there.
Because even an address is formally valid doesn't mean it will get accepted by all systems on it's way. Almost nobody follows the under specified, confusing, and contradictory specs to the letter.
That was my point in the first place: Trying to validate email addresses is a rabbit hole. It's for sure everything, but not "simple", as claimed above.
The point I was making is that whether or not you can successfully deliver email is not a sensible test of the validity of an email address, looking at the address purely as data. As I pointed out, my email archive contains many email addresses that are no longer ‘valid’ by your definition, but they are still valid as data.
By your definition email address validity changes literally on a moment to moment basis. Addresses are becoming invalid constantly and new ones are becoming valid constantly. It’s not a useful definition of validity, and not even something you can test meaningfully.
I've got your point already before and I think it's valid.
That's why I've formulated my "definition" carefully:
> the validity of an email address when someone shows you one
It's of course not a "definition" someone could write down into a spec. But It's by far the best "informal validity check" in practice. It checks whether an email address is currently valid. You practically can't do more anyway!
The "formal validity" of an email address changes with time nowadays as I've pointed out: It depends directly on the formal validity of the host name part which can change over time given the fact that the list of TLDs changes over time (which wasn't the case at the time those specs have been written; fun fact: there is more than one spec, and they're contradicting each other).
To add on that there are two more important aspects: Firstly an email address you can't send mail to is mostly worthless in practice as it can't be used for its primary purpose. Secondly even perfectly "valid" addresses (by the spec) aren't accepted by a lot of parties that claim to handle email addresses! I guess a lot of systems would for example refuse an address looking like "-@-", wouldn't they? But it's perfectly valid!
My initial argument was that claiming that it's "easy" to validated email addresses is wrong in multiple dimensions. In fact it's one of the more complicated questions out there (given the tragedy of the specs).
Why is that valid? You already introduced a new term for this concept, well formed.
Amusingly, this distinction constantly annoys me with contact managers. I don't want to change my friend's phone number to the new one. I want to mark the old one as no longer active, and add the new one. Similar for deceased family.
Back to the point, you seem to want null punning for the empty string. But you still need to check for validity of the value when it is not empty. What does that really gain you?
> Why not a type for even numbers? Odd, prime, not-prime, etc?
Those are not mathematical entities in the same way as are the sets W (whole numbers) or N (natural numbers, indices). It is reasonable to specify your domain closely, otherwise you end up with sqlite.
"Why not a type for datetimes or complex numbers?" would be a better question, as those are different kinds of things but not quite so frequently used in programming.
> It is reasonable to specify your domain closely, otherwise you end up with sqlite
It seems like you're trying to use SQLite as an example of something you don't want to end up with. This seems backwards so it weakens your argument. SQLite is one of the most widely deployed, reliable, broadly applicable and all round useful pieces of software ever written.
OT: It's funny how primes are given this property of 'primeness' and not-primes are those that don't have the property, when it's actually the opposite. The not-primes have the property of being composite (a product) and the primes are the negative space of numbers excluding composite numbers.
Checkout Clojure spec for a very expressive way of defining data requirements. It allows you to use arbitrary functions to describe data requirements. That way you are not limited by static, compile-time only descriptions of data flowing through your program.
I used Clojure on a project while spec was still alpha/beta or something, so I never used it. It does sound interesting, but I'm skeptical. Even the way you described it- I'm still just writing a function to validate my data, aren't I? Is that truly any different than just calling `validateFoo()` at the top of my functions in any other language?
There's more power than that in spec. For example, you can globally define :address/zip to be a string that's more than, say, N characters long. Now anytime you encounter an :address/zip, regardless of whether it is inside, say :billing-address or :shipping-address dictionary/map, it can be checked for those function predicates.
If D had unlimited access to C code at CTFE then maybe you could implement refinement types at compile time as a library but for now you are stuck with what Walter has posted. I implemented a sketch of the API for this ages ago but the actual type checking had to be done at module load time because was of aforementioned (justified) restrictions to CTFE
I love that! So it's the `alias s this` that's the magic here, I assume? It's making the `this` pointer actually point to the stored field? That's awesome!
Yeah, pretty much. I know such things exist on the periphery, but I think it's extremely disappointing that we keep churning out "new" languages that are basically "Here's C again, but with one interesting feature from the 70s!"
I worked on a Scala once several years ago and I really didn't "get it". The more I use Kotlin, the more I wish it were Scala.
It’d also be handy to put other limits on my numeric variables, for instance to automatically throw an exception if an angle delta goes out of the range -180 to 180, or whatever.
Yes, it is handy. This is why I like Ada (or the idea of it, it's rarely been used in my work because it's hard to sell others on) for safety critical systems. With SPARK/Ada you can even work towards proofs that your code won't assign outside of that range so that you don't end up with runtime exceptions.
>how often have you ever written a function that requested a string as input and actually wanted an empty string?
To be fair, I do it quite often. Most of the strings I deal with in my code are coming from user input, and most of them are optional. They are usually just passed to/from a database. If the string has some internal meaning (like ULSs or file paths), it usually gets wrapped in an object anyway.
If you're processing some formal language or network protocol, that's another story.
Let me ask you this, though. If your strings that come from user inputs are optional, doesn't that mean they could also just not be present (as in null)? Why do you need or want two different ways to express "nothing"? Are all of the text fields just funneled right into the database without checking/validating any of them? I've written a number of RESTy/CRUDy APIs and I can't count the number of "check that username isn't empty" checks I've written over the years.
The argument for disallowing nulls is much stronger than the argument for demanding a compiler-enforced non-empty string. I definitely support the ability to declare variables, including strings, as non-nullable. An empty string is simply analogous to the number 0. It doesn't really overlap in meaning with null. It's true it would be useful to occasionally disallow the number 0, but only very occasionally. The obvious example is division, but having a representation of +/- infinity alleviates some cases.
> I've written a number of RESTy/CRUDy APIs and I can't count the number of "check that username isn't empty" checks I've written over the years.
Paraphrasing: "I've written the same kind of method over and over throughout my career and have been unable to (or made no attempt to) abstract it away." I love strong type systems, but it doesn't sound like the type system is your problem here. The problem is that you're constantly re-implementing the same business logic.
> The argument for disallowing nulls is much stronger than the argument for demanding a compiler-enforced non-empty string. I definitely support the ability to declare variables, including strings, as non-nullable. An empty string is simply analogous to the number 0. It doesn't really overlap in meaning with null. It's true it would be useful to occasionally disallow the number 0, but only very occasionally. The obvious example is division, but having a representation of +/- infinity alleviates some cases.
Well, of course you should be able to declare something non-null. What do I look like, someone who likes Java? :p My point wasn't that I WANT to use null instead of an empty collection/string, it was that our languages give us multiple mechanisms by which we can pass in "nothing" for strings/collections, but they give us zero ways to ask for a non-empty string/collection. That's super frustrating! Yes, of course "null" and "empty set" are technically and semantically different. But they're close enough that you could actually deal with having only non-empty strings + nulls and be able to mostly express what you want. That's not the case if I actually want a non-empty string in many of today's languages. Not if I want it to be usable with other APIs and the standard libraries, that is.
> Paraphrasing: "I've written the same kind of method over and over throughout my career and have been unable to (or made no attempt to) abstract it away." I love strong type systems, but it doesn't sound like the type system is your problem here. The problem is that you're constantly re-implementing the same business logic.
Eh, no. I haven't worked in the same language or on the same project for my whole career. So, yeah, I've noticed that I'm pretty much always either defining a bunch of boilerplate types up front or I regret not doing it when a 0 hits the database because someone wasn't careful with their math or had an off-by-one error.
Maybe I should publish a book a la Gang of Four and call it "Static Type Patterns". ;)
So, yeah. Believe it or not, I HAVE implemented stuff like PositiveInt and NonEmptyString a bunch of times in a bunch of languages. And it's better than not having it, but it still sucks because in several of those languages, it means that I have all kinds of noise converting to and from, e.g., the native string type. And that's because most of the above languages have no concept of "newtypes" and have no intention of letting programmers define or refine "primitives".
It's not really about "business logic". It's about "I know how to describe the shape of this data, but my statically typed language won't let me."
But, the reason I'm bitching about it is because, in most languages, you then cannot use your Username type in place of a "native" string, where some third-party or standard library function just expects a string. So now you have to convert back and forth.
And if this is a language like pre-record Java, fucking forget it. Define the class, implement equals(), implement hashCode(), write a getter for the wrapped string so that you can pass its guts to functions that expect strings, write the constructor. Speaking of the constructor, do you let the constructor throw an exception? Do you make the constructor private and write a factory function? Does that factory throw an exception or return a failure value? Checked exception or unchecked exception?
Now do that everywhere that you want a "newtype".
Is it possible? Absolutely. I've done it. Is the amount of effort for such a simple concept reasonable? No.
And my other point is that I actually want a non-empty string MUCH more often than I want a potentially-empty string.
I think that programmers are especially prone to just internalizing bullshit and papercuts. We like "solving puzzles" and we're pretty smart and adaptable. So when we encounter something that's actually kind of insane, but we eventually figure out a workaround, we completely forget that it was ever insane in the first place (see: Gang of Four patterns).
Have you worked with Kotlin yet? Since 1.5 Result<T> is a valid return type. Inline classes are thin wrappers and compiled out. Data classes automatically give you a toString method. With sealed classes you can implement ADTs.
I have. I generally like Kotlin, but even that makes it a little too cumbersome to work with "newtypes" such as a NonEmptyString type. Here's what I do for a NotBlankString in Kotlin 1.5+:
@JvmInline
value class NotBlankString private constructor(private val value: String) : CharSequence {
override val length: Int
get() = value.length
override fun get(index: Int): Char = value[index]
override fun subSequence(startIndex: Int, endIndex: Int): CharSequence = value.subSequence(startIndex, endIndex)
override fun toString(): String = value
companion object {
fun of(value: CharSequence): NotBlankString? = if (value.isBlank()) null else NotBlankString(value.toString())
}
}
The problem is that most APIs in Kotlin and Java (including the standard library as well as almost all third party libraries) work specifically with String, which is a final class. So, using my NotBlankString is a pain in the ass because I have to explicitly call toString() for most APIs.
Also, I do implement CharSequence, because String does. But CharSequence is a terrible interface and we should probably just pretend it doesn't exist. I somewhat regret even acknowledging its presence.
One of the limitations of value classes is that they cannot implement an interface by delegation, either. So I have to implement CharSequence by hand, instead of writing `: CharSequence by value`.
If you define a "newtype" in Kotlin by using a value class, don't forget to override toString to call value.toString(). By default, it's going to print like a data class format: "NotBlankString(value=foo)"
But, overall, this is a much better than the situation in many languages. But it's still just awkward enough that I think a lot of people don't bother.
In my opinion, a statically typed language should HIGHLY prioritize the ergonomics of defining and using custom defined types. Ideally, I would be able to declare somehow that NotBlankString can do everything a String can do, and therefore be able to pass my NotBlankString type into any function that asks for a String. It would also be better for ergonomics if I could define my own type refinement, instead of needing to call a constructor- kind of like how Kotlin does "smart casting" with null and sealed types:
val s: String = getSomeString()
if (s is NotBlankString) {
doStuff(s) // takes a NotBlankString
} else {
doOtherStuff()
}
Thank you very much for the detailed writeup. Several new things to learn for me in there. Not being able to use delegation here is a bummer, would be a great fit.
While trying to write up a clever solution using [0] i stumbled over String.isNullOrBlank()
I hear your dismay but you could easily build your own library of string validations which can extend however you want and can reuse it as much as you need.
In some languages you can get a choice. Broken record time, but Ada:
type Byte is unsigned 2**3;
This will permit any value in the range [0,7] and when you exceed it (in either direction) it will wrap around (the desired action if you choose this type). In contrast:
type Byte is range 0..7;
Will give you a runtime error when you exceed the bounds by trying to increase beyond 7 or decrease below 0. Having this choice is nice, you get to decide the semantics for your system.
Ada's good, but it will never win. The open source compilers are solid, but no one wants to learn it. It's (depending on who you ask): Too old, too verbose, BDSM programming, not Rust, not Haskell, not C.
It has a lot of positive features going for it, but it is verbose. That verbosity is a major distraction for people who can't handle it, they want their very short keywords or, better, no keywords just symbols. Curly braces are somehow better than begin/end even though begin/end really aren't hard to type. Ada shines, particularly, in the long tail of system maintenance, not in the writing (arguable: the type system certainly helps a lot in the writing, the syntax doesn't). So I press for it where it belongs, and don't where it doesn't. But when someone laments the state of type systems, I point it out.
The verbosity can only be removed at a superficial level. Many people object to begin and end, but they’d chafe even more if they got further. For good reasons (Ada leans toward explicit declarations in favor of implicit ones) you have to list each sub program you depend on in a module by name. Even if you don’t import it’s symbols. And you have to explicitly instantiate generics (like C++ templates) rather than the compiler inferring your intention. Things like that cannot be removed from the language so only a superficial improvement can be achieved, for what benefit?
It's difficult to design whole languages (you won't get anywhere close some adoption in production at important places under ten years I guess; if you're lucky of course). Having already something that proved its merit and strong foundations is for sure a good starter for a "new language" (as it would streamline the process to create such "new" language drastically).
People are carving for powerful, safe, fast languages. So there is a market.
If it's "only" about the verbosity and some cumbersome edges this could be fixed with some "overlay syntax".
To be honest I'm one of those people that looked a few times at Ada as it was said to be powerful, fast, and safe. But it looks so awkward in my eyes; even I know I should not judge languages by syntax! The look-and-feel is just a strong point. I guess it's not only me…
So I suspect a syntactic make-over could give new life to an old but powerful language. Just by making it look more "modern" (whatever this means).
Definitely runtime, potential for compile time. The compile time checks will at least prevent obvious cases (assigning a value out of the range like using either Byte type above: T := -1 will get a compile time error). Using the second Byte type, paired with SPARK/Ada, this bit of code should set off the proof system and prevent compilation:
procedure Foo is
type Byte is range 0..7;
T : Byte := 0;
begin
T := Byte - 1;
end Foo;
(Not that that's useful code, but a basic example.) That shouldn't make it past the SPARK/Ada system to compilation. Now change it to this:
function Foo(T : Byte) return Byte is
begin
return T + 1;
end Foo;
and SPARK/Ada should warn (been a bit, but should also fail to compile) that this could cause overflow.
> subtracting one from zero and getting max_uint can be its own brand of fucking horribly awful.
Agreed. Silent wrap-around on overflow is also a really terrible idea.
> having the language itself throw in that circumstance can also be exactly what you don't want.
Then you have to check before you do it. Or call a special overflowing function. The default should absolutely be a crash. That wouldn't even really be anything new: what happens when you divide by zero in most languages? Index an array with an index > length? Should the array index just wrap around and give you the element at length % index?
Statically typed languages give you u8/i8 types of numbers.
Maybe having a non empty string, or non empty list, type is useful now and then, but in practice, just have your code work just as well on both empty and non empty values, and you're good to go.
I'm pretty happy with the languages we have today overall (Kotlin and Rust at the top, but C#, Swift, and Java get an honorable mention).
Your comment kind of galvanizes my view that most of us suffer from Stockholm Syndrome with respect to our programming languages.
As another commenter said, some statically typed languages give you unsigned numbers. Maybe most of them do. But definitely not some of the most popular ones. And out of the ones that do, they are often really unhelpful.
C's unsigned numbers and implicit conversions are full of foot-guns.
Java basically doesn't have unsigned ints. It does kind of have this weird unsigned arithmetic API over signed ints, but it's really awkward and still bug-prone to use.
Kotlin's unsigned numbers API is very poor. Very. Kotlin does not give a crap if I write: `Int.MIN_VALUE.toUInt()`, so it's perfectly happy to just pretend a negative number is an unsigned number. Not to mention that unsigned number types are implemented as inline/value classes which don't even actually work correctly in the current version of the language (just go look at the bug tracker- I literally can't use value classes in my project because I've encountered multiple DIFFERENT runtime crash bugs since 1.5 was released). It, like Java and C, etc, is perfectly happy to wrap around on arithmetic overflow, which means that if you didn't guess your types correctly, you're going to end up with invalid data in your database or whatever.
Rust and Swift have good unsigned number APIs.
Notice also that I didn't say anything about non-empty collections. Yes, I absolutely want non-empty collection types, but that is no where NEAR as important and useful as non-empty strings, even though they might seem conceptually similar. I'm willing to assert, like the bold internet-man I am, that you and all of the rest of us almost NEVER actually want an empty string for anything. I never want to store a user with an empty name, I never want to write a file with an empty file name, I never want to try to connect to a URL made from an empty host name, etc, etc, etc. It is very often perfectly acceptable to have empty collections, though. It's also very frequent that you DO want a non-empty collection, which is why we should have both.
We don't even need empty strings (or collections, really) if we have null.
You say you like Kotlin and Rust and I work with both of them extensively. I can point out a great many shortcomings of Kotlin in particular. I used to be enamored with it, but the more I use it, the more shortcomings, edge cases, bugs, and limitations I run into. Rust is pretty great, but even that has some real issues- especially around how leaky the trait abstraction is. But at least Rust's excuse is that it's a low-level-ish systems language. It's these "app languages" that irritate me the most.
> We don't even need empty strings (or collections, really) if we have null.
This feels exactly backwards to me: I almost always want my sequence-like types to have a well-defined zero-length element, and I almost never want to allow a NULL value for a variable. NULL is so much worse than [] or ''. Think about concat(). When the trivial members of a type support most of the same behaviors as the nontrivial ones, that makes error checking so much easier.
Of course you're right! That statement was intended to be hyperbolic. I'm not actually suggesting that it would be a good idea to NOT have zero-sized collections.
I'm just exasperated that most of these modern, statically typed languages, give us TWO ways to write "nothing" (null and empty) and ZERO ways to write "must have something".
You COULD, theoretically, write a concat() that takes multiple, nullable, non-empty strings and returns a nullable non-empty string. Of course that's not ideal and would be horribly unergonomic. But you COULD do it. But how do you write a concat() that statically guarantees that if any of its arguments are non-empty that its output will be non-empty? You pretty much don't.
Don't get me wrong, I mostly like working Kotlin. But the more I do, the more I realize what a hodgepodge of features it really is. A bunch of features don't really work together that well. And a bunch of features are only 80% (or less) of what I actually want.
For example, what if I don't like data class's stupid copy() method? How do I opt out? You can't.
Value classes can't be used as varargs, can't implement an interface by delegating to the wrapped value. (Forgetting the fact that they're just buggy as all hell and I keep getting runtime crashes from them being overly aggressively optimized away)
And it, of course, inherits a ton of badness from Java. Like the crappy type-erased generics, lack of type classes, etc. It chose to just use Java's standard collection types and encourage copies instead of using persistent collections by default. It hides the mutability under "read only" interfaces, but that's not at all concurrency-safe. Map is not a Collection or an Iterable, which is dumb. Map<K, V>'s type parameters are also allowed to be nullable types, which means that Map.getOrElse{} is actually wrong in the standard library.
Since Kotlin chose nullable types instead of Option<T>, you can't express "nested" emptiness. This means that the Map API is janky if you need to store nullable types. Map.get returns null if there was no value stored, but what if the value stored IS null? Then you have to call Map.contains(key), which means you have to hash the key twice to reliably pull out values.
Belated thank you. I've been chewing on your observations. You're absolutely right about the papercuts.
I have to ponder the (lightweight) value objects stuff. And
> you can't express "nested" emptiness
Spot on.
I dislike all the current null mitigations: nullability (question marks), @nullable, Optional<?>.
My hobby language takes a different path. All nulls are actually Null Objects (under the hood). So method chaining cannot break.
re: type classes
Ya, a nice reminder that I need to learn Haskel. Stuff like that is still beyond me. I hope innovators continue to noodle with more accessible Haskel, Ocaml, etc.
Being late in my career, I'm motivated to create a better "blue collar" programming language for 99% of the work I've done. Meaning data processing. Ingest data, mostly strings, munge stuff, spit out results.
Another problem is that no language hits the sweetspot of a truly general purpose language. For example Rust doesn't allow freestyle functional programming (Haskell relies on a garbage collector for a reason), whereas at the other end of the spectrum Haskell doesn't allow precise control of CPU usage.
You could have a language that offers both in different parts of the program. Just like Rust has an "unsafe" keyword, you could have a "garbage_collected" keyword or a "functional" keyword.
>GC can be largely pauseless for many practical purposes and a GC language can be within 2x-3x the performance of C-like languages.
To that end, it seems like only recently we've seen automatic reference counting [Obj-C, Rust, Swift] and/or compile-time garbage collection [Mercury] in a non-toy implementation. "Breakthrough" is a difficult word because it refers to discovery and impact, with the latter coming long after the former, and it's not clear if ARC is really a game-changer for any serious applications, but it seems interesting at least.
That is atomic reference counting, not automatic reference counting. With automatic reference counting, you do not need to wrap the variables, and you do not need to increment or decrement the counter. Rust requires that you actively make your values reference counted by wrapping them explicitly, and makes you bump the count explicitly. It uses RAII to decrement the count automatically though.
Seem like a good way forward wrt memory management and concurrency, using ASAP inside a component, and delegating concurrency and component cleanup to Composita.
I feel like Kay has taken a rather too narrow view of what counts as programming. IMO here are the breakthroughs in the last 20 (ish) years:
1. Stack Overflow - search for your problem, copy and paste the answer.
2. git - Revision control that is low enough overhead that you need to have a reason not to use it.
3. Open-source software as a commodity - Unless you've got very specific requirements there's probably a system that can do most of your heavy lifting. (Almost) no-one writes their own JSON parsers or databases or web frameworks. Using open-source software is low overhead and low risk compared to engaging with some vendor for the same.
4. Package managers - By making it easy to include other peoples code we lower the bar to doing so.
The common thread here is code-reuse. None of the above are programming languages, but all have driven productivity in building systems and solving problems.
Code reuse was the holy grail of the 90's. People expected classes to be the unit of code reuse, then enterprise services, then after a dozen years of disillusionment we finally got the recipe right.
Actually everything is over and over rewritten. May it be because of the language used, may it be because of frameworks, or architectures.
Polyglot runtimes that could enable true code reuse at least across language boundaries (like GraalVM) are just emerging.
For the higher build blocks though there's still nothing to enable efficient reuse. (People try of course. So mentioning micro-services was no joke actually).
Its blessing and a curse. I think software would be better if coders read mans and other documentation more often (and standards like RFC where it is applicable).
> git - Revision control that is low enough overhead that you need to have a reason not to use it.
RCS - 1982
CVS - 1990
They are limited compare to git, but they perform the main function - track changes in text files allowing to see previous versions, a diff for each change, commit messages. CVS compare to tarballs for each release (or worse to a mess of files .bak, .bak2 e. t. c.) is a breakthrough. Subversion, Mercurial, git is IMHO just evolution of earlier VCS.
> Package managers - By making it easy to include other peoples code we lower the bar to doing so.
CPAN - 1993
FreeBSD pkg_add - 1993
> Open-source software as a commodity
Here I fully agree. Opensource started to get some traction 20ish years ago (probably thanks to more widely available Internet and support from corporations like IBM), but its use is still growing.
When I look back it seems to me that 1990s were very fruitful and the next 20 years progress in software was somewhat slower, but progress in hardware enabled previously impossible stuff without revolutionary changes in software.
The point was that these were either dismissed or weren't considered by Kay when describing breakthroughs in computer programming. They aren't breakthroughs in computer programming languages, but IMO are breakthroughs in computer programming.
> RCS - 1982 CVS - 1990
I don't accept CVS as a breakthrough in the same way as git has been. Back in 2000 - 10 years after CVS - using source control wasn't a given. We had articles like "The Joel Test"[1] encouraging teams to use source control. CVS was a pain to set up and limited once you did. Thanks to git (and DVCS in general) using source control is the default for 1 person throwaway projects up to projects with thousands of contributors and million lines of code.
SVN fixed a lot of the issues with CVS, FWIW; setup, maintenance, robustness were much improved. But, yes, git was a major breakthrough. Even if you ignore the advantages of a DVCS and git's raw speed, git provides the ability to branch freely, with reasonable confidence that you can merge without spending days reconstructing your divergent source ("merge hell").
> We had articles like "The Joel Test"[1] encouraging teams to use source control.
IMHO source control used more widely nowadays not thanks to git per se, but thanks to availability of free VCS hosting platforms like github.com
Subversion IMHO is more beginner friendly than git, but we AFAIR didn't have good subversion platforms (with cheap of free private repos). Sourceforge added SVN in 2006 (which is late), but sourceforge GUI is an abomination and there are reasons not to trust it: https://en.wikipedia.org/wiki/SourceForge#Controversies
I suspect this has a lot to do with the architecture of git allowing relatively "dumb" servers. To run a git server, you only need the capacity to receive, serve and compare git hashes and blobs.
For SVN at least (last I used it), the server is expected to perform all sorts of potentially expensive operations (esp. for large repos): diffs, merges, branches, etc. since the client does not have the full repo/history. Given at the time computing power was less cheap, it would mean that hosting SVN services incurred a non-trivial cost.
(And IIRC CVS was so bad that I don't think anyone in their right mind should actually try to host a free/cheap service around it.)
Also, initial versions of SVN was released around 2004, while git 1.0 was technically in 2005. It took a while for people to get used to git, but given its technical superiority (and its momentum given Linus' approval) and being less demanding on the server side to host, hosting for git was a correct decision to make anyways.
YouTube. My son learns a lot of his programming from searching YouTube and watching videos. Doesn't seem like it would be high density, but I'm amazed that there are really good videos/tutorials on pretty obscure topics. And since all YouTube videos have dates, it's actually easier to find current tutorials.
Are those tutorials good because they're videos, or despite being videos? I'm inclined to believe the latter. What sort of content is there in programming lessons that wouldn't be better presented as text?
I think Kay's complaint about engineering rigor ignores the explosive growth of programming. Sure, bridge-builders have rigor; there's also probably about the same number of them today as there were 50 years ago.
The number of programmers has grown by at least two, maybe three orders of magnitude over the last half century. And more importantly, almost anyone can do it. A kid whose closest approach to structural engineering is building a balsa bridge for a weight competition can also write and release an app for Android or iOS that will be seen by millions. Even if it's not a success, it's still just as real a program as MS Office.
That level of access guarantees amateur-level code, and the rigor Kay is suggesting would kill the software industry as we know it.
Yeah and I'll tell you, as someone wbo gets to look at a lot of those CAD models that are suposedly introducing rigor, they're often in exactly the same kind of condition as internal codebases.
> That level of access guarantees amateur-level code, and the rigor Kay is suggesting would kill the software industry as we know it.
I don't believe this follows. The level of rigor he's lamenting could be constrained to certain categories of software, based on their impact or information content. Amateur or informally specified systems can still satisfy everything else. There is no reason for most systems that don't touch PII or critical systems or other similar categories of software to have serious engineering behind them if people don't want them to.
Sure, I meant if that rigor was applied to the whole software industry, not selectively, exactly as you say. And that level of rigor is applied sometimes. The most rigorous example I know of is the Space Shuttle guidance system (I think that was it; I read about it twenty years ago). Two independent teams write two entirely separate programs, and then in practice (again, from memory) two versions of program A run, and if they disagree program B is the tie breaker.
Also their QA process was completely adversarial. Finding a bug was a major success for QA, and a major failure for the dev team. They found something crazy like 1 bug per million lines of code.
...the rigor Kay is suggesting would kill the software industry as we know it.
You say that like it's a bad thing.
Anyone's plans to make software sane would kill the software industry as people love and hate it. A substantial portion of this industry involves supporting horrific abortions of a system that seem to live far too long. That includes the functional programming people and anyone who believes some method of theirs will produce an explosion of productivity. Hopefully, the effect will be people quickly rewriting old systems to be sane and creating many new system.
Unfortunately, such starry eyed idealism is unlikely to be realized and the rolling a 1000 pounds of jello up a hill jobs are safe. But this kind of idealism is still needed to motivate the systems builders so it's not a complete loss.
Well, rigor and quality in building industry are location dependent. I've read a blog of a builder who describes how architects regularly produce dangerous (too thin or just simply missing load-bearing beams) or straight up impossible (gable of negative size, yep) designs. The solution is that the builders just build whatever makes sense and sometimes the contractors simply don't notice.
I'm ignorant in this domain, but wouldn't it be up to the structural engineer to make sure the plans are sound? I always thought that architects dream it up and engineers are responsible for the physics.
My problem with this is it sort of implicitly assumes that languages are the unit of progress. It's unlikely that we will get progress solely by coming up with better languages, it's more like different people invent different tools to suit different problems. In the explosion, we are more likely to find what we need.
I don't think I've ever used just one language for a project.
The progress at least personally is that there are now so many resources to use a bunch of different languages to knit together a solution that a lot more can get done. You can write your low latency code in a variety of languages, and ship the data via a web server to a browser or a mobile app. For every part you have several choices of tech to use, and they're realistic choices with lots of help online. Almost everything is available for free, so you can try all sorts of libs without committing to one. There's no longer a need to sit down with a textbook and learn a bunch of stuff before starting, you can just jump in with a brief tutorial and explore the language or lib as you discover them.
The plethora of choices also teaches you a lot about languages in general, because you see so many that you can start to make generalizations.
LLVM could very well be thought of as major breakthrough. The intermediate representation format has enabled so many compiler and languages now, that it's pretty insane in my opinion.
Metamine represents the latest breakthrough in programming, it offers a mix of tradition declarative programming and reactive programming. The "magical equals" for lack of a better term, lets you do "reactive evaluation", the opposite of lazy evaluation.
If any of the terms that a term depend on change, the result is updated, and all it's dependencies, etc. You can use the system clock as a term, thus have a chain of things that update once a second, etc.
Being able to use both reactive and normal programming together without breaking your brain is a whole new level of power.
It's brilliant stuff, and it seems to have been yoinked from the internet. 8(
Haskell is great at mutating variables. Reactive programming was even pioneered in it. It's just that the Monads that allow you to do this are somewhat like sticky tar - everything they touch becomes a part of them.
The traditional structure of Haskell programs is to build a pure functional 'core', and layer around that the 'sticky' parts of the code that need to interact with the outside world.
Haskell has do notation and monads (F# has computation expressions, which are similar). These allow you to implement things as libraries that in most other languages would require changes to the compiler. Lisp macros can do it too.
You could push the reactivity monad right down to the core of your application to get the benefits of this reactive language.
Since I left university (eighties) I have only been positively impressed by two languages. One was the Wolfram language. I haven't used it; I'm just going on the demo here, but the idea of having not just a powerful language, but also a massive database with useful information to draw from, seems to elevate it above the usual sad collection of new ways to spell variable declarations and loops.
The other is Inform. I haven't used that either, and of course it is highly domain-specific, but within that domain it seems a pretty damn cool way to write text adventures.
All of the graphical programming systems seem to fail the moment they scale up above what roughly fits on one screen. Labview is a disaster once it is more than just a few boxes and lines.
And everything else is, as far as I can tell, just the same bits and pieces we already had, arranged slightly differently. We were promised hyper-advanced fifth-generation programming languages, but the only thing that seems to come close is Wolfram, and that's hardly mainstream.
I'm occasionally wondering if the whole field might not improve mightily if we stopped focusing so much on languages, and instead focused on providing powerful, easy to use, elegant, well-documented APIs for common (and less common) problems. Using something like OpenSSL, or OpenGL, or even just POSIX sockets, is just an exercise in (largely unnecessary) pain.
It sounds like you and Alan Kay are both expecting novel problems to be solved by new programming languages. That is an extremely inefficient way to do it: you need to come up with new compilers, documentation, standard libraries, communities, etc. Instead, programming languages have become general enough that most new problems are being solved within existing languages, instead of by inventing new ones.
I have no idea what a "hyper-advanced fifth-generation programming language" is even supposed to look like.
> I'm occasionally wondering if the whole field might not improve mightily if we stopped focusing so much on languages, and instead focused on providing powerful, easy to use, elegant, well-documented APIs for common (and less common) problems.
But a programming language is nothing but a well-documented API! How would your suggested solution even differ from a programming language?
That's not quite it. Let me explain... The first computer language I used was BASIC. It was excellent for what it was, but clearly it had a limit on how much you could do with it, on account of it missing amenities like named functions, variable scopes, etc. A 'function' was simply a line further down the program that you branched to using GOSUB. And there was only one scope, which was fine since the space available for actually writing programs was a microscopic 23KB anyway. It had for-loops, but GOTO was still very much present as a tool to control program flow.
Moving on from there, I learned Pascal, which was clearly a major step up in terms of what you could achieve. Using the primitives available in Pascal (structured programming, named functions, scopes, etc.), it is possible to write larger programs than you can in BASIC: the higher level of abstraction makes it easier to reason about larger programs.
From there C++ was another step up: the ability to define objects and encapsulate a great deal of implementation detail is another weapon in your toolbox to combat chaos, thus allowing you to write ever more complex programs without losing control over what they are doing.
And then... there was nothing. There appears to be no step beyond object-orientation that lets you create even larger programs with even less effort. There may be languages that are syntactically easier than C++ (although after using it for a quarter century it no longer bothers me), but they just hide minor implementation details, at the cost of lower performance. That's not greater abstraction, it's just greater convenience.
What we were all hoping for was that next step: languages that provided an even higher level of abstraction, allowing you to create even larger programs without losing the ability to reasoned about them. This was what 5GL promised, but which we didn't get because nobody could figure out what they would look like. Apparently we have reached a maximum level of abstraction that we can express with source.
So the progression of computer languages looks somewhat like this: 1GL (plain assembly), 2GL (unstructured languages like BASIC), 3GL (structured languages like Pascal), 4GL (object-oriented languages), 5GL (not, as of yet, invented). Each level represents a clear step up in terms of abstraction, and after four steps we seem to have run out of steam, and are now mostly busy reinventing things we already had with slightly different syntax. To me, at least, that's a disappointment.
My disappointment with available APIs is perhaps simply because I program a lot in C++, which inevitably means having to deal with an anemic standard library (there's not even a standardized socket interface in there), and a wild array of C-libraries. Some of these are very good, with an elegant interface design and excellent documentation, but many are just painfully bad, with virtually no documentation, and apparently every effort made to confuse the hell of out their potential users. Really, some of this could be so, so much better...
Isn't abstraction merely an "indoctrinated" kind of convenience?
Surely you can, with a huge amount of inconvenience, achieve in Pascal what you routinely do in C++, by passing "self" into function calls and manually implementing vtable lookups? :)
(You might have a point with C++ templates but that's just saving yourself the trouble of a bunch of copy and pasta....)
Similarly, the "2GL" to "3GL" transition merely gives you the convenience of not having to maintain a call stack. There's no magic, you just save yourself trouble of writing boilerplate code with pushing and popping pointers and stack variables.
I would argue that the only reason it's considered a "paradigm shift" is merely because you were "indoctrinated" into thinking those advancements were so great that they were "more than" mere convenience. But, at least in retrospect, they are _trivial_ when compared with the advancements made in recent years (eg. Garbage Collection, Rust's safe memory management, import tensorflow, etc.). And while the old school programming language advancements surely boost programmer productivity and accuracy, I don't think it's fair to say they're more important than say the convenience of an "import fancypackage" that does 90% of the work for you.
Perhaps the form of improvement is often not in a new programming language (because perhaps classical languages are "good enough" for most cases), but I know I wouldn't enjoy programming with the tools available 20 years ago.
PS: I suspect your C++ perspective might also have tinted your perspective a bit. For the examples I gave as more recent advancements (i.e. GC, Rust, TF), they aren't (readily) available in C++. Sure you can say you prefer the speed of C++ which is entirely fair, but you might be pleasantly surprised if you looked outside.
Indoctrination: perhaps it is, but I don't think so, since other people make largely the same distinction, as witnessed by a series of articles on wikipedia (https://en.wikipedia.org/wiki/Fifth-generation_programming_l...). Although they have a slightly different definition of 4GL than the one I used.
Ultimately Turing-completeness means there is no difference in what you can achieve with various languages. What matters are other things: the convenience of getting the work done, performance, etc. In my experience Pascal and C++ really do differ significantly, with C++ allowing you to automate a hell of a lot more than Pascal would.
I disagree that GC is a more meaningful step forwards than, say, structured programming - this is something that is now so pervasive that it is no longer recognized for the revolution that it really was. GC is just one method of freeing memory, but it's not the only one; C++'s RAII does the same, and tracks any resource you care for, not just memory. I totally agree on the importing though. Arguably vcpkg has been a game changer in that sense.
Programming 20 years ago wasn't all that bad, really. Figuring something out usually meant reading a book. There were fewer useful libraries around and they were less capable, but on the other hand, they also weren't as ridiculously complex as today's libraries sometimes are.
And I'll readily admit that C++ informs my comments ;-) I mean, I've been writing C++ since 1996 or so, and for most of that, pretty much full time...
What I've observed is that programming languages have significantly caught up with the math behind them. And it's not uncommon for some of the nearly-cutting-edge math to be hundreds of years old, or more. I recently looked into algorithms for finding the GCD of two numbers, and Euclid's algorithm from 300 BC is still basically the best way to do it (with an optimization for our binary representation happening in 1967). As programming languages catch up to the underlying mathematical theory, they're going to start progressing at the pace of math breakthroughs. That kinda sounds like a good thing to me.
I don't see object-oriented languages as the pinnacle. Notably missing from your list are functional languages, which I also don't see as the pinnacle, but rather "beside" OO langs. A 5GL to me is one that incorporates the best of each paradigm. I generally program in C# and TypeScript, and I'm constantly switching between an imperative style, an OO style, and a functional style. C# is almost my ideal language; its type system is just a little too weak. Perhaps it's a 4.5G language. But let's look at what you can do with it:
* Strong OO support
* First-class functions (functions-as-data)
* Strong support for reflection, allowing powerful IoC and types-as-data
* Good generic type support (reified! Yay!)
* Generators with yield
* Async/await flow programming, including "yield async" mixing the two
* Low-level bit manipulation, arrays of structs (certain memory guarantees), unchecked operations when you need them, and other C-style concepts
* LISP-style macro manipulation (code-as-data) using Expressions (under-appreciated IMO)
* No higher-kinded types :(
* LINQ, if you're into it ... but I prefer fluent because it's more extensible and language-idiomatic (although you can write your own limited LINQ-style methods)
Add in higher-kinded types and how could you not call that a 5GL? I honestly don't know what else you could possibly want from a programming language, except maybe Rust's borrow checking (which frankly is a recent breakthrough in computer programming, kinda challenging the premise of this whole thing).
My next big idea is a programming language whose type system is written with the "same sauce" as the language itself. So the output of a "program" could be the type system for another program. You could write your software in "tiers" of progressively more strict/knowledgeable type systems. That's one thing I haven't seen before that sounds neat, but who knows if it's actually useful.
I think the biggest programming productivity boosts since 1984 haven't been about programming languages, but about tools.
Specifically, distributed version control and dependency management tools.
Being able to collaborate with developers anywhere in the world, and being able to pull in any library with a single declarative line in a configuration file, increases productivity more than any improvement to a programming language ever could.
Nice read. Personally, I am not as unhappy with the state of tooling. With LSP, many programming languages are getting better VSCode, Emacs, Vi, etc. support. XCode is actually nice to use, on a fast M1 Mac. I think wide adoption of deep learning is the biggest breakthrough in recent years.
EDIT: I have used Lisp heavily for 40 years, and I have done a few Smalltalk projects.
Early programming language advancements were about abstracting away the basic repetitive stuff (function call stack manipulation) and the hardware details (register selection, memory addresses). They did it in a way that was minimally "leaky"; debugging a C program you may be aware of the call stack and registers, but most of the time you'll be just fine working at the abstraction level of function parameters and local variables.
Since then we've added tons more boilerplate and hardware to the standard application deployment, it runs over multiple servers and clients using various network protocols, interacts with databases and file systems, etc. But modern solutions to these are mostly code generation and other leaky layers, it's likely you can't debug a typical problem without reading or stepping through generated code or libraries.
What I'd like to see in a new programming language is some abstraction of an application that has persistent data and is distributed, with the details being more or less compiler flags. And comes with debugging tools that allow the programmer to stay at that level of abstraction. But most new language announcements come down to some new form of syntactic sugar or data typing.
Maybe calling these "breakthroughs" is a stretch, but anyway...
Swift. I think that this is the best general purpose language ever created. By "best" I mean that it has the highest productivity (which is determined by readability, static safety, expressiveness, etc.) after normalizing for factors outside of the language spec, e.g. tooling, libraries, compilation and runtime speed.
React and Svelte. The first breakthrough was React, the second generation is Svelte and similar frameworks.
Async / await. This is a major improvement to the readibility and mental model simplicity of the most common type of concurrency code.
I think that programming is the most confused discipline there is. I've mentioned this in previous posts. In any other field there is usually an established body of knowledge, either in science of professions like accountancy.
In programming we're still arguing about whether the debits should go on the left and credits on the right, or vice versa. By that I mean, facetiously, we're still arguing about what programming language to use.
Most new sciences - e.g. electronics - gather a mature body of knowledge relatively quickly. We know exactly how transistors work and how to use them, for example. Nobody argues about how to calculate the current flowing through a wire.
This, to some extent, happened in programming, especially early, but the matter still seems far from settled.
Some poster on here once noted that there are no new paradigms in programming. Their argument is that not functional, OO, or other invention, as good as they are, constitute a paradigm shift. A paradigm shift is so fundamental that it uproots our whole conceptions about how the universe works; like the shift away from using epicycles to describe the motion of planets to using a theory of gravitation.
I've been tinkering around with microcontrollers lately, and it has given me a perspective that I suspect many don't have. I like C++, and I think it's suitable for high-level programming, due to things like deterministic garbage collection and niceties like strings and vectors.
But I've come to the conclusion that C++ is marginal, at best, on microcontrollers. Things tend to be far more static, so these niceties do not bulk large. But C++ also needs extra fiddle if you start from scratch. Things like exceptions need effort to implement.
C is low-level. You get to build things block by block. If your microcontroller supports some nifty hardware acceleration feature, then you can use that.
And that's the thing. A high-level language is an abstraction. It can't decide what should happen at a very low level. So you have a convenience, but you also take a hit.
So perhaps looking for a better language is a chimera, because how you do something is dependent on what it is you're trying to do. It also means that C is never likely to go away as a language, and that C++ was a clever idea in that it built on C.
> Most new sciences - e.g. electronics - gather a mature body of knowledge relatively quickly. We know exactly how transistors work and how to use them, for example. Nobody argues about how to calculate the current flowing through a wire.
Ever since we made computers fast enough to implement basic things in the 1980/1990s, the debate has never been a "scientific" one. It's all about programmer preference and psychology.
Equations about electricity is relatively simple. Yet we don't have equations about programmer brains. The only way we can "scientifically" experiment with programming languages is to invent new languages/features and see how programmers react. Do they make fewer mistakes? Do they see increased productivity? Do they "like" it or even swear by it?
To a very large extent (more-so than most would imagine), trying to pin down "programming best practices" is like trying to set a standard for writing novels. You have common tropes and literary devices (aka "design patterns"), but the expressiveness of languages (both human and computer) makes it hard to make final, scientific conclusions.
The electricity on a wire thing is really much more simpler.
I wouldn't call it a breakthrough as the ideas are not new, but I consider Rust to be a major practical advancement that brings ideas into the mainstream that were formerly just for "academic" languages.
It's the first IMO viable alternative to C and C++ for systems programming ever, and its safety features represent a successful transfer of provable-safety ideas of academia into a practical language that people actually want to use for real stuff.
As for true breakthroughs I'd say the last big one was modern deep learning.
I'd say being able to program GPUs as general purpose compute devices & general SIMD vectorization in high level language is pretty significant. it has opened up many applications like machine learning previously unavailable.
IMO some C++20 features like coroutines rank pretty high in introducing new ways of programming.
I have a question about that, and despite asking in several places, and reading lots of documents, I've ever been able to find an answer that makes sense. Maybe you can help me.
I have an algorithm that I used to run on a SIMD machine. It's a single, simple algorithm that I want to run on lots of different inputs. How can I run this on a GPU? From what I read about GPUs it should be possible, but nothing I've read has made any sense about how to do it.
Can you point me in a suitable direction?
My contact details are in my profile if you'd like to email to ask for more information.
Hi Colin. A few ways to go about it, requires getting some initial tedium done to get started. I would recommend the course https://ppc.cs.aalto.fi/ as a resource, goes through specifics of implementing an example data-crunching program using both a vectorized/SIMD CPU approach (ch1-2) and a GPU approach (ch4, using Nvidia Cuda specifically). Another approach would be to upload data to GPU in dx/vulkan/metal/opengl buffers or textures and run shaders on them, plenty of resouces out there but I understand it’s tricky to find a suitable one. Happy to discuss more
I think you'd be better served by using something like OpenCL. especially if it was already running on simd core then it should be relatively straightforward to port to openCL. the mental model to follow would be to think of a gpu as a collection of identical simd machines that are programmed in simplified version of C.
thinking further, if you have never programmed gpgpu processors you may be better off porting your code using NVIDIA CUDA (assuming you are targeting desktop graphics). it had better tools and overall a better ecosystem.
It always feels like Alan Kay is wistfully talking about how there could have been an alternate future where he and his ilk would do programming and computer engineering the "right way" and somehow the world has lost it's path. He waxes eloquent about how he enabled the genius's at Xerox PARC to do deep work and how they changed the world.
I find his talks inspiring but I also find it tone deaf that he doesn't understand the situation he was in nor does he see how the world has changed. Xerox PARC was, in my opinion, a by product of top down "waterfall" like business practices that he happened to be at the apex of. For most of the rest of us, we have to get by with minimal funding and try to push ideas to an over saturated market.
What really irks me is how he still has this view that somehow if all the worlds intellectuals just got together or just got funding, somehow they would come up with the next genius ideas and could deliver it to the rest of the world, like god's messengers.
Here's what he missed:
* Free and Open source. It's not software that's eating the world, it's free and open source software that's eating the world. Most of the worlds infrastructure runs on FOSS and we'd be living in a developer hellscape of choosing between which Apple and Microsoft crippling licensing fees we'd need to pay just to have the privilege of compiling our programs.
* Git. Git has allowed project management and software sharing with ease like nothing before. Even though GitHub is a proprietary enterprise, it's created massive value to the community through it's ease of sharing.
* Javascript. Write once run anywhere? Javascript is the only language that comes even close. Data representation that's portable? JSON is the only one that comes close. Javascript has it's warts but it brings the best of functional languages in a procedural languages skin. Javascript is the only reason I don't dismiss functional languages off hand, because Javascript actually makes those concepts useful.
* Docker (and perhaps some WebAssembly/Javascript environment). I think we're closing in the idea that Linux boxes are the analogues of "cells", where each has their own local environment and can be composed to form larger structures. Linux images may seem bloated now but when space and compute passes a threshold of cheap/fast, it won't matter.
* GPUs. Moores law is alive and well with GPUs. "Out of the box" gets you a 10x-100x speedup on algorithms ported from CPUs. I've heard of 1000x-10000x in some cases. It's not just raw performance, it's also designing data structures that work on GPUs, are more data centric and are designed for small, independent and highly parallel workloads.
And I'm sure there are many more. I would add cryptocurrency but maybe that falls outside of the "programming" scope.
For all his preaching, Alan Kay still doesn't "get it". If there's a billion dollar monopoly that has a small, independent research group that has no ceiling to funding and has Alan Kay as it's head, great, I'm sure they'll come up with some wonderful ideas. We don't live in that world anymore and the playing field has been leveled because of the points above and many more (maybe even more simply just because cheaper compute is available).
It never really occurs to Alan Kay that maybe his type of research is not the only way. Better yet, maybe Alan Kay's style of research is gone because we've discovered better ways.
If Alan Kay really wants to see the worlds next ideas birthed into existence, why isn't he at the forefront of free and open source, championing the software that enables people to his type of fundamental research? If you really want a trillion dollar ROI, invest a few million in FOSS.
Thank you for this comment, I found it to be tremendously useful in helping me understand some of the attitudes of the present day.
I did understand the situation I was in, and I do see that the world has changed.
Arguments against passionately held positions are usually fruitless, and I don't like debate as a form. I will point out e.g. that there was nothing at all "top down" about the Xerox Parc process (that can be easily checked).
Leaving out the "anti-elite anger", I think that another round of the ARPA-IPTO type funding and research community (of which Parc was a part in the 70s) would make an enormous difference in the richness and levels of ideas and technologies available for computing to choose from.
But let me note that the uptake of the 60s and 70s inventions by the larger field was a bit spotty and introduced a fair amount of noise when something was adopted at all. This would likely be the fate of many of newer better inventions than we were able to do 40 and more years ago.
The writer says: "For most of the rest of us, we have to get by with minimal funding and try to push ideas to an over saturated market." There's no question that the current side-conditions in commercial computing are stifling (and they were when I was a journeyman programmer in the early 60s: every important choice -- of problem, HW, tools, etc was already made and stipulated).
It's worth pointing out here that e.g. many of the most important inventions at Xerox Parc were done by a grand total of 25 researchers plus about an equal number of support folks. That represents a tiny percentage of "mad money" that most Fortune 500 companies would regard as "nothing". The cost in dollars is not the reason they don't invest in new inventions in computing, especially software.
One observation that I think obtains here, is that a very large percentage of computer people take much more joy in "devising" than "learning" -- and haven't tackled the idea that "lots of learning" will greatly uplift "devising".
I found the list above very illuminating in understanding where the author might be coming from. What's interesting is that it does represent "new things" that have appeared in the general world of computing -- and have happened after many of the ARPA/PARC etc. inventions.
I think that anyone who can find the perspectives to be able to criticize this list will also be able to see some of what has happened. What does it mean in the large that these are the solutions endorsed in the current day?
Something not on the list per se is the web and especially the web browser. Most computerists I talk to are unable to really criticize these, especially the latter. (The "new normal" seems to be inescapable "reality".)
I think a good one to pick on the list here would be "Docker" and "containers". There is a lot to be learned here, both about computing and people who do computing if this could be criticized deeply, and alternatives identified.
I find it interesting that the author purports to read my mind. This doesn't seem to be working.
>If Alan Kay really wants to see the worlds next ideas birthed into existence, why isn't he at the forefront of free and open source?
He was, he created Smalltalk, released the source code and made it as open as it can be. You can see the source of everything.
The only problem was very few people were interested on that. Most children were interested in just one thing: games(the best quality they could get). And adults wanted to run professional programs in their inexpensive machines.
He just could not understand why people choose other languages, like C, that were anathema for him, but let people extract all the juice of their cheap computers in order to create and play games and also serious programs that previously only run on mainframes.
As an academic with early access to "hundreds of thousand dollars per machine"(and not accounting for inflation), he was too isolated from the rest of the world to understand what happened later.
What special access to the divine do you think you have that would encourage you to report as facts the last two sentences (which are actually quite false)? You are confusing your internal inferences with what is actually going on -- and to the extent that you feel you can tell others.
If you are trying to bluff your way through -- to appear knowledgeable when you aren't and can't be -- this is a behavioral syndrome that will not serve you at all well overall.
I really enjoyed this critique of AK, but you could really stuff docker and javascript into "portability" to include technologies that provide the same benefits to other focuses outside of application programming. Intermediate representations like LLVM and WASM, and even environments like Wine and Linux subsystem for Windows have felt like major paradigm shifts.
Write once, run anywhere: I used to demo for LiveCode https://livecode.com/ at MacWorld and WWDC where I would code a basic app and build separate single-file standalones for Mac, Windows, and Linux, all while holding my breath.
I think to understand the "next" level of programming it's important to broaden our definition. "Programming" should be more like "defining a system to turn a specific input from a defined input set into a corresponding action or output."
That's too broad, because it includes the formula =A1*2 in Excel. But at some greater level of complexity, an Excel spreadsheet transforms from "a couple formulas" to "a structured tool designed to keep track of our orders and invoices" -- in other words, a program.
On that basis, the recent advances include spreadsheets, database tools, and JavaScript along with other scripting languages.
You can now solve unsolvable problems, like practical formal verification in big C/C++ codebases. There's no need anymore to write test cases, as formal verification tries all possible input values, not just a few selected. It also checks all possible error cases, not just the ones you thought about.
Could you expand on what you mean by this? Because it sounds like a level of hype bordering on nonsense.
1. How is a SAT solver going to do formal verification? How do you turn a formal verification into a SAT problem?
2. You can only formally verify what your formal verifier can handle, which is usually somewhat less than "everything". Can your SAT-driven formal verifier verify that the worst case response time of the system is less than X?
3. Formal verification tends to be slow. If I can write and run unit tests orders of magnitude faster than the formal verifier, then the formal verifier is going to not be used much.
SAT solvers will also be also be applied in traditional compilers soon.
The remaining problem with formal verification and compilation is proof of termination, which is not a SAT problem.
Running all test cases via cbcm is magnitudes faster than writing 100% coverage (only line cov, not covered values) tests. It's like one day vs one month. And I've never seen code with full value test coverage.
For guaranteed timing you need special tests, not a SAT solvers. The solver just helps with the proofs.
Even if everything you say is true, it's true the first time. Then I make a change to the code. Now if I have unit tests, I have to fix a test (maybe minutes, maybe an hour) and re-run the tests (minutes). Or I have to re-run the formal verification (a day).
The biggest shift lately is probably on the ops side. The role sysadmin or DBA hardly exists anymore as full-time jobs. With the goals of containerization and avoiding snowflaking giving lots of positive spill-over effects for the developer side as well.
Today i can snapshot the volume of my containers, completely format my computer, download and reinstall Ubuntu from scratch. Download and install any IDE and all tools i need, clone a project from github, npm install thousands of dependencies, spin up a docker-compose network with a distributed nosql database and a a few services, commit and push a MR giving relevant test results back from github CI and someone reviewing the code - All in one day. 10-15 years ago (without even considering the difference in download-time) every step here would be at least a day each, involving complicated configurations and often causing non revertible modifications to the host system.
Containerization and sandboxing can also be seen as being big enablers for the mobile app stores and untrusted browser applications being able to deploy to clients instantly.
I'd also say UI frameworks are a big deal. Both Windows, Android, iOS and the Web have much more powerful tools than say Win32 or whatever old baseline one wants to compare with. Especially declarative web front end frameworks like React and Vue. Some might call these fad-frameworks, but i for sure can write better web-apps with less than half the number of lines, and with much less spaghetti, compared to what we used to do in 2005 (just close one eye for the webpack complexity), if that's not a paradigm shift i don't know what is. Unfortunately this hasn't always resulted in much better applications for end users when looking at the average app, probably due to lower barrier of entry and race to the bottom.
Programming languages, IDEs and tooling overall are miles better than before, even if it's hard to pin any specific milestone to where the shift happened, all small gains add up to a big deal on all fronts. C++ is still C++ and Java is still Java but even there a lot of the verbose boilerplate is reduced giving much better ergonomics. Then you've got things like Kotlin, Typescript and other hybrid-dynamic/static + hybrid-OO/functional languages getting improved and more widespread, Async everywhere, Rust bringing something on the table to finally replace C/C++, package managers used by everyone really enabling code-reuse. You now also got almost all popular languages converging and being equally capable doing everything, not one language per use-case.
Then there are tons of other cools things like GPGPU, machine learning, big data, EBPF, crypto and security. It's just too much to grasp.
He was speaking as if it was still 1980s. Now we have cloud based ML doing the optimization finding.
ML will (eventually) be able to generate biz logic code, and most assuredly infrastructure config (a cloud API has a limited set of possible configs of value to our sorting patterns), UI code (we gravitate towards a limited set of UX it seems), to solve much of our daily programming work.
ML can’t invent future ideas, it can’t evolve itself without us making new hardware for it. But it will implode the blue collar dev job market eventually.
> ML can’t invent future ideas, it can’t evolve itself without us making new hardware for it. But it will implode the blue collar dev job market eventually.
I have doubts about that. Open source code seem to achieve a large part of the same function as ML (implement "boring" code for the Xth time) but it only has increased the number of blue collar devs. On the other hand, some no-code tools are opening programming to a large number of people (Excel, actions with your iPhone, things like that). Programming is one of those things that everyone would benefit from knowing, but time is limited, so anything to lower the barrier of entry will just lead to more programming. And more programming means even more programming (someone has to develop the no-code tools, the cloud infrastructure, etc).
Remember; human function names and object names are for human consumption. We could write a whole lot less if not for all the programmers who need context.
We know from our hardware platforms what we can and cannot compute; their spec defines the limits. We don’t need dozens of competing languages when our goal is “reserve memory, compute values in that memory in this order, free memory when done”.
My startup is focusing on learning what code shapes are ok from the context of security and developing a filtering tool to avoid allowing merging commits that violate that spec.
We have a lot of interest from DOD and SV companies, in the form of “pre-emptively avoid security issues via coders who write stupid code.”
Eventually we’ll have tailor made hardware with little to no generally programmable surface area.
Because our generation of computing is behind us. Kids now just want the thing to emit results.
The thing that’s holding them back is maintenance of “career oriented job life”. Rather than build programs, software people babysit dependency lists and process. Google products are a mess because it’s about capturing worker and customer agency, not engineering novel things.
> But it will implode the blue collar dev job market eventually.
There is no “blue collar dev job market”, and the end of the dev kob market that has a hint of a bluish tinge in the collar is the end that is continuously being by tooling progress. But that just expands the scope to which it is useful to apply software development, increasing jobs and wages in software.
> ML will (eventually) be able to generate biz logic code, and most assuredly infrastructure config (a cloud API has a limited set of possible configs of value to our sorting patterns), UI code (we gravitate towards a limited set of UX it seems), to solve much of our daily programming work.
Its not “machine learning”, but we already have tools that develop code in all of those domains from higher-level descriptions that what previous generations of coders supplied to them. And each generation of that tooling just serves as a progressively greater output multiplier on time devoted to developing software.
I dated a “web developer” who worked for the county. She used a Photoshop like tool to update the layout of an intranet, and occasionally straighten some PHP.
She knows nothing of computer science, just the tools she was trained on. She knew nothing about how the hardware works, that RAM and SSD were different types of memory or which situations make one preferable to the other, cause that was already figured out in the tooling.
There are a lot of people who work like that and call themselves software engineers.
We’re ultimately trying to make hardware do something, but rather than that, we invented more ornate interfaces to the same old hardware for … jobs.
A hardware based future where the onboard ML chip can generate a AAA game or Pixar level media will happen. Because then they don’t have to pay a bunch of coders or artists.
Greers “program the perimeter, not the area” comes to mind. Most software is simple logic and network effects. As our manufacturing process allow us to make customizable silicon from step 1, why not listen to Greer, bake the best perimeter for a task into hardware?
Large parts of computer science are about languages.
There is a lot of progress. (Research needs to be novel usually).
Only that that progress arrives at mainstream usually 20+ years later, in very small batches. So it's not very visible when looking only on the practically relevant things.
I'm not sure that was a breakthrough made by Rust, but I think people agree that Rust was the first language that provided those things that people actually have wanted to use outside academia and narrow industry groups.
This is at least part of the reason that reasonably strong engineers can learn a new programming language in under a day. The paradigms just aren’t that different.
I realize there’s a cottage industry of folks creating new languages all the time. But when you read the docs they all bucket into a few categories, and the differences are syntax, tooling and how system services are accessed.
All that being said, these three categories do matter. Programming using tools like autocompletion, syntax highlighting and the like does speed productivity.
But, like human language, at some point the “right” set of primitives are discovered. From that point on changes become more about culture and fads than concepts.
Try learning J in a day. It...won't go well -- but in a good way. :-) As you say, for many languages the paradigm is some variation of C, and the concepts are largely interchangeable. Then you approach something like J, where (trivial example)
+/%#
returns the average of a list by composing three functions: + sums; % divides; # counts; with a modifier / that distributes + throughout the list; and you begin to realize you're not in Kansas anymore.
I don't think the "right" set of primitives is as obvious as you say. Obviously branching and loops are foundational to many languages, but even those are optional given the right design, and they can be implemented in very different ways.
Definition by function/combinator composition isn’t weird to a modern programmer (Haskell does a lot weirder); and vector processing isn’t really weird either — even Java programmers are familiar with chaining transformations and reductions on Streams these days.
Instead, not knowing J, the only† thing that’s weird about that J expression to me, is that both +/ and # are receiving the same implicit vector argument, without any Forth-like “dup” operator needing to precede them.
Are these operators defined to take an implicit single receiver vector? (Maybe whichever one’s on top of a “vector result stack” separate from a “scalar result stack”? Maybe the first one passed to the implicit lambda expression this code would be wrapped in?) Or is syntax sugar here hiding the argument, the way Smalltalk hides the receiver in successive expressions chained using ; ?
What would a compact-as-possible J expression look like, to sum up the contents of one vector, and then divide the result by the cardinality of a different vector?
—————
† Well, there is one other, more technical thing that’s “weird” about the expression above: it’s the fact that, unless +/ is a identifier separate from +, then the lexeme-sequence + / has to either derive its AST by lookahead, or by + being a stack literal and / an HOF. But then % is, seemingly, a binary infix operator. You don’t usually find both of those syntax features in the same grammar, both operating on arbitrary identifier-class lexemes, as it would usually cause an LALR parser for the language to have shift-reduce ambiguity. Unless, I suppose, the lexeme / isn’t lexed as “an identifier” terminal, but rather its own terminal class—and one that’s illegal anywhere but postfix of an identifier.
Frankly, I suck at J. I've solved a few dozen project Euler problems with it, but that's about it. If you want to see how deep the weeds get, check out this essay on solving the n-queens problem: https://code.jsoftware.com/wiki/Essays/N_Queens_Problem
+ is a "verb"
/ is an "adverb" -- it modifies the functionality of whatever verb it is applied to.
Almost all verbs can be binary or unary, sometimes with surprising (to me, a newbie) consequences. I have no idea how it gets handled underneath the hood.
It's a bit ironic that we're essentially having a blub language debate on ycombinator's web site. I'll defer to Paul Graham's response: http://www.paulgraham.com/avg.html
Going back to the original comment, the statement was that any experienced engineer could learn a language in a day. At risk of gatekeeping, I'd argue that by definition any experienced engineer would have some experience with an array based language like APL or MATLAB so that no, J does not qualify as a different paradigm.
The implication of the original comment was that any C-style language engineer could learn any C-style language in a day. I'd even argue that point, but it definitely doesn't get you J in a day, except that an supports most C-style syntax.
So you can program in J like a C-programmer in a day. You definitely can't program like a J-programmer. And to say a competent engineer would already have array-language experience sidesteps the point.
The original comment said nothing about C-style languages. That must be something you read into their comment.
Learning J is like learning Perl or regular expressions. Nobody really wants to engage in such an activity, but people do what they need to do. Depending on their level of experience, a person who understands imperative and declarative paradigms along with the language's execution model can absolutely learn J within a day, because it only differs in syntax from the existing languages.
"This is at least part of the reason that reasonably strong engineers can learn a new programming language in under a day. The paradigms just aren’t that different."
This literally says "the paradigms aren't that different." So if you accept that C and J are different paradigms, then because the paradigms aren't that different, a C programmer could pick up J in a day.
What it doesn't say is that truly different paradigms take more than a day to learn competent programmers already understand all different paradigms.
Perl is C-like. As I said elsewhere, J supports C-style syntax, so sure you can program J in C-style in a day. But that's not J. This is J:
>To a J programmer, that's not just clear, it's obvious.
Is the argument that J just has really weird syntax? It feels trivial to me to make a mess of syntax that makes learning a language obtuse, unless there's something more fundamentally different about J, this is the first I'm hearing of it.
Their argument would be that J works differently, and once you internalize how it works, many hard problems become tractable. Their slogan is something like, "understand the problem, and it's done" -- meaning that once you understand the problem, the second step of translating the solution into code is near-trivial.
I would argue that's at least partially because J programmers are in general incredibly smart. (I don't count myself as a J programmer.)
You are the only person who is talking about C. It is a leap to go from "experienced programmer" to "that means C-only programmer". You are using the term "programming paradigm" incorrectly. C is not a programming paradigm. The paradigm of C is that it is an imperative procedural language. In the strictest sense it is a functional language, since functions are first class citizens, but it is not really a functional language as used in practice, such as when discussing function application.
And I didn't say C is a functional, array language either, so we're agreed on that. I'm lost about what your objection is at this point, so I'm happy to drop this.
Is Javascript a C-style language? Because even though the syntax looks superficially similar, REAL Javascript programmers apparently `npm install left-pad` instead of writing their own.
I guess it depends on how much past experience they had with an array-based, stack-based, logic-based, pure functional or lisp family language, and how long ago that was. I can believe an experienced engineer can learn the basics of any language in a day. But being proficient and idiomatic is another matter.
J is functionally equivalent with APL, and a few other related languages like K. In the end you can call anything "syntax," but I'd invite you to give it a try. If C is British English and Python is online-English, then J is Russian, or maybe even Mandarin.
ÅF - Fibonacci in 05AB1E that are less than the current number on the stack - Good tip for when you get THAT code interview and can pick the language. Just need to know the code for Å if you are not from Denmark.
But you don't know common compiler/interpreter pitfalls, tooling, frameworks... so in the end you still need a long time to get familiar with it for productive work.
Sure, you know the vocabulary, but it will still take time to learn the rest.
Also the amount/cost of memory has improved so that we can use immutable datastructures and functional style in many contexts, which definitely feels like a 'level-up'.
Concurrency has been getting easier too, with many languages supporting coroutines/async and/or threads. Reference capabilities are exciting as in Pony, Rust, or Clean.
In general there's a great convergence where ergonomics are improving (editors, compilers, build/package systems) and each language evolves to adopt features of other languages.
I just dabbled into the Sorbet type checker after not writing any C++ since the 90s, and it was surprisingly browseable/readable and could copy/paste the concepts based on recent Ruby, Java/Go knowledge.