Hacker News new | past | comments | ask | show | jobs | submit login
Microfeatures I'd like to see in more languages (buttondown.email/hillelwayne)
297 points by Tomte on Jan 5, 2023 | hide | past | favorite | 531 comments



> Frink has special syntax for date values. You can write # 2001-08-12 # to mean the date 2001-08-12

Frink? Frink?

This Visual Basic erasure can not stand. The #-delimited date literal syntax has been found all over the VB family: Visual Basic, VBA, VBScript, and it persists to this day in Visual Basic .NET.

Of course, being a VB syntax, it's completely cursed. You can put

    # 01/05/2023 # 
in a VB file, and what date it represents will depend on what locale it's.. compiled? executed? evaluated? in? Maybe? Depending on which VB dialect you're in? Good luck. Some of those languages would also accept

   # 01/05/23 #
And they might even agree about what century it's in.


The cursed aspect that you mention is a mighty fine argument for why the article should indeed choose Frink as the language to steal from.

https://frinklang.org/#DateTimeHandling


A VB date literal is and always has been locale-independent. Originally, they did make the mistake of making it always US-style, M/D/Y. VB.NET added more sane formats such as YYYY-MM-DD.

Now if you tried to convert a Date value to String at runtime, yeah, that would use the current locale. That was a constant source of bugs in VB6 apps running on non-US locales (including, famously, at least one Microsoft installer).


Are you sure? Maybe for Visual Basic proper, but I thought the VBA date literal basically followed Excel date parsing rules, so was localized depending on your office inatall.


Apparently I'm misremembering: https://www.engram9.info/excel-2007-vba-2/date-literals.html

Still, this violates the principle of least surprise for non-american developers, at least, so seems pretty on brand for VB.


> what date it represents will depend on what locale it's.. compiled? executed? evaluated?

oh VB. you were so close to perfection. never change.


We don't have to follow locale, we could just force ISO 8601 and there won't be any ambiguity.


Yeah, but this is Windows we're talking about. Even the SCHTASKS program (kind of like /usr/bin/at) takes a date which is locale sensitive, making it absolutely useless for scripting. Check out this answer to a question about how to use it: https://stackoverflow.com/a/18730884


The language doesn't have to care about $PLATFORM, if the spec says dates in source are ISO 8601 they're ISO 8601 or they're invalid.


I think the argument here is that VB is cursed because Windows is cursed; it inherited the bias toward cursedness. Of course other languages can choose to do better.


Or you could just use the standard '2001-08-12' with no special syntax needed and avoid the problem altogether. ISO-8601 or it's invalid. There's no reason to allow other formats or need other syntaxes when we have a 35 year old standard to use.


There’s a Java compiler plugin, manifold, that makes this type of syntax easy. There’s an example covering the date “literal” case…

https://github.com/manifold-systems/manifold/tree/master/man...


But is that December 8th or August 12?


>> Frink has special syntax for date values. You can write # 2001-08-12 # to mean the date 2001-08-12

> But is that December 8th or August 12?

I didn't even think YYYY-DD-MM is a possibility. Maybe I can grant 08/12/2022 is ambiguous and so is 08-12-2022 but can we please agree YYYY-MM-DD can't have variations?


The first rule of timestamps is, we agree on nothing!

Reverse-middle endian YYYY-DD-MM is used in Kazakh writing (https://en.m.wikipedia.org/wiki/Date_format_by_country )


Anyone who packs multiple values into a single string hates the world and possibly themselves.

Nowhere is it written that we must use the same field separator between the ambiguous fields. We could fix this problem by using the separators as type signifiers.

We could also stop using day <= 12 in our examples, which would also help a ton.


My favorite is uniform function call syntax. In several languages (Nim, Koka, D, …), you can always write bar.foo(baz) instead of foo(bar, baz) and vice-versa.

Another one from Nim is the implicit result variable. Instead of having to do this:

    func sum(nums: seq[int]): int =
      var result = 0
      for num in nums:
        result += num
      return result
you just do this:

    func sum(nums: seq[int]): int =
      for num in nums:
        result += num
It saves so much time and I'm disappointed that more languages don't have it.


> My favorite is uniform function call syntax. In several languages (Nim, Koka, D, …), you can always write bar.foo(baz) instead of foo(bar, baz) and vice-versa.

To me, these are "Tell bar's foo to do something with baz." and "Tell foo to do something with bar and baz.". So being 'able' to flipflop the syntax is at least temporarily semantic'ly confusing.


Except I find the concept of state "doing something with" other state unhelpful.

Why is bar working with baz and not baz working with bar?

I struggle with this in Unity:

    Player.collect(PickUp) // this?
    PickUp.boost(Player)  // ...or this?
Instead, the code should describe the interaction between the two units of state:

    onCollide(Player, Pickup) 
If we structure our code like in the last example, it makes sense to weaken the `a.b` vs `b(a)` distinction, and instead use the dot as a kind of pipe-operator.


It shouldn't be that confusing. Ideally, a method should be defined on an object only if it needs to access the encapsulated state of that object. Otherwise, it should probably be a free function.

So, in your case, depending on other modeling decisions, I could argue either for

onCollide(Player, Pickup) - if Player and Pickup are both plain data, and don't need to guarantee any invariants

Player.collect(Pickup) - if Player actually has to ensure some invariants such as health<100)

I don't see any good arguments for pickup.boost(Player), in typical games. Of course, if both the Pickup and the Player have some invariants that need to be maintained on a collision, then arguably the design has to be changed at a deeper level.


But this isn't a discussion about pickups specifically. Attacking an example is pointless if the thing you take issue with is incidental and not fundamental to the argument. I bet you could think of a case where you pick something else and OP's example meets your standards.


My point was that there is a meaningful, and I believe relatively simple, distinction to be made between free functions and methods bound to an object - a distinction which UFCS doesn't really help with. For any given example, I believe there is a reason to prefer one over the other, and I showed what reasoning I would use for the particular example raised by OP.


I got that and it's a good point.

Although I rarely see Objects used this way. Often, methods are used to implement all related functionality. Unity even strongly encourages this. (...at the moment. They are working on Entity Component Systems which will work more similar to my third example)

I concede that languages shouldn't use the dot as a syntactic tool, be it through Extensions[1] or UFCS, but rather offer a pipe-operator. If they don't, I'd still prefer UFCS rather than no way of chaining at all.

[1] Extensions for interface/protocol conformance are fine of course.


I think which of the three is best depends on the context:

What is the code driving it?

- looping over players to update them?

- looping over objects to update them?

- some other event loop?

I actually like the third least:

The first two tell me what is happening, but the third doesn’t — I could be colliding to block motion or I could be picking up.


That arguably depends on your POV. Thinking like python where a method always declares 'self' as the first argument, then a function is just 1 thing (there's no such thing as a method). Then dot syntax is just syntactic sugar for passing the first argument, and there's nothing special about functions. You can manually pass the first argument.

In other words, to me it's simpler and therefore less confusing.


The main selling point is the ability to chain calls

  foo.do_this(bar).do_that(baz)
instead of

  do_that(do_this(foo, bar), baz)


That is easily handled by macros without sacrificing semantic clarity:

  do_this(foo, bar) |> do_that(baz)
Or an explicit version:

  do_this(foo, bar) |<> do_that(<>, baz)
(Yes, this is a bastardization of Elixir's pipe and some Lisps' arrow macros.)


The use case for this sort of thing I like best is extending objects without touching the object.

E.g. in D to convert between different types you can use std.conv:

    import std.conv;

    "123".to!int;
    123.to!string;
Ints don't need to understand string building and strings don't need to understand int parsing. The conversion code just needs to declare a couple functions taking the right arguments and it just works. To me the above is much more readable than

    to!int("123")
    to!string(123)
in any case. It's also quite nice when dealing with C APIs since it allows you to pretend they are OOP in quite a lot of cases. e.g. with SDL:

    SDL_CreateRenderer(window, -1, 0);
turns into

    window.SDL_CreateRenderer(-1, 0);
and say I'd like to have a function to initialize all the renderer stuff in one go? I can simply declare

    void CreateRendererAndInitialize(SDL_Window* window){<snip>}
and now you can do the following:

    window.CreateRendererAndInitialize();
It removes a lot of pain from extending 3rd party types you see in other languages.


I find the distinction between “foo” and “bar's foo” unnecessarily confusing. For example in C++, why is getting the last element of a “vector” something that belongs to it, but reversing a “vector” is something external?


and vice-versa

Doesn't the reverse pollute the function namespace? If every obj.fun() can be written as fun(obj), doesn't that cause ambiguity with a previously imported global function fun()?


In rust you can do this but they are namespaced with the name of the struct. So you can do Object::fun(obj)


You can also `use Object::*` to be able to do this without prefixing. Also, you can throw use statements into functions, so this is pretty useful in specific cases.


In a language without function overloading or namespaces/modules, yes, it would be a major limitation.

But since most languages have both, I don't think it's a serious concern. I don't know of any language where the only names pace support is classes, so that all functions go in a global namespace unless they are methods on a class - maybe you could argue C works like this (where "methods" are function pointer member of a struct)?


Not unless the function signatures overlap entirely. That's rare enough that I don't think I've ever? encountered it in several years of writing Nim.


> doesn't that cause ambiguity with a previously imported global function fun()?

Yes, although overloading can mitigate the issue.


Yah I ran into that issue more in Julia. There's a built-in function to help find clashing functions.

In Nim however I've only had it happen a handful of times in a few years. Then you just need to use the module name to qualify it, or change your imports.


That sounds like no fun() at all.

Right, I'll show myself out...


If you have two functions abracadabra(foo) and foo.abracadabra() which do different things, you should rename one of those functions tbh.


The problem is when you have the functions:

Square.computeArea()

Circle.computeArea()

Clearly these should do different things. I suppose "computeArea(shape)" does dynamic dispatch based on the type of shape? But you're still putting every function defined on every type in your entire codebase in a global namespace. It's not obviously awful but I'd definitely be a bit nervous about it.


You only need dynamic dispatch if dynamic types are involved. If you can always resolve to concrete types, then you can statically resolve the required function.


It doesn't have to be dynamic dispatch; you could also do static dispatch at compile time based on type information.


I dunno, I kinda like that, add sum types and you can pass anything to computeArea.

The problem really starts when you have

typeA.func()

typeB.func(x)

typeC.func(x=default_value)


In Rust, it's still qualified by the type you're calling it on, which I assume is also the case in other languages.


> which I assume is also the case in other languages.

Nope. Rust has an extremely restrictive form of UFCS. In fact officially it's not called UFCS, but "Fully Qualified Path syntax".


Implicit result var is also in Delphi, as Result. The original Pascal is to assign to a variable with the same name as the function, which looks kind of odd, and is overloaded in recursive scenarios, since functions with no args don't need parens to invoke.


The last I used Matlab, it also required you to assign your return value to a variable with the same name as the function. Presumably Gnu Octave also has the same feature.


I've only seen this proposed as a feature to help with template meta-programming (though it could also make similar sense in any dynamic language).

There, it helps if a template can say `t.foo()`, and, with UFCS, can use that template with any t for which either `foo(t)` or `t.foo()` exist. In contrast, in C++ today, a template using `t.foo()` limits its own use only to types that have a foo() method, probably unnecessarily (note that writing `foo(t)` in a C++ template is less limiting, as someone who controls neither the template nor the type of t can still define that function).

However, outside of this use case, I think conflating these two is more of a negative than a positive. It means that there are twice as many places where I may need to lookup the definition of foo(), at the very least. So I wouldn't add this to any static language that doesn't support templates or macros.


I think not introducing UFCSs originally in C++ is one of Stroustrup biggest regrets.


What do you mean by “twice as many places”? And looking up definitions is a job for the language server.


Without UFCS, if I see obj.foo(), I know I can lookup the definition of foo() in the definition of obj's type, or in its supertypes. Even for foo(obj), there is often a canonical place where such functions are defined. With UFCS, I need to look in both of these places until I can find the right definition.

And sure, the IDE/language server/other tooling can often help, but not always (e.g. if I'm browsing some code on Github). Either way, more ambiguity for no gains is typically not a good idea, even if the downsides are minor (again, I am very much in favor of UFCS where it's directly useful, such as C++ or D).


In languages that have UFCS, there's no such thing as classes, so… you simply look for the procedure.


D certainly has classes, and a member function has access to private members, while a free-floating function does not. Nim indeed doesn't seem to have this distinction at all, and only seems to support encapsulation at the module level, not the class level (as far as I could tell from very brief searching - I have never programmed in it).


I’m not so sure about that. Did you ever check out extension methods in C#, they are a bit like what you’re describing, but not so radical.


FWIW the implicit result variable is as old as FORTRAN and ALGOL, although the common practice then was to name it the same as the function. Delphi is one language that inherited that (via Pascal) but renamed it to Result, although I don't know whether it originated there, or whether Nim picked it up from Delphi.


I think you can also do the second in Go with named returns, e.g.

    func sum(nums []int) (result int) {
        for _, n := range nums {
            result += n
        }
    }
No clue if it's an idiomatic usage, and named returns always felt a little too magic for me.


It's kinda fine if it's a short function like that[1]. When you get above 10-15 lines in a function though, it's easy to lose track of what's a return variable and what isn't.

[1] and everything in the codebase uses that style otherwise it's annoying to have to context-switch every 5 minutes.


Lots of languages will implicitly return the final expression— I feel like that's a decent compromise. Not quite as magical as an actual named variable that just exists, but not as clunky as needing as explicit `return` every time.


I always disliked implicit returns, and over the years and having dealt with many more codebases, some quite large, I've learned to dislike any implicitness.

I would much prefer that you must explicitly return a value (even if it's through an implicitly declared 'Result' variable) rather than just 'try to guess what happened here, in this long function with lots of expressions'.

There are a few exceptions, like Forth, where you really have to keep the current state of the stack in mind at all times anyway. Those exceptions naturally tend toward very small functions. Most languages don't, and the result is inevitably difficult to understand bugs.


Honestly, I don't like that style for languages which allow multiple side-effects as well. For example,

  (progn 
    (print "something") 
    0)
Is honestly pretty ugly from my point of view.


Sure, definitely, and in a case like that (say, in rust), I would just put an explicit `return` in. But there are lots of other scenarios where the result is naturally being returned by the final expression and it's quite convenient to elide the extra keyword.


you can do

    func sum(nums []int) (result int) {
        for _, n := range nums {
            result += n
        }
        return
    }
and it will work same as if you would do

    return result
As for omitting return entirely I hated it in every language where I saw it. It just feels wrong to not have return in functions that return stuff


I think missing `return` works if the language is designed around everything being expressions, so the function is just written as a single expression. I agree that in procedural type paradigms I like having the `return` keyword over things like "return the result of the last expression".


This case is a bit different - omitting the return simply returns the implicitly declared result variable, which makes perfect sense.



Initializing at what type and value?


The procedure's return type and its default (zero) value.


I don't like they implicit result variable from a scope point of view.


What do you mean?


Negative array subscripts. So a[-1] means the last element of an array, a[-2] means the second last, and so on.


Any language that supports overriding the index operation should support this. You should be able to do this in C# with a struct with a backing array, for instance. If you're going to do this, use the word "Circular" in it, and I would also insist that if a has 4 elements, then a[0] == a[4] == a[8]. In other words, you always just take the (positive) index modulo the size of the area. Then a[-1] is the same as a[N-1] for an array of size N. This could be useful in a lot of contexts, but should be made explicit.


You'd have to care about the difference between indexing with a literal and indexing with variables of different types/widths, and how indexing with a variable interacts with the size of the area.

For instance if you have an int array that contains the numbers 1-250 and you index with a uint8 variable i,

  for (uint8 i = 247; i++;) {
    // print circ_arr[i]
  }
for the values of i near the overflow points of the circular array and of the uint8 it gets weird:

  i    circ_arr[i]
  247  247
  248  248
  249  249
  250  250
  251  1   # 251 % 250 = 0
  252  2   # 252 % 250 = 1
  253  3   # ...
  254  4
  255  5
  0    1   # i overflows to 0
  1    2
  ...


Right, the calling code needs to handle its own integer overflows, of course. And if your circular array has a size other than a power of 2 you can get only a partial enumeration in the cycle that includes overflow. Sure. But are you really indexing an array of unknown size with a uint8? It's really impossible that there might be more than 255 things you ever care about? No. Everyone who is using a uint8 to index arrays is either doing something extremely low-level and fiddly where abstractions like this simply don't apply, or they're idiots who are doing cargo-cult shotgun "optimization" because they don't know how to write code that works.

If you're indexing with a [u]int32 you need to worry about this once every 4 billion increments, and if an incomplete cycle is a show-stopper for you, you can compute a safe modulo yourself based on the size(s) of your circular array(s), but more likely you just need something else. But really, you don't care if your cache hiccups a little once every 4 billion caches.

You make a good point, of course, I'm just allergic to people poking holes in back-of-the-napkin explanations of things with the trite "but integers can overflow!" It's one of those most common well actuallys written on this site. Of course integers can overflow. They almost never do though, do they? And if they do, a test fails and you add a single line somewhere to fix it.

I really think the vast majority of programmers are too often thinking about bits when they should be thinking about math.


The point made is independent of the integer size and I think we should assume in good faith, that uint8 was chosen for the purposes of an example.


Um, maybe, but then his example is sixteen million times worse than reality! If I argued against some technology by showing how bad it would be if it were sixteen million times worse, that's just not a very good argument, is it?


In a language where arrays are fixed-size, I think the proper solution is to have arrays not indexed by integers, but by a custom modular type that depends on the array with values in [0,n) that allows literals in the range [-n, n-1], with literal ‘-1’ being a different way to write ‘n-1’, etc.

You’d need a way to get that type, for example as

  float a[10,20]          // two-dimensional array of floats
  typeof(a.dims(0)) i = 0 // modular type with values in [0,9]
  typeof(a.dims(1)) j = 0 // modular type with values in [0,19]
or, slightly neater:

  auto i = a.indextype(0)
  auto j = a.indextype(1)
Ugly syntax, but in a modern language, most code would probably do something like

  for (i,j,value) in a
where the types are inferred.

Having those modular types means the compiler would do the arithmetic correct for the array, while the negative literals allow programmers to specify “last” and “next to last” correctly.


Why is an overflow that is generated by the caller your concern?


> you always just take the (positive) index modulo the size of the area.

That's something I'd like in a bunch of languages - a real modulo operator that always returns between 0 and n, even for negative inputs, rather than a remainder operator that's advertised as a modulo operator. Grrrrr!!!!!


What 'mod' is for, innit.


Right, cause why let the language do it automatically when you can use a bug prone manual implementation?


If you can't correctly implement a circular buffer with mod (or 'and') then the language can't save you, nothing can.


"If you can't correctly implement a for loop using assembly, then the language can't save you, nothing can"


Which is true. What's your point.


In c# -1 % 2 == -1. Doesn't help.


It's a little verbose and can probably be reduced, but "((x % m) + m) % m" always works. Although it's probably not better than a check if x < 0 since branch prediction will get that right almost always.

I do think it's quite odd and frustrating that modulo can return negative numbers and I don't really get the reasoning there, but there's probably a good reason I don't know about.


The only reason is that C does it that way. And C does it that way because machine code does it that way.


C#'s % is explicitly the remainder operator, not the modulo operator.


Not according to reflection.

    Expression<Func<int, int>> lambda = n => n % 2;
    Console.WriteLine(((dynamic)lambda).Body.NodeType);
Output is "Modulo".


Hmm... I haven't read the official spec, just the Microsoft documentation: https://learn.microsoft.com/en-us/dotnet/csharp/language-ref... In any case, the behavior is rem, not mod.

Ada has both rem and mod operatiors. I'm not sure how many other languages have operators for both.


Ecstasy uses % for modulo, and /% for divrem (division and remainder). So "a = b % c" for calculating the modulo, but "(a, r) = b /% c" to get the divisor and remainder.


Where would the circular indexing like that be useful?


It's frequently used in signal processing to the point where it's considered one of the defining features of DSPs. One common case is filtering over a fixed size buffer of samples. If you have circular indexing, you can simply overwrite the earliest sample and increment the base reference to the next element.

I'm not sure I'd want it for every list, but there are certain places it's nice.

[1] https://www.allaboutcircuits.com/technical-articles/circular...


First of all, it handles the "get me the 2nd to last element" case automatically, but in a way that doesn't feel like a weird edge case: it's more "mathematically sound", basically. I always want mathematical soundness if possible because it leads to serendipity, the opposite of technical debt. Where technical debt is "dammit, this is going to take so much longer than it should!"; serendipity is "oh wow I can implement this cool new feature just by combining these other two things in a new way, in like 2 lines. This is going to be way faster than I thought." Mathematical soundness / purity leads to serendipity.

Directly, it supports caches very well. You just increment the number of things you've ever cached and that's where your next cached value goes; you don't care when it overwrites an old value.

There are other cases where you just need some variant of a thing, but you don't actually care that much about which variant you get. You might want to vary your wording in auto-generated text, for instance, by rotating synonyms. Or rotating the tiles you use in a 2D game. In this case I'd define an interface where you pass in a "seed" integer and it gives you back some deterministic example; a circular array is the simplest implementation of this interface (but there are others).

You could also do simple load balancing by sending work to Worker[workCount++]. While usually you want to track each workers' existing workload (because the work takes unpredictable time), this simple approach could be sufficient if all your work completes in about the same time.

If you're doing fancy math or science computing, you may be working with finite groups or fields, whose elements you could stick in an N-dimensional circular array (based on the characteristics of the field).


Sonic Pi, the live music coding environment, has a circular list structure type called a 'ring'. This proves curiously helpful for a bunch of musical scenarios.

Like:

- you can put a short chord sequence into a ring, and it now functions as a list of as many repetitions of that chord sequence as you like. You can just loop over it forever (which is kind of the essence of how sonic pi live-loop play works)

- you can put the notes that make up a scale into a ring, and use it to extract specific chords - like, take the 1st, 3rd, 5th, 7th and 9th note - from just a seven note scale.

- you can use rings of booleans to capture drum patterns and rings of notes to capture melodies, and loop them forever

- etc. etc.


As a choice then perhaps but as a default and unalterable behaviour it can be a bloody timewaster when negative subscripts are a runtime error in your work. I've hit that in python and didn't enjoy it.


A nice alternative I've seen is that negative index is an error, but there is special syntax for indexing from the back like array[end], array[end-1], array[end-n], where n is a (positive) variable. Likewise, end can be used in range definitions like array[5:end]. Julia and Matlab both have this.


C# has a very nice approach to this: indices aren't simple numbers, but values of type Index [1], which store both the offset and the direction, and can be implicitly created for plain ints. When you do want to index from the end, you use the unary ^ operator to create a reverse index. Thus, you can write things like a[^1] or a[0..^1].

But, more importantly, it means that any custom collection type can define an indexer that can handle reverse indices in the manner that is appropriate for that particular collection; it's not just for arrays.

[1] https://learn.microsoft.com/en-us/dotnet/api/system.index


That's a lot of machinery which I feel is going to benefit relatively few people and cases. I suppose I should learn it just in case but my suspicion is that MS is adding extra stuff which they hope people will use which will act as a lock-in to C#. Ergo the benefit of this is to MS not to the end user ISTM.


All the indexed sequential collections in the standard library use it, for starters, and those collections are in turn used by the majority of users.

OTOH a convenience feature as a lock-in is hard to believe.


Inexing forwards sure. How many backwards then?

And it mihjt be possible to add a static method to array to index backwards yourself (Can't remember what they are called, but look and act like methods on the object but aren't).


All standard .NET collections with defined order and O(1) indexing support indexing backwards.

And no, it's not possible to do this using an extension method, unfortunately - there are no extension properties or indexers in C# (yet; it's something that keeps coming up). But then again, if and when they add extension indexers, this arrangement with a custom type is what'd allow you to write one that does backwards indexing on a collection type that doesn't support it out of the box.


Nim, which is not controlled by a corporation, does it the exact same way. The unary ^ operator applied to an integer creates a value of type BackwardIndex.


@int_19h, @xigoi perhsps you're right but how many times have you ever indexed backwards? Other, I grant, than to get the last item in a list. If it's more general then reversing the list would be better, alternatively you might have

   lst.reverse()[x]
which the compiler could guasrantee to recognise and simply implement as a calculation.


Well, I write plenty of Python code, so it actually comes up quite often. The annoyance with Python is that it just treats negative values as magic, so if you accidentally end up with a computed negative index, it silently does the wrong thing. But the alternative approach with explicit index-from-end syntax - whether like in C# and Nim, or like Julia and Matlab - doesn't have that problem; it's pure convenience.

And yes, of course, you can always do the same in some other, more verbose way. But why should we tolerate that verbosity when there's a solution that makes code both shorter and more readable? I rather hope that more languages will adopt one of these techniques.


(per your other posts, extension methods is their name. And they aren't supported here, got it).

> it silently does the wrong thing

yeah, my original complaint was this

> why should we tolerate that verbosity when there's a solution that makes code both shorter and more readable?

Because it's a balance. How much it benefits how many users to what degree vs. extra cost of implementation and maintenance. If you're not careful you go down the kitchen sink road and end up with bloat. Be careful when adding stuff cos you have to support it forever.

Anyway, thoughtful answers thanks.


I like D's approach of using `d[$]` to get the last element, `d[$-1]` to get the element before last, etc.


`$` refers to the length of the array, so `d[$]` is an array-bounds error. `d[$-1]` is needed for the last element, but you can’t do that blindly, you have to check that the length is nonzero or you’ll get unsigned underflow to ulong.max


Which will also cause an array-bounds error. So you're covered!


perl have $d[$#d] represent the index of last element.


Yeah, but $#list is better used for writing loops:

  foreach my $i (0..$#list) {
    say "$i: $list[$i]";
  }
For getting the last element from a list you can just use -1 (and of course further negative numbers work like you would expect, -2 is second to last and so on):

  my @last_three = @items[-1, -2, -3];


yes, I forgot perl have negative index!


$my_properly_named_array[$#my_properly_named_array] doesn't look like QoL.


I was really surprised when I learned my second language (after Python) to realize that this wasn't standard!


I'm fine with an array type that supports this, but not as the default. I've been bitten in the past by code that ran error-free while giving incorrect output due to this feature suddenly making the indexing valid. I'd prefer it to be opt-in somehow so that the default behavior for negatives is invalid, not silently wrong yet valid behavior.


Matlab has the opposite where your syntactically valid but mysteriously non-working code suddenly has a large array.

EDIT: On the other hand, I think Matlab's array(end - number) indexing syntax is a good compromise of convenience and less error prone explicitness.


You can also overload end in Matlab for your custom classes as it's actually a method.


Ada has 'Last, 'First and 'Length attributes to similar effect.


While I agree, I've met people who think the idea of a negative index is completely absurd. Their brains seem to immediately reject the concept.


There are always some of those, for every novelty. Sometimes it's me.


Only downside I can see is performance hit. You need to check if it’s negative then you need now calculate the length first… which is a bit of overhead but considering how often you use arrays in a tight-loop… I mean if you can optimize out the check because you can detect that the index is always positive…


You can have an optional length prepended to arrays that use that feature.


Yes please, there was nothing on the article's list that particularly resonated with me but I really wish this was standard in every language.


The problem is that it adds runtime overhead and can cause silent bugs.


javascript has something akin to that:

[1,2].at(-2) returns 1


That's for languages that can't define arrays with custom start/stop indexes. But those that have custom indexes they can very easily expand/implement as helper class (for example array.indexFromLast(1) which means array[Length(array)]. This way you can have best of both worlds.


Surely if your language has custom indexes / ranges `Length(array)` is completely broken and the language provides something like "Index`Last" you can hook on?

Because an array with indexes [3, 7) has length 4, but 4 is not the index of the last element.


That's what Ada does, yes. You'd let the array (or whatever collection) do the work for you:

  for I in A'Range loop
    A(I) = A(I) + A(I);
  end loop;
Whatever the range is, this will work. If you really need the first and last elements or want to be explicit:

  Start := A'First;
  End := A'Last;
And if the type of the range (since any discrete type can be used) doesn't support simple incrementing with +1 or similar, you can use 'Succ to step through:

  Index := A'First;
  Index := Whatever_Type'Succ(Index);
Also 'Pred to work backwards. Those can be wrapped up in a simpler function if desired.


And with its array slice mechanisms, Ada is one of the most easy/productive language to handle arrays.

Being able to give subarrays to a procedure and preventing buffer overruns everywhere, reducing screw-up scope everywhere is a superpower I didn't know I needed before starting writing proved parsers.


Yup, correct. What I meant above with array[Length(array)] is for the languages that don't have it. Let me be more clear.

C/C++ doesn't have custom array indexes and as such <array[std::size(array) - 1]> is returning the last element of said array.

Delphi has custom array indexes and as such, taking your example with defining an array in the form <example_array : array[3..7] of integer>, I would not get the last element in case of <example_array[Length(example_array) - 1]. In this case I would have 2 options. Option 1 would be to use <High> function as in <example_array[High(example_array)]> to access example_array[7] element. Delphi also has <Low> function so you can iterate through a custom defined array by using <for> keyword with the help of them. Option 2 would be to actually build my own helper (this is the most wanted case when you're dealing with multi-dimensional arrays that also have custom indexes) and I would have something like <example_array.FromLastIndex(0)> to access example_array[7] element.

Hope this cleared the confusion.


As long as there exists a bijection between whatever you choose as an index and the natural numbers starting from 0 it is fine. (I.e. the range of valid indices must be a countable set) In your example that bijection could be:

  3 -> 0
  4 -> 1
  5 -> 2
  6 -> 3
This works for vectors as well, so why not have a range from (0,0) to (5,5) to index into an array arr? You could write the function that does the mapping manually:

  arr[(x,y)] = backing_array[x / 5 + y] //bounds checks omitted
But here it can be automated quite simply to allow for vectors of even 3 or 4 dimensions.

Just know that custom indexes / ranges are not automagically broken. Personally, I like how much easier it is to read the intent with custom indices.


Yes, then I would expect 'First and 'Last with the obvious meaning, and something like 'Range which returns an iterator of all indices.


if it works on literals only, so a[x] doesnt work if x is negative, then ok.

otherwise seems like an errors that are hard to spot.


This is such a weird take to me. You're saying: I want to add a rule, where this structure responds to a request in a certain way, based on how the programmer wrote the request in the calling code. Layer upon layer upon layer of weird, janky, edge-case, pseudo-rules, with no consistency, no clear mental model; an absolute nightmare of a programming language. No longer can you possibly intuit what a[-1] really means, nor can you intuit the rules of indexing. You've broken TWO mental models in one fell swoop. No longer can I look at your programming language and assume that anywhere I see a 7, I can replace it with a variable whose value is 7. That is no longer true in your language! Variables no longer work intuitively in your language. Think about that! What an absolute nightmare!

This is exactly the difference between a language like PHP and a pure functional language. PHP says: usually we want to do X, but sometimes Y, so we'll make Z which does X unless Q is true in which case T1 will be set and Y will happen most of the time when you want it assuming you called it the write way and put an @ in the right spot otherwise P will happen because I hadn't had lunch when I wrote that and it seemed like P was pretty likely to be the case when T1 was set but an @ was not written but lately I've been feeling like maybe T2 should also be set sometimes so if you call Z and you want X but T1 is written and you don't want to write an @ then you can just set CONSTANT_FOO_BAR_WITHOUT_X_SET_AT to 17 because the other 16 codes are already used for other things.

Functional languages say: what if everything was just math?


I disagree about purity. At some point "math purity" "math correctness" may be not desirable.

In this case everything is about intention

In general accessing index out of range (above or below) is not desirable, in almost all cases this is bug.

And now, in my opinion `array[-1]` when `-1` is hardcoded would tell, with full intention that last index is desired.

Basically it would be translated to `arr[arr.Length - 1]`. You don't write code with `array[-1]` because that's clearly wrong (when there's no going back behaviour)

Meanwhile when it is calculated, then it should result in an error.

The rules are pretty simple I'd say - if you desire to use "reverse syntax" then you can, but when you use variables with may be calculated wrongly, then you will receive an error.


> This is such a weird take to me.

TLDR: Not that weird. If it is something that is almost certainly going to fail code-review, then may as well let the compiler fail it.

Long:

Just because I want only literals allowed someplace, or only values allowed in other places is not even close to weird.

Most places, code review won't let a function call like `foo(true, false, true, false, true)` through, because the potential for errors is so high and the readability is low.

With this take I can see code review easily getting into the weeds for each `bar[x]` to determine if x will wrap around, while letting `bar[4]` through because it is clear it will not.

Right now, with most languages, we simply let `bar[x]` through because if it is out of bounds it will throw an error/panic/etc. I think it can only silently return wrong data in C and C++.


Ruby happily allows this and I can't recall it ever being an issue. It's no more prone to errors than `x` being greater than the number of elements.

> arr = ["a", "b", "c", "d", "e"]

> x = -2

> arr[x]

=> "d"


If you intended to calculate an index and you accidentally get len(arr), you get a runtime error. But if you accidentally get -1 you silently get the last element instead. Similar to the argument about signed/unsigned indices in low level languages.


If you calculated wrong you can land on positive-but-existing value just fine...

> Similar to the argument about signed/unsigned indices in low level languages.

Think that one has to do more with convenience where most of stuff uses int by default


Elixir's sigils are amazing. There are date sigils that allow you to do what the OP does: ~N[2023-01-01 12:00:00]

But you can also define your own sigils to create new "custom syntax" for almost any struct. Kind of a special case of reader macros, I guess. Very convenient.


Swift has the expressiblebyTypeliteral series of protocols for this.

For example you could write an extension on Date to add initialization from a string:

    extension Date: ExpressibleByStringLiteral {
        public init(stringLiteral value: String) {
             // parse the string here. 
        }

    }

You can then do things like:

    let happyNewYear: Date = “2023-01-01 12:00:00”

There are a protocols for all literal types. For example, you could implement ExpressibleByIntegerLiteral and have have it init the Date object from a unix timestamp. There is even an ExpressibleByNilLiteral.


It's "funny" that things that are considered an anti-feature and something that needs to be avoided by all means in one place is considered a great feature in another place. This points strongly in the direction that there is no logic behind such "considerations".

What you just showed was a implicit conversion from String to Date. Something you would get beaten up for in Scala land.


C++ has the same with implicit constructors, generally considered to be a footgun that should be disabled with the explicit-keyword unless such a cast makes sense, implicit constructors are otherwise the default. For example vector has a constructor with takes integer size argument, if it wasn't explicit you could accidentally do vector v = {10} which would construct a vector with 10 empty elements, instead of one element with value 10. This also has to do with the ambigous curly brace syntax in c++.


Well even JavaScript has custom string literal templates. Is this really that special?


Elixir sigils[1][2].

Eg: ~w(foo bar bat) is a word list. `~ letter bracketed-text lettersasmodifiers` desugars as sigil_<letter>(text,modifiers). Similar to foo_str() of Julia[3], but for one-letter-names and more brackets. But not the unicode brackets of Raku.

[1] https://elixir-lang.org/getting-started/sigils.html [2] https://hexdocs.pm/elixir/main/syntax-reference.html#sigils [3] https://docs.julialang.org/en/v1/manual/metaprogramming/#met...


The most recent addition to the digit family being Phoenix’s new ~p”/healht”, which is a HTTP route string that automatically verifies whether the route exists, and returns compile-time warnings when you link to a path that didn’ doesn’t. It’s fantastic, and really surprising it took this long to be added to any web framework.


I really adore Python's iterable- and keyword-unpacking operators (*some_iterable, **some_mapping):

    arr = [0, 1, 2, 3, 4] 
    head, *body, tail = arr # head=0, body=[1, 2, 3], tail=4
    head, *rest = arr
    head, *_, tail = arr
    *_, tail = arr
    first, second, third, *rest = arr

    foo = {"a": 1, "b": 2, "c": 3}
    bar = {"b": 9, "x": -1}
    {**foo, **bar}  # -> {"a": 1, "b": 9, "c": 3, "x": -1}


Maybe more obvious, it also works the other way around. Saving you from doing [head] + body + [tail] or keeping copies of temporary arrays manipulated with append and extend.

    arr = [head, *body, tail]
Same with dictionaries, often very useful when you need to include **os.environ with modifications into a subprocess.


I was really disappointed when they removed support for argument unpacking. It was just so useful, especially in lambdas, for example

    Mdist = lambda (x1,y1),(x2,y2): abs(x1-x2)+abs(y1-y2)
    
    p1,p2=(1,2),(3,4)
    
    Mdist(p1,p2)  # 4


why bother with the distinction between one * and two *?

in javascript the … operator neatly does both


Python has keyword arguments to functions. In that context, a single star packs or unpacks a list of positional arguments and a double star packs or unpacks a dictionary of keyword arguments.


To complete the description, dictionaries are iterable by key only by default[1], so *dict would be ambiguous if it could splat both.

[1] a misfeature, IMNSHO.


Technically, `...` acts as either the spread or the rest operator, depending on the location. I recall there was an initial impression of intimidation among people who were familiar with ES5 syntax when trying to adopt to these new features because of this.

On the other hand, I'm fairly certain that having to visually disambiguate between `` and `*` and remembering which did what would have gotten a similar reaction.


One of the best truly micro features I've seen recently (can't remember which langauge unfortunately - it wasn't a mainstream one), is general binary literal syntax of the form:

    0x[de ad be ef 00]
So much nicer than the usual condensed format. And I think it'd be valid syntax in any language that allows binary integer literal.


I like that Elixir bitstrings[1] don't have to be bytes. <<2:3>> is three bits 010.

[1] https://elixir-lang.org/getting-started/binaries-strings-and...


A lot of languages allow underscores in numeric literals. Something like 10_000. You can put them anywhere and they get ignored. I don’t know if they also allow it for hex numbers.


> I don’t know if they also allow it for hex numbers.

They do e.g. Python

    >>> 0x_ab_cd_01_23
    2882339107
or

    >>> 0b_0010_0100
    36


Even C these days, although they chose ' instead of _ :-(


My guess is that they did that because there are real human languages where ' is used as a thousands separator.


There are also real human languages where a space is used as a thousands separator, and an underscore is kind of the programming equivalent of a space.


A half space is British Standard for thousands separator if memory serves, also standard in Norway.


The reason they used quote is outlined in the proposal: https://open-std.org/JTC1/SC22/WG14/www/docs/n2626.pdf

TL;DR is that it's because C++ uses quote. The reason C++ uses quote (they considered underscore) is because of a very obscure feature of C++ called custom literal suffixes which I'd never even heard of, but numbers can be suffixed with a custom identifier, and since single underscore is a valid identifier you can't use that. (https://en.cppreference.com/w/cpp/language/user_literal)


Swiss German for example. But space, half space and dot is also common in Europe. Just not a comma, because this is used as a decimal separator (fractions) in most countries.


It's an interesting mix of "hex" and "array" into one syntactic composition. I do kind-of like it.

Here's how we ended up supporting byte strings (# prefix) in Ecstasy, in this case multi-line:

    Byte[] bytes = #|12 34 56
                    |78 9a BC
                    |dE f0
                   ;

    console.println($|bytes=
                     |{bytes.toHexDump(4)}
                   );
Which prints:

  bytes=
  00: 12 34 56 78 .4Vx
  04: 9A BC DE F0 .¼Þð


Is that giving you a number or a bytestring?


A bytestring I think. Although in statically typed languages there's no reason why it couldn't be both depending on the inferred type.


ohhh that's cool.. in Ruby you can do 0x_dead_beef_00 or any other combo


D's approach would be:

    0xde_ad_be_ef_00


Wouldn't endianness interfere with this notation and the preceding one?


Of course it does - endianness interferes with any sort of numeric literal. Doesn't matter if it's hex, decimal, underscored, whatever.


I think you are missing the point - there is no endianness in normal written text, (or code), but there is endianness when that is translated to an actual number - what order are those bytes intended to be used in?

(Given they have been listed separately rather than as a single number).


What order are the bytes in 0xabcd intended to be used in? I don't see how making it 0x[ab, cd] would be ambiguous, it's the same assumption of reading left to right.


Absolutely damned is combination of Kotlin's several features:

1. Lambda functions can be defined with `{}`.

2.`foo(bar, somefunc)` is the same as `foo(bar) somefunc`. In other words, if the last parameter is a function, it can be provided AFTER closing parenthesis.

3. Interfaces that require only one method can be implemented on-side with a lambda function (i.e. `{}` syntax for no-param function).

Combined those three features, the code may look like that:

    routing {
        static("/statics") {
            files("css")
        }
        get("/foo") {
            call.respondText("Hello world!")
        }
    }
So you can make a config-looking file which is just pure Kotlin, with static type checking, autocomplete, suggestions, "this" etc.

It's so damned, I'm surprised author didn't mention it.


What exactly do you mean by "damned" here?


I too am confused by the unique use of this word


I am assuming the author is using it as one would use "wicked" (with a positive conotation)?


I don't know why teenagers insist on taking a negative work and making it positive, especially when teenagers don't like adding emphasis to words so you have no idea whether they think it's good or bad.

Not that I was completely innocent of this at that age.

/Rant.


That's so sick.


That's right, forgot the word (not a native speaker unfortunately).


All of these were copied from Groovy, in case you didn't know it.


It reminds me of Scala too, but I have no idea which started when. And of course both could have come up with this fairly independently. (After all it is the norm in functional style.)


Almost all Kotlin features are Scala rip offs.

Kotlin was started even as just a poor Scala clone. (Because JetBrains didn't manage to get a working Scala plugin for their IDE, so they thought it would be simpler to create their own "simpler" version of the language).


Kotlin's features are amazing for designing DSLs but I think your code sample uses more than just the 3 points you mentioned. Specifically the `call.respondText(..)` part. I assume that in this example the second argument to the `get` function is actually a "lambda with receiver", which means that the lambda executes with another object as the receiver (and that object is bound to "this" inside the lambda block), which makes the `call` object available.


Isn't it cool that you can't know what the code is actually by just looking at it? /s

Kotlin's scope injection is one of the most terrible "features" ever invented. It's dynamic scoping on steroids!

But dynamic scoping was long ago deemed a horrible bug and never ever made it again into any new language.


> 1. Lambda functions can be defined with {}.

Directly "stolen" form Scala.

> 2.foo(bar, somefunc) is the same as foo(bar) somefunc. In other words, if the last parameter is a function, it can be provided AFTER closing parenthesis.

Just a irregular syntax quirk that tries to get around the fact that Kotlin does not support multiple parameter lists, like the language where most Kotlin features come form, Scala.

> 3. Interfaces that require only one method can be implemented on-side with a lambda function (i.e. {} syntax for no-param function).

That doesn't have anything to do with Kotlin. That's Javas SAM (Single Abstract Method) feature.

> I'm surprised author didn't mention it.

The author seems not to know any Scala. Otherwise the lists would show mostly only Scala features… ;-)


Elixir’s testing library uses meta programming to show the code that fails and what the values were at both sides of a comparison.

IE

    a = 1; b = 2
    assert a == b
Will fail with error like:

    Assertion failed,
    a == b
    Left is 1
    Right is 2
So you don’t have a bunch of assert-functions; you just assert anything and it will spit out a decent error.


With how old pytest is, I assume that's where they got it from. Does it perform recursive value printing, or bespoke comparisons?

e.g. in pytest it won't just print out the values of "a" and "b", it will recursively document intermediate values until it's reached the toplevel expression:

    assert f() == g()
    assert 42 == 43
      where 42 = <function TestFailing.test_simple.<locals>.f at 0xdeadbeef0002>()
      and   43 = <function TestFailing.test_simple.<locals>.g at 0xdeadbeef0003>()
and it's possible to customise the report so you can report as a diff:

    assert "foo 1 bar" == "foo 2 bar"
      - foo 2 bar
      ?     ^
      + foo 1 bar
      ?     ^


This is one that I like a lot. Years ago (1997 timeframe) I had implemented it in a Java compiler, and a few years later in a Java library (https://github.com/oracle/coherence/blob/4e6e343e1ffd9bbfea3...) that would create an exception on the assertion failure and parse its stack trace to find the source code file name, and read it to find the text of the assertion that failed, etc. so it could build the error message ...

In Ecstasy, we built the support directly into the compiler again:

    val a = 1;
    val b = 2;
    assert a == b;
Produces:

    IllegalState: a == b, a=1, b=2


Yes! That feature is great. I got used to it in Elixir and Nim provides it as well. It's one of those little things that makes programming nicer.


ScalaTest does the same.


With regard to strings, they give a good example in Lua, but oh boy wait until this person hears about Perl (:

There's a whole section in the manual [1] for string quoting operators (qq, qw, qx, ...)

In general, I feel like Perl is one of those languages that has a high amount of these "quality of life" syntactic features, and helps make it enjoyable to write, once you get over the learning curve.

[1] https://perldoc.perl.org/perlop#Quote-Like-Operators


qw was great. Even in Python I sometimes write:

  usernames = '''
    foo bar baz
    hello world
  '''.split()

  # instead of this, which needs too many keystrokes
  usernames = ["foo", "bar", "baz", "hello", "world"]
Interestingly, Python named tuples have similar interface for fields:

  # all of these are equivalent
  EmployeeRecord = namedtuple('EmployeeRecord', ['name', 'age', 'title'])
  EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title')
  EmployeeRecord = namedtuple('EmployeeRecord', 'name age title')


Ruby was inspired by Perl and also has this. Pretty great


Raku (once Perl6) generalizes quoting as a Q , followed by optional "how should this behave" adverbs, and text bracketed by anyish unicode bracket pair. So Q:w <foo bar> is a list of two words. And has Perl-like qw/foo bar/ as sugar. Heredocs are Q:to/THEEND/ ... \nTHEEND . I'm unclear on whether you extend this without defining your own Q-like thing?

Julia allows[2] defining your own non-standard string literals. foo"bar"hee and qux`...` desugar as macro calls foo_str("bar","hee") and bar_cmd("..."). But lack the bracket flexibility.

http://rigaux.org/language-study/syntax-across-languages.htm... briefly sketches other languages.

[1] https://docs.raku.org/language/quoting [2] https://docs.julialang.org/en/v1/manual/metaprogramming/#met...


That's actually a great point. Perl has so many syntactic sugars that it is a poster child for too much variety in ways you can write things. Making it much harder to read someone else's code.


Yeah, it's a trade-off that's not discussed enough, IMO. Most general programming advice is directed at "programming in the large", involving many people, over a longer period of time, and the maintainability issues that go with that.

But you give up something to get benefits in those areas. Making use of the expressive power of something like Perl is a wonderful sensation. The barriers between thought and making it happen are lower, and so you can be remarkably productive. It is also just more fun, I find, which has subtle and under-valued long-term benefits.

But yeah, agreed that comprehending someone else's Perl-fueled vision quest can be ... rough (:


Perl has so much syntactic sugar it's the poster child for syntactic diabetes, really.


"Syntactic sugar causes cancer of the semicolon." ― Alan J. Perlis


It does! Till you hit the code of someone that didn't really care about readability that much and you're in for ugly mess.


For rust, it is probably the try (?) operator. Fundamentally, it's just syntax sugar for a match statement with an early return in the Error or None cases, but it really improves the ergonomics of dealing with Result and Option types.


Interestingly, there was once a solid effort to add a try operator to Go. While the proposal was quite well received, upon closer inspection it was realized it would be essentially useless in the real world as, given how the rest of the language works, you almost never would want to simply early return with the value received. The data revealed that the vast majority of the code in the wild that the syntax sugar would replace returned something else.

So while it is indeed a nifty feature if the rest of the language is also designed for it, it's not something that is easily tacked on to a language that is not.


In Rust, when you use `?`, it includes a step to convert the error type of the expression it was used on into the (possibly different) error type that the current function returns. So if you need to map low-level errors to high-level ones in a consistent way, you'd just do it once when defining that error type.


Which too requires the language to have a 'the producer is always right' over a 'the customer is always right' design, which Rust does. It all works well when the language is designed for it, but it has to be there at the macro level. Definitely not a 'microfeature'.


> Which too requires the language to have a 'the producer is always right' over a 'the customer is always right' design, which Rust does.

What do you mean by those two quoted terms?


Is there a link to this discussion? This seems interesting. If I can’t make a call to a downstream service, or a file I’m trying to read doesn’t exist, or a s3 bucket 404’s, or almost any other “real world” error I can think of, the only (sane) way I can think of handling this is propagating the error down to the caller (and perhaps logging?) Do you mean to say that it is idiomatic in Go to handle errors by… doing something else?

On mobile so I can’t put in a code block, but here’s how I thought Go was written: value, err := some_fn()

If err != nil { Return err }

(? Operator works here because you could do value := some_fn()? And remove the if statement boilerplate)

Do you mean that instead of “return err” Go idiomatically does something else?


> the only (sane) way I can think of handling this is propagating the error down to the caller

A big problem, among many, with doing that is that you leak implementation details out of the abstraction. If, to stick with your example, you have a function that helps you with reading files, the caller shouldn't care where the data is stored. Today it might be the local filesystem, tomorrow S3, and when you make that change nothing about the rest of the program should break.

But you can't count on the lower level functions using the same errors. As you suggest, a "a file I’m trying to read doesn’t exist" isn't represented as a "bucket 404", even though at a higher level they are the exact same thing. If you straight returned the "a file I’m trying to read doesn’t exist" error as you got it from the file API, now the caller is going to depend on that, and when you replace it with the S3 function that returns a "bucket 404" error, everything starts to break.

What you typically want to do is return a more generalized "not found" error that can remain stable regardless of specific implementation details. There are rare cases where you can get away with simply returning the value up the stack, but in the majority of cases you need to handle the error, either by doing something with it or returning a new error that is more useful to the caller. And, so, try becomes essentially unusable without the language taking a larger macro take on supporting such a feature.

Like the sibling comment points out, Rust does "from" conversion when using try (?) to try and avoid encountering the same fate.


> A big problem, among many, with doing that is that you leak implementation details out of the abstraction. If, to stick with your example, you have a function that helps you with reading files, the caller shouldn't care where the data is stored. Today it might be the local filesystem, tomorrow S3, and when you make that change nothing about the rest of the program should break.

This would lead to the situation where a local call is treated as the same thing to a network call. Which is know to be a very bad design.

> Like the sibling comment points out, Rust does "from" conversion when using try (?) to try and avoid encountering the same fate.

Yeah, and it avoids all the hassle.

Why couldn't any language (and especially Go) just do the same?


> Which is know to be a very bad design.

If your abstraction leaks that the implementation is a local call, and then you try and change that later, unquestionably. Again, you need to avoid leaking implementation details, which is too why you can't just add a try operator and make it automatically useful. Any leak of any kind in your abstraction will make life miserable later. Don't let your abstraction leak.

> Yeah, and it avoids all the hassle.

All it does is move where the code is located, placing the onus on the producer "the producer is always right" instead of the caller "the customer is always right". You don't actually avoid anything, just change the perspective.

> Why couldn't any language (and especially Go) just do the same?

Perhaps it could, but it requires that the language take a more macro look at the problem. It is not a microfeature.


You have things a bit backwards. The try operator was not conceived until after sum types and traits were already stable parts of the language. The addition was simple, didn't really alter the language in a meaningful way and is almost purely syntactic sugar meant as a quality-of-life improvement for users. (ie. it hits pretty every single point in the article's definition of microfeatures)


Here's Rust's try (?) operator.[1] It seems sugar for handling short-circuiting return values.

A bit like Swift's try? which converts throws to nil.[2] Less so javascript's short-circuiting optional chainging `?.`[3], which I thought of first.

[1] https://doc.rust-lang.org/std/ops/trait.Try.html [2] https://docs.swift.org/swift-book/ReferenceManual/Expression... then /Try Operator/ - section anchors don't look stable. [3] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


Lua also allows you to choose the string delimiter. If your string contains "]]" you can delimit it with [=[ or [==[ instead. Any number of "=" so long as the opening and closing delimiters match.


And that's why all modern languages implement streams/string helpers/string builders. You do not want to actually write strings/manipulate them using "+" (concatenation symbol) in code directly because, in modern Unicode world, it tends to become a point of failure for obscure bugs / a maintenance horror show.


String builders originated in languages with immutable strings making code using something like "foo += bar" in a loop very expensive due to the need to allocate a new string on every iteration. A string builder is basically a mutable string that can be built in-place efficiently and converted to a proper immutable string at the end. It is purely a performance thing, and there are no Unicode issues when concatenating valid Unicode strings (i.e. sequences of codepoints).


Note that some kind of string builders are necessary for efficient repeated concatenation both in languages with immutable strings and in languages with 0-terminated strings.

Using strcat() repeatedly in C for example will mean that the string is being read over and over again to find the end, making an O(n) loop actually O(n²).


Right, but that's still a performance optimization and not anything like what OP is implying.


Yes yes, OP had some really strange ideas. I was just adding some extra info to GP's answer, not trying to contradict them.


I don't understand the link with parent, nor why + would be bad, besides a) because language is naive about concatenation / allocation, and b) if the language allows / doesn't differentiate between a char 'x' and an integer, bc that would result in an integer addition instead of a concatenation.


If you manipulate strings in code using "+" (string concatenation symbol) from a user input you'd be in a world of hurt where you either do a lot of regex (which is ugly and unmaintainable) or you limit the user input to known characters only (which would be a bad user experience and later on your manager would ask you to lift such constraint anyway because they want to support a new feature from now on). Therefore you use, for example, a stream and simply dump your user input in the stream buffer, as they come, and go with that stream in your code from that point on. This way you're future proof too if your manager wants a new feature to support, as example, Chinese and/or Japanese keyboard


Not OP, but I'm still confused. Is this assuming that the language is naive in its implementation of string concatenation so it screws up Unicode?

I'm just not sure what the syntax has to do with the semantics of string concatenation.


I’m even more confused now


Just so you know, Lua's concatenation operator is ".."


> [In Chapel] there’s the config keyword. If you write config var n=1, the compiler will automatically add a --n flag to the binary. As someone who 1) loves having configurable program, and 2) hates wrangling CLI libraries, a quick-and-dirty way to add single-variable flags seems like an obvious win. Letting people define configurable variables at their call site is incredibly valuable, even if you don't have compile-time support, and even if you're working on something not meant to be an isolated binary.

At my startup, one our most beloved innovations is that you can write `resolve_config("foo", default="bar", request=request)` pretty much anywhere you'd normally hardcode a value or feature flag... and that's it.

The first time it's seen in any environment, it thread-safely inserts-if-not-present the default value into a key-value storage that's periodically replicated into in-memory dictionaries that live on each of our app servers. Any subsequent time it's accessed, it's a synchronous key-value lookup in memory, with barely any overhead. But we can also configure it in a UI without needing a code redeploy, and have feature flags and overrides set on a per-user or per-tenant basis.

Sometimes, you don't need language support if you have some clever distributed-systems thinking :)


> The first time it's seen in any environment, it thread-safely inserts-if-not-present the default value into a key-value storage

That seems like a great way to get amazingly hard to replicate bugs or odd behaviours if different subsystems use different values for the default.


I would want that system to log those changes to whatever monitoring system is being used, or integrate with the deployment system as a "deploy", so that when some oncall person is trying to figure out why the entire fleet is pegging their CPU, they can trace it back to the flag change.


You'd just need to have a mutex lock on the values. That could be slow, but you probably don't want to have a config flag in a hot loop anyways. :)

In Nim you can do compile flags which let you set constants so you avoid the problem:

    const myLibraryVersion {.intdefine.} = 3


> You'd just need to have a mutex lock on the values.

Oh no they're saying that it's thread-safe, that's not an issue. Rather that depending on the order of initialisation, possibly of different systems entirely, you can have different initial states because different systems or subsystem decided of the default value.


> Sometimes, you don't need language support if you have some clever distributed-systems thinking :)

I think you may have outwitted yourself here; I know what that looks like because I've done it so many times in the past :-)

I'm afraid your solution is not distributed-system safe, as a different bootup order of the nodes[1] in your system would result in a different config value for that key. And, at some point, your nodes are going to come up in a different order.

[1] Nodes == instances of your code that run.


I'm partial to for-else-loops. They fit perfectly into languages with compound expression (produce a value from a break in the loop or from the else block).

They don't come up that often, but when they do they're really the best solution.


I hate the Python syntax using "else" for this, but I love the feature.

I've often wanted both a "then" and an "else" from both for and while loops. The "then" would be for a successful completion (no break), and the "else" would be for when the loop doesn't even run a single iteration.

But that didn't make it into our "language budget", unfortunately. It's easy to implement, but hard to argue for when it doesn't get used often


They also seem to be unknown to a lot of Python programmers unfortunately; I’ve had PRs rejected because a for-else loop was “unusual syntax” and therefore considered hard to maintain.


I used one recently in a script and they're definitely not that readable. It always takes some time what "else" means in the context of a loop.

I feel like it would work better with a more explicit keyword, but I don't know which one. `nobreak`?


Agree with kebab-case.

> Most languages have multiline literals, but what makes the Lua version great is that the beginning and ending marks are different characters. This solves the infuriating “unnestable quotes” problem string literals have, and you don’t have to escape all your literal \s.

That paragraph also uses “nestable marks”.


Nestable comment syntax is also nice. At least some MLs (eg SML) has it, that I know of.

Indentation-sensitivity can also solve similar problems. (Indentation does not have to exclude requiring graphic termination. A formal language can require both. Or just a helpful tool.)

(Also agree with 'kebab-case', although the name is new to me and a bit weird.)


What makes C-style comments unnestable?


/**/ could potentially be nestable, they are just not defined as such.


They are defined to be non-nestable. Any appearance of */ terminates the comment.


> Nestable comment syntax is also nice. At least some MLs (eg SML) has it, that I know of.

Scala has it too. (But OK, Scala is a kind of ML).


FWIW, Raku and XPath support kebap-case. If you really mean subtraction, you have to add whitespace.


That works, but isn't necessary in Raku. As long as the right side of the hyphen does not start with an alphabetic character, you're good:

my $a = 45; say $a-3; # 42


That's such a perl/raku thing, and I mean it in the best way possible.


I really like swift's simple way of collapsing try catch blocks down to a simple value|nil result.

    let data = try? aFuncThatThrows()
Sometimes I just don't care about the reason for the exception and just want to know if it succeeded or not


Rust does this too with `Result::ok`


# Function Shorthand ####

I like how functions in js can be `arg => result`. In F# I have to do `fun arg -> result` with the `fun` keyword. It makes sense since `MyArgType -> MyResType` is a type signature in f#, but I feel like the compiler can just check if the arguments are references to types or are argument bindings.

# Multiline Lists/Arrays ####

I like how F# doesn't require delimiters for multiline lists.

So I can do `let myList = [1; 2; 3]` or

    let myList = [
      1
      2
      3
    ]
# Regex Literal ####

I like how in Crystal instead of doing `/my[regex]/` i can do `%r(my[regex])` where the parenthesis can be any brace type (like "(", "{", "<", "[") so I don't have to escape any characters.

# Argument Accessor Shorthand ####

In Crystal, you can use an ampersand to bind and access a property on an object, instead of writing the verbose form with a function.

So this

    ["a", "b"].join(",") { |s| s.upcase }
can be written as

    ["a", "b"].join(",", &.upcase)
If this were available in F#, for example, instead of

    ["a"; "b"] 
    |> List.map (fun s -> s.ToUpper())
    |> String.concat ","
I could do

    ["a"; "b"] 
    |> List.map &.ToUpper()
    |> String.concat ","


> I feel like the compiler can just check if the arguments are references to types or are argument bindings.

1. this is absolutely terrible because now you need feedback from the type checker to know how to parse the program

2. it is furthermore also ambiguous with function application, requiring arbitrary lookahead to disambiguate, also not a fun thing to do

JS gets away with it because the sigil was not previously used and it only requires a single lookahead to parse, as only single-parameter anonymous functions can have "bare" parameter lists.


This works fine in Scala. And no, they don't need feedback form the type-checker to parse a program. Also there is no ambiguity with application.

  val f: Any => String =
     any => any.toString


Yeah, then maybe just shorten `fun` to `f` or `fn` like elixir. I think even just one char makes a difference over thousands of LoC.


In haskell (and elm) it's `\` which is quite OK (and also a nod to the lambda symbol λ).

But yes "fn" is quite nice (taking over "f" is a bit much). And a few characters can definitely degrade the experience, especially as "u" and "n" are typed with the exact same finger.

Anonymous functions were definitely one of my least favorite features in Erlang, not because they don't work well but because their leading keyword is "fun" and there's an arrow between the (parenthesised) parameters and body and they also have a closing keyword "end":

    map(fun(X) -> 2 * X end, [1,2,3,4,5]).
That's a bit much.

But HoFs in general are quite awkward, as referring to a named function also requires the `fun` leading keyword, and requires specifying the arity, so

    map(fun double/1, [1,2,3,4,5]).
after having defined the function as

    double(X) -> 2 * X.
(as you can see Erlang would really rather you defined named functions).


It doesn't need feedback from the type-checker. The type is not required, only whether a symbol is a type symbol. So it is enough to have feedback from the lexical scope, which can be tracked during parsing without any type analysis. It's a compromise, but a useful and simple one. C has been doing it since the 70s ("typedef").


Re: the argument accessor shorthand, there seems to be a proposal for exactly that (using _ instead of &): https://github.com/fsharp/fslang-suggestions/issues/506#issu...


The function syntax I like even more is one with implicit arguments so you don’t have to name them, e.g.

  waiting = sum workers #(%.in_queue + %.in_flight)
Clojure has some syntax like this though it isn’t needed for the most obvious use-case of functions to extract fields because keywords, which are usually used for map keys, are implicitly functions that look themselves up in their arg, e.g.

  (:bar { :foo 3, :bar 2 })  ; => 2


> In F# I have to do `fun arg -> result` with the `fun` keyword

meanwhile in C++:

  [&](auto arg) { return result; }
You don't know how good you have it.


You can have this in Scala.

  List("a", "b").map(_.toUpperCase).mkString(",")
It's the shorthand for:

  List("a", "b").map(elem => elem.toUpperCase).mkString(",")
(Which shows the firstly proposed feature :-))


Python's `with`

Function composition operators eg

    a |> b # Call `b` with `a` as an argument
    b <| a # Same as above, reversed direction
Then you can do something like

    let x : Map = collect <|
      [a, b, c]
        |> map(entries)
        |> flatten


> Python's `with`

It's very common already: try-with-resource (java), using (C#), bracket (haskell), unwind-protect (common-lisp), ... though it the latter two it's more of a building block.

Also building block: languages with a convenient and "unrestricted" syntax for anonymous function can just use that e.g. Smalltalk, Ruby, ... in Ruby a "with" is usually just passing a block to the corresponding object's constructor:

    # python 
    with open(...) as f:
        ...

    # ruby
    File::open(...) do |f|
      ...
    end


'with' is cool, but it is annoying that it creates a new scope. And then you have ExitStack if your lifetimes do not neatly map to scopes. And AsyncExitStack and async with if you happen to be in an async function.

I prefer RAII.


I don't get it, how are |> and <| any different from parentheses?


Call/Pipe operators are just sugar for normal function calls. It's nice because adding or removing a call doesn't require balancing parentheses. It's helpful for writing stream or sequence/iterator based code and throwing debug utilities in the middle.

It's less of an issue if your language has UFCS or other postfix function call syntax like mentioned in this thread, but if you don't this is nice to have.


Chaining without nesting.


If you mean without closing parentheses, I think you can also do that in languages like Haskell with non-parenthetical function calls.


It also allows you to invert the order of the calls so they are written in the order they occur. Instead of paint(sand(cut(measure(wood)))), you can write wood | measure | cut | sand | paint, which is easier to read, especially if it splits over multiple lines or has additional arguments:

paint(sand(cut(measure(wood, 12), 40, :WZ), 220), :red)

wood | measure(12) | cut(40, :WZ) | sand(220) | paint(:red)

IIRC you can do that in Haskell as well, but I forget the name of the feature. Many OOP libraries have started to adopt a chained method call style similar to this, but it is nice to be able to do with any function.


In Haskell, if all the functions in the pipeline are pure, I'd probably write that as

    wood
    & measure 12
    & cut 40 WZ
    & sand 220
    & paint Red
See https://hackage.haskell.org/package/base-4.17.0.0/docs/Data-...


I'm not a Haskell user, but my experience with this in the Nix language is a bit mixed. It definitely works sometimes, but then you get a pileup of parenthesis nesting anyway, because the default is greedy and you have to control which functions get which arguments.


Don't $ signs work in Nix the way the work in Haskell?

E.g. https://mmhaskell.com/blog/2021/7/5/function-application-usi...


If so, I've never seen that style used or documented.


I was under the impression that Elixir introduced this.


Elixir has forward pipes, but didn't invent them.

For instance Racket and Clojure have threading macros, which are more flexible as they're just macros (Clojure's `->` is equivalent to Elixir's pipe operator, but `->>` will fill in the last parameter rather than first, and `-->` lets you use a keyword to define where the parameter is inserted in each call).

Haskell let anyone who wants define their own pipe operator, historically you had to BYO, which wasn't exactly hard:

     (|>) = flip ($)
or

     x |> f = f x
would do (modulo fixity), but today it's provided by default as "(&)".


The earliest I've seen |> specifically in stdlib was in F#, which still predates Elixir by several years.


It was present in SML codebases earlier than F# (it's used everywhere in the Isabelle codebase, for example, and that's where I first came across it).


Some from Raku (formerly Perl 6) that I really like:

* sub MAIN:

sub MAIN(Int $x, :$verbose) { }

generates a command line parser that expects an Integer plus an optional named switch --verbose

* It has named params (as seen above), and there are abbreviations: instead of thing => $thing you can write :$thing to avoid duplicating the name (:thing also exists, though it create a pair "thing" => True, so ruby lovers need to be careful :D )

* junctions for quick conditionals/validation: 0 <= all($x, $y, $z) <= 2 * pi

* this is a probably debatable, but: if you use a * as a term, it will create a lambda for you, so *+2 is similar to sub ($x) { $x + 2 }


phasers are brilliant too. so much nicer than all the clunky workarounds for "the first/last pass through this loop needs to be treated specially"


> If you look at something like numpy functions, so many of them share the exact same parameter definitions. What if you could write def log(standard-exp-params) instead of having to write them out every single time?

They're not actually written out every time, the issue is mostly documentary (and it would be nice if Python or Sphinx ever had a good solution). And numpy actually has a bunch of generators for that e.g. https://github.com/numpy/numpy/blob/45bc13e6d922690eea43b9d8... handles filling in the common bits of documentation for the ufuncs.


I don't use Numpy but it sounds like they're describing *, ** operators.

    default_args = ('x', 'y', 'z')
    default_kwargs = {'p': 'p', 'q': 'q', 'r': 'r'}
    
    def printer(x, y, z, /, *, p, q, r):
     print(f'x={x} | y={y} | z={z} | p={p} | q={q} | r={r}')  
  
    printer(*default_args, **default_kwargs)  # x=x | y=y | z=z | p=p | q=q | r=r
EDIT: Formatting.


Yes but also that lacks most of the documentation so it's not great.

If you have multiple callables taking these parameters documenting them is awkward, by default help/pydoc and sphinx will tell you that the parameters are `default_args` and `default_kwargs`, but that's not actually true, those are just intended as shortcuts / helpers .


> Quality-of-life features that aren’t too hard to add, and don’t meaningfully change a language in its absence. Often syntactic sugar

This is basically the premise of Project Coin, released in Java SE 7: https://openjdk.org/projects/coin/

    The goal of Project Coin is to determine what set of small language changes should be added to JDK 7. That list is:
    
    * Strings in switch
    * Binary integral literals and underscores in numeric literals
    * Multi-catch and more precise rethrow
    * Improved type inference for generic instance creation (diamond)
    * try-with-resources statement
    * Simplified varargs method invocation


I really liked the postfix and prefix notation in Mathematica. These three all mean the same:

f[x]

f@x

x // f

It matches the flow of thought more naturally when hammering out a couple of one-liners.


How is that third one readable at all? I’d assume it meant integer division.


Mathematica uses a lot of syntactic sugar. You will find all Mathematica code unreadable until you've learned to read it; `//` is no exception.


I find Mathematica code unreadable full stop.

Its convenient when writing though.


I see Mathematica as a shell for math. You can write long programs or modules in it, sure, but very often you're simply typing up a couple of lines to check some computation or visualize an expression which you won't even save, in which case readability is secondary.


You're writing something and you decide you want to apply 'f' to it, so you type '// f' (instead of backspacing like a caveman). It's actually rather convenient.


Ah neat.


It's usually used with the formatting functions, so you have something like:

<Some big expression here> // Column

Useful where the function in question is an "afterthought".


That's just a matter of habit. The exact symbol isn't important anyway.


Julia also has

x |> f

as a syntax sugar for

f(x)

and is useful for the same reason as Mathematica.


(I don't know where the pipeline operator originated, but F# certainly had it before Julia, by the way.)


Strongly-typed units and unit literals (e.g. 3mL, 10gal, 15m / 3s = 5m/s)

AFAIK F# has these and that's about it


I see this in some Rust crates, as you can implement traits for built-in, primitive types.

For instance, the `embedded_time` crate lets you do

    200.microseconds()
    5.Hz()
https://docs.rs/embedded-time/latest/embedded_time/


They can be implemented as a library C++ thanks to templates and user-defined literals, and there's a proposal to add them to stdlib based on one of the existing implementations: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p19...


You can implement it in C# too with zero run time cost.


I don't see how, except for manually rolling a struct for every new combination of units. The C# generic system is not flexible enough to define things like "m / s / s = (m/s^2)".


There's a Java compiler extension, Manifold[1], that adds this to Java.

[1] https://github.com/manifold-systems/manifold/tree/master/man...


You can make that in Kotlin, e.g.: `3.mph` or `(-45..45).deg` for ranges.

with definition like:

    val Double.mph: Speed
      get() = Speed.mph(this)

    data class Speed(val ms: Double)
      companion object {
        fun mph(mph: Double) = Speed(mph * 1609.344 / 3600.0)
      }


Frink (which OP mentions for its datetime syntax) has the concept of units and unit conversions built into the core language - but that's about all it does! I'd love to have this feature in a general-purpose language.



Keyword and optional arguments (seen in e.g. Python, Bash with --flags, and OCaml) are my favorite language superpower. They make code more self-documenting and let you add add optional behaviors to functions. This makes it really easy to make concise, highly usable APIs.


In OCaml, they are a bit more subtle:

   let x = ... in
   some_function ~x
if the parameter has the same name on both sides (caller, callee) there's a syntactic shortcut. It's a small but noticeable force that pushes you towards more consistent naming.


I love Common Lisp arguments support, where you can also get a parameter that tells you whether an optional or keyword argument was supplied by the caller, instead of relying on the default value, for when you want to know. It's very useful for something like a patch function.


Toit makes a lot of use of this, and uses the bash syntax.


Perl's if and unless operators, which can also be postfixed, eg:

  die "can't be negative" unless $i >= 0;
Perl is full of usability features, like <> for reading input, inline literate coding annotations, implicit $_.


Can’t say this was the source but it harkens back to DEC BASIC-PLUS on the PDP.

It has the concept, I think, of “statement modifiers”.

You could do things like:

    S=S+A[I] IF A[I]>5 FOR I=1 TO N
It always read really well.


Similar in spirit to the `until` loop.


Swift and Gleam both have

Function Argument Labels and Parameter Name separation.

Example in gleam:

    // definition
    fn catalog(by list_of_qualities) {
     add_to_catalog(list_of_qualities)
    }

    //applying the function
    catalog(by: [Genre, Year])


WebGL's swizzling vector selectors.[1] Where v.x desugars as v[0], v.y as v[1]. Similarly for z and w. Also r,g,b,a. And they swizzle: v.rgb, v.xz, v.zx . So `v1.xy = v0.yx` reflects.

[1] https://www.w3.org/TR/WGSL/#vector-access-expr


Which leads to the question why you would use offset / index syntax instead of dot syntax in the first place.

Therefore I would prefer something like this to be the usual array access syntax:

  val chars = Array("a", "b", "c")
  val secondChar = a.2 // as a shorthand for `chars.atIndex(2)`, or equivalently `chars.atOffset(1)`, maybe also with `a..1` for the offset case
(Also we should stop calling the offset "index", and get a proper "atIndex" method.)


That applies to GLSL in general. I tried once to emulate that using C unions, but quickly realized that there's no solution for v.xz, v.xw, v.yw, etc.

You can write functions like xyz(v) and that works well enough. Even better if you have ufcs or methods.


I might add kebab-case to my current language project. From all the code I've written, I only found a handful of - operators not surrounded by spaces, so that ambiguity wouldn't bite me often.

Also, I wish the unary negation operator was more visually salient. `foo * -bar` is very different from `foo * bar`, but it's only a handful of pixels on the screen. I've thought about trying to render it as an em-dash or something. Didn't NASA lose a rocket over a spurious - sign?


Some languages use ~ for negation; wasting it (and other common chars) on bitwise ops is a waste in most languages that don't specifically target bit twiddling.

That said, have you considered making this outright illegal without explicit parentheses? I actually wish that more languages would require that any sequence of operators has the same precedence throughout; i.e. a+b*c would also be illegal. It's always a pain to remember the exact precedence rules, especially since they're not consistent across PLs, so I'd prefer any expression that is ambiguous to be explicitly disambiguated.


APL uses ¯ for negative numbers. - is a ambivalent function (it can work as a monadic or dyadic function)


Code is symbolic and not like a (western) written language.

Kebab-case and snake_case may seem to read better when looked at code like it would be written text, but they read worse than camelCase when looked at it in a symbolic way.

https://news.mit.edu/2020/brain-reading-computer-code-1215


I have a similar opinion on using the exclamation point for the NOT operator.

It can be very hard to spot sometimes, especially after an opening bracket - if(!something())

Of course its generally better to try and rename functions or refactor to avoid the negative if possible.



There are coding fonts that make the "-" look more like a minus than a hyphen. I know that Iosevka and its variants do.


What I'd like to have brought back is basically the extended version of the numbers/kebap-case: Ignore white space in constants and identifiers if possible, like Algol-68 did.

That means you can write your number as "1 000 000" and put it in the variable "one million", and then feed that to your function called "withdraw money".

Yes, sure, makes it harder to grep. Here's a nickel, get a better grep tool (or wrapper thereof).


I would add named function parameters to this list. So useful.


Clojure’s loop expression hits this spot for me. It sets a recursion point to which you can jump using any logic inside the body you want, as long as it is from tail position. It’s like a while loop turned into an expression. I haven’t encountered any other way to write iterative expressions whose number of iterations isn’t known at the top (like map and reduce).


Tail call optimization can get you that, too. If you've written Scheme and/or gone through SICP you might be familiar with this: you write a recursive function, with the recursive function call as the last thing the function does ('tail-recursion'), and the compiler/runtime is able to optimize those recursive calls out rather than consuming one stack frame of space per call ('tail call optimization'). Clojure has loop/recur at least partially because it doesn't support tail-call optimization.

See https://en.wikipedia.org/wiki/Tail_call for more. Or SICP might be a good resource. https://sarabander.github.io/sicp/html/1_002e2.xhtml


Interestingly, I almost prefer Clojure's `recur` semantically. Means you don't have to change the function name twice if you rename it, and it's hard to miss that you're recursing.


Those are cool properties. Another one is that you get a compilation error if your recursive call isn't in the tail position (and thus would actually grow the stack when you thought it didn't).

One thing I don't think you can do with loop/recur, though, is optimize more complicated bits of recursion than a single function that calls itself. I.e. imagine a recursive call pattern that goes like f -> g -> f -> g -> ...

(edit: I'm pretty sure this is why trampoline exists, though I've never really played with it... https://clojuredocs.org/clojure.core/trampoline)


If you don't have to change a function's name twice when you rename it, that implies it is not called anywhere. :)


It's also nice to get an error when you `recur` from a non-tail position rather than the function just quietly becoming truly recursive.


If that's your worry then you can probably use the site's namesake. Though simple recursion is generally easy to spot.


Which site's namesake? Hacker News?


The Y combinator.


Oh is this some Common Lisp thing? Never done it.


It's much older, it's lambda calculus stuff. It's a way to implement recursion in a language which doesn't have recursive functions (but for some reason does have first-class functions).

However it allows making anonymous functions recurse as well.


TCO also, unlike special syntax for direct tail recursion, works when the last call is not (directly) recursive (which supports indirect/mutual recursion, and just structures with deep call heirarchies that aren’t necessarily recursive.)


Oh I didn’t know it was kind of a workaround. I do like the fact that loop is not a function though but an expression like if or case.

FWIW I think in Clojure you can use “recur” inside functions too to specifically indicate tail call recursion without relying on automatic optimization


> without relying on automatic optimization

I didn't think Clojure had any automatic optimization at all, due to the JVM not supporting it.


I think you mean automatic tail call optimization?

(JVM has quite a lot of automatic optimizations that Clojure enjoys automatically, and clojure itself also has some automatic optimizations).


> I haven’t encountered any other way to write iterative expressions whose number of iterations isn’t known at the top (like map and reduce).

`unfold`, Rust's `loop`, generators, working tail recursion elimination (the lack of which loop/recur is a workaround for)


Is that like using "continue" in most C-syntax languages?


Only if you could pass parameters to "continue" (which you can't).


for x in y: yield x

Job done


Also known as (in Python, that is):

  yield from y


They were providing a partial example, "yield from" does not actually do what the original poster asks about, it merely proxies the inner iterable.


> roughly three classes of language features [...] 3. Quality-of-life features that aren’t too hard to add

I'd regrettably add another class, quality-of-life features which you'd have hoped weren't too hard to add, but because of past choices, now are.

Examples: Adding javascript-like dots a.b.c for Julia Dict's a[:b][:c] would conflict with "wasn't intended to be public but has been" Dict implementation fields, like .count . Adding { a,b | ... } instead of a less concise { |a,b| ...} for Ruby blocks, but for a yacc grammar conflict.


You can build method access in Ruby trivially enough but you will forever be explaining to users which node.class isn't node[:class]. And now every method added to that Hash/Dict-like object is a breaking change because someone somewhere could have used that already to access a key.

The { a,b | } syntax is also still ambiguous with hash literals, unless you require that | to be there, and that looks like it gives the parser a whole lot of look-ahead work to do in order to distinguish hashes from lambdas.


Php supports kebab-case variables:

    ${"variable-name"}=123;
Isnt it beautiful?


I agree that kebab variables aren't to my taste either, but I am partial to the notion of kebab-case keywords that I encountered in a JEP draft [0]. It suggests expanding the keyword vocabulary with a form that is otherwise invalid syntax, similar to how java treats module-info.java and package-info.java as valid files, but rejects any other hyphenated java class filename.

[0] https://openjdk.org/jeps/8223002


Pretty ugly. Scala does it like this:

    var `variable-name` = 123
    `variable-name` = 456
Looks much cleaner to me.


So does python I suppose

    locals()["kebab-case"]=123


Perl, too. (Probably not a coincidence.)

You can also put a newline in a variable name if you really want. Or a 0 byte.

Here's a demo. I've used the debugger because its "X" command can print the true name of the variable:

    $ perl -d -e 1

    Loading DB routines from perl5db.pl version 1.60
    Editor support available.

    Enter h or 'h h' for help, or 'man perldebug' for more help.

    main::(-e:1):       1
      DB<1> ${"variable-name"} = 123;

      DB<2> ${"variable\nname"} = 456;

      DB<3> ${"variable\0name"} = 789;

      DB<4> X ~variable
    $variable^@name = 789
    $variable^Jname = 456
    $variable-name = 123


How many languages support kebab case with any Unicode dash that isn't the ASCII one? :)


Probably any language that supports Unicode identifiers.


F# allows arbitrary names within double-backtick identifiers. The following is a valid declaration, where I've used a hyphen, en-dash, and em-dash:

    let ``foo-bar–baz—quux`` = 3


Agda does, but it also supports a plain ascii hyphen in identifiers. It allows operator characters inside identifiers and requires spaces around operators otherwise (as proposed in the article). So you can use x-y as an identifier:

    x-y : ℤ → ℤ → ℤ
    x-y x y = x - y
The Agda community also heavily uses unicode characters. I've even seen a unicode colon used for a custom syntax because the ascii colon was unavailable.


> The Agda community also heavily uses unicode characters.

Wise move.

Finally a language from the 21 century.

Still sticking to ASCII is madness. Especially as most people on this planet don't use ASCII as their native char set.


At the moment, the only one I know for sure is Agda.

I suspect java would work as well, not sure about golang unicode var naming.


play.golang wouldn't let me use figure (‒, U+2012), endash (–, U+2013) or emdash (—, U+2014) in a variable name directly.

But then the spec[1] says that only code points characterised as "Letter", an underscore, or characterised as "Number, decimal digit" are valid.

[1] https://go.dev/ref/spec#Identifiers


Coptic Small Letter Dialect-P Ni should work, looks like a hyphen.

Go Playground: https://go.dev/play/p/kxgOcEWsznz

https://www.compart.com/en/unicode/U+2CBB


Oh, neat catch!


I have long wondered why Ruby's symbols aren't in every language.


Because in most languages they're not useful. Symbols are solutions to problems, some of which are:

1. mutable strings (ruby)

2. and / or expensive strings (erlang, also non-global)

If you have immutable "dense" strings and interning, and you automatically intern program symbols (identifiers, string literals, etc...) then symbols give you very little.

And then there's the slightly brain damaged like javascript, where symbols are basically a way to get some level of namespacing to work around the dark years of ubiquitous ad-hoc expansions so you're completely stuck unable to add new program symbols to existing types because you could break any page out there doing something stupid.


As the article covers, they are nice syntactically, regardless of those performance considerations. They fill a niche that in my experience actually turns out to be more common than string literals (though less common than strings as actual textual data).

I haven't written ruby (or any lisps) for awhile, and I miss symbols.


They exist in K/Q. A single-word identifier-shaped symbol begins with a backtick, or a multi-word symbol can be created with a backtick and double quotes. A sequence of symbols is a vector literal, and is stored compactly. For example:

    `apple
    `"cherry pie"
    `one`two`three
Many languages will intern string literals implicitly, or allow a programmer to explicitly intern a string; for example Java's "String.intern()".

The problem with string interning, especially for strings constructed at runtime, is that for the interning pool to be efficient it is very desirable for it to be append-only, and non-relocatable. A long-running program which generates new interned strings on the fly risks exhausting this pool or system memory.



> A long-running program which generates new interned strings on the fly risks exhausting this pool or system memory.

So does a long-running program which generates new symbols on the fly.


Ruby symbols are similar to keyword symbols in Common Lisp. There a keyword symbol is a symbol which evaluates to itself:

    CL-USER 69 > :a-keyword-symbol
    :A-KEYWORD-SYMBOL
Keywords with a similar name are identical:

    CL-USER 70 > (eq :a-keyword-symbol :a-keyword-symbol)
    T
One can't set keyword symbols to another value:

    CL-USER 71 > (setf :a-keyword-symbol 3)

    Error: Cannot setq :A-KEYWORD-SYMBOL -- it is a keyword.
They have certain features of normal symbols, like a property-list with keyword/value pairs.


Both Lisp and Erlang have them, so they're much older than Ruby.


Presumably Erlang inherited them from Prolog, too.


I don't remember Prolog much, but since Erlang inherited very much, you're probably right.


Personally, I found Ruby's symbols to be a source of bugs because they can easily get mixed up with strings. The article gives the example of dict[:employee_id]. But what happens if you serialize "dict" as JSON, then parse it again? The symbol :employee_id will be silently converted to "employee_id", which is treated as a different dict key from :employee_id. I found it was easy to lose track of whether a given dictionary is using the "keys are symbols" or the "keys are strings" convention, especially in larger codebases.


Yeah symbols are terrible and they lead to using Mashes or "Hashes with indifferent access" to attempt to allow both syntax. This helps with round tripping to JSON and back and getting consistent access either way, but values are still not converted. And values shouldn't be symbolized from JSON which means round tripping through JSON typically converts symbols into strings.

It would be a lot easier if symbols had been just syntactic sugar for immutable frozen strings so that :foo == "foo".freeze == "foo" would be true.

And under the covers these days there is very little difference. It used to be that symbols were immutable and not garbage collected and fast. And that strings were mutable and garbage collected and slow.

These days symbols are immutable and garbage collected and fast and frozen strings are immutable and garbage collected and fast (and short mutable strings are even pretty fast).

Symbols as a totally different universe from Strings I would consider to be an antipattern in language design. They should just be syntactic sugar for frozen strings if your language doesn't already have frozen strings by default.


Always serialize the keys back to symbols.


In languages that are statically typed and support enums, symbols are not necessary IMO.


As a side note: Symbols got just removed from Scala 3. They were there quite useless an can be replaced with strings without any downsides.


Is it for performant reason ?


Symbols in Ruby are meant to be more performant that strings iirc. If I have symbol :a, then it's allocated once regardless of how many time it appears. As opposed to "a" which is reallocated every time.

I guess it's similar to Python having a single instance of small integers. PlayStation also experimented with caching small floats which gave them some perf improvements too, but I think wasn't as performant in all cases.


Lua (and some other languages) intern strings, so all strings that are the same point to the same string instance. This gives the same benefits (plus string equality is just pointer equality) without a different type.


There is a caveat in older Ruby versions that they aren't garbage collected, so they shouldn't be used for things like user input. Not a problem since 2.2 though.


Symbols can even improve performance. Replace them with integers at compile time like a global enum, and so the runtime only needs to compare integers instead of potentially lengthy (especially if UTF-16) strings.


> Replace them with integers at compile time like a global enum, and so the runtime only needs to compare integers instead of potentially lengthy (especially if UTF-16) strings.

All of those strings will be interned, and can thus be compared by identity. Which is an integer comparison.


Haskell has almost all of these.

> Instead of writing 10000500, you can write 10_000_500, or 1_00_00_500

https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/nume...

> Balanced string literals

https://hackage.haskell.org/package/raw-strings-qq-1.1/docs/...

> Generalized update syntax

Use Lens. `fileName %~ max 2`

> you can write the sequence 1, 2, … n-1 as 1..<n.

Yup. `[1,2..n-1]` There's far more to it, you have access almost a SQL-like sublanguage.

> Symbols

In Haskell you use hash has a prefix instead of colon.

Haskell sadly does not do automatic lifting, no extended parameter blocks, and no kebab-case.


Same goes for Scala.

You could likely build "automatic lifting" this macros.

"Extended parameter blocks" are just normal Scala method signatures.

You can use kebab-case (though with back-ticks).


>> Symbols

> In Haskell you use hash has a prefix instead of colon.

Can you give an example?


https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/over...

You can do all sorts of things with them. Use them like symbols in Scheme, say to name fields `get #name user` or to access database tables, etc.

But what's even more interesting is that the name is reflected up into the type. `get #name user` won't fail at runtime. Your database table name can be checked at compile time.


Ah, OverloadedLabels – I've only seen them in the `get #name user` use-case. I feel like Scheme/Lisp symbols are used quite a bit more generally, but maybe it's just not caught on yet in Haskell, also other features fill the same roles (e.g. in many lisps you can unquote a symbol and use it as the function of that name; people also often use them similarly to data constructors for pattern matching).


Balanced string literals (with some extra QoL) recently landed in C#, haven’t got the chance to use them yet, but they sound really nice in the right situation.


Comments Section

In Next Generation Shell I've experimented by adding

    section "arbitrary comment" {
      code here
    }

and this is staying in the language. It looks good. That's instead of

    # blah section - start
    code here
    # blah section - end

Later, since NGS knows about sections, I can potentially add section info to stack traces (also maybe logging and debugging messages). At the moment, it's just an aesthetic comments and (I think) easily skip-able code section when reading.

Symbols

I've decided not to have symbols in NGS. My opinion (I assume not popular) is that all symbols together is one big enum, instead of having multiple enums which would convey which values are acceptable at each point.


I sometimes do something like:

  var foo string
  { // Do stuff
      // ...
      foo = "..."
  }

  { // Other section...
  }
You can also split stuff up in to sections, but for some kind of functions where you know the functions will never be re-used and are intimately related, I find this clearer.

Of course, your language will need to have block scope, or at least blocks.


If it's just for comments, IIRC Lisp has docstrings - the very first expression in a Lisp function can be a string literal which gets compiled into the final executable as a docstring which can be retrieved at runtime.


You can do something similar in JavaScript

> const love = "love"; > { // Section > console.log(`What is ${love}`); > }


Not the same ergonomic.

The language doesn't "understand" it's a section (no opportunities listed in the original comment).


Actually, it does, though the only real use is with the `break` keyword for breaking out of a labelled block (or break / continue in a for loop): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

Things like stack traces don't track them, unfortunately.


C# has #region/endregion

They are foldable in IDEs


And the result is people writing 10k LOC files.

I'm not sure this is the right approach…


Ruby's Enumerable module is incredible and jam packed with features like .tally, .any?, .take, .partition, etc.

But one of the things I love most is how seriously they take having their Hash be enumerable. I love being able to loop through any hash as easily as you would with an array


The "Expanded Parameters blocks" is kind of supported in Kotlin. You can write a function like this:

    /**
     * @param arg the string to garble
     */
    fun doThing(@NotEmpty arg: String = "default)
In this example, "arg" is mandatory, or else it would have the type "String?", making it nullable. It obviously has a default value. It has an annotation that performs some validation, though admittedly that is a library and not a language feature. And it has its own documentation. I find this more concise than the Powershell example.


As with almost all "Kotlin features" this is a direct rip off from Scala, where the signature would look almost identical (besides the fun keyword).


> chained evaluators (if 2 <= x < 10)

Because C evaluates this to `((2 <= x) < 10)` which is pretty much never what you want, D just makes them illegal. You'll have to add parenthesis.


Comptime.

After having discovered Zig, I've been missing that feature in every other language.


Comptime is great, but I wouldn't classify it as a microfeature. It'd fall more in the second category the author defines.


Zig would be great if it took security more serous.

As long as you're ending up with the same problems as in C/C++ it can be as great as it likes it will stay a language of the past.


Compare to lisp macros?


For me, yes and no.

Sure, comptime is great, but I've also found it hard to reason about code with it. I prefer my comptime stuff separated out into its own section/file/whatever. With that small change, it becomes so much easier.

But yeah, still powerful and nice.


And error unions with corresponding semantics. And explicit casting requirements. And @TypeInfo. And no hidden allocations. And probably like 4 or 5 other things I'm not thinking of right now.


I'd like to see the "in" operator from SQL in C-style languages.

Something like:

  if(x in (null, 2, 3.14, foo(123)) {
    //
  }


Collection.contains()?

That's a pretty std. feature I guess.


JSON compatibility, to be able to copy and paste from JSON to valid nested dynamic arrays.

  a = {
    "a": "a",
    "b": {"b": 2} 
  }
Optional commas at the end of lines, so this is also valid

  a = {
    “a“: "a"
    "b": {"b": 2} 
  }
and we are able to swap or append lines without editing the commas or forgetting to do it and get a syntax error. Mandatory commas on all lines would do but it gets in the way of JSON compatibility.

This must also be legal code and equivalent to the previous one

  a1 = {
    a: "a",
    b: {"b": 2} 
  }
  a == a1 # true
The developer decides when saving typing time is more important than JSON compatibility.

PS: a big yes to kebab-case too. That's in part CSS compatibility because CSS class names are often kebab cased.


I want more implicit typing in typed languages. Quite often the compiler knows exactly what type a function will return, but I still need to write it there. Sometimes it’s easy („int“), but what about HashSet<Immutable<Tuple<int,string>>>

Typescript does it well. F# (completely statically typed) too…


Rust does a pretty decent job of this, particularly around functions that return collect().

But yeah, C++ has a ways to go on type inference.


Between `auto` and `decltype(auto)` it mostly does what I want, although the 2nd syntax is frightful.

The main case it doesn't handle is

  for (auto i = 0; i < foo.size(); i++)
where obviously I want i to match the return type of foo.size(), so a size_t rather than an int.


Nowadays even Java will do that, var list = yourFunctionReturningThatSet();


For Java, I used to write `list = yourFunctionReturningThatSet();` and then have the IDE fill in the type when it complains about the undefined variable.


Also on a function?

  var getDate() { return „no date“; }


That is often considered undesirable because it makes code less clear and compilation errors inscrutable.

For instance the rust developers consciously decided to remove that from the language, named functions must be fully typed.


Easter egg: you can use -> _ to ask the compiler what type it thinks the return type should be, given your body. Because it is only used for diagnostics it isn't fully featured and there are things it doesn't cope well with, but it is there and works most of the time.


Static type inference is "fully typed".

  def getDate = "no date"
That defines a method `getDate` with the type `() => String` in Scala.

The type is statically know, of course.

But it's recommended to use explicit return types for public methods. This helps preventing breaking public API by refactoring the implementation of a method.


No, but I think that was a good choice. In local code this is great, but in an interface contract being explicit about data type is a virtue.


For interfaces this would obviously not work (there is no implementation). But for class members it would.

I think this concept works very well in typescript and F#.


I mean interface not as a language construct, but as a declaration of how module (class, component etc) can be used. You want to see the type in such declaration, not to infer it based on implementation details.


But rust still allows declaring returning a trait instead of explicit types.


kebab-case

enkebab–case

emkebab—case

These use dash, en dash and em dash, respectively. Most languages that allow you to use a decent amount of Unicode in variable names probably will accept that kind of kebab–case.

Those dashes look pretty similar to each other in monospaced fonts but not indistinguishable, so it’s readable and not super confusing. Might work. Why not?


If unit testing makes the cut, then I think standardized documentation and doctests should be included also. Python (string literal at the beginning of a class/function), C# (structured comments) and Rust (doc attributes) are three different, valid ways of adding this to the language.


D has had these for a long time.


Looks like these fall into the "structured comments" category, based on seeing /** */ and /// in a repository I found. Does it also have a story for doc tests?


Yes, tests in those comments are actually pulled out and run.


Just to pop in on kebab case:

Prolog has a weird third thing going on. Arithmetic only happens in specific contexts. i.e.

  A is 4 - 2, % arithmetic, technically is(A,-(4,2)), A is 2
  A #= 4 - 2, % arithmetic, A is 2
  A = 4-2, % non arithmetic, A is unified with the term -(4,2) which pretty prints as 4-2
  A = -2, % non arithmetic but A is the number -2 not a term -(2).
  A = 4-2, B is 4 + A, % a weird one A is the term -(4,2) but when it gets called in the context is(B,+(4,A)) it gets treated as the arithmetic '-' and B is 6
you can also kebab-case predicate names so

  l-h-t(L,H,T) :- L = [H|T].
  
  ?- l-h-t([1,2,3,4],H,T). %works as desired
  H = 1,
  T = [2, 3, 4]. 
you can't do it with variables though.


From perl/coffeescript, i miss until/unless (ie while! and if!), especially combined with postfix conditionals, eg:

  thing.process() unless thing.cancelled()

  thing.doOne() until thing.queueIsEmpty()

  showAdminMenu() if user.isAdmin()
etc


Good observations. Actually more important than it looks, I think. All those little helpful things.

Also worth discussing: micro-misfeatures to be avoided when designing new languages. Maybe non-micro-misfeatures, ie the lack thereof, can be considered a microfeature. Like, for example, uniformity.

And I just have to trot out my favorite example: Java import statements do not allow keywords and numbers in package names. So we can't put our Java source code in folders named 'import', 'long', or in paths like '2023/01/'. Great. For no good-enough reason - the syntax would actually be cleaner with a separate package name syntax. (BTW, this could be fixed, I think.)


Ruby having anonymous functions being defined like { |args| <whatever> } and for no arguments you can drop the || entirely. Rust's |args| { <whatever> } with || { <whatever> } being mandatory for no args removes the ambiguity in parsing.


More sugar for reflection:

  class Foo { 
    @Max(10) int bar;
    @NonNull String name();
  }

  var field = @Foo::bar;
  var max = @Foo::bar.Max;
  var method = @Foo::name;
  var foo = new Foo();
  var name = method(foo);
  var bar = foo.field;


Just use a dynamic language like JS if you like code like that.


Declaration of units for primitive numeric variables.

Example: <https://github.com/mchrisman/variables-with-units-language-p...>


Julia's Unitful[1] is actively pushing on physical units.

[1] https://github.com/PainterQubits/Unitful.jl



this is usually a style thing not an enforced syntax and maybe a hot take but I actually really like leading commas in comma-delineated lists (like Elm and Haskell). Makes changing the order of things really convenient


Even better would be to get rid of this annoying syntax noise.

Commas in multi-line lists have no use besides making trouble in refactorings and diffs.


Agreed. Especially the more diff-driven development we do.


I wrote a small DSL for easy compilation to SQL that included some similar features. It included datetime literals, but they were specified as just a string prefixed with "d".

    d'2020-02-20'
I also tried to make it so that every comparison had both english and symbol representations, a range syntax, and an approximation/match comparison (e.g. "=~", "!~") which could work with both floating point numbers and strings properly.

I found this useful, and wish it was in more languages.


Comparing floats is not a great idea, though. That's a foot gun.


I really just want every language to support operator overloading.


just allow nice infix functions :)

(which is allowing nice postfix ones with currying)

Scala does it pretty well. and nowadays finally the function names don't require a PhD in ancient Egyptian hieroglyph decoding.

so requiring alphanum names for functions is pretty important imho. (sure it's okay if it's just a stern lint warning and the developer has to opt-in. but searching for iteratee is easier than searching for \∆>> or whatever.)


Scala does a lot right here, but it's supper annoying with its arbitrary limitations for how you can name your methods.

If I want a method called `U+1F602`¹ it's not the business of the language to judge that.

---

¹ the actual glyph, which gets filtered out here, also for no reason


I think it's okay for a language to help guide its users toward "better code" (of course Rust is the big one for this).

But I'm also a firm believer of providing escape hatches. So it should be just a toggle in the project/file/directory to enable whatever behavior. And a very good language would require a human readable explanation for these, so when the developer says

allowEmojisInCode = true "we decided to allow emojis to make our happy DSL, see emoji reference at https://..../...."

downstream users/readers of the code are in a much better position than with just 30000 lines of emojis :)


Scala has this escape hatch. But it's annoying. Why can I use arbitrary symbols (even with spaces, and such) by adding back-ticks, but not regularly. It wouldn't make any difference, besides not looking bad and needing extra key strokes for absolutely no gain (as it does not prevent bad code anyway!). I hate that kind of hand holding! Like I said: It's not the business of a language to judge what kind of code is "good".

Scala does not even let most people on the world express code in their native language, and that in the age of Unicode! Sorry, but that's a little to much of "we know better than you what's good for you" kind of thing.

Of course I know where this comes form: Scala is used broadly in education. There it's good to not allow the students to do all kind of "madness".

But Scala is also mostly used by seasoned professionals in real world settings. (Just have a look at the latest survey, found on the Scala website). For a professional it's just extremely annoying when a language tries hard to know better then they how "good code" should look like. The main thing about a expert programmer is that he knows when it's OK to break "rules". Needing to jump through arbitrary but completely useless loops just to do that — when you know exactly what you're doing(!) — makes me mad sometimes.

I considered to fork the compiler not only once because of this. I hate such kind of "but we know better" behavior.


The backticks are more of an anti-feature than anything, and it's absolutely not a file or module/package level setting. (I think a lint-exclude at declaration site would also be okay.)

> It's not the business of a language to judge what kind of code is "good".

Well, yes, but no. Language design is just inseparably infused with judgement calls. Making things easy leads to them being used (as the backtick illustrates, making things hard reduces their usage, even if it sounds nice that it allows for special cases).

But as you imply the language has to be flexible and thus powerful enough to provide the option of a seriously different design trade off. (Because backticks are just a bad compromise. It's not really switching to a different design choice after all.)

> I hate such kind of "but we know better" behavior.

I think that implies too much intent on the Scala core team, unfortunately the reality is - probably - that historically someone wanted something, it got done somehow, and that's it. (In the particular case of backticks maybe Martin really had a strong opinion. Dunno. Probably you have looked into this at least a few times if you considered a fork :) )

... related to this the recent discussion about Rust's GAT (generic associated trait) feature is a very interesting case study in the intersection of language design, "project governance/management" (the reality of pragmatic compromises). There a small team spent at least a year developing GAT support for the compiler, and then a bunch of people were asked to decide whether to merge it. And it's a very unenviable position, because of course the work was not perfect. So what to do? In the end, I think at least, the narrative that was comfortable for everyone involved was that "this is the best version we can have realistically, and yes it provides net positive value in this current state".

https://github.com/rust-lang/rust/pull/96709#issuecomment-12...


Yes, `matrix * vector` rather than `matrix.VecMult(vector)`


We fought that war 25 years ago.


Nested json dictionary lookups with default fallback, for scenarios where any of the sub-dictionaries could be null or missing a key. Write parseDateTime(config["a"]["b"]["c"]) safely as a oneliner. Haven't seen any language solve this in a neat way, without catching indexerror exceptions or ugly chaining of null-conditionals and empty dictionaries as fallback-values.


It's not language-level, but I keep a Python class handy that wraps objects like that and has that behavior. You can also access items with config.a.b.c if desired (and if they're valid identifiers).


Ruby’s symbols are undersold in terms of how useful they are and how they enable meta programming and DSL development. You can basically invent new language keywords with them.

    private def foo
      “bar”
    end
In this case, ‘def foo … end’ returns ‘:foo’, and ‘private’ is just another method that takes it as an argument and decorates the provided method. It’s not a special language keyword.


About kebab-case identifiers.

What I always wanted is proper spaces! Designing syntax where identifier could have spaces (without backticks or something like that) might be tricky of course. But may be it's not impossible.

All those space imitations, whether they're dashes, underscores or camels - they're just imitations. Nothing compares to real spaces.

If anything, underscores are closest ones, if you ask me.


It's pretty easy, actually, and was common in early PL designs such as ALGOL. All you need to do is make a special syntax for keywords - https://en.wikipedia.org/wiki/Stropping_(syntax) - and then whitespace becomes completely redundant for tokenization purposes, and can be ignored altogether, or treated as part of the identifier in which it occurs.


> What I always wanted is proper spaces! Designing syntax where identifier could have spaces (without backticks or something like that) might be tricky of course. But may be it's not impossible.

What would this look like, though?[1] How would you solve the problem of adjacent identifiers vs a single identifier which has spaces?

Maybe having identifiers, and only identifiers, starting with a uppercase letter with no uppercase in the rest of the identifier? Then lines like `if MySubRoutine()` is easily parsed as `if My Sub Routine`, and `MyVarType MyVarName;` becomes an easily parser `My var name My var type;`.

Looks harder to read than the usual camelCase, kebab-case and PascalCase identifiers though.

[1] I ask because I'd like to do something like this when writing a program that comes with it's own language for the end-user to use to enhance the program.


Well, I would imagine that it requires designing a language which ordinary does not allow adjacent identifiers.

Of course keywords must not be allowed as part of identifiers (or there should not be no keywords at all like with Lisp).

Just an example of my head that I didn't think really much about:

    function print person (p: person) {
      var full name = p.first name + p.last name;
      if p.middle name != "" {
        full name += p.middle name
      }
      print line(full name)
      for c : p.subordinate person list {
        print person(c)
      }
    }
I think this syntax should be parseable with little restrictions (like your identifier can't start with keyword).


That's completely unreadable!

I can't "parse it" by looking at it, and a computer would have even more trouble doing so.

How do I tokenise something like `full name += p.middle name`?

Maybe as `(full) (name +=) (p.middle) (name)`? Or is it `(full) (name) (+= p.middle) (name)`?

How about `print person(c)`? Is it the call of the `print` function with `person(c)` as argument?


That's a terrible idea given the fact that code is processed by human brains in a symbolic way, and not like a (western) written language.

https://news.mit.edu/2020/brain-reading-computer-code-1215


I love all the sugar of Kotlin, but there is one thing I miss, and it is the indexing of arrays of Python when pulling out part of an array.


Kotlin is so close on so many things, but then keeps messing up.

arrayOf(1, 2, 3) - why not [1,2,3]?

emptyArray() - why not []?

mapOf("key" to "value") - why not {key => value}?

These are all solved problems. Why would they make up some harshly suboptimal syntax?


I am not that familiar with Kotlin, but these seems better than the syntax primitives from a language design perspective (I greatly recommend the “Growing a language” presentation done by Guy Steele), these are ordinary functions that are well-known from other parts of the language, not an added “hack” that has a one-off use. If you were to use a concurrent hashmap implementation you no no longer can use the syntactic sugar, and writing against an implementation is quite common in Java (which plays quite a big role in the design of Kotlin), e.g. having a List in the interface, instead of ArrayList.


Interesting, I think the exact opposite :)


Because there are more than three collection types (Sets, Arrays, Maps) in rich static languages.

The ugly Kotlin syntax is of course just there to look different to Scala, where you would have:

  Array(1, 2, 3)
  Array.empty
  Map("key" -> "value")


Coming to Kotlin from Python, these always get on my nerves. My brain hasn't adjusted to needing to call a function. And 'X to Y' just doesn't click for me. I guess I'll get used to it eventually


`{key => value}` it should be mutableMapOf() just as well. Same for array.


I like pyret's dot notation for accessing data associated with variants. from their docs

8<------------

    data Animal:
      | elephant(name, weight)
      | tiger(name, stripes)
      | horse(name, races-won)
      ...
    end
    
    fun animal-name(a :: Animal):
      a.name
    end


> You can write # 2001-08-12 # to mean the date 2001-08-12, instead of writing something annoying like Date(2001, 8, 12)

I like this article but oh man dates just trigger me. Such a missed opportunity to use an unambiguous date example like 2001-08-13


Year-day-month would be a truly cursed assumption for a format


It would have been unambiguous in a world where we all agreed that both months and days start at 0 :)


Yeah, sure, because normal people count form zero…

Have you actually ever watched people counting things?

In a sane world the offset based counting (zero bases) would have never surface (besides in very specific and seldom circumstances).

Also it's a shame that the C languages started to call offset "index" (and there is not even a proper index operator!).


> It would have been unambiguous in a world where we all agreed that both months and days start at 0 :)

How so? Would 2021-08-12 mean month 08 or month 12? It doesn't matter if it starts from zero, it still looks ambiguous to me.


Also don't forget about year 0 (which doesn't exists and is a single point of failure for so many programs that deal with calculating time between now and a BC date)


If only anything with calendars was that clear-cut. ISO 8601 does have year zero.


Oh the joys of being a school administrator with a new international student whose birth date is given as 3/7/5.


You have students who are 1648 years old? :-)


local const variables like JavaScript has them. You can declare variables with „let“ (mutable) or „const“ (immutable). This is really great when reading code, because you never have to check if some code may change the variable at some point. And you usually declare most variables as const.

A lot of languages provide immutable variables only for class members or statics, but not for local variables.


I really like Scala’s `val` vs `var`.


Okay, that would confuse me. Looks very similar and easy to overlook.


Syntax highlighting can help with distinctions like that.

You might prefer Nim's visually distinct "let" for immutable, "var" for mutable. "const" is also available and means resolved to constant at compile time, similar to Zig's "comptime".


it hurts that `let`, `var`, and `const` are all used in JS and mean completely different things. `let` even means the exact opposite


Actually not, as firstly you almost never use `var`s, and secondly if you do, most syntax highlighting rules will make them shine in bright color as something very exceptional.


Same in Kotlin


No surprise though as Kotlin is a cheap Scala rip off.


Most of these seem like anti-features to me. They add little expressivity and a while lot of syntax. Maybe the Lisp crowd was right?


Omitting parens in a function signature when calling with one or no arguments.


This is my least favorite example of this kind of thing. This creates so much unnecessary confusion.


This means you can’t reference the function itself by name without some kind of quoting construct or other circumlocution; also, while the no-arg case might make sense as a special case, if you allow the one arg case, you might as well admit the general case.


The general case adds ambiguity which makes it more of a design choice, rather than an obviously good generalisation.

In the general case of 2 or more comma-separated arguments, is "func a, b" a call with two arguments, or a tuple containing a call with one argument? Feel the same about "x = (func a, b)"? What about when this appears in list syntax like "[x, y, func a, b]", or the argument list of another function like "obj.method(x, y, func a, b)"?

They are easily solvable with a design decision about precedence, but arguably the syntax is a little confusing or worth a warning in all but simple cases.


The single argument case works fine for Lua but maybe the no argument case woudn't. Having to supply only the empty argument list seems counterintuitive, though.

Variable sigils would work but that would be even more annoying.


Kotlin has it, but only if the function accepts the function argument, e.g. `foo({println("hello")})` is the same as `foo {println("hello")}` (larger example in my other comment above).

You can also make it on no-args function, if it's a method (attached to a type). Utilizing @get(), like that: `3.mph` and have somewhere else defined

    val Double.mph: Speed
      get() = Speed.mph(this)


Does anyone else disagree with kebab case over snake case?


i_used_to_think_snake_case_is_the_way_to_go_up_until_i_wrote_this_comment_out

to-prove-that-snake-case-is-more-readable-but-now-that-ive-written-it-i-might-be-changing-camps


You should check how it looks in a monospace font as well. Still team snake case here :p


> Instead of writing 10000500, you can write 10_000_500, or 1_00_00_500 if you’re Indian.

I hate this so much. It means I can't grep for a constant.


If you have constants repeated throughout the codebase, you can pull them into a constants file, and then you can go to that file, navigate to the definition of interest, and use your IDE of choice to find usages.

You'll also be able to give these constants semantically significant names, and comment next to them providing derivations or citations. And of course, if it's a mistaken or outdated value, you can change it one place and apply it everywhere.

Consider that, if you were debugging a problem with this constant, and the problem was caused by someone having made a typo in one of it's usages (eg having typed 1000500 instead of 10000500, a mistake that's more difficult to make of you have better ways to format numbers [did you have to look back and forth to find the mistake? I did]) - your regex would fail to find it, even if there were no ambiguity about the format it was written in.


I'm in the same boat as GP. Typically, I grep such things when I am not familiar with the codebase, so I can't change where constants are defined, and I do not know where to find such a file.


For what it's worth when I dive into a new codebase, the first thing I do is try to guess what files exist and then find them and get a feel for structure. Constants are high in the list.

But there are many ways to skin a cat.


The problem is not finding where the constants are defined, it's about finding what the code does based on a hardware datasheet, the constants could be in a file, or could be scattered throughout, it doesn't matter at all.

You can also use the magic of base 16 to search for lexical subsets of a value to find where code uses things with the same mask, which are probably related. Extremely effective in reverse engineering a hardware device.


You haven't understood the problem at all. No wonder, few people do any sort of bare metal programming. It's not about defining constants, or even writing code at all, it's about figuring out what this arbitrary piece of code is doing based on hardware datasheets. You search for constants defined in the datasheet in the piece of code you are analyzing to determine what it does, or where does it do specific things...


In my experience most (embedded) code is usually not possible to grep for such specific numbers anyway, because the assignments use bit-shift operators, set-macros, bitfields, binary literals, hex-literals non-hex-numbers, splitting a 16 bit number into a 2-element 8 bit array, mixing up the endianess, etc. An IDE that can find all assignments where right side has a numeric value of choice would cover more of such variants.


I'd gently suggest that if you wanted me to have that context when interpreting your statement, you could have provided it in your original statement or provided it now blamelessly, rather than framing this as a deficiency on my part that I failed to use my crystal ball to determine you were an embedded programmer. (I do very similar things at the application level, for what it's worth.)

I believe the solution I proposed remains viable in that context or for that usage. If I defined a constant for the magic memory address one writes to to configure the MMU, and you and to understand how I implemented context switching, you could navigate to my constant and find usages.

If that solution doesn't work for you, no worries, it was just a suggestion/observation.


Yes, exactly, now you need special tools (you mentioned IDEs) when previously you could have used grep.


You could also use CScope or grep for the constant's variable name. I don't consider an IDE to be particularly specialized, but you do you, I hate it when people tell me I'm using the wrong IDE, so I'm not going to tell you to use an IDE.

To be clear, I grep for things all the time, even though I use an IDE.


It's not about searching for specific values either. Sometimes you find some unknown value that's not reflected in the out of date or simply wrong datasheet. What does it do? Well, it's quite likely it does something related to other values that use the same mask, so you can try to figure out the mask by searching for lexical subsets of the value, which because of the magic of base 16 is an extremely effective strategy in finding related things. Much harder now with underscores.

And it's also about being able to search Google for a found constant, which indexes other people's code. There have been plenty of occasions where I found newer version of proprietary driver code that the hardware manufacturer claimed it either lost or doesn't exist and won't provide to us by simple searching for constants on Google...

But more often is to just find drivers from other operating systems that already support that device, or mailing lists or forums where other people try to reverse engineer the device.


For sure. I think there are some semantic/type-aware grep implementations.

I'm sorry this language feature creates frustration for you and interrupts your workflow.


Depends on the size of your codebase and how many people work on it. Once you have 20 years of code written full time by 200 developers in your monolith, finding the constants file out of 400 different subsystem's constant files for a given subsystem that you've never seen before can become a legitimate and challenging pain.


Totally true. I don't know if IDEs commonly have a feature like, "find this value, regardless of how it's expressed" (even better yet, fuzzily, to catch typos), but I think that's the proper general solution. It's a good idea to consider a constants file earlyish, while everything still fits in your head.

I'd say in the case of such a sprawling system, make a constants module, and it can have different files for different topics. But keep them all together. Code style is an engineering tool you can use to prevent problems.

But I do understand this is cold comfort for those working on systems where the decision around this were made 15 years ago, and there's no possibility of refactoring the constants. That's quite annoying.


Then you have a problem with having to update both the constants module and the service that depends on it to change the service, and you need to handle packaging and distributing the module. Maybe if you have a monorepo where those issues are moot...


I didn't mean to suggest the constants module was a separate, reusable module, but a component of the same piece of software. By "module" here I meant "directory which can contain multiple importable files," so you could namespace your constants (constants/rfc_abcd.xyz, constants/customer_limits.xyz, etc).

I'd rather copy-paste any reused constants to different projects to avoid coupling, unless there was some kind of compelling domain/project specific reason.


When your code-base reaches a certain size, you cease using the IDE to find code and instead start using specialized tools, such as Lucene indexes so that grepping through code takes seconds rather than minutes. One of the bads of this is that using regex over an index is O(n) in comparison to the O(log(n)) of a normal index lookup.


Perhaps this special tooling can normalize numbers and other source code ambiguity.


It certainly seems possible to use a parser when building your index. It's a lot of work though.


This so hard. A constant that replaces a magic number like 5 that is ungreppable, or a constant for a precise number like the physical constants - sure. A constant that replaces a number like 404 or 500 that you want to be able to easily grep across heterogenous code bases for? Pass.


Is that 500 the HTTP error or 500 the adhoc limit we put on the number of user uploads or a 500, representing half a kilobyte?


I'll take the momentary ambiguity in some cases (usually quickly resolved by line context and file name) over having to manually hunt down 5 different projects' inconsistently named constants to do 5 different greps any day of the week.


Out of interest, how often do you do this, and what are the semantics of the numbers you're grepping for? I literally can't remember a time I've ever tried to grep for a number.


When I was doing bare metal programming I was doing this all then time, depending on what I was doing maybe even tens of times an hour. And it's not just source code either, these days even debugging tools print values in this way making it a total PITA to reverse engineer things because you can't easily match values coming in from different tools, or from the tool and source code, etc.


I've done it for error codes but that was awfully cursed


But you need to search for both hex and dec codes for that


Yup which is still a one liner in grep


You can grep the snake case too in one grep

You can do almost anything in a grep


Well you can, you just need a more complicated regex. Same for '1e6' and such.


You already can't, because someone can write the constant in hex, or even (shudder) octal.


Or like this:

   const int WAIT_TIME_MICROSECONDS = 42 * 1000 * 1000; // 42 seconds


Or:

  0x2A * 10e6;
The solution is clearly to have language tooling with a find-constant tool which you give an expression and it parses all declarations in the source code looking for one which evaluates at compile time to the provided expression.


The funny way I’ve seen this go wrong in practice is a typo like:

  // we don’t want more than 100m because …
  quota = 1000_000_000
Where people assume the underscores are in the expected place.


This is why I like the underscores, it's easier for me to see the typo in:

    quota = 1000_000_000
than it is in:

    quota = 1000000000
And also when I'm typing the number, it's easier for me to be sure I got it right when I can count the zeros in groups of three. It's rare that I've needed this, but I've used it in Java a few times in my career.


When do you find yourself grepping for a constant (as opposed to it's name)?


I explained down in this thread.


1_?0_?00_?0_?500 yes it sucks


That’s not enough — the underscores can appear anywhere, so you need to cater for 500000_0 and 5_0_0_0_0 etc. basically an optional underscore between each digit.


just grep for `[0-9_]+`


I would assume they're looking for a specific constant value.


grep '[0-9_]+'|sed 's/_//g'|grep <my-constant>


That's kind of a handful to type every time :) I'd just do for the first one and then scan for the number I'm interested in. If you have more than a page full of constants..


You can make it an alias. No need to type it every time!


Arrow function syntax


Great, introduce slang to languages, what could go wrong?


Slang itself is a language: https://www.jedsoft.org/slang/


> Second, what if parameter blocks were abstractable?

Sounds like a great way to make unreadable code

    #define STANDARD_EXP_PARAMS ...

    // I hit gotodef on func(...) and got here. What are the arguments?
    void func (STANDARD_EXP_PARAMS) {
       // ... long function body ...
       x = y; // What is the type of x? Is it an argument or a local? 
    }
I'd really be irritated if someone ever used this feature, optimizing for lines written is shaky territory to begin with. When it's at an interface boundary it's not excusable. NB4 "use an IDE" - requiring an IDE to make code legible is dumb, and I'm an IDE shill!


You could say the same about a function call or a function that takes an object as an argument - who knows what code lies behind the impenetrable barrier of structured or object oriented programming?


That's actually a feature of encapsulation. The object (ideally) is used to abstract away those details so I don't need to know about them, only how to create the object. Encapsulation is a different feature than reuse.

The exception is in languages that use that syntax as a way to implement keyword arguments, but I would still ask people to destructure it in the parameter list or at the top of the function so I can see what the arguments are.

The point is that when I am looking at a function definition I need to know how to call it. As described all I see is obfuscation to save some keystrokes, not clean code.


But how would you know how to call it if it takes an object as an argument? How different would the process be for looking at a named standard set of arguments?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: