I'm never going to understand the 'mathmatical jargon' argument against Haskell (and similar). Yes, but this can be true in almost any language that allows abstractions. The math jargon is just a name for a concept and I'd prefer math and programming shared terminology where appropriate rather than coming up with an entirely separate vocabulary. Is Functor really works than Mappable? It's just a label for the abstraction and you'll have to learn the underlying concept regardless so you might as well share the same word.
The Haskell concepts are not quite the same as the mathematical concepts; they're somewhat restricted, and different aspects of them are emphasized (e.g. monads---programmers generally think of bind as the basic operation and join as derived, mathematicians generally think of join as the basic operation and bind as derived). So different names could be justified for that reason. It's also not just "math jargon" but "category theory jargon", which is worse than ordinary math jargon.
A monad is a data structure that holds one value, but that value may be of one of several types. It implements a `map` function which allows conversion to the same monad with different set of types, eg there is some function `map` defined as :
Two commonly used examples are the Optional and Result monads. An Optional is either a value of type T, or the empty type None. A Result is a value of type T or some error E.
The utility of monads is their ability to chain operations. With optionals, you can map the results of functions that may or may not return results without handling the None case explicitly. With Result types you may chain operations that return errors, and handle both the successes and cases where you need to convert errors.
Monads can be expressed in many languages, but they make the most sense in languages with algebraic data types (to express a value is one of a set of possible types, aka union/sum types) and first-class functions (where functions can be passed as an argument, or else implementing `map` may be difficult).
The mathematical parlance is really harmful to their adoption because as an abstraction, monads are stupidly easy to use.
I think you describe only a Functor, not a Monad. A Monad also has to have either a `bind` or a `flatmap` function, besides the `(f)map`.
Basically the point of a monad is to depend on the output of another monad data type of its kind — e.g. it can take an IO String and use the encapsulated String value to produce another IO X. This would be IO (IO X) without the flatmap, or the bind function that can remove this double encapsulation.
Haskell/FP/Math purists would probably balk at the oversimplification. But introducing intimidating concepts by way of such oversimplification (without entirely missing the target) is precisely the point! Beginners just need a very rough and coarse grained overview, so they are able to gain some basic familiarity with the concept. That gives them some already-familiar conceptual knobs to hang further details on, reduces fear, and inspires curiosity to investigate further.
It's not hard to describe what monads are, but it's probably not going to convince anyone of its importance without an appropriate language. The monad per se is an extremely simple idea. In fact, monadic interfaces are ubiquitous:
I agree it can take some time to discover the similarities between these interfaces. But a monad is just that. It's so simple that we are blind to see.
Unfortunately, it requires a sufficiently advanced language to express the concept of Monad. You need at least higher-kinded types and means to describe behaviors of HKTs.
Can you understand or describe monads in a less powerful language, say Rust? Definitely!
But will you appreciate the power that monads give you in Rust? Probably no.
A monad isn't a concrete thing, a monad is a pattern that lots of different types adhere to. So it makes sense to extract it into an interface, in that operations that only use the monadicness can be implemented once for all types that are monads. It doesn't make much sense to study monads apart from the things that implement the interface, because there is not much to it. It's just an interface for the abstract notion of an ordered sequence of actions.
If you pick one, some people will see similarities, some people will get weirded out by abstraction leak, and a lot of people will have no idea what you're going on about.
If you pick none at all, well, you mostly just get the last two.
>I'm never going to understand the 'mathmatical jargon' argument against Haskell (and similar). Yes, but this can be true in almost any language that allows abstractions.
Well, the author (and many others) wants them to allow less abstractions then.
That's not a very Haskell thing to see that many Unicode operators. Are you thinking Agda or APL? At least in PureScript, all infix operators must have word name and is either a list of ASCII symbols or a single Unicode character. And even still, you see a massive amount of programmers abusing (and misusing) ligature fonts to make their code look more like math without embracing Unicode and the actual symbols. Why fight over composition operator being `.` or `<<` or `<<<` when you can use the one from math, `∘`? Using Vim, use the digraph `Ctrl+k` then `Ob` and now you have a `∘`.
Haskell is, however, fond of custom `operators' (infix functions made of symbols). For example, take <$>. It's barely even shorter than fmap! And Haskell even has a special syntax if you want to make fmap infix. Is it really worth the characters to write
a <$> b
instead of (1 char longer)
fmap a b
or (3 chars longer)
a `fmap` b
I understand fmap, but <$> tripped me up for an embarrassingly long time when trying to read Haskell code. In any other language, I'd do a web search for the function I didn't understand, to learn more about it, but that doesn't work here because most search engines don't handle symbols well. You have to know to use Hoogle. But someone who's trying to teach themselves Haskell isn't just going to know that. Learning Haskell doesn't have to be harder than learning another language, but it does require inordinately more hand-holding.
I'm sure I'm not the only one. This is something I would call an "inessential weirdness"[0] of Haskell. There is a whole host of others— <>, $, and >>= are some from the Prelude alone; it becomes especially tricky when people don't use explicit or qualified imports.
Some of this is opinionated, and some is real. It's helpful to separate the two.
We've learned some things over the years.
* Null-terminated strings were a really bad idea.
* Arrays of ambiguous size were a really bad idea.
* Multiple inheritance just gets too confusing.
* Immutability is very useful, but not everything can be immutable.
* If thing A over here and thing B way over there have to agree, that has to be checked at compile time, or someday, someone will break that invariant during maintenance.
* Concurrency has to be addressed at the language level, not the OS level.
* Which thread owns what data is a big deal, both for safety and performance reasons, and languages need to address that. Some data may be shared, and that data needs to be identified. This becomes more important as we have more CPUs that sort of share memory.
* The jury is still out on "async". Much of the motivation is web services holding open sessions with a huge number of clients. This is a special case but a big market.
* Running code through a standard formatter regularly is a win.
* Generics can get too complicated all too easily.
* There are three big questions that cause memory corruption: Who owns it, who can delete it, and who locks it? C addresses none of those issues, GC languages address the second, and Rust (and Erlang?) address all three. Future languages must address all three.
The biggest issue with C++ metaprogramming is not necessarily that it is that powerful (though I've concluded for me that C++ allows too much overloading of semantics by library authors and they seem to really like that) but that the metaprogramming "language" if you can call it that is just incredibly weird and like nothing else on Earth. There's this entire jargon that literally only exists for meta C++ and that should be kind of a huge red flag besides the entire thing being an accident. It's truly a "weird machine" in the truest sense of the word that somehow became an industry standard.
I think it's "weird" because it fills this weird gap between shitty macros and generic programming but does none of the hard parts of either. The syntax itself is not nearly as strange as those semantics.
For example, if you look at what type traits are used for, it's essentially making up for the fact that C++ lacked semantics for compile-time checks of generic arguments, forcing anyone that needed to care about that to implement their own (often simple, special case) constraint solvers. Meanwhile, typeclasses in other languages wrote a generic constraint solver (ironic) and support the semantics to express those constraints on generic types, obviating the need for any kind of complex compile-time template logic hackery. Essentially the lack of semantics for a simple concept (yet difficult implementation behind it) forced the implementation of a simple extension to the same syntax that enabled really complex macro programming using the same templating engines.
It's no surprise that metaprogramming in C++ is exceptionally weird; it supports neither proper macros or generic programming, yet half-implemented both through templates to solve the lack of either.
Right, D has fairly sane metaprogramming (just slap the static keyword on all compile time stuff!) and somehow managed to make it compile fast unlike c++
Not to mention C++ templates are hideously slow to compile (if used indiscriminately, as they often are). Although C++ isn't the only language that struggles with build times. At any rate, it does seem possible to do generics without slow compile times (C# seems to do fine, for instance), so C++ might be giving generics a bad name on that count as well.
Fast builds would be high on the list of my own "dream language" features. Long developer iteration times are poison.
I don't think c++ overdid it persay, I just think that they really got some of the ergonomics wrong. SFINAE and the CRTP are examples of idioms that are possible due to the flexibility, but unintelligible to read (I'm sure there's some population that really disagrees with me. I'm not saying they're not useful, just not ergonomic)
They're slowly layering in those ergonomics, which will eventually be nice, though there'll always be a big pile of legacy.
Generics in C# is crazy good, both in writing and using them, all while the compiler checks the correctness especially if combined with constraints. I've tried "generics" in other languages but they all fall short compared with C#'s way. Between reflection, generics, polymorphism, dependency injection, you have a winning combo of code reuse. Can turn thousands of normal lines of code into a few hundred. I've surprised myself again again on how useful it can be, all without getting lost in the abstractness of it. The people at the dotnet team who carefully thought it through and implemented it deserves a ton of recognition for this.
Eh, It can be good or bad. While generics (in C#) can make your code easier to use correctly, by making sure that everything is of the right type, and can give you a nice perf boost when using generic specialization with structs, they also have a tendency of infecting every bit of code they affect with more generics.
For example if a function is generic over T, where T is parameter X constrained to be `ISomething`, all the function that it calls with X should use constrained generics as well, this can easily lead to the explosion of generic functions in your codebase. Many times, instead of making a function generic over T it's easier to make parameter X just be interface `ISomething` and be done with it.
Very good point. I typically use both approaches, sometimes using an interface as a method parameter just "feels" more natural, but other times a generic + type constraint is better. I usually draw the line when the infection you mention becomes too much, aka when 3 or 4 layers are now forced to use a generic cause 1 layer needed it. So yes, very valid point. The key is to find balance so that the code is still maintainable in 6 months from now, still readable + fast, while trying to reduce duplication where it makes sense.
I'd love to know if you write lots of code and/or have to maintain other people's hugely templated code. I find if you have to do lots of the second you quickly stop writing it.
I would still prefer that over copy pasting that shitty list implementation for the nth type, or whatever “convention” some C devs do to make it “generic”.
> Arrays of ambiguous size were a really bad idea.
When? And for whom? Having to make everything a fixed size array like in C is no good either, that's for sure. It leads to bad guesses and comments like "should be big enough".
I think he was referring to areas of contiguous memory (arrays) where the size of the area is separated from the pointer to that area. Nearly any operation on the array will require both pieces of information, and tons of C bugs come from making assumptions about the length of an array that aren't true.
So, better just to carry the length along with the pointer (Rust calls these "fat" pointers) and use that as the source of truth about the array length (for dynamically sized arrays, such as those created by malloc).
They're referring to dynamically-sized arrays that do not store their own length. In C, this would be a pointer to the first element of an array; the length of the array must be passed around independently to use it safely. Instead, they're advocating collections which store their own length, such as vectors in C++ or arrays in C# or Java. Personally, I believe there is a need for three different kinds of array types:
1. Array types of fixed length, ensured by the type system. This corresponds to arrays in C, C++, and Rust, and fixed-size buffers in C#.
2. Array types of dynamic length, in which the length is immutable after creation. This corresponds to arrays in C# and Java, and boxed slices in Rust.
3. Array types of dynamic length which can be resized at runtime. This corresponds to vectors in C++ and Rust, and lists in C#, Java, and just about every interpreted language.
Of course, for FFI purposes, it is often acceptable to pass arrays as a simple pointer + length pair, this being the common denominator of sequential arrays across low-level languages.
We also frequently need reference types to contiguous storage in addition to value types of contiguous storage. In C++ this is satisfied by std::span for both the static and dynamic cases.
True. In Rust those are regular slices, and in C# and Java all collections are owned by the runtime anyway. Something which several languages do lack, though, is a view of a contiguous subsequence, in the manner of an std::span; it would be nice to see those in more places.
I've also taken a look at Golang's slices, which have rather confusing ownership semantics. One can take a subslice of a slice, and its elements will track those of the original, but appending to that subslice will copy the values into an entirely new array. In fact, appending to the original slice past its capacity causes a reallocation, which can invalidate any preexisting subslices. This also occurs with C++ vectors and spans, if I am not mistaken. This is an area where I think Rust's borrow checker really shines; it prevents you from resizing a vector if there are any active slices, encouraging you to instead store a pair of indices or some other self-contained representation.
You cannot append to a std::span, or append to a vector-backed span through the span. You have to perform the append on the underlying vector. It is possible to perform an insertion into the middle of a std::vector through std::vector::insert.
If you can establish the precondition that the underlying vector will not re-allocate as a result of the append, then it is perfectly safe to perform such an append while holding a reference to one or more elements of the vector. Same thing for insertions: references to elements before the insertion point may still remain valid if you can establish the precondition that the vector will not be resized. In both cases, it is straightforward to establish the precondition through the reserve() member plus some foreknowledge of how much extra capacity the algorithm needs.
You can always construct a user-defined reference type which back-doors the borrow checker, such as by storing indexes instead of iterators as you mentioned. If the std.vector is reduced, then they are still just as invalid.
I think it's also okay to address memory corruption using tooling (like how sel4 proof checks c), the language might not have to address it, but it would be nice for the language to make it easy to address.
"Params: Function parameters must be named, but no need to repeat yourself, if the argument is named the same as the parameter (i.e. keyword arguments can be omitted). Inspired by JS object params, and Ruby."
I've grown fairly fond of JS implicitly named object parameters, and I always liked Smalltalk's named(-ish) parameters, and this seems like an interesting compromise. I'm not actually sure how Ruby works in this case? I thought its parameters were fairly normal. Are there other languages that do this?
But in the case of one-parameter functions this seems unnecessary, the function name often makes it clear exactly what the first parameter is. And if you are going to support Subject-Verb-Object phrasing (as suggested later) then the Subject is another implicit parameter that probably doesn't need naming.
Maybe another approach is that all functions take one unnamed argument, and there are structs with named members for any case when that single argument needs to contain more than one value. Which starts to feel like JS/TypeScript. The static typing to support this in TypeScript feels really complex to me (it's unclear if the language described uses static types).
OCaml gets all of this right already. Labels are optional (which is IMO better than the article suggestion). You can abbreviate to only the label if the parameter variable is the same as the label. You can omit the label for labelled parameters if you want, they just become unnamed positional parameters. And of course the main thing is type safety to help you get parameters right in many cases.
The ability to treat named arguments as unnamed is not necessarily a feature - it's too easy for the caller to mistakenly pass the wrong thing without a label, and it also makes it that much harder for the API owner to version it in the future. Python adopted a special syntax for named-only arguments for these reasons.
(For others, labels are optional in three senses: not all parameters need to be labelled, the compiler can emit a warning if a label is omitted but you may want this to be an error instead, and there are optional labelled arguments which may be omitted entirely in some cases. Labels are not optional in that the type system is unwilling to convert between labelled/unlabelled or optional/omitted/required parameters and the compiler mostly won’t add hidden coercions between these types)
A common case where the trivial-value syntactic sugar fails looks like:
foo ~some_descriptive_label:t.some_descriptive_label
bar ~something:!something
Or
baz ~the_option:config.the_option
(Why not just pass the config? Two reasons: it is nice to have the interface being explicit about what data is needed, and the function may come from a module that cannot depend on the definition of the config type.)
I'm surprisingly happy with IntelliJ / Java's behavior. Java is all positional parameters, but anytime the arguments aren't obvious, IntelliJ shows the parameter name with subtle syntax highlighting. It feels like the best of both worlds.
A ton of features on this list really require an IDE or rich program representation, or _something_ that's not just plain text. For instance using content hashes to refer to functions is only reasonable with IDE tooling support.
But programming language itself is a tool. And, you need one tool to improve the other, that imply your tool (programming language) is not good enough.
IMO, the only reason why java is still in use - superb IDE and tooling, that compensate terrible language design.
Vertical alignment. When you press the down arrow, you don't know what position your cursor will land on because it's not visually aligned with the character above. Kind of defeats the purpose of a monospace font (within that one line where it's applied).
> I've grown fairly fond of JS implicitly named object parameters, and I always liked Smalltalk's named(-ish) parameters, and this seems like an interesting compromise. I'm not actually sure how Ruby works in this case? I thought its parameters were fairly normal. Are there other languages that do this?
Ruby allows keyword arguments, which are very similar to object parameters in Javascript:
One thing that's missing in Ruby (that I quite like in JS) is the ability to rename keyword arguments in the named parameter list, like:
// javascript
const example = ({ max: maxNum = 123 }) => {
/* ... */
};
# ruby
def example(max: 123)
max_num = max # can't do this in the parameter list
# ...
end
Honestly, this limitation is just about the only thing I think is missing from Ruby's parameter-list syntax.
> But in the case of one-parameter functions this seems unnecessary, the function name often makes it clear exactly what the first parameter is. And if you are going to support Subject-Verb-Object phrasing (as suggested later) then the Subject is another implicit parameter that probably doesn't need naming.
Maybe there could be a special sigil for positional arguments? I like the idea of gently discouraging them, but still making them possible. Maybe something like
// hypothetical language
function singleArgFunction(@arg0: number) { /* ... */ }
function mixedArgFunction(@arg0: number, foo: string, bar: string) { /* ... */ }
function standardFunction(alfa: number, bravo: string) {/* ... */}
singleArgFunction(0);
mixedArgFunction(1, foo: 'hello', bar: 'world');
standardFunction(alfa: 123, bravo: 'abc');
On your last point I’ve been toying with the idea of having even the function it self folded into that struct
the struct defines the unbound inputs ad well as mapping those to different named outputs
one interesting aspect to play around with is to then have commutative, and associative, application/evaluation operation where structs (or environments really) are merged and evaluated by binding what can be bound
say
(a,b:1+a)(b,a:1,c:a+b,d:2)
would evaluate to
(a:1,b:2,c:3,d:2)
not saying its a good idea - kind of falls down on scoping - bu could be an interesting toy.
This makes me think about some sort of exec-with-dynamic-scope approach to functions, like:
func = {
b = 1 + a
}
result = func {
a = 1
c = a + b
d = 3
}
Where func {...} (two adjacent blocks) is like a composition and merges the two blocks. "=" has the mathematical meaning, not assignment but a statement of equality. The parameters are just the free variables, e.g. func has a parameter of a. Are external routines or functions also parameters? I'm inclined to say yes, and that it's interesting to treat them as such.
This is all very Mathematica/Wolfram-like, with concrete expressions with undefined terms, different than functions that are executed.
The result has no real sense of order, and so it implicitly has to be purely functional. I can imagine using this to build things-that-execute, where this code doesn't execute but defines an execution (like tensorflow). Or there's some outside layer that creates streams of input, and the "execution" is how it responds to those streams. That outside layer is where all the system integration happens. It would work well with React-style unidirectional data flow. But I think for any practical program you'd have to live in both worlds.
The latest Ruby version works as described in the «keyword arguments can be omitted» link.
The dream language is gradually static typed. Like TS, but the type system and inference would ideally be more like OCaml or ReScript. Hopefully avoiding some of the complexity of TS, and gaining more soundness.
I've known about the majority of these points since I learned Scheme in college back around 1995. My thoughts have solidified and remained mostly unchanged since around 2015 as I've watched web development eat itself.
For example: PHP is one of the only languages that got pass-by-value right, other than maybe Clojure. It even passes arrays by value, using copy-on-write internally to only make copies of elements if they're changed. Unfortunately that triumph was forgotten when classes were added, passed by reference of course, putting the onus on the developer to get things right, just like every other mediocre language.
Javascript got the poison pill of async, Ruby is Ruby, and I have so many disparaging things to say about mainstream frameworks that it's probably best I don't say anything at all.
My only (minor) disagreement with the article is that I do want reflection, so probably need some kind of type info at runtime, ideally with low or no storage overhead. If that's a dealbreaker, at least make it available during development or have a way to transpile that into a formal implementation, maybe something like Lisp and C macros.
PHP was one of the first languages that I learned, and used professionally for years. I agree with you RE. pass-by-value; even to this day I find it annoying in other languages that I need to remember their particular quirks around that.
What would be the main motivation for reflection? I’d like to be convinced it is fundamentally necessary, or just worth it, given the potential for introducing footguns.
Oh I'm looking at the development side, being able to iterate quickly and having full visibility of the code. This fits in with stuff like aspect-oriented-programming (AOP) so that things like execution traces can be generated at runtime without modifying code.
I mostly work with the shell and scripting languages. I'm racing as fast as I can all day to get anything at all to work, then building on that to get to a solution. Most of the time, I can't really edit the code I'm working with and still maintain velocity, so can't do any kind of manual type annotation.
Also I question if there is really any merit to hiding the type of a variable at runtime. It feels like a power trip by the language, a form of security theater. I also view the "private" and "final" keywords with similar skepticism. If I can call typeof on an element in my own struct/class, then I get frustrated when I can't do that with anonymous data from a framework or in the REPL. It also frustrates me when I can't access the call stack or get information about the calling function. If there's a way to do it in C++ (no matter how ugly), then I must be able to do it in whatever language I'm using or my psyche views it as stifling and I lose motivation.
I guess I find the obsession with types today to be a form of pedantry. So much concern for only accepting a narrow range of shapes, but then holding that info close to the chest. It should be the opposite.
That's also the reason why I've gotten away from object-oriented (OO) programming. I've found that functional programming (FP) on JSON types (number, string, etc), and doing declarative programming in a data-driven way, usually runs circles around the boilerplate/interface yak shaving of OO.
The main place I've seen these be useful is in ORMs. Metaprogramming allows you to decouple the business logic from the plumbing of query construction and memoizing, while maintaining syntax that most people are comfortable with. Django ORM did a really good job with this.
JS Proxy classes also use reflection. Check out the immer library for an awesome example of how those can be used to make coffee drinker to read and reason about.
Reflection should be used very rarely, but it's an amazing tool when it's the right tool
I like how python’s indentation scheme makes code look nearer, but I find it scary in the following case:
for i in range(0, m):
for j in range(0, n):
doInner()
doOuter()
A single tab here in the 4th line produces syntactically correct but logically incorrect code. I find this scary.
I guess it would be easy to just write your own preprocessor that requires curly braces everywhere and removes them for the python interpreter, but eh.
This example seems contrived. I used to swear by clean demarcations of expressions because of this logic, but in practice the return values of nested expressions are so rarely the same, and always reviewed before merge, with at least some kind of test or that would make this obvious, it would be almost unnoticeable.
For example :
for i in range(0, m) :
for j in range(0, n) :
... inner ...
... outer ...
If `...inner...` is dedented then you will almost certainly get a compilation error since it is referencing `j`. If `j` is not referenced at all, you'll get a linter error saying "unused variable, j".
Additionally, you probably wouldn't want to write nested for-loops anyway, but something like flattening the inner loop into a return value.
The problem usually isn’t …inner… being dedented, but …outer… being indented. Which usually won’t produce an error, since anything that could be in …outer… could also be in …inner… without error.
> Additionally, you probably wouldn’t want to write nested for-loops anyway, but something like flattening the inner loop into a return value.
Nested for-loops aren’t uncommon in real-world code. And, I…am not sure what you are saying here. Loops are imperative constructs, not return values.
Since Python scopes lexical variables to the function instead of the block, `j` is in scope for every line of `... outer ...`. So de-denting the last line of inner doesn't trigger a NameError, even for new bindings which were introduced within `... inner ...`.
Also you need white space diffs to be turned on to spot this change in code review, and this is particularly bad if indentation changed for another reason.
Tools like Prettier can automate this with curly-brace languages, so indentation (tabs or spaces?) stops being something you waste brain power on, similar to how code editors will match the current line's indent on the next line when you press enter even though it's just "press tab a few times". It adds up.
// just copied some code into this while block, oops I copied an extra space out front...
while (...) {
if (...) {
...}
}
// after Ctrl+S tied to autoformat
while (...) {
if (...) {
...
}
}
Do you never move code around? Especially moving things into and out of loops?
That was my experience in Python.
-> These statements need to be in this other loop.
-> Cut...paste
-> Reindent
-> Turn my brain to extreme paranoia, and double-check that I properly indented the first and last lines of the code I moved.
In braces-languages, it was just cut..paste..autoformat. If you missed a brace, you hear about it from the compiler.
Python is a great language in a lot of ways, but I caught myself making indentation mistakes twice a week. Significant indentation is, in my opinion, an error in language design.
I've never had a problem with this when moving code around. Cut/paste, indent entire block to match initial level, done. Where is the potential for errors? There really is none, because the brainbox keeps track of what I just pasted and tab/shift-tab moves that block as a unit. I find moving stuff around in braced languages a wee bit more work, actually, because you have to move the braces manually instead of shifting the block in and out, which is quicker.
Personally I've been way more annoyed when I've typed something up in the REPL and can't paste that over directly because of the >>> and ... but that's really a small tooling issue; a smarter editor would remove the REPL marks when pasting. Same for other REPL languages, most of them have PSn...
I normally select the whole block that I just pasted and then press Tab or Shift+Tab to indent/dedent the whole thing at once---no possibility of missing the last line then. This works in most editors.
They're similar in terms of how advanced they are. I find Nim preferable, personally, and has a pretty excellent batteries-included standard library, as well as robust tooling and third party library support. That said, I'm biased, and haven't used Crystal for a couple years. Anecdotally, I see more uptake of Nim than I do of Crystal, but again, I'm biased and it's likely because I'm looking for it, rather than it being a fact of the world.
Later on the author wants an autoformatter as part of the core language tooling, ala gofmt. I think with autoformatting the code ends up as readable as Python... the code can still be unreadable, but not because of misleading indentation! (Well, I still am offended by the vertical inefficiency of K&R style bracing, but presumably the language wouldn't copy that style...)
I don't mind indented code when writing or reading code. However I think indentation often becomes a pain when copying code from e.g. StackOverflow - the indentation is never quite right.
Hypothetically your on a team with a bunch of junior devs.
Your doing a PR review, assuming this dev is sloppy, would you rather it be in C++ or Python ?
I didn't like Python or Ruby at first, but now whenever I need to write a small tool, it's Python. Before Python I was using JavaScript, but I've fallen in love with how clean Python is
> Your doing a PR review, assuming this dev is sloppy, would you rather it be in C++ or Python ?
I'd rather it be in Python because I haven't touched C++ with any seriousness since the 1990s.
But not because of formatting (in either case, I’d prefer a standardized code formatter be in use, which gets you a lot farther than Python’s whitespace sensitivities alone.)
In visual studio (not code), there is an extension to render braces as very small, almost invisible. So with that extension and standard IDE enforced indentation (depending on those braces you can barely see), C# code almost looks like python code.
I think GvR observed that beginning programmers often got confused when the indentation of their code didn't correspond to the syntactic nesting implied by brackets, etc. They found the indentation to be more salient and easier to interpret, so he designed the language to cater to that.
Sorry, I wasn't clear enough: ABC is not just some random language - it's one that GvR worked on before Python, and he specifically said in several interviews that it's where Python originated from. As you correctly surmised, ABC was a teaching language, hence the choice to go with indentation.
The problem is that when auto formatters completely wreck with the semantics of your code because of a missing or misplaced character. It has actually happened often and drives me nuts every time.
I prefer to write code the way I want/need then apply a formatter to unify my code with that of myself and colleagues.
Depends on what the formatter does when the code is syntactically invalid in the first place. It could just error out, but that might be annoying if the error is in an unrelated part of the file.
So a formatter might opt to format on a best-effort basis even in the presence of syntax problems, but then there's the risk of creating different semantics than intended (once the syntax problem is fixed).
One example I can remember was an empty if or for loop body where everything beyond it was raised into its scope.
I'm taking about pycharm formatter specifically here. This does also happen with braces sometimes, but this would've simply been a syntax error in most other languages or just pulled up the single next line not the entire following code.
While i normally prefer indentation to demark function scope, it has one huge drawback: When writing callbacks, especially chained promises, Python has no way to express the following in a convenient way.
Lambdas only allow a single expression. Defining the callback-function upfront leads to lots of boilerplate and reading the code out of order. Async reduces the need of callbacks but instead leads to colored function problem.
> While i normally prefer indentation to demark function scope, it has one huge drawback: When writing callbacks, especially chained promises, Python has no way to express the following in a convenient way.
Huh? Assuming an identical API structure, it's syntax for exactly that would be:
foo().then(bar).then(print)
> Lambdas only allow a single expression.
Sure, but your problem code uses only single-expression lambdas with superfluous blocks (where, in fact, the lambdas are also superfluous), so its literally the worst possible thing to say that Python can't express as cleanly.
Also, single-expression lambdas can handle a lot, because expressions are easy to combine to arbitrary levels, it's only multistatement imperative blocks you can't do in a lambda, but Python has expression forms that cover lots of uses of imperative blocks already.
Sorry i only used a single expression in the example for brevity. Normally they contain a lot more than one. I thought that would be obvious from the context, the form you presented would fall into the “define function upfront” category.
There are certainly some things about Python that make it easier to write hard-to-read code in it compared to Rust.
One example is that booleans in Python are considered integers (0 and 1) for all purposes, including arithmetic: True+True produces 2 with no warning whatsoever. Even in C++, where this is also legal, you usually also get a warning.
And conversely, Python is much more flexible when it comes to interpreting other things as booleans: on top of implicitly treating null pointers and integers as false like C does, Python does the same to empty strings and even collections. Then there's the part where "and" and "or" don't just return true/false, but rather the value of one of the operands, which needs not be a boolean.
Sequence comprehensions are also a mighty tool for producing unreadable code if desired, especially if you nest them and/or use the conditional operator. This is more so in Python due to the unusually ordered syntax for both.
Here's a fizzbuzz code golf example that combines these techniques:
[f"{(not x % 3) * 'Fizz'}{(not x % 5) * 'Buzz'}" or x for x in range(1, 20)]
> Can you seriously say it's easier to write hard to read python than writing hard to read Rust or C++ ?
Dynamic typing and passing around *kwargs make this so trivial often the documentation doesn't even help and your IDE can't help you without executing the entire program.
Decorator over-use can lead to some difficult-to-read Python.
Doing just about anything imperative at import-time makes it hard to reason about what happens when. Even experienced Python programmers can get tripped up by the import system's order of evaluation.
I like indentation in certain languages. Like f#. Because for me, the combination of dynamic typing, statement based and significant whitespace is just bad.
Enforcing indentation is generally a good thing, but it should be possible to switch it off or loosen it up when it ceases to be good. Which is impossible with a Python-like indentation-only syntax.
You should take a look at F#. While not faster than C necessarily, it beats Python in every aspect aside from the availability of certain libraries and can look quite similar to Python.
I was just talking about this with a coworker. I think the hardest part about creating a dream programming language is that dreams are subjective and everyone dreams of different things.
For example, I'd rather write code in a functional language than an OOP language, but others feel the other way around. Elixir is currently the closest language to my dreams, but my coworker hates working with it due to its functional nature. On the other hand, I'm not a big fan of Go (and would most likely choose Crystal over Go if I needed a compiled, single-binary something-or-other), but my coworker loves it.
I'm just glad that there are so many high-quality languages to choose from nowadays, and good paying jobs to go with them. That wasn't the case when I started my career 30+ years ago.
I wish there was a nicer way to integrate multiple languages together. Like if you could somehow transpile perfectly (or pick a really neat abstraction w/o optimizations) that preserves a general implementation.
Seems impossible but I wish there was a nice solution so people can use a variety of languages seamlessly without worrying about refactors and semantics. Almost like picking database views in Notion but for languages itself.
This is a great summary of that state of the world of programming languages, 2021. It goes through a number of language features that have taken off, and which have moved the space forward over the past few years. I think a language that addressed all these points (even if "address" means consciously doesn't support), would be a very interesting language!
For me, an interesting language development would be finding ways to make managing dependencies more... manageable. We've seen a rise in systems for packaging, distributing and consuming code written by others. However, still a significant part of maintaining systems is spending time on upgrading dependencies, updating application code as things are deprecated, responding to CVEs, etc.
I think there's interesting research and ideas to find in this area, that could lead to a productivity boost. One example idea (I'm sure there are better ones) would be allowing simultaneous use of multiple versions of a library. E.g. This dependency I'm consuming uses CommonLibrary-1.2, and this other dependency uses CommonLibrary-2.4, but they can both get along just fine without having to find some combination both dependencies can agree on.
yes! although you don’t even have to modify the code per se if you have «content-addressable code». See other comments in this thread that expands on it.
I think it would be cool if there were more options in the "programming language backends" space like LLVM. That way you could create custom programming languages that focus on the syntax/QoL features that fit your idea of a "dream" without having to worry about performance since it's uses a known highly optimized backend.
There's QBE (https://c9x.me/compile). It's tiny compared to LLVM and only does the most impactful optimizations, but it's also much simpler to work with. A small but dedicated community of language builders has gathered around it.
JVM has done pretty well in that space over the years ... with the upside that you also get a huge library ecosystem for free which is less accessible with the LLVM approach.
> …when everything looks the same it is hard to tell things apart
This is so true. And it happens on both sides of the complexity.
In forth, it all looks the same because the basic philosophy is “here’s a molding machine, make your own lego.” You end up having to rebuild constantly to construct what was constructed what in what was constructed on what was constructed.
In syntax heavy languages you end up with lots of contextual overloading so you have to parse the environs of a token to figure what it’s actually for (e.g. []’s in Swift, is it a subscript operator of various kinds, an array, or a dictionary??).
I think there’s a sort of sweet spot of making enough generally usable and quickly recognizable pieces, but keeping the set in a bag of generally useful pieces. And to top it off, it’s nice when the elements have a sort of harmony/consistency.
My dream programming language is imperative with global variables when I add new code and transpiles into functional with minimal shared state when I'm done with changing stuff and just want to read what happens.
The part of functional programming I hate the most is passing some new data through 10 levels of callstack to use it in a function that didn't needed it previously. It should be automated.
> The part of functional programming I hate the most is passing some new data through 10 levels of callstack to use it in a function that didn't needed it previously. It should be automated.
Probably, you may restructure your code, use curried functions, and you won't need to go through all 10 levels.
Transition from imperative to functional programming may looks like untangling a ball of function calls, in order to get simple and clean design.
I can structure the code cleanly, that's why I love functional programming. Althought I mostly use clojure not SML-like languages so I don't go overboard with currying. From my limited experience currying makes order of arguments matter A LOT, and then you have to refactor that often.
My problem is that I'm mostly writing games in my free time (the only time when I'm allowed to use functional languages), and writing games is mostly about quick iteration and testing many small changes. So previously I had the code to handle collisions only take potentially colliding objects as inputs and outputs. Now I want to add particle effects when things collide. Pass that particle system and return particles from the function.
Then I think - what if collisions cause the screen to "shake"? Again pass new data to the function. Then I decide it looks stupid and revert it.
Then I think - what if police reacted to collisions if they are near the police station? Again - pass some new data there.
I can do this, but it's a lot of refactoring. I'd prefer if I could just use a global variable and hit a button in IDE to refactor it to a nice functional code.
Curring is a way to do "dependency injection" in FP, so it let you to "capture" some arguments and don't pass them each time. Sure, arguments order matter, but you don't need to refactor that often.
I'm not familiar enough with clojure, buy in F# you sample with collision/shake/police may looks like
where each function accept gameState, run some login against it, and return modified state, so you don't need to pass dozen arguments through call-stack.
(-> someGameState
handleCollisions
handleShake
handlePolice
;; ...
renderGameState)
;; ->> would work as well in this example cause it's both first and last argument
But this code is only pretending to be functional, because in practice every function can modify everything. So looking at this I have no idea where the particles were created - I have to look at every function anyway.
This guy doesn't seem to know what he actually wants. He wants the language to be "extensible" by "library authors" (?) but then later on the argues that the language should be locked down, thus hard to misuse.
Thanks for pointing out the inconsistency. What I meant was that the language core should be very constrained around composition of a few core primitives (self-hosting), but that it could be modified or built upon by others. So that it could evolve in multiple avenues of exploration, and gain from the competition. Where it would be up to the community to decide whether they want to use the constrained version(s) (suitable for large scale complex environments), which I prefer, or the bring-your-own syntax version(s) (suitable for small scale playful experimentation and research) which would inevitably appear. Inspirations here would be Lisp, Clojure and Racket.
What would be important is to facilitate simple language _merging_, due to all the divergence that would appear. Inspired by Git. So the community could easily find back together after a split (if their ideas and goals come back in alignment, and they have converged to an agrement on the features again).
>REPL-driven-development, or interactive-programming. Inspired by Clojure. But without having to leave your IDE to code in a console/prompt/terminal.
You don't have to leave the IDE to evaluate code in Clojure. It's common to spend most of your time in the IDE and only occasionally type at the REPL prompt.
Fascinating to me that we still have tremendous creative ecosystem of new languages appearing and yet it still seems people are yearning for their "dream" language and unsatisfied with the options. Is it a fundamental truth that we can never have the perfect language?
maybe not even with the exception of what you learnt in high school…? It was ambivalent then, and people even struggle with remembering and applying PEMDAS years later.. wouldn’t it be better - for humans & computers alike - if application of algebra was completely uniform, chronological, and predictable?
Something people often miss about operator precedence in math is that it very closely tracks implied grouping by spatial orientation of the operands in conventional notation. The "rules" are a formalization of an implied intuitive ordering of operations.
This intuition is pretty well lost in programming language operator precedence tables.
Good point. Smalltalk is a source of inspiration, though I am unaware of how it handles math expressions in particular.
The solution as I see it: You’d have to enforce using parentheses (in line with «explicit over implicit»), alternatively enforce an ordering of math expressions so that the operators with the highest precedence has to be placed first. This might even be beneficial in tail call optimization, since it minimizes what has to be kept in memory (for the compiler as well as the human reader).
> I am unaware of how it handles math expressions in particular.
It does not automatically do order of operations. You have to explicitly use parentheses, as you say. However methods in Smalltalk tend to be quite short do to the nature of the system so it tends to not be as messy as one might expect.
I don't think that enforcing an order of expressions would help at all with optimization, since it's just a syntactic requirement. You should end up with an equivalent abstract syntax tree, and as a result the same underlying machine code.
Could barely get through the first few paragraphs with that terrible layout. It mobile-optimized my laptop screen so that there were 4 words on each line. Gross.
True. That is indeed an implication, unless some other facility is available. The necessity of meta-programming, and if so, the extent of it, is one if the biggest uncertainties I have currently, which could be determined at this time, by input from someone more knowledgeable and more familiar with it.
Some of the related points from the article:
> Will likely need to be able to treat code-as-data. Might need compile-time macros.
> Meta-programming: No first-class macros (runtime), since it is a too powerful footgun. But should have compile-time macros.
One thing that always strikes me is that a lot of the "function arguments" mess is because destructuring/deserializing is slow.
If structuring/serializing were fast, a lot of the "data sharing" issues go out the window as you could copy everything.
Your function arguments instead become "I want a named dict/hash/struct. It should have these named members with these types. Any missing members get these defaults. If any members are still missing, please complain."
The problem with this is this is that it assumes the programming language is the last world in the developer experience.
I am certainly a language person, and you cannot get anywhere with a bad language, but merely having a good language is not enough.
Working on github.com/obsidiansystems/obelisk/ we have very very many things we would like to work on, but not super much of it involves "adding new features" to Haskell. And yet the libraries are so richly layered e.g. the frontend feels more different from regular effectful code in Haskell, than regular effectful code in Haskell feels different from that in other languages.
The one feature we would like in Haskell is arguably a "removing the weaknesses and restrictions that make additional features appear necessary" thing, which is something like https://github.com/conal/concat and https://arxiv.org/abs/1007.2885. This would be a way to make some categorical DSL stuff less annoying to write because you have to make it point-free.
This reminds me of some experiments that I have been having. I'm (slowly) learning what it means to implement languages through various sources, but one thing I've been testing and thinking about is designing a language from the top down. That is, basically creating a design for the language's syntax, semantics, environment, and libraries without concern how one would eventually implement it.
Sounds rad :D
Ive been in a similar position. Learning from the higher end of things but beginning to pick up some clisp.
Ive been toying with the idea of quantum mechanics in relation to scale free architecture... wanting to read this book on quantum chemistry and group theory and eventually think from the bottom up, but in a way that could perhaps keep the "entanglements" to the priorities set on higher levels of the semantic stack...
I might be a bit far out here but my goal is to eventually gather enough knowledgeable folks in a shared repository to at least begin a collection of discussions about what is desired. I started a github repo w a collective approach and the forum is open for app if youre interested in collaboration.
The more of these "takes" that i can account for in designing a new lang the better, and as other commenters point out the disagreements that would arise in collective development i think thats where forks and merges can come in handy. Insteads of docs as code maybe more like code as docs? With atomicity and scalability? Am i just naive? Ill find out later.
https://github.com/holarchyclub/discussions/discussions
I think the convention is to only use the <expression><if…> for a guard statement. ie `return if condition true|false` which is perfectly readable.
A lot of language angst seems based on specific syntax issues, for people migrating/visiting from other programming “cultures” rather than community evolved style issues.
Good point. I agree. Most Ruby code uses `if <conditional> <expression> end`. Tried to rephrase it now so it gives the impression that `<expression> if <conditional>` is merely syntax that Ruby allows.
My dream language would be simply Typescript freed from the shackles of JavaScript (and thus simplified), with a sound type system, a standard library, decimals, runtime type checking, and compiling to machine code, WASM, and/or the .NET IL.
A few more detailed features that would be good:
- Higher-kinded types
- Generic objects
- Everything can be an expression, e.g., if and switch statements
- Dependent types? In that case the language can be made type-only
That’s attractive! To be fair,
that dream is more likely a dream that is inevitable to come true at some point, given the enormous base of JS/TS developers it will immediately be able to draw from, given the lower/zero knowledge barrier for them to learn it.
Don't like their (relatively) complex syntax and inferior tooling support. Elm also lacks generics. Lisp is out of the question, for obvious reasons. Haskell has difficulty with side effects.
I feel like the author is maybe making subtler points than I'm able to understand, or something. At one place they're arguing against inversion on control and somewhere else they're praising Phoenix' Plugs, which are basically callbacks that take and return a connection.
They say at the end that some of these requirements might be conflicting, but then what's there to learn from the article? Everyone can list their pet-peeves with most languages/frameworks they've ever touched.
From my modest understanding of Plugs, they work like function calls, and in Phoenix you call them, so you can always follow the control-flow. As opposed to Rails’ where you write («magic») methods which you have to trust the framework will call at some point (IoC). I may be wrong, but that’s atleast what I meant.
To your second point:
On one hand the article serves as my own summary notes of what I’ve found desirable features (plus why), and maybe could serve as inspiration for a new language spec someday. Matz found it inspirational, at least: https://twitter.com/yukihiro_matz/status/1451548965019668489... The article links to many tremendous talks and writings, from various thought-leaders and pioneers, after all. I’ve just collated the principles and features I have become convinced are valuable.
On the other hand, since the context is language design, the purpose is to understand what features are generally considered good, so even a collection of favorite features or pet peeves can be a starting point for that, at least for shared discussion. The point is furthermore to find the good and avoid the bad, not gripe over all the peculiar cases of bad, of course. Many of the points also have a brief justification or further reference, that can also be helpful, and discussed (agreed to, or refuted).
I don’t have all the answers, and teasing out which points are actually conflicting, either from my oversight/inconsistency, or from fundamental mutual exclusivity, is part of the discussion. A discussion that would especially benefit any whom the dream may resonnate with. So I am grateful that you are pointing out any flaws or inconsistencies as you see them.
the 2 big improvements are that precompile happens at package install time and in parallel. also the compiler has generally become significantly better optimized is there's still more work to do, but the difference compared to 1.5 is often around 3x
I love these kinds of discussions, but I'm also sure it will tire me out because the ideas are likely stated either too specifically or too succinctly.
> Everything should be able to be encapsulated
Not only should it be optional (default public), but it should be encapsulated with an escape syntax (similar to Python's underscore or Go's package level, backdoors). This is specifically good for testing. Note later, "no runtime reflection" either.
> No need to keep things in human memory for more than about 20 lines of code at a time. If extending that, then there should be an conceptual model
Not a language issue. This is what people think of when talking about a function. The problem is how functions have to physically be defined, which causes a lot of jumping around code to understand it. The expansion of function definitions inline, on demand, is a capability that only a LISP IDE has had, afaik.
> Content-addressable code: names of functions are simply a uniquely identifiable hash of their contents.
I've been a proponent of this since I started writing unit tests. There is no unit test to defend against "someone added a statement with a new side effect".
> Since discipline doesn't scale.
That's wrong, as a statement. Discipline is, generally, the only thing that scales and that's a Good Thing (tm). The examples given, were not directly related (a language adding more ways to do things) and are not compelling.
> Perhaps the language should even restrict a function's scope only to what's sent in through its parameters. So no one can reference hidden inputs^
This kind of wish seems like silly idealism wrapped in ignorance of what smart people will do when a language restricts them too much. Statics are good to have and should be preferred for most operations (once you break functions into small functions, it's obvious), which is not the same thing.
Shortly thereafter it goes off the rails completely with lots of strange views:
> No magic / hidden control. Make Inversion of Control (IoC) hard/impossible(?).
Also the cited reasoning for avoiding exceptions is not compelling, lumped in with specific Java functionality instead of a general reasoning about the pattern.
The constant wish for immutability really isn't the panacea presented. The ability to remove items during iteration in Python is fantastic and preferred to swapping data into different structures to perform operations.
Some wishes I would add (rather than a big diff of what I think are missteps)
- No "new" keyword to create objects/instances. Object() creation should be trivially mockable. Inspired by Python.
- Normalize "Helpers", which are a structure to hold static properties and methods. An IDE should be able to analyze a function and recommend it become a static function.
- Comments should be separate from code, joined at IDE or joined via manual tooling. This would allow comments to span multiple lines/function and files. IDE could also alert when breaking changes are made. Pairs well with the Content-addressable code wish.
^This is an invitation to attach output buffers and input buffers (normally for UI) to transfer around information to static methods. I have read through this kind of code before.
> The expansion of function definitions inline, on demand, is a capability that only a LISP IDE has had, afaik.
That is a great idea. Take the definition, splice it into the call site, rename local variables to match the caller. Interesting take on step into while running a debugger too.
Not really. What's a mess is having a single value NULL that means very different things in different contexts. For example, an unknown value is different from a missing value is different from an error value is different from an elided value. If you need to return all these kinds of values from a function, how would you distinguish between them?
Exactly six of these points are counter-inspired by Ruby, some of those very subjective ("No unless or other counter-intuitive-prone operators"), some duplicate (Ruby meta-programming used twice as a reason). The autor says they like Ruby but then use the most Ruby things as counter-examples.
> Compiled, but also interpreted and/or incrementally compiled (for dev mode). Inspired by C++ and JS.
>Interpreted / incrementally compiled: So developer can write quick scripts and get fast feedback. Sacrifices runtime speed for compile-speed. Except it also needs quick startup/load time.
>Compiled: For production. Sacrifices compile-speed for runtime speed. Compiles to a binary. Inspired by Deno.
But compilation speed is not prioritised above all…:
> Readability and reasonability as top priority. Reduce dev mind cycles > reduce CPU cycles. Human-oriented and DX-oriented. Willing to sacrifice some performance, but not much, …
> Should always be able to be read top-to-bottom, left-to-right. No <expression> if <condititonal> like in Ruby.
> But it should borrow some similarities from natural language (like its popular Subject-Verb-Object structure) to make adoption easier (more at-hand/intuitive).
(Quibble: natural language isn’t subject-verb-object, English is. Mostly. This is acknowledged later in the article.)
Subject-verb-object and verb-object look good at first (thing.frobnicate(other) and frobnicate(thing)), but actually lead you down a path incompatible with reading top-to-bottom and left-to-right. I’ve been steadily leaning in the direction of thinking that it would be wiser to have reading order follow execution order, which often means object-verb, even though that’s not how English works.
It starts with prefix keywords on statements: `return thing()` looks fine, but is actually deeply misleading: reading left to right, you’d expect it means you’re returning a value, but thing() might crash your program (abort, panic, exception, whatever).
It gets worse: return was a statement or diverging expression, but then you use a prefix keyword on a converging expression, such as the popular `await thing`. (This point is not about await specifically, but that type of keyword, for which await is the most popular example.) Before long, you’ve got `(await (await fetch(url)).json()).field` and you’re wondering whether you should finish your transition to Lisp. And so you give up on your elegant fluent interfaces where this keyword is involved and start putting things in meaningless variables more often, unnecessarily exposing yourself to one of the two hard things in computer science.
Rust went with a suffix keyword for await, thing.await, and it composes much better—`fetch(url).await.json().await.field`.
I’m not sure where exactly the balance lies. Having control flow keywords at the start of the line is definitely good for grasping control flow at a glance; if you switch to suffix for them you do lose something. Suffix if/for/while all have such mild readability issues. Suffix return would, I think, be a good idea. (Rust’s ? error-propagation/early-return operator is suffix—though again, return is always diverging whereas ? is not, and suffix is more important for converging expressions; but also in Rust, its prefix return keyword not all that common due to the language’s expression orientation; if designing a fresh syntax for Rust, suffix break/continue/return would be very tempting.)
Prefix unary operators are also a common source of ordering pain, especially negation in conditionals. `if !some.fluent().interface()` is often painful. Didja know that in Rust you can write `if some.fluent().interface().not()` so long as you import std::ops::Not? I’ve been tempted a couple of times.
Conclusion: as usual, some design goals conflict and finding the right balance is hard.