> even in languages where a [...] } was needed, the indentation always predicted where it would have gone anyway.
> if I were to design a [...] language, it would push me toward indentation-sensitivity.
> That’s a notable shift from my prior perspective, where
> I felt that whitespace-insensitivity was preferrable to indentation-sensitivity.
I'm starting to have the same change of perspective for a different reason: widespread use of formatters.
One of the drawbacks of white-space sensitivity is the accidental mixing of distinct white-spaces (e.g., tabs and spaces). Using a formatter eliminates this problem because all white-spaces are normalized.
I’m the creator of syntax suggest, the syntax error detection library that ships with Ruby 3.2
I’ve switched my perspectives on whitespace significance. It makes for shorter source code but comes with a signal to noise problem. If there’s a mistake in the indentation it’s harder to determine intent. Versus if you expect well formatted and indented code (I love auto formatters like cargo fmt), and syntax to declare intent, then your tools can make better guesses and suggestions than if you only have one signal.
The other problem I’ve found working with Python is I often want to copy code and paste into my REPL but it’s usually inside of another context so I get an indentation error.
While I like significant whitespace, I LOVE tooling that understands my intent and helps me fix it. I’ll trade slightly more verbosity if it means better tooling. Though obviously I’m quite biased.
We still use a loosely representation (text) for a semantically very rich data format (code). When you look at this objectively that's madness.
We should finally use proper editors for our semantically rich data format instead of "dumb" text editors. (Which are by now actually not so dumb, and which do again completely crazy things to re-extract the semantic information from a representation that inherently erased this information while we're used a loosely text representation).
We should imho eventually switch to "binary source code" and start using proper tools to edit this data.
What we do currently is like editing Excel sheets in a hex editor…
Binary source code (or at least something in the ballpark of XML tech) would also solve a lot of other problems: For example the rendering of the data on screen would be independent form the actual data on disc. This means bitching around "code style" and even partly syntax could finally end. The stuff on screen could be rendered with the help of some "style sheets". Everybody would get the preferred look & feel for their code. (We do this already partly with syntax highlighting; we "just" need to extend this concept). Also things like version control, dependencies, and such, could be build in. Proper indices for searching (and refactoring code) could be also added (things that our editors today compute anyway, but in some ad hoc and very language specific way). Compilation of such code would be faster as all the parsing wouldn't be needed (as it would happen already in the editor; actually, also what is already done today, only that we than again throw away this information for no reason, so the compiler "may" regenerate it over and over on every run, instead that this stuff ends up in a kind of database which could be even part of the "binary source code" format).
I guess some innovative editor features nobody even considered until now would be also enable this way.
It's time for progress!
(Most likely, as almost always, someone will come up with some references to very old systems that did already this because they were ahead of their time; let's see what new stuff I'm going to learn this time :-D).
Having done a fair bit of programming outside of text, text is not "loose" or less "rich" than the programs it is used to describe. It's actually far more explicit and easier to grok than alternative representations of programs. Problems like code style have been solved by linters and auto-formatters.
A lot of the ideas you're pitching have been pitched before and died for the same reason: parsing text is easy for humans and computers. Although I'd quibble with some minor points, like "compilation of such code would be faster". The major bottlenecks in a compiler are codegen, optimization, and linking compilation units. Even bad parsers are extremely fast.
The stuff on screen would be still (rich) text, of course. Only that it would be a rendering of the actually underlying data structure.
Linters and auto-formatters didn't solve anything. They are just a constant PITA, and a source of arguments.
And to quibble a little bit: I said "faster". I did not say how much… Just parsing is indeed very cheep by now.
But a major performance issue was unmentioned: Type checking. In languages with advanced type systems that's the most costly part. Putting all that information into a database and carry it along "the code" would make things significantly faster. It would be like there would be only incremental compilation of the few line you're just editing.
I've actually worked on this problem quite a bit and taken this approach, but it's not as obvious as you're making it out to be - I'm going to talk in generalizations about optimization, which is obviously bad so take this at face value.
Type checking is usually dwarfed by macro expansion, import resolution, optimization, and code generation. There's very little I/O, not that much work to do, and it is pretty straightforward to cache (*). There's a trap to thinking that complex syntax and type systems are potential bottlenecks - it's true the compiler has to do more work, but it's still not that much compared to all the other time it has to waste doing mundane things.
An example is import resolution. In order to type check a compilation unit you need to type check its imports. In order to do that, the compilation unit and its dependencies needs to be macro expanded. If the language supports some kind of scripted macro-programming then you need to evaluate the macros and possibly cannot cache the results (performance pitfall #1). If the language does not support explicit imports, then the entire dependency graph may need be scanned and checked to resolve the imports (performance pitfall #2).
This is a case where the simplest and most expressive language designs can have the absolute worst performance regardless of the compiler's design - the language definition forces the compiler to do a ton of mundane work, in particular I/O and cache invalidation. A central in-memory database holding a binary representation of the AST does not solve this problem and is not necessarily faster than redoing all of the work up front.
In my experience, something as mundane as "import x from y" has a much larger impact over the performance of a compiler than any type checker or parser, and incrementalism strategies are tightly coupled to the design of the language being compiled and not the storage layer. It has to be approached holistically for the language being compiled.
Features like "style sheets" for code are just linter and auto formatter config files. It may be useful to reuse the compiler internals for this, but in practice every tooling author eventually learns not to do that. You will need to modify the IR generated by the compiler, and it probably doesn't have an API that's anything close to what you need for that problem
(*) Of course I'm glossing over type systems that allow higher-kinded dependent types, but those are going to suck to check the same that macro heavy code sucks and currently isn't that widely used.
You're talking about C++ issues here. Those are very specific to C++.
No other languages besides C/C++ have such a broken import and dependency mechanism. No other language than C++ has such broken "macros" like the template mechanism.
Also I was explicitly talking about advanced type systems. C/C++ have a very shallow and simple type system.
But OTOH C/C++ have the most advanced optimizers.
So the whole picture gets quite bended when looking at it form that perspective.
Scala for example has an IR called TASTy which can be stored along the compile artifacts. It's the AST after type checking (Typed Abstract Syntax Tree). Recompiling form that is very fast as it's (almost) only the relatively cheep code gen step.
Also incremental compilation works differently for a lot of languages because those languages have small and independent compilation units. The situation in C/C++ is again a very specific mess as there are really no modules (until now). Usually you don't need to potentially recompile the world on every code change.
> Features like "style sheets" for code are just linter and auto formatter config files.
No, they aren't.
If I "change the style of my code" the actual code gets changed!
In case I would change the style sheet used for rendering only a local config value would get switched. The "binary code" on disc would stay exactly the same. (Still I could than have "Python with braces" after the switch if this is what I prefer to look at).
> It may be useful to reuse the compiler internals for this, but in practice every tooling author eventually learns not to do that.
Everybody involved in the "LSP revolution" would disagree, I think.
The whole point of LSPs is to reuse the already existing tools (usually the compiler) built by the language devs.
My entire point is that caching is hard and tightly coupled to the design of the language, and things like typechecking are not really that slow compared to the other work a compiler has to do. Choosing an in memory representation is not really a secret recipe for performance or utility, designing the way the cache gets updated and what it stores is.
I think the idea of a core language with multiple syntaxes is interesting, but I'd contest that the system to implement it is any faster.
> Everybody involved in the "LSP revolution" would disagree, I think.
I reached this conclusion after writing a language server and researching other language servers. The tide is changing to where the primary compilers for languages are query-oriented compilers like Roslyn that allow for this kind of thing, but they aren't nearly as common as batch compilers so language server authors find themselves writing everything from scratch. Interestingly though, big portions like type checkers can be reused if they have stable input/output IRs because they're not slow (however, compiler authors are loathe to expose those details or make stability guarantees there).
> references to very old systems that did already this
Here's one: Alice Pascal
> In a syntax directed editor, you edit a program, not a piece of text. The editor works directly on the program as a tree -- matching the syntax trees by which the language is structured. The units you work with are not lines and characters but terms, expressions, statements and blocks.
I know what source code is and I know what binary code is. But I’ve never heard of binary source code, and my Google search doesn’t turn up anything either.
Most code editors deal with abstract syntax trees for linting, etc. Think HTML DOM but to define the syntax relations of the code itself.
I'm assuming OP talks about saving something like that format directly, instead of text. Then information like 'this method is being called with these two parameters' isn't mixed with presentational info like whether you separated the parameters with newlines, or whether you use tabs or spaces. The downside being that it would not necessarily be immediately readable, the text editors would have to parse it and present it in a readable format, kinda like what happens now with, say, excel sheets.
Even currently code isn't "immediately readable". It shows up, but until the IDE ingested the project it does not provide any features besides basic string editing, which is mostly useless.
So no new issue here.
But reading in a project would be much much faster if the IDE would just need to read a rich binary format!
> Even currently code isn't "immediately readable"
I read code in text editors (not full IDEs) all the time...
> it does not provide any features besides basic string editing, which is mostly useless.
I also do write code in text editors all the time. Not always - I do use IDEs sometimes. But at least half the time, I would say.
I use vim a lot, and all I want there is syntax highlighting. I would hate it if I was forced to use a heavy IDE everytime I want to read/edit some source code. Also I like grepping into sources without having to run an IDE that can search through binary code.
Thank you for your hard work making my favorite programming language even better.
I completely agree with all your points on significant whitespace for programming languages. Where I think it makes a lot more sense though is for structured static data. My quality of live would be worsened without the likes of Haml, Sass and YAML.
I enjoy using slim in my views though prefer scss to sass. I think for many of those tools either misindentations raise an error (like failing a yaml schema validation) or result in incorrect but observable behavior (content isn’t in a div as expected).
It really comes down to how the tool is supposed to be used and how failures are handled.
In general I think that’s the difference between a good experience and a great one: when we can have beautiful failures that help us learn.
> I completely agree with all your points on significant whitespace for programming languages. Where I think it makes a lot more sense though is for structured static data.
This makes no sense.
If the arguments against significant whitespace are valid for executable code they're also valid for markup.
If it's fine to use significant whitespace for markup / data literals even the perceived drawbacks apply than it would be equally fine for executable code.
You've got a point. I was mostly thinking about the "copy-pasting into the REPL" argument, which is a problem unique to executable code, at least in my experience.
Interestingly my main quality of life improvement when writing Rust is being able to leave the brackets around the test in an if statement empty (vs C/C++/PHP_), I don't care about the square brackets for the body as much.
In Rust it’s clear when the logic of the “if” is over because it’s always followed by a curly and it’s pretty easy to detect if you missed the curly.
Im not saying that more syntax is always better, but sometimes that extra bit that seems needless is helpful to clarify to the program “yep I meant what I wrote”
> If there’s a mistake in the indentation it’s harder to determine intent.
If there's an indentation mistake in an indentation sensitive language, then there's a mistake. Why does intent matter? Intent is lost because there is a mistake.
> One of the drawbacks of white-space sensitivity is the accidental mixing of distinct white-spaces (e.g., tabs and spaces)
I discovered something really surprising regarding whitespace awareness of others, TL;DR that many juniors fundamentally misunderstand "tabs vs spaces", thinking that "tab" means the tab key rather than the tab character.
As a follow up to migrating a bunch of repos and discovering every possible combination of "wrong" indentation... I asked what people's preferences were, my assumption that the cause of this mess was different people working on the same repo with differing preferences. Only to find that this was not the case at all, they were mostly from the same individual. And those individuals had a severe misunderstanding of what "tab" vs "spaces" meant. Essentially many people are either not aware of the tab character or not aware that the tab key does not necessarily insert a tab character due to auto indent features. So those people interpret "using tabs" as "using the tab keyboard key", getting confused at why anyone would be using the space bar.
I then realised that the source of this misunderstanding comes from auto-indent defaults in modern text editors. The automatic, per-line indentation type detection, combined with automatic space insertion and automatic navigation over spaces when using cursor keys makes tabs and spaces indistinguishable unless you either turn these features off or render invisibles. This becomes a problem when chunks of code are pasted from somewhere else with different indentation styles, since they act as seeds of an indentation type that start to propagate and compete with adjacent line indentations. The worst combinations are competing space-based indentation levels, e.g 2 vs 4 spaces, when these are adjacent the automatic indentation behaviour cannot figure out which is correct since 2*2 = 1*4, 4*4 = 2*8 etc etc. This can make it impossible to set e.g a closing bracket onto the correct indent level. Rather than realise the auto indent is insufficient, It seems that novices will just submit to the editor - thinking it is somehow broken.
So rather than enforce a tab/space/indentation level style guide: I said I didn't care what they preferred, so long as they turned on invisibles and made sure their code is self-consistent by manually selecting an indentation style/level. i.e I've ended up mandating specific text editor settings rather than a style guide.
Let's face it: The new generation has no clue how computers work.
Not that this would be a big surprise. My grandma also didn't know how computers work. That's not some knowledge you get born with. You need to learn it.
But the new generation never learned about "real" computers. They're just consumers of user grade entertainment and communication devices, until they get in contact with "real" computers the first time in university or on the job. (So exactly like for the generation of the now grand parents).
> The automatic, per-line indentation type detection, combined with automatic space insertion and automatic navigation over spaces when using cursor keys […]
That's absolutely crazy shit just to simulate the behavior of tabs with spaces!
If we would just use tabs (like they were invented, namely for indentation) there wouldn't be any issue.
The editor vendors should stop all this craziness described above and we would be good. Nobody is printing their source code any more on a line printer! Which was the only reason ever to actually use spaces for indentation. It's incredible how bullshit form the 70' still causes issues today even the actual reason for that bullshit can be found at best with luck in a museum.
Based on my quite limited experience with Python the big drawback of indentation sensitivity is copy pasting will only work when your editor has great language support. If I copy a block of code from indentation level X and paste it to indentation level Y when I have explicit end-of-block tokens the worst thing that could happen is that I end up with badly indented code that I have to clean up manually. On the other hand, if indentation matters, and my editor messes up things, I am very likely to get code that looks good, probably compiles too, but does nothing like the original. I have to read through every line carefully and think about where the blocks were meant to end.
In the end I'm still not convinced that significant indentation would worth the trouble.
I've been writing python for 20 years, and I truly cannot think of an instance where pasting code was an issue. I use vim mostly without any plugins, not any fancy editor.
I've only used python a few years professionally, but seen it multiple times. With other languages I paste in code between some brackets, run the autoformatter and it looks as I want. With python I paste in some code, run the autoformatter, only to see because of spacing issues half the lines I pasted didn't get properly parsed to be inside the for loop or whatever.
Also makes it hard to copy a snippet to run inside a REPL or whatever.
In general just a huge amount of pain, and the "good thing" about how it forces well formatted code I've never seen as an issue with autoformatters and linters in othet languages.
The "trick" is to copy and paste on the right block level. Just set the cursor appropriately (which the editor does for you if you use the feature to jump to block start or end).
Than everything just works.
(For that to work even better the stupid feature of some editors to remove white space form all empty lines should be disabled. So pressing enter or going to the next empty line will always position the cursor on the correct indentation level. Than it's easy to paste code blocks everywhere as even empty lines are correctly indented).
I've written a lot of python in my life (but most of the time not as my main day-to-day language) so whenever I had to try something small in a REPL I do paste. I guess it would work to highlight a piece of code and run it externally with a shortcut, but I've never set that up, so I did indeed run into this problem more often whenever I was just taking up Python again (for example I find [x for x in ....] kinda horrible and have at times fat-fingered if it's twice nested)
> If I copy a block of code from indentation level X and paste it to indentation level Y when I have explicit end-of-block tokens the worst thing that could happen is that I end up with badly indented code that I have to clean up manually.
That's exactly the same "clean up" you need to do if you did something wrong while copy & pasting code without the useless block delimiters.
The rest is made up as someone pointed out already: Exactly this story comes up every time but in reality it never happens. (Besides when someone tries hard to get it wrong by all means only to prove their made up point).
Funny, I have the opposite opinion for the same reason as you - I'd rather just use an auto-formatter in a non-whitespace sensitive language and have the option to mark certain sections as needing custom indentation for readability. E.g. I've had situations where I'm calling into an API to programmatically build a nested GUI in Python, and the inability to indent my API calls to represent the structure of the nested UI was seriously frustrating. Also, trying to copy-paste code in Python while getting the indentation right is an exercise in severe frustration.
It is. you can parse something given how it's indented. Doesn't mean that the resulting code is actually correct (from the logic point of view). And you cannot properly format it because the logic is ambiguous.
Both of these are valid code. One (or both of them) are incorrect
if x > 0:
function1()
function2()
function3()
if x > 0:
function1()
function2()
function3()
No formatter will be able to guess which one is correct. It will do a best effort approximation. But good luck figuring out how to format a YAML file:
I don't understand your first example. They are both correct, but have different semantic meaning. A parser doesn't have to try to find incorrectness there, it simply has to infer the semantic meaning of both, which is... the second one calls function3() irrespective of x. Both programs should be expressible in any language hence I don't see why you are placing the onus on the compiler to tell you one is incorrect.
> They are both correct, but have different semantic meaning.
Exactly
> it simply has to infer the semantic meaning of both, which is
How exactly do you want the parser for the formatter to "infer semantic meaning"? It has no knowledge of the human requirements for this code. To the parser both are valid
> Both programs should be expressible in any language hence I don't see why you are placing the onus on the compiler to tell you one is incorrect.
Are we talking about "expressible in any language"? No.
Here's what I wrote: "Formatters will no save you from the many, many, many cases where indentation is simply ambiguous" in the context of formatters.
> the accidental mixing of distinct white-spaces (e.g., tabs and spaces). Using a formatter eliminates this problem because all white-spaces are normalized.
I've found an easier solution is to forbid tabs outside string literals. You can't do the same with spaces.
Can you expand on this? Prepending spaces (or tabs) to strings is one of the most basic operations there is, so that can't be where the difficulty is. Keeping track of what the current indentation level should be also seems like something that any tool which outputs source code already does.
> Keeping track of what the current indentation level should be also seems like something that any tool which outputs source code already does.
The challenge you are refuting is about building said tools. Just because existing tools have pushed through this challenge doesn't mean it's not both tech debt for those existing tools and a challenge for any new tool in the space.
I think the best argument imo is that whitespace has to be properly maintained per line which adds computational cost and statefulness.
There is a whole range of programs that fit in between "output a single block string of text" and "yer a compiler, Harry!".
And maybe the right answer is to import a compiler's worth of tooling to output a bit of code. But to me that seems like needless overhead, since meaningful whitespace isn't really a positive to begin with.
You can set tab display width for whatever width you prefer, you cannot do the same for space.
From accessibility point of view, tabs are more usable for people with sight problems.
If you have standardized indentation width, then the only difference is how the indentation is stored in the file. You can present either option in arbitrary way to the user. Doing n spaces to m character cells is as simple transformation as 1 tab to m character cells.
Could you tell how to set this up in say VSCode or IntelliJ?
AFAIK it works only one way: You can format tabs with an arbitrary number of spaces. But you can't render spaces with anything else than spaces. How should this even work? You would need semantic code analysis to make it happen…
No, it isn’t[1], not in most formatting styles, if we’re speaking about the usage where the user can adjust the tab width and have the formatting not fall apart. (Note this excludes e.g. Vim or Visual Studioconfigured to use tabs, so that may be a part of the confusion.) The reason is that alignment still needs to be done with spaces:
Ah whoops, now I’m not sure. Either that comment was saying that with a standardized indentation width, having an option to adjust the indentation width does not matter, which is trivially true and not contradicted by mine but strange to say as a reply to a comment pointing out that option; or it was (implicitly?) saying that with a standardized indentation width in the storage format, you could recover the tabs from the spaces and adjust the result, which is contradicted by my comment: you can’t automatically distinguish indentation spaces and alignment spaces without at least language-syntax smarts, and while the former should be adjusted the latter mustn’t be.
I've used vim's conceal feature to do something like this before to read some ridiculous code that had 16 space indentations. You can use this for all sorts of fun things like making lambda: render as λ. That doesn't use any code analysis, it's just dumb rendering, not sure why you would need anything else?
> That doesn't use any code analysis, it's just dumb rendering, not sure why you would need anything else?
Because this would not work in the general case. Without code analysis you can't know which spaces are indentation and which aren't. Spaces are everywhere in code…
You could have some heuristic like "new-line followed by x spaces" but this wouldn't work correctly in all cases.
Thanks for the pointer to vim's conceal feature, btw! I'm looking for something like that, but I'm not using vi(m) (I'm too lazy to learn it, even after 20 years of desktop Linux). Now I know what VSCode extension to look for. Does something like that exist that works on the AST level? (May be even some external filter-like tool).
I feel like we're reliving the episode from Silicon Valley where Richard Hendricks broke up with his GF over tabs vs. spaces.
In 2023, it is technologically possible for the IDE repo to use tab characters and the git repo to use spaces, or vice versa. Also, in 2023, I don't care anymore.
> Somewhat surprisngly, indentation-sensitivity in languages like Python and Haskell had me writing cleaner-looking and more readable code by default.
Yeah, I've gone from hating this, to accepting it (after using Python), to wanting it if I were to design a language (after using Haskell/Purescript/Idris).
> Based on this experience alone, if I were to design a non-S-Expression-based language, it would push me toward indentation-sensitivity.
The catch is that it's a pain to parse. Haskell and Python do token injection. Idris has some manual checks (with lots of holes/exceptions) and explicitly passes around an indentation (in some places). A few others conflate parsing and lexing.
I've settled on having a `(layoutRow, layoutCol)` in the state and requiring `col > layoutCol || row == layoutRow && col == layoutCol`. This lets you setup the off-sides layout and still accept the first token at that column.
I use to not only agree with this sentiment but was an evengelist about it.
Then fixing a lot of bugs due to people (including myself) getting indentation wrong when refactoring python got me wondering.
Then languages servers convinced me indentation is best left to editors and that the cost of braces, semi-colons, etc... is tiny compared to the huge benefit of knowing with 100% certainty and at zero cognitive cost where a block starts and ends.
As I implied, I find the Haskell-like languages attractive. To the point that I am taking that approach in a toy language.
But you are right, the cost of it is pretty high - not just copy/paste, but it's tricky to parse, auto-formatting is harder, LSP support (changing code) is harder, renaming variables and functions is harder, etc. (Haskell-like languages are worse than Python because they don't always start a new indentation level on a new line, so renaming can change the indentation level.) So it's probably not worth it.
And I will add, like the sibling commenter, I like the auto-format on save of Go. When writing Go, I'll type whatever, often on one line, and let the formatter take care of making it pretty.
It's not just golang, it's your tooling assuming the syntax of the language you use allows it. I do "reformat on a hot key" in both Rust and Typescript, and use it constantly, way before I save, it helps me review and confirm my logic makes sense.
The other advantage is that it standardizes formatting for a given repo making diffs more relevant.
Ironically, I believe Python PEP 8 is what started and normalised the idea of taking formating decisions away from developers.
> compilation versus interpretation never made a sigificant difference in performance for my solutions (which tended to be dominated by algorithmic complexity concerns) ... Were I to implement a language, I think I’d favor having an interpreter first.
Constant matters. Ahead-of-time compilation (AOT) and just-in-time compilation (JIT) are a lot faster than interpretation [1,2], sometimes by a factor of >100 in the case of C vs pure python [3]. This gap is noticeable for a program taking more than a few seconds to run. Except for performance-critical tasks, my preference is JIT-based languages without a manual compilation step, such as LuaJIT, Javascript-V8/Node.js and Dart. Nonetheless, these language implementations are much harder to develop than interpreters.
I am not familiar with Advent of Code. If all their problems take ~0.1s to solve in C, you are probably right. However, it is a stretch to make a high-level conclusion just based on small exercises, like this:
> There was never a time where rewriting in another language felt like the right way to get better performance.
Similarly, for small exercise problems, dynamic typing rarely hurts, but for large collaborative projects, static typing starts to show its advantages.
Interpreters don’t mean the internal evaluation method here, but only the mode of operation. E.g. ruby implementations that use JIT, java’s jshell, python also has a JIT interpreter.
> I’ve bounced back and forth between statically typed and dynamically typed languages my whole career.
>
> I’ve never settled dogmatically into any camp, and this experience didn’t change that.
I don't think any proponents of static typing would expect small problems like these to really show the benefits of static typing though. If you were trying to decide whether comments were worth it you would conclude from AoC that they are not, which is clearly the wrong conclusion in general.
The author does say they have extensive experience of Haskell and Standard ML so presumably they have experience with the enormous benefits of static typing at "real program" scale so it's a strange conclusion IMO.
Not so strange. Experience of people who write small programs all the time and people who work on big programs all the time will differ. Maybe the author worked on small programs half the time and big programs half the time.
> In particular, days where dense matrices were the primary data structure seemed to favor dynamically typed languages.
IIRC, if one pedantically types a typical NxM transformation matrix, one may need NM different explicit types. (or maybe this was just a problem when derivatives were involved?)
Which programming languages have a strong enough type system to do this? And does it have any draw backs to implement type-level natural numbers, or is it just complicated and a lot of work?
The drawback of (1) is that it's hard to understand or extend, and the type errors are insane, and put Boost to shame. The drawback of (2) is that's just a macro. Of course (1) and (2) are not interoperable.
C++ templates are really underappreciated, because they are a single mechanism that gives you both powerful generics and macro-like codegen, while being easy to adopt gradually. (E.g. you can add concepts when it gets unwieldy, but you don't have to rewrite the whole thing.)
I also find that even though it is a language inside a language, it is much easier to keep track of, because it is uses the same C++ language features instead of a special set of macro operators and semantics.
The drawback is that if you also have (parametric) polymorphism (and you do, unless you’re strictly standard Pascal and impossible to write libraries in), your typechecker now needs to answer arbitrary questions of the form “are there natural numbers that make this equality true?” in order to tell if something is well-typed or not. And that is of course impossible in general (per Gödel and Turing), so now it’s one on you to give the proofs.
So languages that do this seriously—Agda, Idris, ATS—are usually somewhat Turing-incomplete in order to sidestep this and related problems of too-expressive (sound) type systems and, though it’s perfectly possible to write useful programs in them, it requires a more disciplined and restricted approach than usual, and you’re going have to spend some time proving things.
There are also languages—notably C++ and Haskell—that do this less seriously, in that their systems are capable of expressing naturals but are either unsound (allowing some programs that passed the typechecker to misbehave due to type-related problems), semidecidable (allowing the typechecker to hang on erroneous programs), too weak (not allowing you to prove equalities you need), or all of the above. On the upside, those are also the ones flexible enough that normal people can program in them (as long as they don’t try typing every length and size they see, then it becomes as painful as ever).
(Other responses have mentioned interesting type systems for dynamic languages, such as those in Racket and Common Lisp. I think those mostly fall into the “unsound” bucket above, due to runtime checks, but I don’t know much about them. Should be worth looking into as well.)
> So languages that do this seriously—Agda, Idris, ATS—are usually somewhat Turing-incomplete in order to sidestep this and related problems of too-expressive (sound) type systems
you mean that the _type systems_ of these languages are somewhat Turing-incomplete, not the languages themselves, right?
> you mean that the _type systems_ of these languages are somewhat Turing-incomplete, not the languages themselves, right?
Either is a possibility.
Agda and Idris have full dependent types, that is every value computation is allowed to be used as a type parameter, so all value computations are forced to terminate in a way apparent to the typechecker and are consequently not Turing-complete. (As a sibling comment mentions, aside from hanging your compile when run an infinite loop is an example of empty so a proof of false, thus if the compiler decides it’s uninterested in the details and trusts it to work instead of actually running it it’s actually worse.)
ATS I think has a Turing-incomplete proof language distinct from the Turing-complete program language, but I’ve never been able to get through its rather arcane documentation so I’m not sure. Frama-C or SPARK is one example where I’m certain that’s the case (the program language is C or Ada, respectively), except the proof language is “throw intermediate statements at the SAT solver and see what sticks”.
Both, but almost more importantly the languages themselves are all strictly terminating.
Long story short, these languages are based on the “Howard-Curry isomorphism” which says that we can view type signatures as theorems and programs with that signature proofs of that theorem.
A program which runs forever is a proof of false (you could see it as saying “after infinity time I’ll have your answer I swear”). So these languages are designed to prevent that from being possible.
This is pretty misleading, isn't it? Idris and Agda, for example, have coinductive types, which you could in principle use to write a "coprogram" that provably never terminates, such as an operating system (where "terminate" really means "put myself into a clean state and then turn off the power"). See e.g. http://blog.sigfpe.com/2007/07/data-and-codata.html and the links therein; Conor McBride is constantly railing about this; see e.g. https://strathprints.strath.ac.uk/60166/1/McBride_LNCS2015_T... .
I explicitly ignored codata/coinduction as they are complex and rather niche. Even so, they come with their own notion of 'termination' (progress / wfo) and don't magically let you write random non-terminating programs.
Depending on what you want to achieve there are plenty of tricks and techniques to encode a divergent program in a sound dependent-type theory, but the best approach to use will depend on the problem statement.
The person you replied to didn’t say CL is statically typed. They said “powerful”.
While CL (the standard) doesn’t require compile time type checking AFAIK, certain implementations, such as SBCL (probably the most popular non-commercial implementation), does offer compile time type checking.
But the now linked page was an interesting read. At least SBCL has indeed real types. I didn't know that until now.
Compile-time type checking
You may provide type information for variables, function arguments etc via `proclaim`, `declaim` and `declare`. However, similar to the `:type` slot introduced in CLOS section, the effects of type declarations are undefined in Lisp standard and are implementation specific. So there is no guarantee that the Lisp compiler will perform compile-time type checking.
However, it is possible, and SBCL is an implementation that does thorough type checking.
This exists since early 80s. The first language definition book appeared 1984: Common Lisp the Language.
One could think that it is a gradual type system. Type declarations are optional and are used for several purposes. Many implementations use type declaration and optionally type inference as input for optimizing compilers. In the 80s the first implementation to add limited forms of static type inferencing and checking was CMUCL. SBCL is a later fork of CMUCL.
The challenge is to integrate dynamic types, static types into a language which is often used interactively - including a lot of dynamic features, where programs can change at runtime or parts of the language might change during runtime. Thus the type system has its problems and hasn't been improved since the 90s... Not many implementations have adopted the type system implementation of CMUCL & SBCL.
Generally the SBCL compiler, even though it is based on 80s technology, is quite helpful when developing Lisp code.
(deftype option (subtype)
"The type of a value that is either nil or a value of SUBTYPE."
`(or null ,subtype))
(defun* get-latest-statement ((account account))
"If any statements have been made on the ACCOUNT, returns the latest.
Otherwise returns nil."
(:returns (option statement))
(car (last (account-statements account))))
>>> Somewhat surprisngly, indentation-sensitivity in languages like Python and Haskell had me writing cleaner-looking and more readable code by default.
Who would have though!
It's like this would have been the whole goal of this trick
That's only because editors are not programmed to do the work.
An IDE with proper language support for a whitespace-aware language should be able to parse the AST figuring out the code structure from the indentation level; then it could reformat it at the proper indent level when you copy and paste it elsewhere, just like it does with a delimiter-based language.
Of course it is decidable, the programming language has very strict rules to determine it. That's why you need language support in the editor, not just finding matching open/close delimiters.
> Of course it is decidable, the programming language has very strict rules to determine it.
No, it's not. The programming language grammar can reject things, but it cannot determine certain class of mistakes which explicit terminators ("}" and so on) can, which is why programs like `indent` existed for decades for other languages, but have yet to exist for Python which is already 30 years old.
If no one has cracked that particular nut for 30 years, you can't with a straight face claim that it is obviously possible. It looks more impossible with each passing year.
If a program cannot determine when a block code starts and ends from indentation alone, then it is not correct Python; the language needs a deterministic process to determine what sequence of commands belong to the same block, otherwise it couldn't compile.
If you're talking about failing to determine blocks which are NOT correct Python, then I don't have a problem with those failing to be correctly indented under copy/paste. The vast majority of use cases would still be well served.
We are talking about pasting code into another, where both can fail to be correct at the time and you may or may not want to output to be correct. Nonetheless, your intent can’t be determined due to multiple ambiguous “good” outputs, e.g. should insertion start at the current level and keep at it, did the copy happen from a “dumb” medium without leading whitespace so that it should be tried to be indented first, etc. Sure, you can have a look at the resulting code and fix, but we all know that humans err a lot, this is something that works braindeadly easily with non-whitespace ”aware” languages.
> If you're talking about failing to determine blocks which are NOT correct Python, then I don't have a problem with those failing to be correctly indented under copy/paste.
Then maybe you shouldn't have replied to parent who said:
>> Yes it forces clean indentation, but it gives the writer the burden to take care of it. That sucks. Let the computer do the work.
> The vast majority of use cases would still be well served.
Well, sure, because the burden for ensuring the code is correct falls onto the programmer, whereas with other languages we can just let the computer do that.
You can't let the computer ensure the correctness of code.
I replied to dismiss your assumption that adding delimiters as '}' or 'END' will ensure the correctness of code in a way that a code with indentation delimiters can't do. It's similar to saying that in, a language where statements are not finished by semicolon, it's not decidable where sentences end, so this forces the developer to decide on the correctness of statements. OF COURSE there are ways to define where a statement ends without adding a semicolon to each one, just like there are ways to decide where blocks end in an indent-based programming language; they are embedded in the language syntax.
Heck, C is notorious for producing incorrect code for NOT using meaningful indentation and relying on delimiters instead. If you write:
if (x>0)
function(1);
else
function(2);
And then you expand the else block:
if (x>0)
function(1);
else
function(2);
function(3);
Then function(3) will be compiled out of the if sentence, when it's clearly part of the else block.
> You can't let the computer ensure the correctness of code.
But I'm not letting it do that, I'm letting the computer autoformat it, which it can't do if the whitespace is the code.
> Then function(3) will be compiled out of the if sentence, when it's clearly part of the else block.
And yet, I don't have to worry about that error because the editor is able to simply autoformat it so I can see where the error is even when the compiler/interpreter thinks it's legal code!
That's the whole point - using whitespace as part of the code means that the IDE can not, and never will be able to, autoformat the code.
Your example is one of a burden that falls to the programmer in Python, but is taken care off by the computer in other languages.
These errors, which can be automatically found by the computer in other languages, have to be manually found by the programmer in Python. It's a clear disadvantage.
> But I'm not letting it do that, I'm letting the computer autoformat it, which it can't do if the whitespace is the code.
Of course it can do it. Indentation changes are parsed as delimiters, so it can treat them in the same way as if you put those delimiters yourself by hand.
> And yet, I don't have to worry about that error because the editor is able to simply autoformat it so I can see where the error is even when the compiler/interpreter thinks it's legal code!
Nothing prevents an IDE to highlight that error through other means. Ever heard of secondary notation? You don't need explicit delimiters to apply them, when indentation changes work as delimiters themselves. Simply highlight the start and end of blocks, and you'll get the same exact benefits as with programming languages based in brackets.
> These errors, which can be automatically found by the computer in other languages,
So tell me, how would the computer find an error if you have unmatched open/closed brackets? Doesn't that mean that other computer languages have errors that can't exist in Python?
> Your example is one of a burden that falls to the programmer in Python, but is taken care off by the computer in other languages.
How does an error in C code fall to the programmer of Python?
BTW, do you understand Python block delimiters at all? Your complaint seems to come from not understanding how a Python developer sees indentation. Once you enter this mindset, indentation errors are no different than unmatched open/close brackets - which do exist in languages like C too.
> That's the whole point - using whitespace as part of the code means that the IDE can not, and never will be able to, autoformat the code.
No - it means that indentation changes are meaningful, so the state of the code before reformatting should be maintained after reformatting. As long as the reformat tool doesn't change the meaning of the Python code, it will autoformat code perfectly fine, as numerous autoformatters in existence prove.
Which is what I was referring to in my first comment - you need an IDE that understands Python syntax, not just one that blindly pairs open/close delimiters.
One more thing you should consider in your argumentation is, that in non-whitespace-sensitive languages the programmer can make fewer mistakes when pasting code:
It could be pasted at a completely wrong position of course, before some other expression or after some other expression, where it should not be. Aside from that however, the programmer cannot make a mistake by pasting it in a place, that is between 2 expressions, because that is one potentially huge correct position to paste that code. It stretches from the end of one expression to the beginning of the other expression. Anywhere in between the programmer can paste code.
In contrast to that in Python and other whitespace-sensitive language, one has to be very careful about where to paste. Only in the case of pasting at the semantically correct indentation level (basically one tiny position in the code) can the editor/IDE/tool know how to indent things. Of course humans make mistakes and paste maybe 1 space to the left or right or even a whole indentation left or right of the correct position and then the tool cannot help you make your code do the right thing. It could be valid code, accepted by the compiler or interpreter, but still semantically wrong, doing something you did not want it to do.
The tool cannot fix the pasting at wrong position mistakes made by humans in all generality, because it does not know the actual intention and it can be ambiguous what the result should be.
I hope this makes a long discussion a bit clearer.
The origin text that you're cutting will have a precise syntax determining where the block starts and ends. Just copy it to the new position, and change the indentation level to that of the target position, maintaining the same block definition. You know, the same thing that would be done if there were start and end delimiters; because in Python, indentation changes ARE block delimiters, with a well-defined syntax.
It's not rocket science, just following the language indentation rules for defining blocks. If the language parser can do it, why not the text editor?
Indentation increasing or decreasing generates INDENT and DEDENT tokens, which are used as block delimiters.
> Remember, we're talking about an IDE automatically figuring out proper indentation for something.
The IDE doesn't need to figure out whether the program is correct. Only has to treat code blocks as defined by the language syntax. In your linked example:
if x > 0:
function1()
function2()
function3()
if x > 0:
function1()
function2()
function3()
The IDE should treat the blocks as if defined with delimiters this way:
if x > 0: {
function1()
function2()
function3()
}
if x > 0: {
function1()
function2()
}
function3()
Because that's how Python will interpret them. So, the IDE would do exactly the same behavior with curly bracket delimiters and with changes in indentation, because in both cases there is a precise rule to define where blocks start and end.
I've just tried Spyder, and it seems to work that way.
With this code:
indent(1)
indent 2
if x<0:
function 1
function 2
function 3
if you select the if block starting the selection at the "if" (i.e. ignoring the whitespace at its left) up to function 3, and paste it right under indent(1), it produces the following:
indent(1)
if x<0:
function 1
function 2
function 3
indent 2
if x<0:
function 1
function 2
function 3
Yet if you select the whole line including the initial indentation whitepace and paste it at the same place under indent(1), then it doesn't change the block indentation:
indent(1)
if x<0:
function 1
function 2
function 3
indent 2
if x<0:
function 1
function 2
function 3
That looks sensible to me; if you're moving whole lines it will paste them unchanged, but if you move the cursor to select specifically a keyword and its context, it treats it as a code block, reformatting its indentation to that of the place where you are pasting it.
Where as in a language that actually has blocks you don't need to make these arbitrary decisions and hunt for the exact way of copying a block of text.
> And this precise rule inevitably formats code incorrectly from the programmer's point of view in a significant amount of cases.
Only for programmers who are significantly unaware of how the Python language structures its code.
It's not that difficult really. And if you have problems visualizing them, you could use editors like Visual Studio Code, which highlights indentation so that you can see the start and end of blocks with colors, way easier than with start and end brackets.
Would you put your arm in fire that I could correctly copy-paste that code from your HN comment as well? I’m not so sure, yet a Java fragment would sure work just fine.
OTOH hand, COBOL had this and then switched to format free code, because it's something 1960s when computers were slow and helping the compiler made it fast enough to be used.
Indentation-sensitivity is so popular that only one language in the Top 20 programming languages has it. That's nearly 40 years after I saw my first indentation-sensitive language: occam.
Indentation-sensitivity seemed like a great idea in 1983. I'm not convinced today.
Not sure what list you’re referring to, but I would expect to find Python and Scala in it, maybe OCaml or Haskell depending on the criteria.
Personally I think it causes more problems in interpreted languages, a lot of those disappear with a compiler with thoughtful error messages like Scala.
I will give you that for highly specialized domains, those lists would look different (eg in some niche OCaml would be high while Rust would be low, or R high where C++ is lower down the list) , but it is telling that Scala comes in at #32 when compared to other languages that, I think, occupy a similar problem space.
Of course, Python is #1 on the TIOBE list (and have had a strong showing for years), which I ascribe to the overall productivity and wide application domain fit. So, whether or not white space is objectively bad (for any language or only some languages), it hasn’t detracted so much from Python’s popularity.
The original formulation of the offside rule made provision for handwritten entry. If/when computers get small enough to lose the keyboards and fast enough to reliably OCR, maybe that provision will be rejuvenated?
> Of existing languages, Racket (paired with Typed Racket) probably comes closest to what I feel is the sweet spot for language design, and it may explain why I seem to pick it more than most other languages when
given a choice.
Racket is great, and I'm coming to appreciate it more as I spend my spare time with it. I didn't really start enjoying it until I stopped trying to pretend it was Python with delimiters. Expressions are fundamental to Racket in a way that goes way beyond syntax.
> Expressions are fundamental to Racket in a way that goes way beyond syntax.
This is typical of all (truly) functional languages, in my experience. They are expression-oriented rather than statement-oriented. You'll find similar behavior in, e.g., OCaml and Haskell. And since it was inspired by OCaml, you'll also find that Rust has an interesting expression-oriented approach to imperative programming, where they have essentially adopted the idea that "statements" are just unit-producing expressions that can be sequenced.
This rings especially true - "you can write Java in any language", or whatever sort of language shaped your thinking. Even after years of using different languages my prototyping code is often inherently imperative and I have to actively rewrite it to be functional.
eval is bad, but there are better alternatives which don't involve unsafe string concatenation. OCaml had (for a while) a nice system for constructing ASTs, eg:
let ast = <:expr< 1 + 2 >>
would be the way to write the AST:
+
/ \
1 2
and you could compose those in various ways, eg:
let ast2 = <:expr< $ast$ + 3 >>
Of course it was all safe at compile time. The main issue with it was it was quite complicated and terribly badly documented.
The whole idea behind eval is to provide textual input. Requiring an AST as input breaks the whole idea - you're just writing code directly in an awkward format.
eval comes from Lisp, and there, it's not textual input. You give it a quoted form (i.e. an AST) and an environment and it evaluates the form in that environment.
A language that does not derive information from whitespace, doesn't mean the language is written without whitespace. Java, Go, Javascript, Rust, etc are all regularly written with whitespace, and have tools to enforce such formatting, but they don't derive information from it.
Also, Python has existed for decades and still there is little further adoption of indentation-sensitivity. It doesn't seem like a wave of indentation-sensitive languages will be coming any time soon.
> Java, Go, Javascript, Rust, etc are all regularly written with whitespace, and have tools to enforce such formatting, but they don't derive information from it.
Ah you reminded me. A curious phenomenon I've observed with Prettier in JS and fmt in Go is languages are moving to standardized whitespace, but as you said, not yet deriving information from it. I don't know enough about Java or Rust but I suspect they probably both have adopted a Prettier/fmt like convention where all code is formatted on save. So it seems like we are moving to a world where it will be a simple flip of a switch to then start having popular languages extract meaning from the whitespace.
> Also, Python has existed for decades and still there is little further adoption of indentation-sensitivity. It doesn't seem like a wave of indentation-sensitive languages will be coming any time soon.
I think it's coming big time this year. I think our Scroll (https://scroll.pub/) will catch fire and be the go to language instead of Markdown by the end of the year. Then with the increasing success of TreeBase (powering PLDB and others) we will start to see JSON fall for config formats and document storage databases. A lot more will happen to, data vis will be a big one, but those 2 I'm reasonably certain of happening in 2023.
> Ah you reminded me. A curious phenomenon I've observed with Prettier in JS and fmt in Go is languages are moving to standardized whitespace, but as you said, not yet deriving information from it. I don't know enough about Java or Rust but I suspect they probably both have adopted a Prettier/fmt like convention where all code is formatted on save. So it seems like we are moving to a world where it will be a simple flip of a switch to then start having popular languages extract meaning from the whitespace.
But why would I want to extract meaning from whitespace if I have all the information I need from other mechanisms (eg. braces, brackets, keywords)?
You got this the wrong way: You can then leave out the redundant information. As (I strongly hope) you won't leave out the formatting and start to write everything on a single line, the obvious thing to do than is to leave out the superfluous block delimiters.
Thank you for writing this article. Very interesting.
The dualities static vs dynamic, whitespace insensitive vs sensitive, brackets vs nobrackets, pascal/fortran/style vs algol, interpreted vs compiled, dynamic scope vs lexical scope, aot compiled vs jit compiled, functional vs oop are all dualities that a language must bear with the ramifications of that decision. I don't think it's obvious that one category of the duality is ALWAYS superior to the other in every scenario.
Obviously we would prefer a language that is expressive, easy to read and compiled for performance and avoids errors at compile time. Does this language exist?
And that is as easy to write as a scripting language such as Python or Ruby. Python is extremely fast to write scripts for.
I'm working on my own programming language and it resembles Javascript, so it is C style brackets and semicolons. In the interests of getting running fast, I compile to my own imaginary assembly language which is a switch based interpreter. This means I can work on the frontend and backend of the language in parallel. My language is a multithreaded language and can communicate integers between threads. I am still designing compound struct sharing between threads. Since this enters the move vs copy or by reference or by value semantics problem. If I send an object/struct/hash/list by copy then the code is slow. If I do it by reference, I have memory management to worry about if I plan to rewrite my language in a compiled language and not rely on Java's garbage collection.
On eatonphil's discord we talk programming language design.
In my experience, type-related programming bugs are very rare (less than 2%). That's why I'm not in the least bit concerned with whether or not a language is statically-typed or dynamically-typed. I use both as needed. My favourite languages are Go and Smalltalk.
Smalltalk allows me to do live programming with great ease. This is a huge productivity amplifier. Smalltalk's productivity is much higher than that of any major language in use today.
Really? About 30% of the bugs we find in our entirely-Golang codebase are something that is trivially solved by a more advanced type system, and it was even higher back when I worked on Python.
Go is not statically typed, really. Consider using Haskell, if not Agda.
About "type-related programming bugs." I once took endeavour to write cycle-accurate MIPS CPU simulation in Haskell. Most of the errors I uncovered were easily preventable with the types - mark integers with their size in bits and you will never add instruction pointer to an unaligned offset. Haskell did not had proper types at the time but later I used that trick quite successfully.
With the types you can eliminate whole classes of logic errors. They just will not appear before end user.
I have the opposite experience. Most of the bugs in my code are type-related issues, and once I fix those (usually with the help of the compiler, since I prefer statically-typed languages), my code usually runs correctly the first or second time. Throughout the 2022 AOC I only had 2 (out of 49) parts where I had a logic error that type checked but did not run properly, which I needed to debug. But in almost every part I had the compiler point out issues that I had to fix before the solutions would run at all.
Do type related programming bugs include null pointer errors? I also think this 2% stat is just complete bs. Anyway, this debate is over. Absolutely every top language designer has seen the merit of statically typed languages, if not for avoiding type errors then for making code refactorable.
Safe and sane refactoring is impossible even in mid sized code-bases without static types.
Dynamic languages are strictly only viable for throwaway code, or very small programs (where small means something up to two pages of code max, or so).
Sure it's not necessary when you produce throwaway code.
Most startups do exactly this.
That's even a valid strategy: Concentrate on the business case only, redo the code when you get successful and have the money to rewrite everything. (Of course quite some companies misses of missed the right time to throw away everything and rewrite it. This results later on in a huge and very expensive PITA; some of the companies affected even started to build their own compilers and static checkers for the languages they use; just look who pays for static type checkers for Python or Ruby).
Either there's something missing between steps 1 and 2, or dynamic languages allow you to become a multi-billion dollar company just by writing two pages of code. I mean they are concise, but not that concise.
But it sounds like no amount of evidence will convince you. You have to invent weird terms like "sane refactoring" and "throwaway code" that do all the heavy lifting in your religion.
If you are presented with evidence that a lot of successful companies are doing refactoring with dynamic languages, you will say "it's not sane refactoring" (no true Scotsman) or say those companies are outliers (ignoring the fact they are overwhelming majority).
But all of that doesn't matter. Those companies are still more successful.
For me it’s never been about type bugs but rather about being able to work comfortably with the code. Some of us don’t have the greatest working memories; having the IDE offering support with types is a godsend. Python’s my prototyping language and I start adding types very early on, if not right from the start.
Depends on the language, but when you have to use some metaprogramming gymnastics, then types are really hindering to develop something generically, if the language doesn't provide a fully developed type systems, which few systems have. For example higher-kinded types are quite essential, but not always given. Then I prefer dynamic typing and get this done fast without safety. But then I remember getting a lot of bugs in certain types of applications, maybe one fifth, or something.
> However, for puzzles where the data structures were straightforward, dynamically typed languages had me moving faster out of the gate."
I don't understand this. I don't really encounter situations where the data structures are "straightforward" that type inference isn't able to handle on its own (or at most with explicit type specification for just a few variables).
Not 100% sure what's meant in this specific example but as someone who did advent of code a lot, there are truly benefits of an untyped python dictionary over anything in Rust or Java or C++. You can just put None next to your integers without having to wrap in Optional or even worse, a struct.
It would surprise me if Rust isn't smart enough to infer that the type of the dictionary is "Optional<Integer>" or "Integer | None". I haven't used Rust myself, but every language that I've used with type inference and algebraic data types will make this inference on its own.
You should check out Scheme with SRFI-119 or similar. It is a really elegant and noise-free way of writing S-expressions, unsure if it works with Racket or not though.
I also did a similar challenge this year, though I had to drop some of the more exotic languages that you kept in. Nice work.
Thanks for letting us tag along. What I'd like to hear are your thoughts on the Groucho Marx cigar question: laziness is useful every now and then, but in lazy languages like Haskell you pay for it everywhere. The same is true for dynamic dispatch, it's needed when working with values of sum type. But like laziness, it isn't needed wall-to-wall but in dynamically-typed languages, you pay for it everywhere.
Give me a call-by-value language with arrow and sum types and I can pay as I go.
[edit: added "arrow and"]
Sure, OCaml is great as are the other dialects of ML. I'm hoping that someone will weigh in with a compelling case for laziness everywhere or dynamic dispatch everywhere, other than pre-existing infrastructure, libraries, etc.
> hoping that someone will weigh in with a compelling case for laziness everywhere
How should this happen? You can hear even form the horses mouth that laziness by default was a big mistake.
> dynamic dispatch everywhere
Big mistake of most "OO language" indeed.
Rust got this right. Static dispatch is the default (simple and fast!) but you can have dynamic dispatch where it makes sense. I hope this design will be copied in the future everywhere. Maybe even some language manage to get rid of their mistake and switch to the sane design (looking at you Scala, just make type-classes first-class, but keep OOP for the cases where it makes sense).
> if I were to design a [...] language, it would push me toward indentation-sensitivity.
> That’s a notable shift from my prior perspective, where
> I felt that whitespace-insensitivity was preferrable to indentation-sensitivity.
I'm starting to have the same change of perspective for a different reason: widespread use of formatters. One of the drawbacks of white-space sensitivity is the accidental mixing of distinct white-spaces (e.g., tabs and spaces). Using a formatter eliminates this problem because all white-spaces are normalized.