I like the fact that textual includes are at the top of the list. What an abomination! Modules have been known and clearly superior since forever, both for correctness and for compile times. Every time I see macro abuse in C, or exploding compile times in C++ (which oddly doesn't even get mentioned in this section) I cry a little for the future of my profession. Optional delimiters, default fall-through, weak typing - check, check, check. Good points, well argued. I've had to fix bugs caused by all of these. I happen to agree on type first, disagree on single return (and it's a shame Zig doesn't seem to be on the radar especially because of this), but these are certainly discussions worth having and the OP provides a good starting point.
It's definitely a bit long, but well worth it.
What's not discussed is the layer below the language that so often ties us to it anyway: linkage. C also does not require anything that can be called a "runtime", although you usually get some in libc. These factors result in it being easier to write libraries in C that can be imported by FFI into more modern languages than the other way round.
I would love to see a new language, technology, something, take on this sort of problem. C works so well in this sort of problem because of the preprocessor's ability to strip out bits & pieces that aren't necessary at the time.
We need something that will allow us to fall into the pit of success rather than succumb to the easy solution of using #ifdef for new, optional features.
I know for a fact that Rust can do this, though I couldn't speculate as to whether it would be better or worse than the equivalent done with the CPP (I am fairly convinced it would be harder to fuck up at least), and I would be very surprised if the other modern systemsy languages (D, zig, etc.) didn't offer similar things.
Linux kernel is an amazing example of how to do ^^^that and still produce beautiful code. The Linux kernel uses a great number of function pointers to accomplish their encapsulation goal.
That need not be the case, and typically isn’t. If you do
(defvar *foo* #+sbcl(sb-sys:thing) #-sbcl(portable-thing))
I wouldn't claim that this is the end-all, be-all of code, but it does make for some reasonably compact platform-dependent-but-portable code.
This is a tremendous phrase.
Most likely reference.
"I happen to agree on type first, disagree on single return (and it's a shame Zig doesn't seem to be on the radar especially because of this)"
Can you clarify what you mean by this?
Yes, you can do that, but it's always going to be far more cumbersome than native multi-value return would be. Even in the simplest case (stack allocation in the caller), you have to define a new struct that has no other purpose, in a place visible to both caller and callee. You also have to refer to the status and value as struct members rather than simple variables. That's already worse than using a return value for the status and a reference parameter for the value.
I guess you could say that multi-value return is just syntactic sugar for more reference parameters. That's mostly true at the source level, but at the object level not quite. Using a reference parameter requires an extra stack operation (or register allocation) to push the parameter's address, in addition to whatever was done to allocate the parameter itself. Then the callee has to access it via pointer dereferences. With true multi-value return neither is necessary. The exact details are left as an easy exercise for any reader who knows how function calls work at that level. (BTW yes, interprocedural optimization can make that overhead go away, but a lot of code is written in C specifically to avoid relying on super-smart compilers and runtimes.)
Multi-value return isn't a huge deal, though. I appreciate having it in Python, but in almost thirty years I rarely missed it in C. While I wouldn't necessarily oppose its addition, I think it's more in keeping with C's general spirit of simplicity to leave it out.
A interface devised just to unwind information from
a two member struct is beyond 'cumbersome'.
What non-object-oriented approach would you suggest instead?
That's not abuse, that's the most important pattern in C++ (RAII). You mustn't think of C++ "objects" as Java or C# or Smalltalk objects - they are not, they are deterministic resource managers before everything.
As long as people persist in calling them objects, and use all of the other object-oriented concepts/terminology such as classes and inheritance, people will expect them to be objects. That's not unreasonable. It's absurd to look down your nose at people who are taking you at your word.
These other uses are hacks. If you want to be able to attach something to a scope that's great, actually it's a wonderful idea, but just be honest about it. Make scopes a first-class concept, give things names that reflect their scope-oriented meaning and usage. Python's "with" is a step in the right direction; even though the implementation uses objects, they're objects that implement a specific interface (not just creation/destruction) and their usage is distinct. That separation allows scope-based semantics to evolve independently of object-lifetime semantics, which are already muddled by things like lambdas, futures, and coroutines. Tying them together might be an important pattern in C++, but it's also a mistake. Not the first, not the last. Making mistakes mandatory has always been the C++ way.
We don't want to attach resources to scopes, we want to attach them to object lifetimes. That's why defer/with/unwind-protect are not alternatives to RAII. The lifetime of an object I pushed to a vector is not attached to any lexical scope in the program text, it is attached to the dynamic extent during which the object is alive. While a scope guard always destroys its resource at the end of a block, RAII allows the resource lifetime to be shortened, by consuming it inside the block, or prolonged, by moving it somewhere with a dynamic extent that outlives the end of the block.
Here's an example where defer solves nothing: if I ask Zig to shrink an ArrayList of strings, it drops the strings on the ground and leaks the memory because Zig has no notion of the ArrayList owning its elements. You need to loop over the strings you are about to shrink over and call their destructors, which is literally the hard part, since the actual shrink method just assigns to the length field. The lack of destructors (Zig has no generic notion of a destructor) here impedes generic code since what you do for strings is different than what you do for ints.
RAII guards are real objects, they contain real data (drop flags), and they make code safer and more generic. If you don't like RAII, show comparable solutions (which scope guards are not), don't just call it a hack and adduce philosophical notions of how OOP should work.
Is that the royal "we"? Because for people who aren't you, it's only true some of the time. Sure, true resource acquisition/release is tied to object lifetimes. That's almost a tautology. But that doesn't work e.g. for lock pseudo-objects, which very much are expected and meant to be associated with a scope. It just happens to work out because the object and scope lifetimes are usually the same, but it's still a semantic muddle and it does break for things like lambdas and coroutines.
> if I ask Zig to shrink an ArrayList of strings
That's a silly and irrelevant example, having more to do with ownership rules (which C++ makes a very unique mess of) more than scopes vs. objects. Any Zig code anywhere that shrinks a list of strings had better handle freeing its (now non-) members. No, defer doesn't cover that case. Yes, destructors would, but this isn't an OO language. The obvious solution (same as in C) is to define a resize_string_list function. Again, what can you suggest in a non-OO language that's better?
> The lack of RAII here impedes generic code
You really don't want to get into a discussion about C++ and generics. Trust me on that. Yes, you need to do different things for strings and ints, but there are many ways besides C++'s unique interpretation of RAII (e.g. type introspection) to handle that.
> If you don't like RAII, show comparable solution
Done. Your turn. If you want to be constructive instead of just doctrinaire, tell us what you'd do without OO to address these situations better than existing solutions.
P.S. Also, what's with all the nonsense-word accounts in this thread taking offense at things said to jcelerier and responding with exactly the same points in exactly the same tone?
The nicest thing: you can throw in the exit method without the runtime falling over itself.
You can do a kind of RAII in D when you use structs instead of classes because these are stack allocated and have a suitable lifetime.
After suffering badly formatted code in my early career, Python's approach seemed refreshing. But after suffering one too many bad merges where indentation is left mangled, we need braces. You can automatically reformat everything instantly without even thinking about it, eliminating merge ambiguity. gofmt is the better solution because it declares a strict representation that can be automatically enforced.
You could retort that better rebase/merge practices could alleviate some of these issues but if that discipline could be enforced on arbitrary groups of humans, we wouldn't have cheered Python's forced indentation in the first place.
Or semantics-aware merging.
But I'm beginning to think you meant that you want a merge tool that doesn't mangle the whitespace in the first place. If the pre-merge whitespace is unmangled, and the merge tool understand Python whitespace, then at a minimum it should be able to flag ambiguous changes and ask for help.
Citation needed. Python popularity seems to be only growing up and in all kinds of industries. Hard to find somewhere where python hasn't touched yet.
Also, as you mentioned, is pretty standard to have some sort of flake8 as part of your CI and it would certainly detect problems with bad identation that would cause code issues.
If the basic flake8 tests pass then should certainly have some unit / integration tests as part of your merge request...
Now a days there are even things like black [https://github.com/ambv/black] that does auto formatting just like gofmt.
Further, all the human conditions that resulted in bad code style before are still present and showing through in Python codebases. Namely organizations with loose standards or inadequate tooling where language problems get magnified. This is remarkably evident in companies that used to be 100% C/C++ or Java shops and never quite figured out unit tests.
It does auto formatting, but it doesn't do auto formatting just like gofmt. An auto formatter can't do anything with this block
if value < 0.3:
# do something
elif 0.3 <= value < 0.7:
# do something else
# do a third thing
* No bitcast operator. Your options are a) use the union trick; b) use the memcpy trick; or c) take an address, cast it to a different pointer type, and hope that your compiler gives you a pass on technically violating the standard here. Even C++ didn't get bitcast until C++20.
* No SIMD vector types. Of course, SIMD vectors are even more type-punning heavy than integers and floats, so you do need a good bitcast operator to get anywhere.
* Volatile and atomic are type qualifiers. These ought to be properties of the memory accesses; making them qualifiers on types obfuscates which memory accesses they apply to. If you look at the Linux kernel, it doesn't use volatile but instead uses a READ_ONCE macro that acts much like a volatile load.
* Bitfields are a mistake. They combine especially poorly with the vague properties of volatile. How many memory accesses are required in this program:
volatile unsigned a : 3;
volatile unsigned b : 2;
foo.a = 3;
foo.b = 0;
Huh? What architecture are you using that provides SIMD without corresponding vector types?
In both cases, they're probably going to be confused why an equality test a few lines later fails every once and a while.
The concept of divide vs. div & mod is one that a programmer eventually needs to understand. More importantly, not everything should be optimized for newbies. The context driven / operator is appropriate in programming languages designed for experienced programmers.
Upon further thought... Couldn't the argument be rephrased as "integers are broken numbers"? It sounds silly, but from a newbie's perspective the same problem exists with "int f = 38.2 * 25;".
An example from C#: when I was a Unity developer, I frequently needed to figure out what the aspect ratio of the screen is. You'd think this would work:
float aspect = Screen.width / Screen.height;
Python does it correctly, and C-like languages should as well. Division used with two integers should always return a floating point number, and there should be a separate operator for integer division. It makes no sense the way it's done now.
3 = x
Doesn't solve everything but it does help.
The real problem is using '=' as the assignment operator. I think this was a serious design flaw. Of course some languages use ':=' which is better. I prefer just ':'. I see many languages that use '=' in some contexts and ':' in other contexts. Members/properties/fields quite often get assigned using ':'. I say make it universal and reserve '=' for equality.
I'm not sure if that makes things better, but it could be worth a special mention.
I found that they both shared a philosophical simplicity (even if it only seemed that way with C, considering how much complexity you later learn about) and over a decade later I've not found the same philosophy in any other programming languages.
They all tend to be written with the goal of adding features that are suppose to make the programmer's life easier—and here's the distinction—rather than designing a language that is powerful but simple.
This of course has shortcomings of its own, but the trade-offs are ones that I typically seek.
The grammar for Lua is delightfully short, which seems to be a significant source of its beauty. https://www.lua.org/manual/5.1/manual.html#8
I'd love to be educated on similarly easy languages.
I'm not sure what the lingua franca of the future for software developers should look like, but I forfeit that it probably should be slightly more complicated than C or Lua in terms of looks. At least in terms of optional standard libraries provided for things like cache levels and GPU support.
Perhaps that's outside of the scope of what a programming language should provide to users, though? I'm not sure. It seems like we sit on a lot of complexity and don't use it as efficiently as we could be, though. Maybe some of these things the underlying virtual machines or compliers should be doing for us as they currently do, but extending this reach.
Suppose in my program "hello" I import "foo"-- which depends on foo/buggy.dll-- and "bar"-- which depends on bar/buggy.dll.
foo/buggy.dll is a helper library that computes a correct value for "foo" but would crash if used with "bar."
bar/buggy.dll is foo/buggy.dll with a single late-night bug fix for the crasher which introduces a regression that will crash if used with "foo".
So does "contents ending up in that namespace" mean that "hello" will run without crashing/clashing?
1. "Single return and out parameters" should have a special mention for Haskell, since Haskell doesn't even have multiple input parameters!
2. Python has assignment expressions now.
Overall, it's a pretty good list of shortcomings of C, but I disagree with several of the points: (a) special-casing subtraction lexing to be whitespace sensitive is silly, and (b) integer division is essential whenever working with arrays or modular arithmetic, and converting types explicitly, like Rust mandates, is definitely the way to go. Who knows if I'd want a float, double, rational or currency type to be the output, anyway?
Probably what people want is a Number type providing integer bignums for those situations when you don't want to care, a set of "machine integer" types with controllable overflow handling, IEEE floating point, a Rational/Fraction so you can handle 1/3 correctly, a Money type with controllable rounding, and COBOL-style "picture" types.
Oh, and complex numbers, and matrix types of all of the above.
PEP 572? If so, that doesn't support multiple assignment at all.
Julia doesn't use ~ for logical not, it's used for bitwise not (it does work as logical not but only because Bool is a subtype of Integer, I've never seen it explicitly recommended).
In shell a variable can be unset, which is different than set to empty string.
Make extensive use of these for maximum job security! /s
No. Just no.
This is several orders of magnitude worse than most of the rest of the list. (Which is mostly personal opinions.)
Is there a reason I'm not thinking of?
Almost all the attention went to superficial details that really don't make much difference. Your fingers and your eyes learn the gestures and signposts, and they fade into the background. (But modules and namespaces matter.)
What is left is whether you can express what you need to, at all. Whether, having expressed it, you can pack it up into a library that anybody can use without knowing all about how it was put there. Whether, finding a library, you can actually use it in your runtime environment without it costing more overhead than if you wrote your own, or demanding runtime concessions different from what you have committed to for your code or for other libraries you want to use. Obligate GC is death for interoperability.
The only languages that excel, there, are C++, Rust, and D. (I include Rust because it is well on its way, and will get there before long if its users can wean themselves off of ARC boxing.) None of the other languages are really even trying. It's tragic. Haskell and the MLs could be good at libraries, were it not for their obligate-GC problem. The other languages with big library ecosystems are slow, so overhead isn't noticed.
There has been more than enough time to come up with something to unseat C++. Part of the problem is that the main incubator for new languages has been academia, and academics won't even discuss a language that is not obligate-GC. We need a language that will be equally good at copying register values to and from ALUs and memory buses, driving vector pipelines, orchestrating legions of GPU cores, and wiring up FPGA subunits. (I have not seen an FPGA compatible with GC.) If we end up programming our FPGAs in C++, it will be the fault of everyone who failed to unseat it by making a better language than it.
On the other hand, C has so many goodies which ought to be done right and better in modern languages, but often are not:
1. Variadic functions like printf. It sucks to wrap arguments into a list just for this.
2. Setjmp/longjmp and nonlocal returns
3. Union data types
4. Conditional macro directives to compile debug statement versions when needed.
It's easy to criticise C or patronise it saying that it was good for its time, the reality is that many of its features (or what they attempt) are futuristic even today.
That has to be the first time I've seen those features of C described as a positive. I'm genuinely curious; would you be willing to explain further?
const CBLAS_LAYOUT Layout,
const CBLAS_TRANSPOSE transa, const CBLAS_TRANSPOSE transb,
const MKL_INT m, const MKL_INT n, const MKL_INT k,
const double alpha, const double *a, const MKL_INT lda,
const double *b, const MKL_INT ldb,
const double beta,
double *c, const MKL_INT ldc);
int gsl_blas_dgemm(CBLAS_TRANSPOSE_t TransA, CBLAS_TRANSPOSE_t TransB,
const gsl_matrix * A,
const gsl_matrix * B,
gsl_matrix * C);
(The c2 wiki is half-defunct. Long story.)
I guess that the author only saw the use of `use-package' or the `:use' option of `defpackage', but this is not necessary (and not generally used) to refer to other namespaces.
The actual use of `defpackage' is often quite close to how Clojure does it.
The symbols in the compiled file get read in the current package. The <star>package<star> variable is dynamically rebound, over the lifetime of the load, to its existing value, so that if the file happens to change it, that effect is undone when the load finishes.
The loaded file source can arrange for its bulk to be read in its own namespace, or it can be processed in the parent namespace, which is the best of both worlds.
When a file is compiled, then it's no longer textual inclusion: best of both worlds again.
This is all so reasonably designed that I copied the salient aspects of things like load and compile-file and all that jazz nearly as is into TXR Lisp, which isn't an ANSI CL implementation and free to do anything differently.
I agree! I've stuck hard and fast to this rule since... I was programming Qbasic as a kid. Back then it was for different reasons, but the practice stuck with me as I learned new languages.
Though maybe I'm too deep down the rabbithole already and simply got used to it.
Rather than call this "monadic error handling", I'd just say that results are wrapped up so that errors can be distinguished from successful results. Usually that's done by wrapping in a list (or, if the language supports it, an "Optional"/"Maybe" type, which is just a list truncated to 1 element).
Adding this extra structure lets us distinguish things like "the query died" (an empty list) from "there was no match" (a list containing an empty list). If we'd used NULL to indicate failure, we wouldn't be able to distinguish between these situations (or indeed if there was a match, whose value happened to be NULL!).
Naively we might think this require a lot of length-checking and unwrapping, but we can avoid that by using list operations that are (hopefully) familiar to every programmer, like "map", "concatenate" and "singleton".
It turns out that those operations form a monad, but it seems overly dramatic and confusing to name the approach using that terminology. Sure it's nice that we can abstract out this interface, but we don't need that much abstraction when our whole ecosystem is using a single, specific implementation like "Optional".
Incidentally, there's a really nice paper on this called "How to replace failure by a list of successes" ( https://rkrishnan.org/files/wadler-1985.pdf ), which shows how normal, non-truncated lists actually implement backtracking search (assuming our lists are lazily generated, e.g. like in Haskell or using an iterator).
Note that being "monadic" specifically means we're able to 'collapse' these lists, i.e. concatenate a list-of-lists into a list ("singleton" comes from a weaker notion called 'applicative', "map" comes from an weaker notion called 'functor'). Collapsing lists removes the distinctions that we introduced, since "concat([])" and "concat()" are both "", making our result act more like NULL. So calling this approach "monadic" is actually emphasising the wrong part!
"Do you struggle to track down the source of NULL values in your code? With monadic error handling you can struggle to track down 'Nothing' values instead!"
The real improvement is from the non-monadic interface, like "map", which preserves these distinctions.
The difference is that ⊥ can't exist in a Haskell program. The moment you get it your program will either crash or loop forever.
(The reason it is useful has to do with lazy evaluation)
And that is perfectly OK! Any "module system" sucks big time, creates more problems that it solves, and should be avoided whatever the cost. Textual includes are great but of course they should not be used for silly module systems.
Module systems solve all the problems mentioned in the argument (and created by text-based imports).
By themselves, they don't add anything bad on their own.
What exactly are those "more problems that it solves"?
Imagine you change code in an upstream module. Now the compiler has to recompile all downstream modules. In C and C++ this only happens if you change the header file.
(On the other hand modern development techniques emphasize tests so you might only recompile your module and the testsuite until all tests pass and only then recompile all modules, minimizing the impact.)
For a full recompilation you can parallelize C and C++ compilations much better than any module system I know.
The C language is not designed for building huge programs by accretion of modules. The idea is that you build many independent programs, and then you glue them together using scripts.
That doesn't exactly describe kernels or embedded systems, which are C's strongest bastions. Whether it's well designed for the purpose or not, whether modules are appropriate in that context or not, a significant majority of the C code out there (including most common web servers, databases, etc.) does not fit your description at all.
Building small programs and gluing together with scripts is great, but hardly relevant. You don't need includes or modules for that. What if you do need to build one large program, like those aforementioned web servers or databases, or (what I work on) a storage server? That's where the difference between textual inclusion and modules really comes into play, and modules are strictly better than includes in every way.
I think the problem here is that you're confusing modules with things built on top of modules - specifically package managers. A lot of the package managers out there are horrible and create more problems than they solve, but that has almost nothing to do with modules as a language construct.
Can you give a single concrete example of a problem, because the above are all noops semantically speaking...
>The idea is that you build many independent programs, and then you glue them together using scripts.
This is a non-starter for most use cases outside pipeable shell commands (which are not the only kind of programs people want to write).
People need, and write, and have written for decades, large programs in C, and programs in C which have from 10s to 100s of headers files included (including recursively from included libs).
The idea you mention is valid, and is part of the Unix philosophy.
But it was never the idea that C should be used JUST for that.
In fact the first use of C was to write a whole operating system.
Must be nice to have one of those! Cortex-M* does not.