> ML didn’t have single-line comments, so same level of surprising limitation.
It is not quite clear to me why the lack of single-line comments is such a surprising limitation. After all, a single-line block comment can easily serve as a substitute. However, there is no straightforward workaround for the lack of nested block comments.
> I’ve never heard someone refer to C as “expressive”, but maybe it was in 1972 when compared to assembly.
I was thinking of Fortran in this context. For instance, Fortran 77 lacked function pointers and offered a limited set of control flow structures, along with cumbersome support for recursion. I know Fortran, with its native support for multidimensional arrays, excelled in numerical and scientific computing but C quickly became the preferred language for general purpose computing.
While very few today would consider C a pinnacle of expressiveness, when I was learning C, the landscape of mainstream programming languages was much more restricted. In fact, the preface to the first edition of K&R notes the following:
"In our experience, C has proven to be a pleasant, expressive and versatile language for a wide
variety of programs."
C, Pascal, etc. stood out as some of the few mainstream programming languages that offered a reasonable level of expressiveness. Of course, Lisp was exceptionally expressive in its own right, but it wasn't always the best fit for certain applications or environments.
> And what bearing does the comment syntax have on the expressiveness of a language?
Nothing at all. I agree. The expressiveness of C comes from its grammar, which the language parser handles. Support for nested comments, in the context of C, is a concern for the lexer, so indeed one does not directly influence the other. However, it is still curious that a language with such a sophisticated grammar and parser could not allocate a bit of its complexity budget to support nested comments in its lexer. This is a trivial matter, I know, but I still couldn't help but wonder about it.
But we did have dummy procedures, which covered one of the important use cases directly, and which could be abused to fake function/subroutine pointers stored in data.
Fair enough. From my perspective, lack of single line comments is a little surprising because most other languages had it at the time (1973, when ML was introduced). Lack of nested comments doesn’t seem surprising, because it isn’t an important feature for a language, and because most other languages did not have it at the time (1972, when C was introduced).
I can imagine both pro and con arguments for supporting nested comments, but regardless of what I think, C certainly could have added support for nested comments at any time, and hasn’t, which suggests that there isn’t sufficient need for it. That might be the entire explanation: not even worth a little complexity.
However, if a later C standard were to introduce nested comments, it would break the above program because then the following part of the program would be recognised as a comment:
/* /* Comment */
printf("hello */
The above text would be ignored. Then the compiler would encounter the following:
world");
This would lead to errors like undeclared identifier 'world', missing terminating " character, etc.
Given the neighboring thread where I just learned that the lexer runs before the preprocessor, I’m not sure that would be the outcome. There’s no reason to assume the comment terminator wouldn’t be ignored in strings. And even today, you can safely write printf(“hello // world\n”); without risking a compile error, right?
> Given the neighboring thread where I just learned that the lexer runs before the preprocessor, I’m not sure that would be the outcome.
That is precisely why nested comments would end up breaking the C89 code example I provided above. I elaborate this further below.
> There’s no reason to assume the comment terminator wouldn’t be ignored in strings.
There is no notion of "comment terminator in strings" in C. At any point of time, the lexer is reading either a string or a comment but never one within the other. For example, in C89, C99, etc., this is an invalid C program too:
In this case, we wouldn't say that the lexer is "honoring the comment terminator in a string" because, at the point the comment terminator '*/' is read, there is no active string. There is only a comment that looks like this:
/* Comment
printf("hello */
The double quotation mark within the comment is immaterial. It is simply part of the comment. Once the lexer has read the opening '/*', it looks for the terminating '*/'. This behaviour would hold even if future C standards were to allow nested comments, which is why nested comments would break the C89 example I mentioned in my earlier HN comment.
> And even today, you can safely write printf("hello // world\n"); without risking a compile error, right?
Right. But it is not clear what this has got to do with my concern that nested comments would break valid C89 programs. In this printf() example, we only have an ordinary string, so obviously this compiles fine. Once the lexer has read the opening quotation mark as the beginning of a string, it looks for an unescaped terminating quotation mark. So clearly, everything until the unescaped terminating quotation mark is a string!
Oh wow, I didn’t remember that, and I did start writing C before 99. I stand corrected. I guess that is a little surprising. ;)
Is true that many languages had single line comments? Maybe I’m forgetting more, but I remember everything else having single line comments… asm, basic, shell. I used Pascal in the 80s and apparently forgot it didn’t have line comments either?
Some C compilers supported it as an unofficial extension well before C99, so that could be why you didn't realise or don't remember. I think that included both Visual Studio (which was really a C++ compiler that could turn off the C++ bits) and GCC with GNU extensions enabled.
That's my recollection, that most languages had single line comments. Some had multi-line comments but C++ is the first I remember having syntaxes for both. That said, I'm not terribly familiar with pre-80s stuff.
It is not quite clear to me why the lack of single-line comments is such a surprising limitation. After all, a single-line block comment can easily serve as a substitute. However, there is no straightforward workaround for the lack of nested block comments.
> I’ve never heard someone refer to C as “expressive”, but maybe it was in 1972 when compared to assembly.
I was thinking of Fortran in this context. For instance, Fortran 77 lacked function pointers and offered a limited set of control flow structures, along with cumbersome support for recursion. I know Fortran, with its native support for multidimensional arrays, excelled in numerical and scientific computing but C quickly became the preferred language for general purpose computing.
While very few today would consider C a pinnacle of expressiveness, when I was learning C, the landscape of mainstream programming languages was much more restricted. In fact, the preface to the first edition of K&R notes the following:
"In our experience, C has proven to be a pleasant, expressive and versatile language for a wide variety of programs."
C, Pascal, etc. stood out as some of the few mainstream programming languages that offered a reasonable level of expressiveness. Of course, Lisp was exceptionally expressive in its own right, but it wasn't always the best fit for certain applications or environments.
> And what bearing does the comment syntax have on the expressiveness of a language?
Nothing at all. I agree. The expressiveness of C comes from its grammar, which the language parser handles. Support for nested comments, in the context of C, is a concern for the lexer, so indeed one does not directly influence the other. However, it is still curious that a language with such a sophisticated grammar and parser could not allocate a bit of its complexity budget to support nested comments in its lexer. This is a trivial matter, I know, but I still couldn't help but wonder about it.