I got the solution! Given that spaces in identifiers cannot be used because we n...

pavlov · 2024-05-21T11:38:47.000000Z

In Unicode there's the non-breaking space character. It's even easy to type, at least on Mac: just press Option+Spacebar.

throwaway482945 · 2024-05-21T12:16:57.000000Z

Would allowing spaces in identifiers even introduce any ambiguity in most languages? I think the only languages I've seen where it would matter are functional languages. e.g. I think it'd be possible to write a Python program using spaces instead of underscores, and be able to unambiguously parse it with a slightly modified parser?

lpribis · 2024-05-21T23:20:22.000000Z

The classic example of why whitespace in identifiers arguably causes more problems than it solves is https://www-users.york.ac.uk/~ss44/cyc/p/fbug.htm.

Since the space of valid syntax becomes so much larger, typos are more likely to result in valid but incorrect programs. Especially in dynamic interpreted languages like python.

ReleaseCandidat · 2024-05-21T12:20:33.000000Z

> I think the only languages I've seen where it would matter are functional languages.

Yes, ML style function application is a problem and treating newlines as "normal" whitespace without having line separators (aka semicolons). And keywords used as infix operators, like another post reminded me of.

pwagland · 2024-05-21T19:04:19.000000Z

Just NBSP instead.

It makes sense, you would never want your variable to be line wrapped anyway, right?

The challenge is that _some_ languages define the space to be unicode whitespace, not the space character.

Glacia · 2024-05-21T11:53:36.000000Z

i'm pretty sure ALGOL allowed spaces in identifiers. Probably some other old programming languages too. For the most part, it's just a tradition at this point.

ReleaseCandidat · 2024-05-21T12:05:53.000000Z

Fortran up to 77 (well, technically still everything in fixed-form, a.k.a. punch-card-style source files) ignores spaces. And AFAIK there still exist both versions of e.g. "goto": "go to" and "goto" are both valid Fortran 90 and later.

This, and the fact that variable names are allowed to be implicitly defined, lead to the famous bug:

   DO 10 I = 1.100

declared the variable `DO10I` with a value of 1.1, instead of the loop from 1 to 100 and declaring the "statement label" 10:

       DO 10 I = 1,100
          SUM = SUM + I
    10 CONTINUE

aragonite · 2024-05-21T12:08:12.000000Z

Also older versions of FORTRAN, according to Crockford at least :)

> It is good to have names containing multiple words, but there is little agreement on how to do that since spaces are not allowed inside of names. There is wun [sic] school that insists on the use of camel case, where the first letter of words are capitalized to indicate the word boundaries. There is another school that insists that _ underbar should be used in place of space to show the word boundaries. There is a third school that just runs all the words together, losing the word boundaries. The schools are unable to agree on the best practice. This argument has been going on for years and years and does not appear to be approaching any kind of consensus. That is because all of the schools are wrong.

> The correct answer is to use spaces to separate the words. Programming languages currently do not allow this because compilers in the 1950s had to run in a very small number of kilowords, and spaces in names were considered an unaffordable luxury. FORTRAN actually pulled it off, allowing names to contain spaces, but later languages did not follow that good example ... I am hoping that the next language does the right thing and allows names to contain spaces to improve readability.