.C as a file extension for C++ is not portable

Joker_vD · on May 13, 2021

Interestingly, the blog post doesn't mention the actual problem that's the source of the MSVC behaviour: not all filesystems are case-sensitive.

So if you care about your project being able to be checked out on Windows, please refrain from using .C extension for C++ files. I've seen a toy language repo that used "something.s" for the test source file and "something.S" for the part of the language runtime (or something like this, don't quite remember), and checking it out on Windows was great fun: git declares that the working directory at the same time is both dirty and has no changes in files.

DSMan195276 · on May 13, 2021

That's actually an interesting one - for gcc at least `.S` indicates an assembly file that requires preprocessing via `cpp`, and `.s` does not. You could definitely wire it up such that a different extension indicates preprocessing (and there might be some others already), but the typical convention would run into this problem. Of course, you're not supposed to check in the preprocessed source, but you'd likely run into the problem at build time when it tries to create the preprocessed file with the same filename as the original, so it doesn't really solve the problem.

cesarb · on May 13, 2021

> but you'd likely run into the problem at build time when it tries to create the preprocessed file with the same filename as the original

Only if you use -save-temps, which is rarely used (it's there mostly for when you want to debug the preprocessed code). Otherwise, in the common case where you are going from the .S to the .o, the intermediate .s is saved somewhere in /tmp (without -pipe) or exists only in memory (with -pipe).

DSMan195276 · on May 13, 2021

Yeah that's a good point, I didn't really think about it too hard but usually you just go straight from the `.S` to the `.o` and more-or-less skip the intermediate.

kalleboo · on May 13, 2021

If you care about Windows you also need to watch out for filenames like aux, con, prn, etc

Joker_vD · on May 13, 2021

Yep, checking out NetBSD or Minix 3 on Windows runs into this exact problem with the directory "external/mit/xorg/lib/xcb-util/aux". Thankfully, no NetBSD or Minix users need X anyway, right? :)

nanis · on May 13, 2021

I've run into `.C` files a few times since the 90s. They were associated with projects managed by developers who were actively hostile to the notion that people might do development on Windows and they knew what they were doing.

Joker_vD · on May 13, 2021

Over the last 7 years I've seen many small or hobby projects on GitHub that started non-portable, and there would always be an issue/comment about "how do I compile it on my OS?". Now, it's completely anecdotal and recalled from memory, but the reactions of Windows-based developers to "doesn't compile on Linux" was generally "OK, I'll try to take a look", while the reaction of Linux-based developers to "doesn't compile on Windows" was generally "switch to Linux or try to compile with Cygwin but honestly, idgaf lol".

Have you ever seen a Windows program that to tries to create "etc" directory in the C:'s root and put its config files there? I've seen, and it was "ported" from Linux.

SAI_Peregrinus · on May 13, 2021

Windows is expensive. Linux is free. Now that MS offers free trial Windows VM images for developers it's less of an issue, but traditionally it's been significantly cheaper for a Windows dev to set up a Linux system than for a Linux dev to set up a Windows system.

jolmg · on May 13, 2021

> Now, it's completely anecdotal and recalled from memory, but the reactions of Windows-based developers to "doesn't compile on Linux" was generally "OK, I'll try to take a look",

This is just my prejudice, I haven't actually checked, but I think most projects that have support for both Linux (or Unix-y OSes) and Windows generally start with Linux/Unix and then have Windows support added. I think the opposite is rare.

Maybe things changed? I think most Windows developers say 10 or 15 years ago might not even know what Linux is, or have heard the word as some obscure technology used by few. In other words, I would have expected a Windows developer to respond to "doesn't compile on Linux" the same as they would for e.g. "doesn't compile on Plan 9", or whatever is still very obscure relative to Windows nowadays.

nanis · on May 13, 2021

I used to post on test failures in Perl modules when building `perl` with MSVC on Windows. Just two[1] examples[2].

Also, there is this gem[3]:

> I still run into modules that try to create temporary files in the root directory of my C: drive. That usually happens due to the script clearing the environment and not saving temporary directory locations. This is an unfortunate interaction with File::Spec->tmpdir which defaults to trying to write to the root (hey, Windows 95 allowed it!) of the current drive if it can’t locate the customary directories. I think File::Spec->tmpdir ought to croak if the environment does not contain one of TMP, TEMP, or TMPDIR, instead of offering C:\system\temp or C:\temp or /tmp or / on Windows. Regardless of File::Spec’s behavior, scripts, modules, etc should not delete those environment variables.

[1]: https://www.nu42.com/2014/12/yeah-you-put-me-in-my-place-rea...

[2]: https://www.nu42.com/2014/11/fixing-hard-coded-file-path-in-...

[3]: https://www.nu42.com/2018/03/dont-complicate-things.html

rjsw · on May 13, 2021

They are used in CDE [1] and ET++ (an X11 toolkit that Erich Gamma worked on before Design Patterns).

[1] https://en.wikipedia.org/wiki/Common_Desktop_Environment

bit-hack · on May 13, 2021

I've honestly never seen .C used in the wild. I imagine anyone whose written a bit of portable code would immediately realize this is a bad idea.

jsrcout · on May 13, 2021

Haven't really seen it in the wild, but I see it every day in a very large proprietary codebase I work in.

captainmuon · on May 13, 2021

I only know it as an extension for "interpreted C++ scripts" as used by CERN's ROOT. It is considered good practice to make them compilable but the interpreter used to be very lenient (before they integrated it with clang) so that usually didn't happen.

Anyway I would never rely on the C compiler invoking the C++ compiler; I always write g++ or $CXX. I wonder if there is a downside to that.

kwk1 · on May 13, 2021

Just for an example, the OpenFOAM codebase has 'em.

andai · on May 13, 2021

This is because the Windows filesystems (in backwards compatibility with DOS) are not case sensitive, so the compiler doesn't distinguish between .C and .c

Someone · on May 13, 2021

Nowadays, they’re case-preserving, so the compiler _could_ distinguish the two.

Historically, though, lowercase character weren’t allowed in file names (even though the on-disk format would have allowed it), so compilers couldn’t make the distinction. Given that using “.C” has fallen out of fashion (if it ever was fashionable), I don’t see much pressure to add that functionality, especially given that it might break compilation of C source code copied over from old times that uses .C.

karatinversion · on May 13, 2021

I've had similar fun on macOS, where IIRC the project had both an API.h and an api.h file.

jagged-chisel · on May 13, 2021

Is this something that happens because someone thinks C/C++ is one language? I often see this single mythical language referred to in job postings, blog posts from folks with a range of experience in the field, and even in neophytes who got terrible information from their Java instructor at university. But I don't think I've ever come across it in an actual codebase.

I have seen .cc for C++ and it annoys me, but seems rather common.

nanis · on May 13, 2021

>> Is this something that happens because someone thinks C/C++ is one language?

No[1]:

> C++ source files conventionally use one of the suffixes `.C`, `.cc`, `.cpp`, `.CPP`, `.c++`, `.cp`, or `.cxx`; C++ header files often use `.hh`, `.hpp`, `.H`, or (for shared template code) `.tcc`; and preprocessed C++ files use the suffix `.ii`. GCC recognizes files with these names and compiles them as C++ programs even if you call the compiler the same way as for compiling C programs (usually with the name gcc).

[1] https://gcc.gnu.org/onlinedocs/gcc/Invoking-G_002b_002b.html

jagged-chisel · on May 14, 2021

This doesn’t speak to why the problem occurs. You didn’t even answer the question you quoted.

FearNotDaniel · on May 14, 2021

I don't work in C or C++ but I'm inferring from the context and other comments that it's actually because .c (lower case) is used to indicate C language code and .C (upper case) means C++ code. This then fails on windows because the OS is not case-sensitive wrt filenames.

So to make the implied answer to the question explicit: no, it's because someone who knew the difference between C and C++ sought to distinguish between their source code files using a filename extension convention that becomes invisible on a different OS.

pjc50 · on May 13, 2021

They are incredibly similar, though. And they inter-link. It's easier to have one program that contains C and C++ code than it is to have one program that contains Python 2 and Python 3 code.

MaxBarraclough · on May 13, 2021

> They are incredibly similar, though.

Not even close. C++ is a vastly more complex language with a very different philosophy from that of C.

You might argue that C can very roughly be treated as a subset of C++, but this really is a very rough approximation. Which is to say, really, that it's wrong.

> And they inter-link

It's generally very easy to call C code from C++ code, yes. It is not easy to call C++ code from C.

pjc50 · on May 13, 2021

> C can very roughly be treated as a subset of C++

Well, 99% of the time this holds, and you can generally use the same tooling in different modes for both. And they share a preprocessor.

> It's generally very easy to call C code from C++ code, yes. It is not easy to call C++ code from C.

Significantly easier than almost any other pair of languages, though. Especially if you take a little care on the C++ side (extern "C") or use COM or similar.

Again, C is more compatible with C++ than python 2 is with python 3. But not the other way round.

MaxBarraclough · on May 13, 2021

> > It is not easy to call C++ code from C.

> Significantly easier than almost any other pair of languages, though.

I wouldn't say so. If you're using the features of C++ then you'd need to manually wrap it all to expose a C API.

Accessing C code from C++, Ada, Rust, Zig, Java, Python, C#, or just about any other language, would be easier than manually wrapping C++ to expose a C API.

(The LLVM compiler, written in C++, does this. It exposes a subset of its API as a C API. I get the impression it's no small task, or they'd expose the full LLVM library that way.)

> Especially if you take a little care on the C++ side (extern "C") or use COM or similar.

Agreed, but going the extern "C" route implies you're using C++ as a better C, rather than making full use of C++. I have to admit I don't know much about working with COM. It's vaguely like GObject, right? I imagine it must take quite a bit of work to expose a C++ API that way.

seba_dos1 · on May 13, 2021

"99% of the time" may be stretching it a bit when you have to prevent yourself from using modern and useful C features in order to get a C++ compiler to accept your codebase.

GrumpySloth · on May 13, 2021

> Is this something that happens because someone thinks C/C++ is one language?

Nobody thinks that. Everybody knows what it means. Picking on it is like picking on people referring to amd64 as x86.

pdpi · on May 13, 2021

Outside the world of people who do write C or C++, most people I talk to really do bucket the two together as if they were largely interchangeable or, at least, a lot more similar than they actually are.

GrumpySloth · on May 13, 2021

Bucketing them together is fine. There are technical and organizational reasons for that. When you're working with C++, you're almost guaranteed to need to deal with C as well, so jobs for "C/C++ developers" make perfect sense. There is also a certain level of expectation that future versions of both languages keep incompatibility between one another to a bare minimum.

I'd happily apply for a "C/Rust" job as well.

adrianmsmith · on May 13, 2021

Right I mean one project I was working on who had to write some JavaScript and would always refer to that language as Java. He was a developer, literally using one of the languages, so should have known better. And Java and JavaScript have less in common than C and C++.

borodi · on May 13, 2021

People think that. amd64 and x86 today mean the same thing, C and C++ not quite. I've also seen HR people think that java and javascript are the same thing.

captainmuon · on May 13, 2021

In the olden days around VS 6, Microsoft only had one "C/C++" compiler. Needless to say it not very standard compliant, but C was still mostly a C++ subset, and the differences were anyway smaller than the deviations of MSVC from the C++ standard. So it made sense to just compile everything as C++.

bluGill · on May 13, 2021

AFAIK microsoft has never had a C compiler. C is enough of a subset of C++ that you can use MSVC to build most programs, but MSVC doesn't officially support C to this day. C++ tries to bring in everything of the latest C standard, but sometimes that isn't possible, or C++ has a better way (better generally because of some issue that doesn't exist in C in the first place). The C standard committee is aware of C++, and tries not to break the ability of C++ to use new C standards, but nobody is perfect.

spacechild1 · on May 13, 2021

> AFAIK microsoft has never had a C compiler.

Well, MSVC compiles .c files as C, which I guess makes it also a C compiler? It supports most of C99 by now.

> MSVC doesn't officially support C to this day.

Oh, it does: https://docs.microsoft.com/en-us/cpp/build/walkthrough-compi...

kevin_thibedeau · on May 13, 2021

They had a standalone C compiler for kernel drivers.

Ballas · on May 13, 2021

Is that something people do? I have never seen a C++ file with a .c extension. I have seen plenty of C++ headers with .h extension, though.

rfoo · on May 13, 2021

I think it meant uppercase .C instead of lowercase .c as a C++ extension. Indeed it's pretty rare.

That being said, there are indeed projects using lowercase .c extension for their "C++" file: gdb [1][2]

[1] https://github.com/bminor/binutils-gdb/blob/3e5fac07975a310c... [2] https://github.com/bminor/binutils-gdb/blob/master/gdbsuppor...

Ballas · on May 13, 2021

I see, thanks for clearing that up. (I have obviously also not seen the uppercase .C used before.)

jcranmer · on May 13, 2021

gcc does as well.

The use of .c for a C++ file is probably a project that decided, decades later, to switch from C to C++ and didn't want to go through the hassle of renaming all of their source code files to reflect the language switch.

hvdijk · on May 13, 2021

This is about C++ files with a .C extension, not a .c extension.

Semaphor · on May 13, 2021

Semi-related: There is now a way to set a case-sensitive flag for NTFS directories `fsutil.exe file setCaseSensitiveInfo C:\folder enable`

Here’s an article about it: https://www.howtogeek.com/354220/how-to-enable-case-sensitiv...

jussij · on May 13, 2021

I would suspect using that option would only open up a bunch of other unexpected issues, only because Windows programs have been written expecting that case insensitive file system.

Flipping that setting might make the file system case sensitive, but those Windows applications would still be working as if they were running in a case insensitive world.

Semaphor · on May 13, 2021

Depends on the application I would think. If they do everything properly, they should let the file system handle it and not care themselves about case-sensitivity.

jussij · on May 13, 2021

The problems start whenever a program asks the user for a file name and it then saves that raw users input.

For example lets imagine some sort of batch processing application that runs user defined scripts.

To define some batch process, the user provides the name the script to be run, so they enter 'MyScript.py' even though the file lives on disk as the 'myscript.py' file.

That batch processing application then saves the user supplied file name 'as is' and everything works fine in default Windows file system mode.

However, flipping that file system option to make it case sensitive suddenly the batch processing will fail.

As Windows applications don't have to deal with case sensitivity, they don't do things like check if the user has entered the name correctly in terms of case.

The just ask the file system if the file exists and likewise, Windows is not checking the case when it does that 'exists' check.

Semaphor · on May 13, 2021

> The just ask the file system if the file exists and likewise, Windows is not checking the case when it does that 'exists' check.

I haven’t tried it, but I would assume that’s exactly what happens. The file system, with the case-sensitive flag, should return "File does not exist".

Semaphor · on May 13, 2021

Yeah, I just tested this, works exactly as expected. After creating `Hallo.txt` and calling `File.Exists(@"C:\folder\hallo.txt")`, I get `false` with the attribute enabled, and `true` with it disabled

gpderetta · on May 13, 2021

key being "properly". Many programs, especially games, fail under wine because of case sensitiveness and need workarounds.

Semaphor · on May 13, 2021

That’s a different problem though. It sounds like that’s programs calling Name.xyz when the file is called name.xyz

That’s probably also the reason why there is no direct way to enables case-sensitivity recursively ;)

rurban · on May 13, 2021

That was always possible using some obscure registry flag. Only people compiling unix projects used that in desperation.

Semaphor · on May 13, 2021

I looked it up, assuming you are talking about the keys mentioned here [0], then that is very different as the `fsutils` commands actually makes the normal windows calls on NTFS behave with proper case insensitivity.

[0]: https://superuser.com/questions/266110/how-do-you-make-windo...

rurban · on May 14, 2021

No. I was referring the cygwin posix=1 mount option, and the 2 FILE_CASE_ flags of GetVolumeInformation(), which a minority used.

Before XP flag 1 meant case-sensitive, now it means case-preservant and flag 2 handles case-sensitive search.

Stackoverflow talks about NFS and something completely different.

beached_whale · on May 13, 2021

I had a fun time trying to figure out why a dependency project, my own, was not building via FetchProject in cmake. Macos by default, like windows is case preserving but insensitive too. My CMakeList.txt file was named CMakelist.txt and Linux Ext4 is case sensitive. Cmake really shouldnt care, or use all lowercase everywhere

SAI_Peregrinus · on May 13, 2021

IMO filesystems should treat different characters as different. Case-insensitivity at the storage level is always a mistake. It's occasionally fine for search or display, but it's a lossy operation. Sadly legacy compatibility has kept this choice on a number of filesystems.

qalmakka · on May 13, 2021

Case insensitive filesystems are a horrible idea, because you have to take encodings into account and the Turkey test, which means that either your normalization is broken or something will be equal under en_US but not tr_TR.

Normalization should be done in the file manager or in a gui file picker, and not at the filesystem level.

SAI_Peregrinus · on May 13, 2021

Exactly. But Windows and MacOS were made with the US-centric view that letters of different case are identical (they're not, thankfully even ASCII didn't make that mistake), and now we're stuck with it due to large amounts of legacy software relying on that assumption.

beached_whale · on May 14, 2021

I agree, but I think that the tooling should appreciate and handle the cases uniformly. On a case sensitive system, in the not found error handling, look for other names and warning that this might be the case.

Ideally for the cmake tool, the filename should not have anything but lowercase letters too.

jmrm · on May 13, 2021

I don't find any good reason to use .C instead of .cpp, .Cpp, .cxx, .Cxx, or similar. That two characters more of information helps compilers and other persons a lot.

qalmakka · on May 13, 2021

I haven't seen a "big-C" file in years. Does any project that started after 1999 still use them? Anything I have ever read either uses .cpp, .cc, or more rarely .cxx.

rm445 · on May 13, 2021

It's mildly upsetting to me that the usual file extension for C++ isn't .c++, or for that matter .C++

Why shouldn't it be?

muststopmyths · on May 13, 2021

According to wikipedia "+" is a reserved (prohibited) character in old/classic FAT file names https://en.wikipedia.org/wiki/Filename#Filename_extensions

My recollection is that the first Unix C compilers (Cfront?) used ".cc" for c++ and Microsoft started using ".cpp" , probably because of the above restriction.

Athas · on May 13, 2021

`.cpp` is very common. I never understood the point of `.C`; even ignoring issues with case sensitivity, it is too easy to visually confuse with `.c`, and doesn't even have much of a mnemonic connection to the name C++.

SloopJon · on May 13, 2021

Probably because "+" is a special character in most shells. You'd have to quote the filename every time you did anything with it.

Edit: morning brain fart. Actually, I guess "+" is not all that special.