> In error checks that detect “impossible” conditions, just abort. There is usually no point in printing any message. These checks indicate the existence of bugs. Whoever wants to fix the bugs will have to read the source code and run a debugger. So explain the problem with comments in the source.
But then the person RUNNING the program will only see this:
Abort trap: 6
And that's all the info you'll get from their bug report.
So please ignore this directive and print a descriptive message always, complete with file and line, and the values that led to the impossible situation. Then you can get helpful bug reports like:
BUG: flush.c:51: buff_offset (65535) must not be greater than 20!
Abort trap: 6
The core dump contains information for fault analysis.
There is another aspect, C and C++ are not memory safe languages so the bug may not be particular logical (i.e. some kind of memory corruption). In these cases I actually prefer something to the like of __builtin_trap instead of abort. Calling any code after an invalid invariant has been detected clobbers registers and may make it impossible to investigate the state at the time of the "crash". Some features of modern optimizing compilers make this even worse, such as abort being marked as "noreturn".
> When you want to use a language that gets compiled and runs at high speed, the best language to use is C. C++ is ok too, but please don’t make heavy use of templates. So is Java, if you compile it.
Back in the early days, this sentence was more like "When you want to use a language that gets compiled and runs at high speed, the best language to use is C.".
So we switched from a path where all major desktop environments (OS/2, Mac, Windows, UNIX) were adopting C++ to a surge in C programming, as FOSS adoption started to gain steam.
So here we are now, about 30 years later, trying to fix the security inconveniences caused by this manifesto.
> So we switched from a path where all major desktop environments ... were adopting C++ to a surge in C programming, as FOSS adoption started to gain steam.
So you say FLOSS is responsible that the late 90s/early 2000s C++ hype slowly died off?
> trying to fix the security inconveniences caused by this manifesto.
And you believe those projects chose C, solely because some random GNU document suggested they do?
If that document didn't exist they would have chosen what? C++98? Java? Ada?
Had GNU/Linux not taken off, in the alternative universe from Windows, BeOS, Mac OS, OS/2, commercial UNIX (remember Motif++ and CORBA?) would kept writing the software in C++, instead of caring about creating FOSS stuff in C.
GNOME vs KDE is a good example of that schism and language wars.
Having written software in the 90s: C++ was unusable for large-scale apps on commodity hardware. It was a neat toy. It had humongous compile times, and the runtime was suboptimal at best.
The choice was protracted language wankery with continuous (wrong) declarations of "Soon, the compiler will make it fast enough", or actually shipping software.
The balance started tipping in the early to mid-2000s. You could, if you were very careful, write decent-sized systems with good performance in C++ at that point, and the abstractions were starting to be worth it.
And I say that as somebody who enjoys C++, and has written code in it since the late 80s. Yes, on cfront. "Horses for course" always has been, and always will be, the major driver for language adoption. That particular horse wasn't ready in the 90s.
> And I say that as somebody who enjoys C++, and has written code in it since the late 80s
As have I, and I say you don't seem to know what you are talking about. I've been involved in writing very large Unix and Windows applications in the 1990s (starting in the late 80s), and have had no problems that you mention.
> Mac OS was a mix of Object Pascal (originally created by Apple), Assembly and C++.
That doesn't ring true at all.
Object Pascal was a rather short-lived project at Apple. It was only seriously used for the MacApp framework -- which was a separate product sold to application developers, not part of the core OS or development tools -- and was abandoned entirely during the PowerPC transition. Later versions of MacApp used C++.
The bits of source code I've seen for System 7 were primarily C and assembly, with some older code in (non-object) Pascal. I don't recall seeing any C++.
Yes, because while it isn't fullproof due to copy-paste compatibility with C89, at least it offers better tooling for safer coding, provided one doesn't code "C with C++ compiler".
Namely:
- proper string and vector types (most compilers allow to enable bounds checking anyway)
- stronger rules for type conversions
- reference types for parameters
- better tooling for immutable data structures
- memory allocation primitives instead of getting sizeof wrong to malloc()
- collection library instead of reinventing the wheel in each project
- RAII
- smart pointers
- templates instead of error prone macros
- namespacing (usefull in large scale projects with prefix tricks)
I'd argue that without STL & C++98, C++ would've languished even longer. And with STL, it still took another 5 years for the compilers to be good enough.
> And with STL, it still took another 5 years for the compilers to be good enough.
This is under-stated, IMO. It wasn't really until 2004 or even later that we had high-quality support for C++98 in GCC. LLVM wasn't available, yet. Heaven help you if you wanted to develop in C++ on OSX, since Apple's packaging of GCC was a total disaster. Step zero for developing C++ on OSX was "install GCC from FSF sources" for many years.
Even MSVC support was lagging. It wasn't until the Microsoft tools leadership got involved with C++11 that MSVC took standard support seriously. They were already prioritizing .NET in that timeframe.
Meanwhile, the big open-source desktop C++ libraries (Qt and WxWindows) still don't fully take advantage of the types and features in the standard library in 2021.
> Meanwhile, the big open-source desktop C++ libraries (Qt and WxWindows) still don't fully take advantage of the types and features in the standard library in 2021.
Of course not. C++ and libraries don't go along nicely. STL is not really useful for cross boundary interop due the fact that C++ ABI is not stable.
Shipping libraries that leak STL types all over the place will only give you headache.
The average C++ code base has as many segfaults than the average C code base.
Windows has more exploits than Linux. But you know all that, so I wonder why you keep making these statements, which are then upvoted by the "memory-safe" crowd.
I’m glad this doesn’t go unnoticed, and share the observation. There seem to be quite some effort going into creating illusions of truth about C. Just a personal observation.
If you want to play that game, https://www.cvedetails.com/product/32238/Microsoft-Windows-1... shows Windows, indeed, having more CVEs, in spite of, AFAIK, not using C (substantially, at least). Of course, the real problem is that CVE counts may or may not mean anything when comparing systems with wildly different development models (FOSS/proprietary) used mostly in different areas (desktop / everything else).
I can't speak for the parent, but I've been using RAII and smart pointers in the 00s and it provided a lot of the benefits that got standardised with C++11.
RAII and smart pointers definitely were a thing in the 90s. I wrote lots of COM code using these techniques. According to wikipedia, RAII was invented in 1984-89.
The average C++ codebase isn't from the 90s. On all the recent c++ polls the average language revision used is between c++14 and 17.
Besides I'm pretty confident that there are more new c++ projects created daily in 2021 than monthly at the peak of the 90s c++ craze - just on GitHub, 6/7% of C++ repos means a few million recent C++ repos.
I've worked on several code bases that nominally are C++11 or 14. However they still contain a lot of code written by people still coding like it's the 90s.
we are discussing the sentence "The average C++ code base has as many segfaults than the average C code base."
__s and you said "You were using C++ >= 11 in 90s/00s?" to which I answered that this was not the point, because the average C++ code base isn't from the 90s/00s.
> On all the recent c++ polls the average language revision used is between c++14 and 17.
Polls of hobbyist coders, or software houses? I would be surprised if most software houses migrated to C++17 yet. Tensorflow is stuck on C++03 I think.
it does not mean that you're not using C++11. This macro is just a compatibility flag for your code to work on old linux distros that provide a C++11 compiler but did not want to rebuild their whole archive. It mainly means that std::string is implemented with copy-on-write instead of small buffer optimization.
In 1992, I was working on the Taligent project, probably the first major C++ operating system. (It failed.) I remember when the ARM came out---none of the compilers we had available could really do templates. Or namespaces.
Oh no... ROOT. It's a testament to the pure grit and gumption or thousands of poor undergraduates that particle physics can advance, with this. Eons ago, I tried several times to help my then-girlfriend (you know how it goes 'hey you have some kind of eng diploma'? Yes sw engineering... - Hu so you know C++? - nobody 'knows' C++ but I can manage 'so here what I'm trying to do, here are 3 other examples, please for the love of Wotan help') and I was baffled on how to do anything with it. I mean the core thing seems powerful enough, but trying to go out of the beaten path (research, right ?) was yugely frustrating... And I'd worked on 2 physics codebases or variable quality before. I didn't appear as competent as I'd hoped and spent so much time helping, reading docs and code without understanding much of the design. This is the codebase that started my deep defiance for OOP and especially OOP-as-a-mirror-of-the-real-world and inheritance-for-code-economy...
I worked with C++ & MFC in mid-late 1990s, smart pointers weren't an option. Maybe if you do something against MS oddball APIs, but not in general programming, not even for mainstream MFC uses. And what was there was entirely non-idiosyncartic, it's like claiming C++ had garbage collection in 1990s because you could bolt on Boehm's.
> So here we are now, about 30 years later, trying to fix the security inconveniences caused by this manifesto.
That sounds backwards to me. It wasn't a "manifesto" that caused that "surge", it was the actual software being written.
Free software beat the world. Free (system) software is overwhelmingly written in C. At least part of an argument like this needs to nod to the fact that free software written in C beat the world because... C was a better choice? Certainly it was in the 90's.
I've spent 10 years writing security critical C code. There's no problem writing secure code in C. You just have to stop being clever and prioritize security above "speed". Your code will probably be fast enough anyway.
If it's too slow, then you probably have an issue with which algorithm/data structure you chose and would have had the same issue in another language.
The biggest issue I have with C today is that you can't trust that your compiler actually generates code that is 1:1 with what you wrote. I'm not talking about UB here, your compiler can actually remove code/checks even though you don't invoke UB.
Then we have UB, I think UB should be removed from the spec completely. There probably was a point in time when leaving stuff as UB was the best option, but today speed is seldom the problem, correctness is.
I no longer work as a C programmer, but I still love the language and I really enjoy writing C code, but I really would like to get rid of UB and have a compiler I can trust to generate code even when I turn on optimizations, and having optimizations off is not an option either, since it can generate broken code as well, so...
> having optimizations off is not an option either, since it can generate broken code as well, so...
Especially when you add this line, you're telling me that what you want is not to program in C but to program in a language which differs from C not in syntax but in semantics, and in otherwise vaguely undefined terms [1] there. And you're mad that compilers implement C instead of your not-C.
I find claims that you can safely write secure code in C hard to believe when you marry them with complaints about compilers not implementing not-C correctly. Especially given that virtually every new sanitizer and static analysis tool to find issues in C code manages to turn up issues in code that is rigorously tested to make sure it passes every known prior tool (e.g., SQLite).
[1] From prior experience, this tends to be best distilled as "the compiler must read the programmer's mind."
It is not possible to remove ub from the spec without turning everything into a pdp-11 emulator, which will sabotage performance on many platforms.
I also don’t believe that a single person on the planet can write a secure c program of meaningful complexity. Static analysis tooling has demonstrated that it isn’t up to the task of saving developers from themselves.
Written over many many years by experts of a field, and the end result is at most somewhat complex, nowhere near the complexity of even the monolith SPA of your favorite website.
I've got a strong background in formal verification. I do not believe that "formally verified" means "security bug free". In fact, I personally know researchers who have had vulns found in their formally verified code more than a decade after they completed the verification.
I agree, there are lots of outrageous claims out there about "provably secure software" which seem very dubious.
What are your thoughts on SEL4, is it really a breakthrough it is made to be in success of formal verification? Is there a way for users/administrators deploying it to verify themselves authors' claims? Or is it too difficult? I am afraid the latter...
In 2004 Peter Gutmann in his thesis/book criticized the hype around effectivity of formal methods in computer security [1]. Has the situation changed?
> I've spent 10 years writing security critical C code. There's no problem writing secure code in C. You just have to stop being clever and prioritize security above "speed". Your code will probably be fast enough anyway.
Are there good examples of what you mean by this? From my own C++ experience, when dealing with c libraries and std::string types, I'll sometimes use the copying api's[0] when passing around std::string::c_str() because I find it easier than worrying about invalidating the returned reference if the string is destructed or modified.
They might be thinking of cases like where C compilers can remove code that zeroes memory holding security-sensitive data. [0][1] The compiler is permitted to reason that this memory is about to be deallocated anyway, so we can elide the memset call.
I can't imagine why a C compiler would remove a non-trivial runtime check though (except undefined behaviour).
I'm more interested in the the actual statistics for security vulnerabilities found in C vs. C++ programmes, rather than theoretical benefits one language might have over the other.
I don't think there is a justification for the idea that 90s "C++" was going to be substantially different from C. You can call the C files C++ and be pretty much correct. Speculatively ... the OSS community would have converged on the C part of C++. That is the well understood part.
>>> Please don’t use “win” as an abbreviation for Microsoft Windows in GNU software or documentation. In hacker terminology, calling something a “win” is a form of praise. You’re free to praise Microsoft Windows on your own if you want, but please don’t do so in GNU packages. Please write “Windows” in full, or abbreviate it to “w.”
But they have removed some other guidance:
>>> Instead of abbreviating “Windows” to “un”, you can write it in full or abbreviate it to “woe” or “w”.
Encountering "woe32" in GNU things was what made me stop writing M$ as a teen. It was so embarrassingly petty that it made me entirely reconsider the idea of insulting nicknames for things I don't like.
This is exactly why I've never been on the GNU-train. The radicality of RMS's writings and philosophies paired with the subtle condescending and childish tone in all of the guides, licenses, articles, blog posts, etc.
Always interesting to see standards across different groups. The kernel is also very interesting, being very strict and having reasoning that not all may agree with, but has proved to work well.
I think C is a beast for standard in general due to it's rogue history. Now when a language like Rust or Go is created, standards are released with the code via formatting tools that enforce consistency (which I'm all for).
The amount of time I've seen C code with inconsistencies within a single file (naming, spacing, you name it) from fellow students back when I was in college was insane.
Enforced code style standards are great, even if I don't agree with them. Go's public vs private distinction is a good example. I really am not a huge fan of the pascal vs camel case to denote private and public (it's grown on me a bit, but still not a fan), but I know their code is going to be reasonable (in looks) because Go is very picky about how the code looks.
Unix utilities in the old days weren't all that great. One example I've posted about here before is how mv would refuse to move files across filesystem boundaries (because it's not a 'move', it's a 'copy and delete', so you had to use cp and rm instead).
Just a few weeks ago, I fixed a problem in an internal tool (that still also runs on an ancient Solaris 10 box) by switching from `awk` to `gawk`, since the former would briefly whine on stderr and quit whenever a line in its input would contain too many fields. A recent change had managed to break that undocumented barrier.
In this case there's at least no silent breakage involved, but the badly written shell script that called it did not bother to check for that condition, and a fair number of heads were scratched for a while as a consequence.
> Avoid arbitrary limits on the length or number of any data structure, including file names, lines, files, and symbols, by allocating all data structures dynamically. In most Unix utilities, “long lines are silently truncated”. This is not acceptable in a GNU utility.
... goes against MISRA C, which certainly is preferable in the domain I work, embedded systems - because dynamic allocations all over the place are a recipe for CVEs.
A significant number of these CVEs are related to dynamic memory allocation (double-free, use-after-free).
Probably not all are the result of that piece of advice and probably some of those memory allocations were necessary, but since this class of errors is common in C/C++, I believe it is really not a good idea to encourage people to point the gun right to their feet.
On a side note, please explain to me how this is end-user oriented in a system where the convention is that a program ends silently when everything went smoothly:
> In error checks that detect “impossible” conditions, just abort. There is usually no point in printing any message [...] Explain the problem with comments in the source.
if everything went smoothly likely the program had some useful output (e.g. grep, awk, sed). If it failed then I'd just run `coredumpctl gdb` ? (and ... abort isn't silent ? here's what I get if something aborts here: https://imgur.com/a/69eF73w)
Me too. I've never understood it. I just went back through the clang-format documentation and looked at some of the examples for the GNU style and was reminded at how utterly unreadable the resulting code is.
I've not written a lot of lisp (I understand it, just never had a need for it personally) so perhaps I'm ignorant, but I don't see how it's at all similar to Lisp. It just looks different for the sake of being different.
I never worked in a codebase that was formatted in this style and surely its weird and not at all what I am used to. But I somehow from looking at the examples find it to be somewhat pleasant. Doesn't mean I would pick it as my standard but I can see how one could.
I do prefer putting opening braces at the line end in all situations though (but that also rules out a lot of other styles).
But then the person RUNNING the program will only see this:
And that's all the info you'll get from their bug report.So please ignore this directive and print a descriptive message always, complete with file and line, and the values that led to the impossible situation. Then you can get helpful bug reports like: