... Except the fact that printf/scanf use variadics, and the only reason why it stopped being a constant source of crash is the fact that compilers started recognizing it and validating format strings/complaining when you pass a non-literal string as a format.
<format> is instead 100% typesafe. If you pass the wrong stuff it won't compile, and as {fmt} shows you can even validate formats at compile time using just `constexpr` and no compiler support.
As always, people making more fuss around it than necessary. Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works. Many C++ programmers have always been using printf() in preference to e.g. std::cout because of ergonomics. And they were right.
It's hard to take people seriously that try to talk it down for being a pragmatic solution that's been around for probably 30-40 years.
I've definitely written bugs where the format specifier is wrong. Maybe you used %lu because a type happened to be unsigned long on your system, then you compile it for another system and it's subtly wrong because the type you're printing is typedef'd to unsigned long long there. Printing an uint32_t? I hope you're using the PRIu32 macro, otherwise that's also a bug.
This. I have corrected uncountable lines of code where people just used "%d" for everything, potentially opening a massive can of worms. `inttypes.h` is what you should be using 99% of the time, but that makes for very ugly format strings, so basically nobody uses that. Otherwise you should cast all of your integer params to (long long int) and use %lld, which sucks.
Yes, this is annoying. Integer promotions can be annoying in general.
I'm often working with fixed size types, and still find myself using %d and %ld instead of PRIu32 etc most of the time, because it's simply easier to type. If unsure about portability issues it can help to cast the argument instead. But realistically it isn't much of an issue, printfs seem to be about 0.5% of my code, and >> 90% of them are purely diagnostic. I don't even recall the last time I had an issue there. I remember the compiler warning me a few times about mismatching integer sizes though.
I agree. Layers of legacy typedefs in the Win32 API always catch me off guard. Any large source base with lots of typedefs, it can be tricky to printf() without issue.
A common use case for cstdio/iostreams/std::format is logging. It's not at all uncommon to have many, many log statements that are rarely executed because they're mainly for debugging and therefore disabled most of the time. There you go, lots of rarely used and even more rarely 'tested' formatting constructs.
I don't want things to start blowing up just because I enabled some logging, so I'm going to do what I can to let the compiler find as many problems as possible at compile time.
So, how about it? I mean, I have code where that works exactly as expected, so I can "know it works" according to you, but I also have code where that blows up immediately, because it's complete nonsense. Which according to you shouldn't happen, but there it is.
I mean yes, I should have been more restrictive in my statement, but I'm sure you notice how we're veering more into the hypothetical / into programming language geek land. I had to look up %hhn because I've never used it.
(Have used %n years ago but noticed it's a fancy and unergonomic way to code anyway. In the few locations where printed characters have to be counted, just consider the return value of the format call!)
And btw. how is this a problem related to missing type checks with varargs? The only problem I see is that we don't know that those pointers are not null / the char-pointer doesn't point to a zero-terminated string. In other words, just the basic C level of type (un)safety.
Most issues with printf could be reported by static analysis, and modern compiler reports them as warnings, which in my book must be immediately converted to errors. All other weird usages should be either banned, or reviewed carefully, but they are very rare.
Also, std::iostream is ridden with bugs as well. Try to print out hex / dec in a consistent way is plain "impossible". Everytime you print an int, you should in fact systematically specify the padding, alignment and mode (hex/dec) otherwise you can't know for sure what you are outputting.
iostream _sucks_, I had to implement an iostream for zlib + DEFLATE in order to play ball with an iostream-based library, and I had to sweat blood and tears in order to make it work right when a simple loop and a bit of templated code would have worked wonders compared to that sheer insanity of `gptr`, `pubsync`, ... The moment you notice that they've methods called "pubXXX" that call a protected "XXX" on the `basic_streambuf` class is the moment your soul leaves your body.
IOStreams is superbad, and thankfully <format> removes half of its usages which were based on spamming `stringstream`s everywhere (stringstream is also very, very bad). They also inspired Java's InputStream nonsense which ruined the existence of an innumerable number of developers in the last 30 years.
Sometimes the print statement is in untested sanity-checking error-case branches that don't have test coverage ("json parsing failed" or whatever). It's pretty annoying when those things throw, and not too uncommon.
Another case in C++ is if the value is templated. You don't always get test coverage for all the types, and a compile error is nice if the thing can't be handled.
"Type coverage" is pretty useful. Not a huge deal here I agree, but nice nonetheless.
Take as an example printf("%d\n", foo->x);. Assuming it compiled but assuming no further context, what could break here at run-time? foo could be NULL. And the type of foo->x could be not an integer.
Let's assume you run the code once and observe that it works. What can you conclude? 1) foo was not NULL at least one time. Unfortunately, we don't know about all the other times. 2) foo->x is indeed an integer and the printf() format is always going to be fine -- it matches the arguments correctly. It's a bit like a delayed type check.
A lot of code is like that. Furthermore, a lot of that code -- if the structure is good -- will already have been tested after the program has been started up. Or it can be easily tested during development by running it just once.
I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.
I'll even go as far as saying that it's easy to have errors slip on refactors if there aren't good tests in place. But people are writing untyped Python or Javascript programs, sometimes significant ones. Writing in those is like every line was a printf()!
But many people will through great troubles to achieve an abstract goal of type safety, accepting pessimisations on other axes even when it is ultimately a bad tradeoff. People also like to bring up issues like this on HN like it's the end of the world, when it's not nearly as big of an issue most of the time.
Another similar example like that are void pointers as callback context. It is possible to get it wrong, it absolutely happens. But from a pragmatic and ergonomic standpoint I still prefer them to e.g. abstract classes in a lot of cases due to being a good tradeoff when taking all axes into account.
> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.
A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.
In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.
Thanks for the ad hominem, but let's put that into perspective.
My current project is a GUI prototype based on plain Win32/Direct3D/Direct2D/DirectWrite. It currently clocks in at just under 6 KLOC. These are all the format calls in there (used git grep):
fatal_f("Failed to CreateBuffer(): %lx", err);
fatal_f("Failed to Map() buffer");
fatal_f("Failed to compile shader!");
fatal_f("Failed to CreateBuffer(): %lx", err);
fatal_f("Failed to create blend state");
fatal_f("OOM");
fatal_f("Failed to register Window class");
fatal_f("Failed to CreateWindow()");
fatal_f("%s failed: error code %lx", what, hr);
msg_f("Shader compile error messages: %s", errors->GetBufferPointer());
msg_f("Failed to compile shader but there are no error messages. "
msg_f("HELLO: %d times clicked", count);
msg_f("Click %s", item->name.buf);
msg_f("Init text controller %p", this);
msg_f("DELETE");
msg_f("Refcount is now %d", m_refcount);
msg_f("Refcount is now %d", m_refcount);
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
fprintf(stderr, "FATAL ERROR: ");
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
snprintf(utext, sizeof utext, "Hello %d", ui->update_count);
snprintf(filepath, sizeof filepath, "%s%s",
int r = vsnprintf(m_buffer, sizeof m_buffer, fmt, ap);
int r = vsnprintf(text_store, sizeof text_store, fmt, ap);
snprintf(svg_filepath, sizeof svg_filepath, "%s", filepath);
That's theory and practice for you. The real world is a bit more nuanced.
Meanwhile I have 100 other, more significant problems to worry about than printf type safety. For example, how to get rid of the RAII based refcounting that I introduced but it wasn't exactly an improvement to my architecture.
But thanks for the suggestion to use std::format in that set of cases and std::vformat in these other situations. I'll put those on my stack of C++ features to work through when I have time for things like that. (Let's hope that when I get there, those aren't already superseded by something safer).
I've used `fmt` on *embedded* devices and it was never a performance issue, not even once (it's even arguably _faster_ than printf).
(OT: technically speaking, in C++ you shouldn't call `vfprintf` or other C library functions without prefixing them with `std::`, but that's a crusade I'm bound to lose - albeit `import std` will help a lot)
I noticed std::format and std::print aren't even available with my pretty up-to-date compilers (testing Debian bookworm gcc/clang right now). There is only https://github.com/fmtlib/fmt but it doesn't seem prepackaged for me. Have you actually used std::format_to_n? Did you go through the trouble of downloading it or are you using C++ package managers?
I'm often getting the impression that these "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.
But I'm asking in earnest. Please also check out my benchmark in the sibling thread where I compared stringstream with stdio/snprintf build performance. Would in fact love to compare std::format_to_n, but can't be arsed to put in more time to get it running right now.
> my pretty up-to-date compilers
> testing Debian bookworm
Debian and up to date compilers - pick one. <format> support comes with GCC 13.x, which has been released more than 3 months ago. MSVC has had it for years now, LLVM is still working on it AFAIK (but it works with GCC). `std::print` is a new addition in C++23, which hasn't been released yet.
> Did you go through the trouble of downloading it or are you using C++ package managers?
I don't know of many non-trivial programs in C or C++ that don't rely on third party libraries. The C standard library, in particular, is fairly limited and doesn't come with "batteries included".'
In general I've been using {fmt} for the better part of the last 5 years, and it's trivial to embed in a project (it uses CMake, so it's as simple as adding a single line in a CMakeLists.txt). It has been shipped by most distributions for years now (see https://packages.debian.org/buster/libfmt-dev), for instance, it was already supported in Debian buster, so you can just install it using your package manager and that's it.
{fmt} is also mostly implemented in its header, with a very small shared/static library that goes alongside it. It's one repository I always use in my C++ projects, together with Microsoft's GSL (for `not_null` and `finally`, mostly).
> "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.
No, I think that insecure code is insecure, period, no matter how much it is used or well known. Such focus on practicality over correctness was the reason why my C university professor was so set on continuing using old C string functions which were already well known back then to be a major cause of CVEs. That was, in my opinion, completely wrong.
This is especially true in this case, {fmt}/<format> are nicer to use than `sprintf`, are safer, support custom types and are also _faster_ because they are actually dispatched and verified at compile time. Heck the standard itself basically just merged a subset of {fmt}'s functionality, so much so that I've recently sed-replaced 'fmt' with 'std' in some projects and it built the same with GCC's implementation. `std::print`, too, is just `fmt::print`, no more no less (with a few niceties removed, afaik).
> where I compared stringstream with stdio/snprintf build performance
String Streams (and IOStream in general) are a poorly designed concept, which have been the butt of the joke for years for their terrible performance. This is well known, and I'm honestly aghast any time I see anyone using them in place of {fmt}, which has been the de-facto string format library for C++ for the best part of the last decade (at least since 2018) and is better than `std::stringstream` in every conceivable way.
If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
Not practical for me to add a second of compile time for each file that wants to print something.
> Debian and up to date compilers - pick one.
gcc 12.3 was released only a few months ago and is included. gcc 13.1, some 80 days old, doesn't seem to have made it. Not everybody is closely tracking upstream. Immediately jumping on each new train is not my thing (hence why Debian is fine), nor is it how software development is handled in the industry generally.
Even on godbolt / gcc 13.1 which I linked in the other post, <format> isn't available. Only {fmt} is available as an optional library.
> {fmrt}/<format> are nicer to use than
I think otherwise, but maybe you enjoy brewing coffee on top of your desktop computer while waiting for the build to finish.
> _faster_ because they are actually dispatched at compile time
I don't actually want this unless I'm bottlenecked by the format string parsing. If I have one or two integer formats in my formatting string, the whole thing will already be bottlenecked by that. So "dispatching at compile time" is typically akin to minimizing the size of a truck, when we should have designed a sports car. The thing about format strings and varags is they're in fact an efficient encoding of what you want to do. Not worth emitting code for 2-5 function calls if a single one is enough.
If there is a speed problem, you need some wider optimization that the compiler can't help you with.
Apart from that, that compile time dispatching doesn't actually happen with fmtlib in the godbolt, not even at -O2. The format string is copied verbatim into the source. Which I like.
> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?
>> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.
> Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?
Duh, I apologize for not even reading your statement completely. So I went on this page and it is exactly how I imagined. libc printf 0.91s, libfmt 0.74s. 20% speedup is not nothing, but won't help when there is an actual bottleneck. (In this case the general approach has to be changed).
Also compiled size is measurably larger even only with a few print statements in the binary. Compile time is f***** 8 times slower!
These are all numbers coming from their own page -- expect to have slightly different numbers for your own use cases.
For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.
It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream. If you don't like std::string, you can probably write your own ostream that will operate on a fixed size char buffer. (Can any C++ experts comments on that idea?)
About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?
> It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream.
I don't know about those specifically right now, but in general these things have huge compile time costs and are also generally less ergonomic IMO. [EDIT: cobbled together a working version and added it to my test below, see Version 0].
> About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?
Yes. It's a mouthful, and I'm worried not only about static checks but about other things too -- like readability of errors, include & link dependencies, compile performance, amount of compiled code (which is minimal in case of snprintf/varargs)... I would need to check out std::format_to_n() as suggested by the sibling commenter.
And hey -- snprintf has been available for easily 30+ years ... while the <print> and <format> headers that people make such a fuss about, don't even seem available on gcc nor clang on my fully updated debian bookworm system. The reason is that those implementations aren't complete, even though <format> is C++20. The recommended way to get those headers is to grab https://github.com/fmtlib/fmt as an external library... Talk about the level of hype and lack of pragmatism that's going on around here. People are accusing each other for not using a standard library that isn't even implemented in compilers... And with a likelyhood they haven't used the external library themselves, and given that this library is external it's not heavily tested and probably contains bugs still, maybe CRASHES and SECURITY EXPLOITS.
But let me test C++ features that actually exist:
#if VERSION == 0
#include <iostream>
#include <streambuf>
struct membuf: std::streambuf
{
membuf(char *p, size_t size)
{
setp(p, p + size);
}
size_t written() { return pptr() - pbase(); }
};
int main()
{
char buffer[256];
membuf sbuf(buffer, sizeof buffer);
std::ostream out(&sbuf);
out << "Hello " << 42 << "\n";
fwrite(buffer, 1, sbuf.written(), stdout);
return 0;
}
#elif VERSION == 1
#include <sstream>
#include <iostream>
void test(std::stringstream& os)
{
os << "Hello " << 42 << "\n";
}
int main()
{
std::stringstream os;
test(os);
std::cout << os.str();
return 0;
}
#elif VERSION == 2
#include <stdio.h>
int test(char *buffer, int size)
{
int r = snprintf(buffer, size, "Hello %d\n", 42);
return r;
}
int main()
{
char buffer[256];
int len = test(buffer, sizeof buffer);
fwrite(buffer, 1, len, stdout);
return 0;
}
#endif
CT=compile time, LT=link time, TT=total time (CT+LT), PT=preproc time (gcc -E), PL=preprocessor output lines
Bench script:
# put -DVERSION=1 or -DVERSION=2 as cmdline arg
time clang++ -c "$@" -Wall -o test.o test.cpp
time clang++ -Wall -o test test.o
time clang++ "$@" -Wall -E -o test.preprocessed.txt test.cpp
wc -l test.preprocessed.txt
My clang version here is 14.0.6. I measured with g++ 12.2.0 as well and the results were similar (with only 50% of the link time for the snprintf-only version).
For such a trivial file, the difference is ABYSMAL. If we extrapolate to real programs we can assume the difference in build times to be 5-10x longer for a general change in programming style. Wait 10 seconds or wait 1 minute. For a small gain in safety, how much are you willing to lose? And how much do this lost time and resources actually translate to working less on the robustness of the program, leaving more security problems (as well as other problems) in there?
And talking about lost run time performance, that is real too if you're not very careful.
> For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.
Honestly I just don't ensure it perfectly -- beyond running them once as described. I write a lot of code that isn't fully proofed out from the beginning. Exploratory code. A few printfs are really not a concern in there, there are much bigger issues to work out.
I also absolutely do have some printfs that were quickly banged out but that are hidden in branches that have never actually run and might never happen -- they were meant for some condition that I'm not even sure is possible (this happens frequently when checking return values from complicated APIs for example).
The real "problem" isn't that there is a possibly wrong printf in that branch, but that the branch was never tested, and is likely to contain other, much worse bugs.
But the fact that the branch was never run also means I don't care as much about it, pragmatically speaking. Likely there is an abort() or similar at the end of the branch anyway. It's always important to put things into perspective like that -- which is something that seems often missing from C++ and similar cultures.
The more proofed out some code gets, the more scrutiny it should undergo obviously.
Apart from that, compilers do check printfs, and I usually get a warning/error when I made a mistake. But I might not get one if I write my own formatting wrappers and am too lazy to explicitly enable the checking.
Again, you're mixing functions up. `std::print` is the equivalent to "std::fprintf", the one you want to write on random buffers is `std::format_to_n`, which IS a strictly better version of `snprintf`.
I'm using C/C++, which do provide a good level of type safety.
And no, types are absolutely not my problem. In fact, rigid type systems are a frequent source of practical problems, and they often shift the game to one where you're solving type puzzles -- instead of working on the actual functionality.
Come on, dude. C provides almost no type safety whatsoever. There’s no point in saying “C/C++” here, because you won’t adopt the C++ solutions for making your code actually typesafe.
Type safety is not a standardized term, and it is not binary. Being black and white about things is almost always bad. One needs to weigh and balance a large number of different concerns.
A lot of "modern C++" is terrible, terrible code precisely because of failing to find a balance.
for basic things, sure. it is much much worse than this when you deal with different encodings for an application that needs to format and print things
There are widely used languages that don't have a standard at all; literally every single thing you can do in them is above and beyond any sort of standard.
I mean, the function for straight printing is puts; I don't know why people keep using the much more complicated printf in cases where no formatting is involved.
Edit: OK, I guess puts includes a newline, so you'd need to use fputs if you don't want that (although this example includes one). Still, both of those are much less complicated than printf!
Consistency. Having intermixed puts and printfs throughout the code looks pretty bad. Also, every compiler replaces printf of a literal ending with \n with a puts anyway.
It is a very natural feature. Especially when you are writing mathematical code e.g. implementing different types of numbers, e.g. automatic differentiation, interval arithmeic, big ints, etc.
Overloading gives user defined types the expressiveness of internal types. Like all features, of they are used badly (e.g when + is overloaded to an operation which can hardly be interpreted as addition) it makes things worse. But you can write bad code in any language, using any methodology.
It is a very natural feature, but it makes discovering what you can and can't do with a library really hard. Learning what is and isn't legal with math libraries that use a lot of them can be really tricky. For example, numpy code is really easy to read, which is fantastic, but figuring out how you're intended to do things from the documentation alone is quite difficult.
In my experience numpy has also been on of the worst numerics libraries to deal with. The main reason is that Python seems designed to be hostile to numerics. Loose typing, assumptive conversions, specific numeric types hard to access, tedious array notations, etc. all are bad preconditions for a language which sadly seems to have become the prototyping standard in that area.
The moment you have a language actually designed for numerics all these things vanish. One of Julias core design aspects is multiple dispatch, including operator overloading and it works extremely well there.
I also don't see the point for discoverability at all. The documentation will list the overloads and the non-overloaded calls are exactly as discoverable as the others.
As someone who has written math libraries over and over again for the last 25 years (I wish I was joking, but it turns out it is something I'm good at [1]), I find that operator overloading works only for the simple cases but that for performance and clarify, function names work best.
Function names let you clarify that it is an outside product or inside product (e.g. there are often different types of adds, multiplies, divides), and I can not stand when someone maps cross product onto ^ (because you can both exponent and cross product some vectors, like quaternions, so why use exponent operator for cross?) or dot product onto something else that doesn't make any sense. Also operator overloading often doesn't make clear memory management, rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result. Lastly, explicit functions allow you to pass additional information like how to handle various conditions, like non-invertible, divide by zeros, etc.
I find word-based functions more verbose but significant less error prone and also they are more performant (because of the full control over new object creation.) Operator overloading is only good for very simple code and even then people always push it too far so that I cannot understand it.
> rather it relies on making new objects constantly, whereas with explicit functions, you can pass in an additional parameter that will take the result.
It's not the same if you need to allocate memory for the result. If you could pass the result in by reference, then you could (re)use a buffer which has already been allocated. The difference is massive in things like matrix calculations or image processing where you have an inner loop or a real-time stream repeating similar calculations.
Or you are working with a language like JavaScript where math primitives are GC objects and thus quite costly. In those languages if you do not reduce object creation via reuse in this way, it can be very slow.
Perhaps you're arguing that you ought to be able to name new operators (like Haskell) so that you can create a new operator for inner product instead of having to use '^' (typically used for exp or xor).
Alternatively, the main reason to use operators here is infix notation, so perhaps Haskell-like backticks.
I think languages like Julia make a strong case the other way. you can literally write algorithms that match the pseudocode in a paper. You have to be ok with unicode in your source file, but for numeric stuff, I think its a nice feature
Yeah it's a tiny bit clumsier, and prefix notation takes some getting used to. But on the plus side we avoid all the too-clever travesties programmers have inflicted on us with bad operator overloading decisions! On the whole I think it's easily worth the trade.
Again, no thanks. I want mathematical notation and I simply won't use any language without operator overloading. Free functions for common mathematical operations are an abomination.
Then you should probably use a language that lets you write DSLs for any given domain, rather than abusing operator overloading which just happens to work for a few subdomains of mathematics (e.g., you can't use mathematical conventions for dot product multiplication in C++). Anyway, I've never seen any bugs because someone misunderstood what a `mul()` function does, but I've definitely seen bugs because they didn't know that an operator was overloaded (spooky action at a distance vibes).
Actually, I'm quite happy what C++ has to offer :)
Yes, the * operator can be ambiguous in the context of classic vector math (although that is just a matter of documentation), but not so much with SIMD vectors, audio vectors, etc.
Again:
a) vec4 = (vec1 - vec2) * 0.5 + vec3 * 0.3;
or
b) vec4 = plus(mul(minus(vec1, vec2), 0.5), mul(vec3, 0.3));
Which one is more readable? That's pretty much the perfect use case for operator overloading.
Regarding the * operator, I think glm got it right: * is element-wise multiplication, making it consistent with the +,-,/ operators; dot-product and cross-product are done with dedicated free functions (glm::dot and glm::cross).
One never writes such expression in a serious code. Even with move semantic and lazy evaluation proxies it is hard to avoid unnecessary copies. Explicit temporaries make code mode readable and performant:
auto t = minus(vec1, vec2);
mul_by(t, 0.5/0.3);
add(t, vec3);
mul_by(t, 0.3);
v4 = std::move(t);
I think there may be a misunderstanding here regarding the use case. If the vectors are large and allocated on the heap/on an accelerator, then yes, writing out explicit temporaries may be faster. Of course, this does not preclude operator overloading at all: You could write the same code as auto t = vec1 - vec2; t *= 0.5/0.3; t += vec3; t *= 0.3;
However, if the operands are small (e.g. 2/3/4 element vectors are very common), then "unnecessary copies" or move semantics don't come into play at all. These are value types and the compiler would boil them down to the same assembly as the code you post above. Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever; however, code is much more readable, as it matches actual mathematical notation.
> Many modern C++ codebases in scientific computing, rendering, or the game industry make use of vector classes with operator overloading, with no performance drawbacks whatsoever
I guess these people are all not writing "serious code" :-p
TIL Box2D must not be serious code because it doesn't use copious amounts of explicit temporaries[0].
And just for the record, I'm very glad Erin Catto decided to use operator overloading in his code. It made it much easier for me to read and understand what the code was doing as opposed to it being overly verbose and noisy.
> One never writes such expression in a serious code.
Oh please, because you know exactly which kind of code I write? I'm pretty sure that with glm::vec3 the compiler can optimize this just fine. Also, "vec" could really be anything, it is just a placeholder.
That being said, if you need to break up your statements, you can do so with operators:
auto t = vec1 - vec2;
t *= 0.5/0.3;
t += vec3;
t *= 0.3;
Personally, I find this much more readable. But hey, apparently there are people who really prefer free functions. I accept that.
Of course, the compiler or an advanced IDE can know what your code means. If all your identifiers were random permutations of l and I: lIllI1lI, your IDE would not mind either, but the code would be horrific, don't you agree? The point of the OP is that overloaded operators (and functions) make it harder to reason about the code for a human that reads it. At least for some people. At the end, everything is "just" syntactic sugar, but it makes a significant difference.
Exactly. If you don't care that the code is unreadable and you can rely on every human viewing the code through an IDE with symbol resolution (and not say, online code review platforms) and remembering to use said symbol resolution to check every operator, then operator overloading is great!
If editors were to implement it, you could navigate to the corresponding overload implementation or even provide some hint text. Just like they do for other functions.
Yeah, we would need editors and code review tools to not only follow overloads to their functions but also highlight that the operator is overloaded in the first place. Of course, this is quite a lot more work than just not overloading things in the first place (particularly since the benefit of operator overloading is negligible).
Dealing with money is important, even if it's only a small part of mathematics. I'll focus on that.
Python's 'decimal' module uses overloaded operators so you can do things like:
from decimal import Decimal as D
tax_rate = D('0.0765')
subtotal = 0
for item in purchase:
subtotal += item.price * item.count # assume price is a Decimal
taxes = (subtotal * tax_rate).quantize(D('0.00'))
total = subtotal + taxes
Plus, there's support for different rounding modes and precision. In Python's case, something like "a / b" will look to a thread-specific context which specifies the appropriate settings:
>>> import decimal
>>> from decimal import localcontext, Decimal as D
>>> D(1) / D(8)
Decimal('0.125')
>>> with localcontext(prec=2):
... D(1) / D(8)
...
Decimal('0.12')
>>> with localcontext(prec=2, rounding=decimal.ROUND_CEILING):
... D(1) / D(8)
...
Decimal('0.13')
Laws can specify which settings to use, for examples, https://www.law.cornell.edu/cfr/text/40/1065.20 includes "Use the following rounding convention, which is consistent with ASTM E29 and NIST SP 811",
(1) If the first (left-most) digit to be removed is less than five, remove all the appropriate digits without changing the digits that remain. For example, 3.141593 rounded to the second decimal place is 3.14.
(2) If the first digit to be removed is greater than five, remove all the appropriate digits and increase the lowest-value remaining digit by one. For example, 3.141593 rounded to the fourth decimal place is 3.1416.
... (I've left out some lines)
(3) Divide the result in paragraph (a)(2) of this section by 5.5, and round
down to three decimal places to compute the fuel cost adjustment factor;
(4) Add the result in paragraph (a)(3) of this section to $1.91;
(5) Divide the result in paragraph (a)(4) of this section by 480;
(6) Round the result in paragraph (a)(5) of this section down to five decimal
places to compute the mileage rate.
There's probably laws which require multiple and different rounding modes in the calculation.
This means simply doing all of the calculations in scaled bigints or as fractions won't really work.
Now of course, you could indeed handle all of this with prefix functions and with explicit context in the function call, but it's going to be more verbose, and obscure the calculation you want to do. I mean, it's not seriously worse. Compare:
But it is worse. I also originally made a typo in the function-based API for line5 where I used "decimal_add" instead of "decimal_div" - the symbols "/" and "+" stand out more, and are less likely to be copy&pasted/auto-completed incorrectly.
If overloaded parameters - "spooky action at a distance vibes" - also aren't allowed, then this becomes more rather more complicated.
Why have operators at all? If that notation is good enough, then you might as well use it for the built-in types too. We're halfway to designing a Lisp!
Sorry but please don't take Eigen (https://eigen.tuxfamily.org) away from me. Can't speak for others, but the scientific code I work on would become unreadable like that.
You read the code. And unlike operator overloading, you know at a glance exactly which implementation to look at. There is no spooky action at a distance.
and to know which `plus` is being dispatched, you need to know the types of both arguments, exactly the same as if `plus` is named `__add__` in python or `operator+` in C++.
They're functions. Whatever they do, my code will go execute some code from whatever library implements them, which is what a function does. I just want to be able to rely on [] being an array subscript when I read some unfamiliar code. Is that too much to ask?
Can IDE's detect this and offer "go to implementation" on an overloaded operator these days? Because besides from the surprise-element in the fact that there even is code to debug hiding somewhere, not being able to quickly navigate to it is much worse in my opinion. And with infix operators where you can't even be sure which operand the implementation belongs to, figuring it out can be a bit of a detective task.
Yes (I’m a numerical analysis researcher and wrote a handful of ubiquitous mathematical packages)! The implementation of even primitive types can vary considerably! It’s way too much to hide. Nevertheless, I understand I’m biased. :)
In engineering practice, we often start using math without first consulting numerical analysts. It takes a long time to identify and fix the inevitable issues, which eventually becomes a lesson we have to teach students and practicing engineers because the field has accumulated so much historical baggage from doing it the wrong way.
As an example, early device models for circuit simulation were not designed to be numerically differentiable, leading to serious numerical artifacts and performance issues. Now we have courses dedicated to designing such models, and numerical analysis is used and emphasized throughout.
Is there anything today that you look at and think "yeah, they're gonna need to fix that at some point"?
Vectors and perhaps matrices are about the only valid use case I have ever come across, so I agree with GP that it's not worth it. And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever. I feel sorry for the developers who had to figure out what that was about after I left..
Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!
> And that's speaking as someone who once implemented a path class with a subtraction operator that would return the relative path between two absolute ones. I thought I was very clever.
Haha! It's ok. The temptation to be clever with operators is too strong, few can resist before getting burned (or more usually, burning others!) at least once.
> Ah, matrices! Does * mean dot product, cross product, or the usually less useful matrix multiplication? Ooh, or maybe you should you use . for dot! How clever that would be!
Why the snark? The fact that you're free to make a bad choice does not imply that having a free choice must be bad. Obviously neither dot nor cross product should be *. It should be the Hadamard product or matrix multiplication. You can choose one convention for your code and be perfectly happy for it.
As a follow-up question: How do you feel about languages like Fortran and Matlab then? Is it actually a good thing that mathematics convenience features are relegated to a few math-oriented languages and kept away from all the others? (Or are the linear algebra types in these languages offensive as well?)
The benefits from operator overloading are "I can show this to a mathematician and it looks like what they're used to". The downsides lurk in the corners of whether it's actually doing what you think.
In C++34 we'll finally have a way of overloading the empty string operator, so that we can, at last, write AB for the product of matrices A and B. As God intended.
Overloading is orthogonal to the issue you're striking at: infix operators versus postfix function calls. Functions can be overloaded just like operators.
What if you could type the asterisk to multiply vectors, but then your editor of choice would replace it with the symbol that actually means vector multiplication?
Perhaps that idea falls apart once you realize you would need hundreds of symbols for just addition…
But what if those symbols were (automatically) imported and named at the beginning?
Perhaps it would be annoyingly inconsistent how in various files different symbols are used for the same operation…
The idea is, operator overloading is a convenience feature. Why not have that convenience as an option in an editor, without influencing the language? If you want scalar multiplication to look the same as vector multiplication, set it in your editor. If you want to insert scalar multiplication with the same key you insert vector multiplication, set it in the editor (to figure out which you mean, based on context, when you press that key).
Just to be clear, I'm not being a smartass, just considering this as an option and wondering if the HN crowd has some thoughts on this.
That said, in my experience over the decades, operator overloading has been one of the primary causes of bugs that are very hard to pin down, so I have come to hate it. It hides far too much.
The cost/benefit ratio of operator overloading is generally unfavorable in practice, in my experience. Which is not to say it shouldn't be used when it actually clarifies things! But those situations tend to be fairly niche.
Interestingly, where I work right now, using operator overloading is specifically prohibited. So I'm confident that my dislike of the practice is not just a personal quirk.
That't literally the only place where operator overloading makes any sense.
In other places they monkey patch c++ defincies as a language.
And they are confusing and error prone.
Nobody is pretending we will get rid of any c++ syntax ever. So the discussion is about a hypothetical language syntax that fits C++ slot.
In that world C++ would have N x M matrices as native value types in the language (as fortran does) and those operators would be defined in the language spec for matrix types just as they are defined for standard number types at the moment.
Making such assumptions about what is correct or proper use is why c++ is so successful, it doesnt make assumptions it leaves it up to the project / community using it.
Go ahead make a language that dictates alot and makes srict assumptions it will be depricated or forced to open up before the end of the decade.
Notr this is why i think python and lisp is so populare meta programing is very powerful and expressive.
The fact C++ has so many ways of defining things is not the reason it's so popular. The reason is the enormous industry investment on the language tooling and ecosystem. IMO the language itself is the worst part of the ecosystem, but the other parts create a totality that is the best development language ecosystem in industry for my niche (graphics and geometry) including libraries, copmpilers, debugger & profiling etc.
Any language with the level of industrial support C++ has had would have grown to prominence. C++ came abut a judicious time in history when "object orientation" was becoming the latest buzzword. And now we have ended up with gazillions of lines of C++ code.
It's a tragedy of our trade that two mongrels - C++ and Javascript - became to be among the most prominent in our trade.
But the reason C++ fits in so many industries from embedded system to high level gui libraries is its flexibility we see the end of OOP trend but C++ does bot lock its users into one paradigm or another so it will continue to be industry standard. Even if the industry is moving towards other paradigms of programing.
Adding, javascript really inly has one industry its used in. Think it sais a bit about its versatility
That's a good observation about C++ not making assumptions, it strikes me as true. C++ apparently doesn't even make assumptions about what the C++ filename extension is. .h, .hh, .hpp, .hxx, .C, .cc, .cpp, .cxx, .ixx, cppm
> That't literally the only place where operator overloading makes any sense.
That may be true for C++ (I'll take your word for it), but not for all programming languages in general. For example, in C# it's fairly common to overload == and != to implement value equality for reference types (classes).
Of course, you should really only do this for immutable classes that are mostly just records of plain old data. And C# 9 introduced record classes, which is a more convenient way of defining such classes. But record classes still overload these operators themselves, so you don't have to do it manually.
Honestly that sort of thing always confused me when I worked in Java, C#, etc. I could never tell at a glance whether the operator was doing an identity comparison or a value comparison, and I definitely contributed a few bugs from this misunderstanding. In Go which lacks operator overloading, we either `ptr1 == ptr2` or we do some `ptr1.Equals(ptr2)` for value comparisons and `ptr1 == ptr2` for pointer comparisons--in either case, there is no ambiguity and IME fewer bugs.
Java's the same regarding == and .equals( ) and when it's Java code written by devs who also work in other languages, it definitely still results in bugs, sometimes that go undiscovered for remarkably long times (particularly if == happens to return the right result in most cases). Meaning/needing to compare references for string (and similar) types is exceedingly uncommon, yet uses the more "natural" syntax for testing equality.
FWIW I can't remember working with a codebase where unexpected behavior due to operator overloading was a serious problem.
Operator overloading is a useful feature that saves a bunch of time and makes code way more readable.
You can quibble whether operator<<() is a good idea on streams and perhaps C++ takes the concept too far with operator,() but the basic idea makes a lot of sense.
string("hello ") + string("world");
complexNumber2 * complexNumber2;
for (int i : std::views::iota(0, 6)
| std::views::filter(even)
| std::views::transform(square))
someSmartPtr->methodOnWrapperClass();
The majority of time in professional codebases is not spent on typing but reading and understanding code.
"saves a bunch of time and makes code way more readable"
Not when everybody defines their own operators.
Note - we are discussing operator overloading, not operators as features in syntax. Operators at the syntax level make life a lot easier. But then everybody uses the exactly same operator semantics, not some weird per-project abstraction.
The lines of code you wrote as an example are not saving anyones time, except when writing it if you are a slow typist and lack a proper IDE support for C++. If typing speed is an issue, get a better IDE, don't write terser code.
Code is read more often than written. Writing code that can be understood at a glance (by using common, well understood operators) optimizes for readability.
I think your argument is basically "people should not aggressively violate the implicit bonds of interfaces", which is true. But that goes for all interfaces, not just and not in particular those around operators.
We just have cases where it's common with operators because those are one of the few cases where we have lots of things that meet the interface and interact directly as opposed to hierarchically. The same kind of issue comes up with co/contravariant types and containers sometimes, but that's less often visible to end developers.
I tend to agree with this. I like operator overloading for mathematical constructs (like complex numbers or even just for conversions of literal types, Imagine, for example, you have a gram type and a second type, if you said 1g / 1s you'd get 1gps, that seems reasonable)
I don't like it in the example given
for (int i : std::views::iota(0, 6)
| std::views::filter(even)
| std::views::transform(square))
What benefit does this have over the Javay/Rusty version that looks like this
for (int i : std::views::iota(0, 6)
.filter(even)
.transform(square))
?
No deducing what `|` means, you are applying the filter then transform function against the view.
People don't use the same operator semantics. Is + commutative? Does it raise exceptions or sometimes return sentinels? What type conversions might == do?
And how exactly do you propose library authors should work with user-defined types? Operator overloading is what allows algorithms to be efficiently generalized across types.
The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).
I think what people should advocate is full DSL capabilities with some unambiguous gate syntax so people know precisely that `foo * bar` is not using the host language syntax. Overloading operators is ambiguous and vastly incomplete (everyone is holding up matrix math as the shining example for the utility of operator overloading and you can't even express dot product notation in C++!)--it's a hack at best.
> The code isn't readable (you can't even reliably tell at a glance what the operator does) and it takes negligibly longer to write "add()" rather than "+" in your program (yes, 'add()' is more keystrokes and thus takes longer to type, but most of your program isn't addition instructions).
Except now you replaced + with a name that tells you just as much/little as + does. So you made your program verbose for the sake of verbosity.
No, you’ve made your program “verbose” (by a handful of characters) for the sake of clarity—there is no longer ambiguity about what code runs (of course, this assumes you aren’t similarly overloading named functions, which should also be disallowed).
That was me, but I didn’t provide example code that would require namespaces. I don’t understand his your earlier comment makes sense in the context of this thread.
Unfortunately, like spicy peppers, everyone's definition of "too much" is different. Some people are eating ghost chili peppers just fine while others are struggling with ketchup.
All the things you wrote could be about as easily written & much more easily read without operator overloading. Operator overloading only allows programmers to feel "smart" for doing a "clever" thing, to the detriment of future readers.
string("hello ").append("world");
complexNumber2.mult(complexNumber2);
// wtf is even going on with this one in your example? have these people never heard of method chaining?
for(int i : std::views::iota(0,6).filter(even).transform(square))
(*smartPtr)->methodOnWrapperClass();
That's all about the same verbosity, it's much more clear to the reader even if they're unfamiliar with your codebase, and dropping operator overloading eliminates the "clever" option to do stupid crap like divide file path objects together.
Would you advocate getting rid of operators altogether?
3.times(2).plus(7)
Some things just lend themselves to being expressed in terms of simple operators.
(*smartPtr)->methodOnWrapperClass();
That is still using the overloaded SmartPtr<>::operator*() method.
I understand the viewpoint that operator overload is syntactic sugar for things that can easily be done another way, I just disagree that costs outweigh the benefits.
> Would you advocate getting rid of operators altogether?
Of course not. It makes sense for built-in types, as everyone reading the code can be assumed to know them.
> That is still using the overloaded SmartPtr<>::operator*() method.
Good catch ;)
> I just disagree that costs outweigh the benefits.
Yah, I think that's the disagreement. My feeling is there's a teeny, tiny handful of appropriate places for it (almost entirely math) and it opens up a pandora's box of terrible decisions that programmers clearly find irresistible.
as a good thing or a bad thing? I see a.equals(b) occasionally from the first argument is magic crowd but 3.times is novel here. I'm really unsure what the order of operations is for that expression.
“Fried shrimp should be removed from the all you can eat chinese buffay because i cant help myself from eating at least 20 of them in a single sitting and now i have stomach cramp”
The very first Hello World program anyone learning C++ will write uses the godawful iostream bitshifting operators! Not even the language's authors could help themselves eating 20 fried shrimp on the first day the buffet was open!
The iostream bitshift overload was one of the first features of C++ that I learned to despise. I'm very happy that there's an alternative in the new version.
How far are you prepared to take this stance, exactly? C has operators that are generic over both integral and floating point types. Was that a mistake? Did OCaml do it better?
For my part, I've been persuaded that generic operators like that are a net win for math-heavy code, especially vector and matrix math. Sure, C++ goes too far, but there are middle grounds that don't.
Having operators defined for value types within the langauge spec is different thing than defining operator overloading for arbitrary struct and class types.
For numeric value types mathematical operators are the only sane option.
For arbitrary classes - not so much.
A sane language in the slot of C++ in the language ecosystem would not have operator overloading. It would have matrix types defined in the language spec with mathematical operators operating on them.
One part of the philosophy of the language maintainers is that they're somewhat humble about their designs in the standard library, and very much against breaking changes.
Some folks prefer absl's flat_hash_map over std::unordered_map for a hash table, and it's not great that you need to choose or risk having both in a codebase, but it _is_ nice that you can have your preferred hash table and use operator[] whichever you decide.
Python also has operator overloading, and people seem to like that numpy can exist using it. And container types. Weirdly doesn't cause much consternation compared to C++ (maybe because the criticisms of the latter come from C programmers?)
I've occasionally missed overloading in JS/TS though.
> It would have matrix types defined in the language spec with mathematical operators operating on them.
This is unfortunately impossible (IMO). The problem is matrixes have multiple operations that don't translate nicely like complex numbers do. If you want to be consistent, you have to pick and choose What A * B means, under which contexts, and when is that illegal (or what should happen on an error).
For complex numbers, there's only one definition of A * B that matters and no failure cases.
I fear there's not clean way to do matrix operations that won't make some community really irritated for choosing "wrong". (Physics, engineering, science, etc.)
Operator overloading is critical for building ergonomic frameworks.
The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.
> Operator overloading is critical for building ergonomic frameworks.
The modern web is built on overloading the . operator (e.g. ORMs like Rails and Django). We will never see a Tier-1 ORM in Golang simply because it lacks it.
As I said, there won't be a Tier-1 ORM in Go. Ent or Gorm are tier-2 at best. They can get the job done, but it ain't pretty.
Any advantages of Go (and there are many) are outweighed by the fact that you have to write and read 2x more code to be equally productive as Rails or Django.
That sounds like a good thing, having dealt with Hibernate in production. As a backend developer, I'm pretty happy with C++17 (and beyond), Go and Rust. All of them can be used in fairly explicit ways, which means debugging a problem is easy, and performance issues are right there on the page if any. I want less magic, not more.
Magic is magic until it becomes understood, then it is science.
While I don't want junior programmers wielding the dark magic of operator overloading, I trust that the engineers behind Django are using it reasonably.
I'll byte: complex numbers and matrix support is bad in languages without operator overloading. Why should only the primitive types of the language be privileged to proper math notation?
Not having operator overloading is anti-human. To think so highly of yourself that there is no other thing that can properly be the subject of the field operators (or other basic operators) is the height of hubris. The compiler typically must handle the operators on certain types due to the compilation target's semantics, but in reality, there's nothing special about these 'built-ins'.
Operators like +, -, /, *, etc have meanings independent of integers and floats and to not allow these meanings to be expressed is sad.
I've heard many programmers express this sentiment and what they actually are attempting to argue is that having overloads of these operators that do not respect the corresponding group, ring, or field laws is confusing. This I agree with. Operators should mainly follow the proper abstract semantics.
BS. I thought that Java already demonstrated to the world how dumb it is to disallow operator overloading altogether.
Allowing ANY operator to be overloaded was dumb, like C++ did, where you could do batshit crazy stuff like overloading unary & (addressof) or the comma operator (!), or stuff like the assignment operator (that actually opens a parenthesis about how copy/move semantics in C++ are a total hack that completely goes OT compared to this).
Sensible operator overloading makes a lot of sense, especially when combined with traits that clearly define what operations are supported and disallow arbitrary code to define new operators on existing types. Rust does precisely that, and IMHO it works great and provides a much nicer experience than Java's verbose mess of method chaining.
I'm on your side, but only after many years of being on the other side. I used to think they were "graceful" and "minimalist", and refused to acknowledge they can be the source of many surprises.
The Google C++ style guide has a very nice overview. There are only two pros listed, and large number of cons. And this document is old by Internet (dog) years -- at least 10 years.
Consider the humble + operator. In most compiled languages -- even those that don't support operator overloading -- it is in fact overloaded. int + int. long + long. float + float. double + double. pointer + int. Would every language be better with it?
Built in operators don't always map 1-1 to CPU instructions so don't appeal to that authority. There are still plenty of CPUs -- old and new -- without multiplication, division, or floating point support.
You could argue that there is just one type (tensor) with some invalid operations between its values (e.g., when dimensions mismatch). Just like integer division by zero.
I disagree, it’s heavily abused but very useful for types where it’s obvious what the operation is (inherently mathematical types like vectors and matrices). I wrote a macro library for C that vector/matrix math in prefix notation with _Generic overloads and it’s still too clumsy to get used to.
> where it’s obvious what the operation is (inherently mathematical types like vectors and matrices)
Considering there are like 3 different types of matrix multiplication operations, I don't think it's obvious at all. Feels like you should either use a language with complete support for implementing custom DSLs (that can express the whole domain naturally) or eschew ambiguous operator overloading altogether (gaining consistency and quality at the expense of a few keystrokes).
I think we all know what someone means when they say “matrix multiplication”. Asserting that * could mean, say, the Hadamard product or the tensor product is a reach. In practice I have never seen it mean anything else for matrices.
DSLs just push the complexity away from the language into someone else’s problem in a way that has much higher sum complexity. You’re making authors of numerical libraries second-class citizens by doing so. For some languages that’s probably not a bad choice (Go is one example where I don’t feel the language is targeted at such use cases).
Also, the lack of a standard interface for things like addition, multiplication, etc. means that mathematical code becomes less composable between different libraries. Unless everyone standardizes on the same DSL, but I find this an unlikely proposition, given that DSLs are far more opinionated than mere operator overloads.
I've never understood why people complain a lot about `std::cout << "string"`, if the problem is that this operator is used for bit shifting, simply stop thinking that way (genius I know), do you think of addition when you see `string + "concatenate"`? Operator overloading is awesome, and like everything in programming, if used correctly; constructing paths with / is sweet, and I find << with streams visually appealing and expressive, it's feeding data to the stdout/file/etc, same for `std::cin >> var`, data goes from the stdin to the variable.
> do you think of addition when you see `string + "concatenate"`
Yes. And it tortures me every time.
I religiously avoid string concatenation in Python for this very reason. It's not that "+" necessarily means addition; it's that it always means a commutative operation (to somebody who has learned some algebra). String concatenation is notoriously non-commutative, thus it is extremely disturbing to write it using a visibly commutative operator. Any other operator except "+" would be better. For example, a space, or a product, or a hyphen. Whatever. But please, not a commutative operator. It breaks my brain parser.
It's also one of the biggest sources of bugs when dealing with loose types around strings and numbers.
When it comes to languages that let you mix strings and numbers, Lua has it right. + always adds, and accepts numbers and strings that can cleanly convert to numbers. .. always concatenates, and accepts numbers and strings.
Aside from the syntax (I find it ugly, you find it visually appealing - it's subjective), iostreams are inefficient, awkward to customise, not thread safe (allows interleaving), and mix format and data (that one is also subjective).
I love me some operator overloading. I love / for filesystem separators, I love | for piping things. I don't like << and >> so much but that's just because of too many years of writing them everywhere.
In C++, with its templates, there are only a couple alternatives:
1. Operator overloading
2. Operator desugaring (e.g. __subscript__(), which substitutes the intrinsic function for basic types, but can also be defined for user defined types)
3. Writing templates with weird adaptors for primitive types.
Given that its design goal was to embed C, there were already operators that worked with various and mixed types. Adding (+.), etc., would have been unacceptable to the users. So, I think in general, for this language, it was good but, unfortunately, iostream made people think you should overload the behavioral expectation, too.
Its design goal was to repeat the author's experience having to downgrade himself from Simula to BCPL, he wasn't keen in repeating the same experience with C, when he went out to write distributed computing applications at Bell Labs.
Bjarne has given a couple of interviews on the matter.
The alternative GP is advocating for is "none of the above." Meaning "operators are only defined for primitive types" which is perfectly fine when working with C.
If you wanted to use an abstraction over non primitive types for such things you would use a normal function.
Built in types are special. The compiler needs to define operator precedence and certain semantics (commutativity, associativity, overflow, etc) that cannot be expressed in the type system or enforced by the interface.
The fact that it makes syntax "nicer" for user defined types is at best subjective and at worst an anti-pattern because it leads to bad compiled code and confused programmers. Function calls are unambiguous and follow the same rules as other function calls, while operator overloading does not.
Whatever you save in avoiding having to write "sum = add(l, r)" is not worth allowing programmers to bitshift file handles[1], divide file path objects[2], or subscript objects to launch a process[3].
C++ "bitshifts" (yes, but it's just a symbol in this context) make no real difference to me. It was type safe when it wasn't easy to be back in the day, everyone knows exactly how they work - it's never shot me in the foot (whereas the real meat of the design has, independently of how the composition is expressed syntactically).
The others are more sus but you can make a bad API out of anything.
Rewriting (say) a load of calculations as a tree of sum pow exp etc. is just a huge burden - a codebase I work on has a formula that takes up about 6 lines for example, outputted by mathematica: total pain to to translate to function calls.
> The others are more sus but you can make a bad API out of anything.
Yahhhh but there's something about operator overloading that is like catnip to clever programmers. Ah ha, file paths have slashes, and the divide operator is a slash! Clearly I should use the divide operator to append paths segments together! I'm so clever! Ah ha, << looks kinda like an arrow I guess, so we can use it to, uh, pass data between objects, I guess. I'm so clever...?
It's irresistible. We have abused it and we must give it up for the good of all.
Operator overloading has been a feature of many languages dating back to the very concept of using operator notation in programming. I know of no language that has the * operator dedicated solely to a single type. Typically you have at least signed and unsigned overload, as well as various bit sizes (including larger than the machine word size), and floating point representations. Extending that to vectorial operations, arbitrary precision, and others only seem to make sense and to be going with the flow...
Most programming languages use infix notation for mathematical operations but polish notation for function calls. This creates an inconsistency. In languages, like LISP, that entirely use polish notation the inconsistency does not exist.
One could argue that if a programming language has this inconsistency, then one should at least try to be consistent with one's notation, i.e. for mathematical operations use infix notation (operator overloading).
Agree. It looks fun when I am writing the code and re-inventing abstract algebra and category theory types for classifying cat pictures. However then at some point I have to read someone else's code, even my own code weeks later and then I start cursing operator overloading.
Operator overloading is one of the cornerstones of generic programming in C++. And perhaps it is a failure of imagination on my part, but it’s difficult to think of a more elegant approach.
If you just need a nice print: fmtlib is a really nice c++23 style implementation without needing c++23 compiler support. Highly recommend it. It’s simple. It’s fast.
I think Barry under-estimates how long it will be before C++ programmers actually get the equivalent of #[derive(Debug)] and as a result the impact on ergonomics. But of course I might be wrong.
This works on my RHEL9-compatible for a .c file (using gcc). The type specifier for main is implicitly `int`. You get some warnings about implicit types and implicit declarations, but you get a binary that when executed writes "Hello, world".
Are there now in c++, after all these years, f-strings like python has, or at least something coming close? If not, I keep being at my disappointed state about c++.
Slightly off topic, but I recently learned that implementing the opposite of what you've asked for, bitshift stdout in python, is only a few lines of code:
If people don’t have time to keep up with a languages updates (which, in case of C++, is currently _once_ every three years), then they don’t have time to complain about the lack of features, either. This one had the time to complain and just didn’t want to bother typing "c++ string formatting", which would have been fewer keystrokes than the comment complaining.
On DuckDuckGo, the very first result for "c++ string formatting" is the exact thing this person was complaining about.
That about wanting to change mind... it touches a string! Somehow the let-downs must have been too much for me at a certain point in time. That being said, I'm curious to find out right now. Edit... no such thing found as string interpolation in cpp, at least not in my first 4 search hits. I'll crawl back.
- template-function parameters (NTTPs with a function parameter syntax rather than a template parameter syntax, tentatively spelled as `void foo(template int constant){}` )
- Scalable reflection, in combination with expansion statements (likely in C++26, spelled `template for (auto& e : array) {}` ) which would allow you to write an arbitrary parser for the template string into real C++ code. Reflection targets C++29.
Syntax2 already supports string interpolation as a special compiler feature.
That type of interpolation is something most non-scripting languages don't have anyway, and it took Python several decades to get it, and only has had it in the 5 years or so.
I should have said the "latest standard", not "spec", if we're being technical. But EVERY bit of official material is very clear about asserting that C++23 is still a preview/in-progress, not a standard. Saying otherwise is, strictly speaking, incorrect.
And quite frankly, what matters to devs is what tooling supports the specification without special configuration, and the answer is "basically none". Not a single compiler fully supports it.
fmt has been available for years and it works with ridiculously old compilers. It’s great to have it standardized but it’s not a new capability that C++ didn’t already have.
My guess is you never had to parachute into a project using operator overloading in strange, inconsistent, and undocumented ways with no original maintainers to show you the ropes
I actually like operator overloading, but overloading the shift operators for I/O was still a mistake IMO. It's a mistake even if you ignore that it's a theoretical misuse (I/O and binary shifting have nothing to do with each other semantically). The operator precedence of the binary shift operators is just wrong for I/O.
First, includes either need to be wrapped in angle brackets (for files included from the include path passed to the compiler) or quotes (for paths relative to the current file).
Second, the whole standard library would be huge to pull in, so it is split into many headers, even for symbols in the top level of the std namespace.
Something I've learned recently is that the convention of when to use angle brackets and when to use quotes is not prescribed by the standard but instead is implementation-defined.
#include is a preprocessor directive that substitutes the text of a file in place. import declares that this translation unit should link to a specified module unit. Usually there would only be a single translation unit for the entire program in the latter case, which obsoletes IPO/LTO (except when you have statically linked libraries), and means internal linkage functions (everything that is template, inline, or constexpr) do not have to be redundantly recompiled. That also means there would be no distinction between inline variables and static variables. This obsoletes unity builds and precompiled headers.
Some C++ person recently wrote to the GNU Make mailing list about some grotesque experiments for supporting C++ modules inside GNU Make whereby GNU Make would communicate with some C++ compiler process over sockets and generate dependencies.
Any decent language with modules needs no make utility in the first place. You tell the compiler to build/update the program. The compiler compiles (if necessary) the interface definitions the program references, and in that vein recursively updates the whole tree, compiling anything that has changed; then links the program.
I didn't need any make utility when working with Modula-2 in 1990!
I’ve always wondered why we use make in the first place. Is it really so hard to write a python script that keeps track of file time stamps in a JSON? gcc can even be invoked with certain flags to print header dependencies. Make is crusty, archaic, and over designed
C++ has namespacing which makes sense because this language has an enormous amount of available 3rd party libraries and without name spacing you can't help stepping on each others toes.
There are two ways you might want to have this work anyway despite namespacing. One option would be that you just import the namespace and get all the stuff in it, this is popular in Java for example, however in C++ this is a bit fraught because while you can import a namespace, you actually get everything from that namespace and all containing namespaces.
Because the C++ standard library defines a huge variety of symbols, if you do this you get almost all of those symbols. Most words you might think of are in fact standard library symbols in C++, std::end, std::move, std::array, and so on. So if you imported all these symbols into your namespace it's easy to accidentally trip yourself, thus it's usual not to do so at all.
Another option would be to have some magic that introduces certain commonly used features, Rust's preludes do this, both generally (having namespaces for the explicit purpose of exporting names you'd often want together) and specifically ("the" standard library prelude for your Edition is automatically injected into all your Rust software by default). C++ could in principle do something like this but it does not.
The language is designed so that is possible, although the current compiler does not. At one point, the compiler did all the file reading in parallel, but that was finally turned off because it did not significantly improve compile speed.
The std namespace is from the <print> standard header. It’s not just print because while you might want it in the global namespace, other people do not. For example, my code isn’t cli and doesn’t need to print to the cli, but perhaps I want to print to a printer or something else and have my own print function.
Leave un-namespaced identifiers to those that are declared in the current file and namespace everything else. If you really want, you’re free to add “using namespace std” or otherwise alias the namespace, but keeping standard library functions out of the global namespace as a default is a good thing! (In any language, not just C++)
> If you really want, you’re free to add “using namespace std”
You're free to, but I discourage the habit. It's more verbose to add the namespace:: prefix to symbols, but it sure does make it easier on the devs that have to work with the code later.
Oh, I’m completely with you on this and always prefix my namespaces. Occasionally I will alias long namespaces to short ones, but I never pull identifiers into the global scope and I really dislike when I see code online that does “using namespace” (unless it’s tightly scoped, at least). I’ve been prefixing std:: for years and won’t stop now, I like knowing where an identifier is coming from, which is extra important when you have multiple types of similar containers that you use for different performance characteristics (eg abseil, folly, immer versions of containers vs std containers)
C++ has been my main language for a very long time, but I've been a grumpy skeptic of C++ since around C++14 due to the language spec's total complexity. So I've mostly stuck with C++11.
But now that C++ has modules, concepts, etc., I'm starting to wonder if C++23 is worthwhile for new projects. I.e., the language-spec complexity is still there, but the new features might tip the balance.
I'd been thinking to walk away from C++ in favor of Rust for new projects. But now I might give C++23 a chance to prove itself.
C++ deserves its rep for complexity. But, it comes from a promise to avoid a version upgrade debacle in the style of Python 3. C++ promises that you will forever be able to interleave new-style code right into the middle of ancient, battle-tested old-style code.
To do that, it can only add features and never take them away. Instead, it adds features that deprecate the practice of PITA patterns that were common, necessary and difficult.
Like, SFINAE was necessary all over the place to make libraries "just work" the way users would expect. But, it is a PITA to write and and PITA to read. Now, constexpr if and auto return types can usually collapse all that scattered, implicit, templated pattern matching down to a few if statements. Adding those features technically made the standard more complicated. But, it made new code moving forward much simpler to understand.
Similarly: Before variadic templates, parameter packs and fold expressions, you had the hell of recursive templates. Auto lambdas make a lot of 1-off templates blend right into the middle of regular code. Deduction guides set up library writers to set you up to write
> But, it comes from a promise to avoid a version upgrade debacle in the style of Python 3.
There is a very wide middle ground between C++'s "Your horrific unsafe code from the 80s still compiles" and Python's "We changed the integer values common operations on strings return at runtime with absolutely no way to statically tell how to migrate the code".
In Dart, we moved to a sound static type system in 2.0, moved to non-nullable types in 2.13 (sound and defaulting to non-nullable!), and removed support for the pre-null safety type system in 3.0. We brought almost the entire ecosystem with us.
Granted, our userbase is much smaller than C++'s and the average age of a given Dart codebase is much younger.
But you can deprecate and remove old features without causing a decade of misery like Python did. You just need good language support for knowing which version of the language a given file is targeting, good static typing support, and good automated migration tooling. None of those is rocket science.
> But you can deprecate and remove old features without causing a decade of misery like Python did. You just need good language support for knowing which version of the language a given file is targeting, good static typing support, and good automated migration tooling. None of those is rocket science.
That's way easier to do in a naturally statically-typed language though.
Except there is only one implementation, with language design and implementation developed together.
C++, Java and .NET are now all living through "decade of misery like Python did", exactly because not everyone is jumping into the latest versions of the languages and ecosystem changes.
Isn't this a problem only if you manage your dependencies by source rather than "compiled"?
That is, in Java you may not be able to compile old code with new javac, but the class file format is still understood by the JVM. So your old .jar still works.
I believe it is the same in C++ with the .so. I don't know about .NET.
In Python however, I don't think the .pyc were compatible between Python2 and Python3.
The way the bytecode is executed, the standard library, possible changes in GC and JIT implementation, other 3rd party libraries.
For example, if the jar code has a call to Thread.stop() and is loaded in Java 11 or later, when it comes to actually call that method you will get a java.lang.NoSuchMethodError exception.
> To do that, it can only add features and never take them away. Instead, it adds features that deprecate the practice of PITA patterns that were common, necessary and difficult.
The result being that programmers have to learn every single one of those ways of doing things in order to read your coworkers code. Give me python 3 breaking changes any day
When I learned C++, there was no auto. Range-based for loops didn't exist. The combination of those two were particularly nice because instead of writing:
for(std::vector<sometype<int>>::const_iterator it = my_container.cbegin(); it != my_container.cend(); ++it){
// use *it
}
You could use:
for(auto const& val : my_container) {
// use val
}
Which really paved the way for less hesitation around nesting templates and other awkward things.
Same goes for map with the introduction of structured bindings:
for(std::map<somekey, sometype<int>>::const_iterator it = my_map.cbegin(); it != my_map.cend(); ++it){
// use it->first, it->second
}
became
for(auto const& [key, val] : my_map) {
// use key, val
}
The introduction of std::variant made state a lot easier to reason about. Unions meant too much room for error, and cramming polymorphism into places it shouldn't have ever been was riddled with its own performance and memory issues.
std::optional was also a game changer in that regard.
constexpr was huge. It killed off a bunch of awkward template metaprogramming for something faster to compile and easier to deal with. I wasn't a part of the consteval discussions but from what I HEARD people say about it and what it actually WAS, I think it was castrated at some point along the way.
Concepts, also a big one.
Formatting (RIP iostreams) is much nicer than the alternative.
Ranges, coroutines, modules - they'll have their day too.
And we have a bunch of stuff to look forward to. Senders and receivers solves a lot of component interaction problems. Reflection will rid us of so much annoying provisioning of classes. Pattern matching and expression statements will be major usability boosts.
Ok, C++ doesn't move at the speed of newer languages, but for its size, complexity and all of the hurdles that it has on the way to standardisation, I think modernisations are many and often successful.
I would absolutely love a major break that fixed things we can't currently fix due to ABI risks, implementer resistance, etc. I'd love to rid us of the old ways™ and focus on safety the right way™ - by removing bad bits rather than just adding good bits and telling people to use them and follow a bunch of guidelines. I'd love better defaults.
There is so much work to do to make C++ great again puts on MCPPGA hat, but I can't agree that modernisation are few and rarely successful.
Simply listing the features of C++ makes it sound good until you actually try to use them and realize they don't quite compose and have little holes here and there exactly because its actually 3 languages from 3 different era trying to mix together.
For example, we have std::variant, yay! So where's the pattern matching? Vomits from a glance
Most serious C++ developers will tell you that C++14 was basically a bug fix for some oversight in the C++11 spec. You should probably use it if you can if that’s the standard you’re happy with.
C++ 17 is the sweet spot for me. It has most of C++ 11/14, with many new features I use a ton: constexpr, <charconv>, <string_view>, [[maybe_unused]], new SFINAE feature (if constexpr), and more. (And this was considered a small release!?)
I guess I’m just right at home and therefore a bit reluctant to jump into C++20. That and the constantly changing “””correct way of doing things.”””
Really the cognitive load of modern Rust is no less than C++$RECENT in my experience. Both require a ton of prima-facie concepts before you can productively read code from other developers. Of en vogue languages, Zig is the only one that seems to view keeping the "metaphor flood" under control as a design goal, we'll see how things evolve.
But really, and I say this from the perspective of someone in the embedded world who still does most of his serious work in C and cares about code generation and linkage: I think the whole concept of these Extremely Heavy Systems Programming Languages is not long for this world. In modern systems, managed runtimes a-la Go/Swift/JVM/.NET and glue via crufty type-light environments like Python and Javascript are absolutely where the world has ended up.
And I increasingly get the feeling that those of us left arguing over which monstrosity to use in our shrinking domain are... kinda becoming the joke.
> Really the cognitive load of modern Rust is no less than C++$RECENT in my experience. Both require a ton of prima-facie concepts before you can productively read code from other developers.
Eh. I don't know if I agree. I've worked on a few large C++ codebases, and the cognitive load between Rust and C++ is incomparable.
The ownership/borrowing stuff is complexity you deal with implicitly in other systems languages, here's it's just more explicit and semi-automated most of the time by the compiler.
In C++ the terse thing is never the correct thing. If I'm using metaphors: a Rust sentence, in C++ usually has to be expressed through one or more paragraphs. The type system is so loose you have to do "mental" expansions half the time (or run the build and pray for no errors, or at the very least that the error is somewhat comprehensible[1]).
There's some low-level stuff that can be ugly (for various definitions of ugly), but that's every language. The low level bits of async are a bit wired, but once the concepts "click" it becomes fairly intuitive.
At least the ugly parts are cordoned behind library code for the most part, and rarely leak out.
I guess it could just boil down to familiarity, but it took me much less time to familiarise myself with Rust than it took me to familiarise myself with C++. We're talking months vs years to consider myself comfortable/adept at it. Although, maybe just some C++ or general programming wisdom transferred over?
[1]: This happens in Rust too, I must admit. But it's usually an odd situation that I've encountered once or twice with some very exotic types. In C++ it's the norm, and usually also reported with bizarre provenance
C++ errors are so bizarre that most of the time I use the compilation result as a simple binary value. Some part of one's mental neural network learns how to locate many errors simply through a "it doesn't work" signal.
IMO the cognitive load of Rust is different in practice just from basic nuts and bolts things like having a ecosystem of libraries that culturally emphasizes things like portability and heavy testing, and a standard package manager and an existent module system (C++26 fingers crossed). I dislike Cargo but it's incredibly effective in the sense any Rust programmer can pick up another project and almost all of of the tools instantly work with no change. I mean, Rust is frankly most popular in the same space Go/Swift et cetera are, services and application layer programming, where those conveniences aren't taken for granted. I see it used way more for web services and desktop/application services/command line/middleware libraries than I do for, like, device drivers or embedded kernels. Those are limited enough in number anyway.
Really, the ergonomics of both the language features and standard tooling meant it always meant it was going to appeal to people and go places C++ would not, even if they in theory are both "systems languages" with large surface areas, and they overlap at this extreme end (kernels/embedded/firmware/etc) of the spectrum that little else fits into.
I don't write embedded software, but I see Rust breaking into many domains outside of systems and winning developer mind share everywhere. Seemingly fulfilling its promise of being able to target low-level and higher-level software domains (something Chris Lattner wanted for Swift, but it hasn't yet materialised on the low-level side, we'll see).
I'm in doubt of this statement, Rust is here for 17 years, its market share is less than 1% still(per google).
It does have a lot of developers saying good words for it whenever there is a chance in recent few years, but the reality is that, it's hard for most developers to learn, because of that it might remain to be a niche language for system programming.
c++ is not standing still, I feel since c++20 it becomes a really interesting modern language, and I hope its memory safety concern will be addressed over time at a faster pace, in fact if you use rule-of-zero, RAII, smart pointers etc correctly your code can be very much memory-safe already.
> It does have a lot of developers saying good words for it whenever there is a chance in recent few years, but the reality is that, it's hard for most developers to learn, because of that it might remain to be a niche language for system programming.
On what basis is this "reality" based? Recent survey of 1000 developers by Google denotes "[the] ramp-up numbers are in line with the time we’ve seen for developers to adopt other languages"[1]
> in fact if you use rule-of-zero, RAII, smart pointers etc correctly your code can be very much memory-safe already.
That's not what memory safe means. Memory safe means that even if you make a mistake, you cannot get a memory error, something that is true of Rust minus the `unsafe` superset. Even with what you describe, you can trivially get UB on C++ if you: keep a reference to an object, use a smart pointer after move (something that the compiler will not warn against), call .front on an empty vector, dereference an iterator that got invalidated by pushing in your vector, use a non thread-safe object in a multithreaded context. Rust will statically prevent a memory error from occurring in any of these situations.
If you're cautious, you may not get memory errors in a C++, but that's not "memory safe".
Also, as a C++ developer turned Rust developer, C++20 does nothing to capture my interest back. The module system is atrocious (orthogonality between modules and namespace, way too complex for what it wants to achieve, module partitions seriously?, poor tool support 3 years after the standard was published), there are no ergonomic sum types and pattern matching, no standard package manager (and the legacy compilation model based on textual inclusion makes build system more complex than they could be), no improvements on thread safety in sight (to be fair, you pretty much need your language to feature a borrow checker and be built around thread safety, it cannot be bolted on). I have little hope C++ is in capacity of addressing these issues in the next 6 years.
Actually if you look back to 8 and 16 bit home computers, with C, C++, Object Pascal, BASIC, Modula-2, full of inline Assembly, in many cases that being the biggest code surface, it is quite similar.
Nowadays C and C++ took the role of that inline Assembly, with higher level systems languages taking over the role of the former.
Sure. The point was more that Big System Engineering (in any particular domain, frankly including things like OS kernels), going forward, just won't be happening much in Rust or C++. And languages that large make for very poor "inline assembly", where C obviously does quite well and Zig seems to have a reasonable shot.
I mean, literally yesterday I had to stop and write a "device driver" (an MMIO firmware loader for an on board microcontroller) in Python by mapping /dev/mem. That's the kind of interface and development metaphor we'll be seeing in the longer term: kernel driver and interface responsibilities will shrink (they are already) in favor of component design done in pedestrian languages like Go or whatever that mere mortals will understand. There's no market for a big C++ or Rust kernel framework.
I think the future of these languages is largely as a target for code generation and transformation. Their legacy tooling-unfriendly designs is what's slowing this transition.
>I'd been thinking to walk away from C++ in favor of Rust for new projects. But now I might give C++23 a chance to prove itself.
Modules were a great step forward for C++, but one of the features I enjoy the most about Rust is that adding a library just works. Especially for single person projects, which are on the smaller side, it feels absolutely great having to do near zero build management.
I've always been aware of C++ (obviously!) but it seemed impenetrable to me coming from my experience with C# and JavaScript. So I was really pleasantly surprised when I tried it out a while ago using the C++20 standard: it felt entirely usable! But then I tried Rust, which felt like that plus a pile of even better stuff. It's been difficult to find a reason to go back since but I'm glad to see even more progress is being made.
There are a zillion reasons that C++ is still widely used (existing software, brand new software that relies heavily on existing libraries / game engines), so it's really nice to have lots of helpful features being added to the language and the standard library.
I'm ambivalent about Rust, but its best feature compared to C++ is a universal package manager and build system. I like vcpkg well enough, but it's not Cargo, it can't be.
The thing that really ticks me off about RUST is it has compiler optimizations that can’t be turned off. In C++, you can typically turn off all optimizations, including optimizing away unused variables, in most compilers. In RUST it’s part of the language specification that you cannot have an unused variable, which, for me kills the language for me since I like having unused variables for debug while I’m stepping through the code.
> In RUST it’s part of the language specification that you cannot have an unused variable
I believe you're confusing Rust with Go. In Rust, an unused variable is just a warning, unless you explicitly asked the compiler to turn that warning (or all warnings) into an error.
Not true. Their debugger might display the value of the variables, which are unused outside the debugger. There are other options in Rust though, like logging. Create your "unused variable", and then use it in a debug log or dbg macro, etc.
If your development style demands uninitialized variables, you can set them to `= unimplemented!()` during development.
And as someone else said, if all you want is unused variables without warnings, you can say at the top of the root of the crate (`main.rs` or `lib.rs`):
It might be worth noting that unimplemented!() is a thin wrapper around panic!(), so if execution makes it to that line, your program will crash.
Rust actually does allow uninitialized variables, even in safe code. But it'll be a compiler error if you try to use them in any way before initializing them, so this is mostly just a curiosity: https://play.rust-lang.org/?version=stable&mode=debug&editio...
That would be even better, naturally there still exists the whole ecosystem issue.
In any case, taken to the extreme, the language won't matter any longer, beyond some druids with the knowledge to implement Star Trek like computing models.
The tradeoff is C++ is an amazing language for building a type system and zero overhead abstractions, where Rust is extremely limiting in that area (orphan rule, constant evaluating const-generics, no implicit pedagogically equivalent type conversions, users don't define what move semantics mean, terrible variadic programming, macros can't interact with CTFE or generics, functions/lambdas cannot be const-generics). Some of that will probably improve over time, though, although metaprogramming in Rust advances slower than C++ since 2014 and it has to play catch up already.
I learned C++ in the late 90s and didn't touch it until recently and all the new stuff basically melted my brain. It feels like a different language now.
Indeed. I was working in/on C++ back in the late 80s and the 90s and I decided a few years ago to do a project with it.
I took the approach of learning it as a brand new language that I knew nothing about, and especially avoiding thinking about what I already knew about C++ and especially C. The result was a language that yes, is a pleasure to work in.
Of course some context could not be ignored: there are about six ways to do many things because of back compatibility. But that back compatibility is why so many people use it. I write in one style, and just know the others for reading other people's code.
Yeah, that's me, I used C++ professionally throughout the 90s and early 2000s, then switched to C# (and C for embedded). Unfortunately, the C++ code bases I work with are equally old so I also have to look for a brand-new project to re-learn modern C++.
As someone who learned JavaScript in the late 90s I feel the same way sometimes! If I'd gone away from the ecosystem and returned recently I think it would feel extremely alien.
Hopefully clang++ can catch up to the new standard faster as clangd the LSP is used in so many intellisense tools that depends on clang++ implementation. Even though c++23 will be compiling fine with newest g++, clangd will still complain all those new syntax probably for a few years ahead, at the moment quite a few c++20 features are still unsupported by clangd(which depends on clang++).
or, gcc can have its own c/c++ LSP, which is not the case.
Hoping compilers will get their C++20 modules implementation working well enough that we'll get C++23 standard library modules soon.
As an outside observer, it seems like progress is happening in spurts, but it feels kinda slow.
Last time I checked (which was roughly a year ago), even a toy project with modules slowed down IntelliSense to the point of it being unusable and having to be disabled. Do you know if it's better now?
For people who are proficient in other languages but haven't touched C++ since CS036 in undergrad, some 15+ years ago, what would be the best way to learn what "modern C++", both the code and the dev stack, looks like?
I'm a little 10 years out from writing C++ professionally and I found this cheat sheet[0] useful. Basically if you have an inkling of the concept you're looking for, just search on that cheat sheet to find the relevant new C++ thing. Specifically for me, we used Boost for smart pointers which are now part of the stdlib, and threads are now part of the stdlib as well.
I don't really learn stuff in a structured way so this might not be helpful at all, but a youtube walk got me into watching CPPCon talks (https://www.youtube.com/@CppCon) and I found them generally useful for getting an idea of what's going on in C++ and what all the magic words to research are.
When a bunch of people talk about weird gibberish like SFINAE it becomes easy to figure out it's something you should search for on wikipedia (https://en.wikipedia.org/wiki/SFINAE). note: SFINAE is simply a way to make overloading work better by failing gracefully if an overload fails to compile.
There's a series of talks called Back to Basics that seems to have quite a few talks every year where they discuss C++ features in general like move semantics or how testing works, etc. There have also been talks from the creators of CMake or the guys working on the recently added ranges proposal, so it does cover tooling as well.
I also quite enjoy following Jason Turner's C++ weekly series (https://www.youtube.com/@cppweekly) which also has quite a few episodes that are dedicated to new C++ features or new ways of handling old problems made available by the new features. They're generally quite short, each episode on average is 12 minutes.
Just looking down the list of videos I see this is also kind of a direct response to your question, from only 8 months ago.
https://youtu.be/VpqwCDSfgz0 [ C++ Weekly - Ep 348 - A Modern C++ Quick Start Tutorial - 90 Topics in 20 Minutes ]
For experimenting:
https://gcc.godbolt.org/ is a tool called compiler explorer, which is a really good place to experiment with toy code.
It's a site where you can write some code and let it execute, to see the results, as well as see what ASM it compiles down to for various compilers.
That last feature really helped me figure out whether the compiler really does pick up an optimisation I'm trying out. (and it's how I got really impressed by how powerful constexpr is (that's one of the new keywords))
For References:
Generally the https://en.cppreference.com site is a really well maintained wiki that has good explanations of every concept and standard library feature.
It sticks to the standard and has use examples and is heavily interlinked, as well as some concept pages to give an overview of certain topics like overloading, templates, etc. (they also have a SFINAE article https://en.cppreference.com/w/cpp/language/sfinae)
Deducing this seems like a drastic change to the language, not a minor incremental one. People will be doing CRTP with it without realizing or fully appreciating the consequences now.
When I was doing C++, one of my interview questions was an open ended one: "std::cout << "Hello world!" << endl;" What exactly is this doing, lets dive in to how it works.
You touch on kind of a lot here, and its pretty esoteric to even devs with 3-5 years experience. functors, operator overloading, namespaces, passing by reference to make the << chaining work, there is a lot you really have to know that is non-obvious. You can even jump into inheritance, templates and such if you really want to dive deep.
I thought this was normal after doing C++ for ~11 years, but when I finally broke out of the Stockholm syndrome and started a job working in other languages, I find it absurd how complex, or maybe a better way to put it is, how much language specific trivia you need to learn before you can even begin to unravel how a basic hello world! program really works.
My favourite C++ question is asking what std::move does. It's amazing the knots people twist themselves into explaining how it supposedly works, when in reality it's just a type cast to an r-value so that the move constructor/assignment gets called instead of the copy one.
If they do, there's no harm done. The new this inference stuff is just a brevity enhancement, yes?
You have to understand that it's a Sisyphean struggle to get people to use modern C++ features at all. You still see people passing around std::string instances by value and manually calling new and delete. They're ignorant of variants and think "final" is an advanced feature. I'm happy if they get as far as understanding even how to use CRTP.
There's a vast gulf in skill between the expert C++ practitioner who appreciates a blog post like the one linked and one who's been dragooned without training into writing some enterprise C++ thing. The latter relates to C++ as an inscrutable Lovecraftian god who might eat him slightly later if he makes the right cultist noises and does the right syntax dance.
There is yet another thing with deducing this: no more pointer-to-members:
```
class A {
int f(this A & self, float b);
};
```
Type of &A::f is int ()(A &, float), not int (A::)(float).
This is huge for template metaprogramming function deduction if you stick to it because that generated a lot of template instantiations to cover all cases.
But then you'll break if someone supplies a "real" member function. Is this even a big deal with std::invoke, which lets us treat regular functions and PMFs uniformly?
std::invoke I am guessing it could somewhat instantiate quite a bit of code? Not sure thoug. But it is a template signature. And it is not the only problem with pointers to members. When doing signature introspection you need a ton of combinatoric template specializations. Getting rid of a family of instantiations by sticking to deducing this seems like an attractive approach.
I wouldn't call it harmless. The feature might just be brevity enhancement for CRTP (and I think it's incredibly well-designed for that), but the advertisement/education around it usually just mentions constness-agnosticism and code deduplication as the use cases, which are precisely the wrong cases for deducing this. CRTP was never the solution for those and that hasn't changed with the syntax -- because both of these have effects on code semantics, not just syntax. But I will bet most people using it will use it for those purposes, rather than as a briefer syntax for when they would've otherwise used CRTP.
It feels a lot like the push_back -> emplace_back thing, where the feature has important use cases, but plenty of people will use it merely because they're mistaught that it's the "modern way" to do things without being told it's a footgun that bypasses existing safety features in the language, and it becomes impossible to hold back the flood. And they get away with it the majority of the time, but every now and then they end up introducing subtle bugs that are painful to debug.
But hey it's the shiny modern hammer, so obviously you can't just let it sit there. People can't just see a modern hammer and let you watch it sit there until you actually need it. You have to use it (and people will tell you to use it during code reviews) or everyone will look down on you for being so old-fashioned and anti-"progress".
Probably thats also partly because they are being dumped into a large sprawling codebase already full of C++98 idioms. Even if you point them to the "newer sections" that are more modern, they will fall back to what they are working with all the time.
std::expected and a monadic interface for std::optional are very welcome changes. I've ended up with funky utility types to accomplish much the same thing in a couple projects, so an official interface is definitely nifty.
I remember reading that clang was finally shipping std as a module, albeit experimentally. So this ought to be an interesting couple of years for C++ -- though I suppose it remains to be seen to what degree the ecosystem will keep up with all these changes vs using headers/exceptions/the traditional ways of doing things.
Does C++ yet have something similar to async/await?
I have a medium sized JavaScript codebase that uses this (http://github.com/bhouston/behave-graph) and I could really use a port to C++ so that it can be integrated with the USD project, who has expressed interest.
I couldn't find an equivalent to async/await in my searches so I am fearful that porting this across to C++ is non-trivial.
But highly doubt you need anything like async/await for this kind of application. In fact, I'd go as far as to say async/await is almost never needed except for building networked services with truly massive performance demands.
If you genuinely have massive performance demands, stay far away from coroutines. For whatever reason the approach C++ took makes them incredibly memory hungry and inefficient.
You can look into asio or boost::asio (same thing more or less). These libraries add pretty good async behavior. Of course, you need to know a lot of C++ to understand and use them properly, and they're very verbose.
If you need to do things in parallel for performance, you may be better off spinning up a threadpool (like asio's thread_pool) and using a threadsafe queue or similar (i.e. asio's asio::post()) to pass messages and get results.
Since you can do it with JS, you likely don't have massive perf requirements, and you just need to write it in simple, synchronous C++ and you may find it to be well fast enough.
We also have coroutines, they're async/await. Haven't tried them though.
What are some personal projects you'd use C++ for? I have a hard time imagining any use cases for myself where it makes sense to use C++ over another language.
As someone who has never used c++, it seems like it's primarily useful when latency and performance really matters and the cost of that is ease of use when it comes to developer quality of life/productivity.
When creating things where such things matter or you need lower level access to the platform since there is a tight connection between the C and C++ compilers and the platform they (were) built.
I'd like to play with many gadgets, small embedded toys. Let's say it is custom a USB mouse. You may want to create a data structre that 1-1 represents whatever data you're pushing to the wire and the exact amount of time it takes you to push it over the wire.
C++ lets me abstract things better than C. There is a level of control in C++ where you get the decide to the minute detail how the compiler builds your software. With a C++ compiler you can say to the compiler to optimize one function, while not the other one in the same source file. Yes, the standard language abstracts the underlying machine and it will suprise you since compilers obey it very well if you not tell them otherwise. But all of the C++ compilers, including the niche ones, also let you pierce through those abstractions by special attributes and linker directives. So you can have an almost assembly code next to very abstracted and crazy optimized code.
C++ doesn't care how you build software, it is up to you to decide which kind of libraries, which libraries, what subarchitecture, what optimization level to use for each library. These sound crazy but current literature and internet is full of detailed documentation how to do these things since somebody at some point needed to do them to debug a particle accelerator or the latest PCIe bus or something and it is somehow useful for your stupid USB mouse.
Rust as the language (syntax, rules) is better. Rust articulates better, it makes a lot of sense. However to appeal the newcomers and people who look different sa(n/f)er ways to do things, Rust generally limits you. You need to codify certain "features", or you need build.rs files. You need to find underdocumented or "nightly" features. Whatever you come up with would be better formally defined. However going against the grain of Rust and Cargo creates abominations.
It's still, after learning Rust, Go, C#, Pascal, and other languages, the language that "feels" best to me. So I'll use it for hobby/personal projects, because the other languages get in my way in weird ways a little too much.
Can you share some examples hobby projects? I'd like to get into using c++, however I'm having a hard time thinking of things where it makes sense to use it. Most things I'd be interested in have a java/JavaScript/python SDK and C++ is nowhere to be found or mentioned.
You can take a look at my GitHub[0] or GitLab[1] for some projects I've done.
Notably a hobby http server, or a compiler, or an emulator, are projects that i enjoy/ed particularly and to which C++ is a good language choice, generally.
kv-api is a kv store with a REST api, for example, which is more unusual to write in C++.
I'm excited for more usability coming to C++. It's still far from the writability of Python. At the same time the Mojo language is going the other way: creating a system language out of Python.
iostreams are horrible to use, especially if you want to format your output and they blow up compilation to a mythical degree. People go to great lengths to avoid them and I agree with them.
print/println are based on the fmt library (https://github.com/fmtlib/fmt), which had it's first release in 2015, so it's roughly as old as Rust afaik. It's mainly inspired by Python formatting.
Having per-type formatters is just a logical conclusion of having type-safe formatting.
iostreams are for all kinds of io (which includes stdin/stdout), while fmt is entirely dedicated to formatting strings. Those things are related, but not the same. cout and cerr and such will probably be entirely superseeded by print/println (I hope), but it doesn't make iostreams generally redundant.
println adds a newline and you want to be able to choose, so there is print and println.
Most of those benefits apply to std::format which was already introduced in c++20. But formatting a string you will often want to output it somewhere. You could do `std::cerr << std::format(....)`, but that just invites weird mixes of std::format and iostream based formatting. I look at print/println as partially just convenience function combing the existing std::format functionality with outputing to something. Not sure if standard permits it but print could also allow skipping some temporary heap allocated std::string which the std::format returns by directly writing to the output stream buffer.
In C++20 if you wanted to print using std::format style formating (or it's variant) your options where:
```
std::cout << std::format("{}{}", arg1, arg2);// not the best syntax but less bad than everything else, slightly inefficient due to temp string
std::string tmp = std::format("{} {}", arg1, arg2);
fwrite(stdout, tmp.c_str(), tmp.length()); // more ugly, and still uses temporary string
std::format_to(std::ostream_iterator<char>(std::cout), "{} {}", arg1, arg2); // avoids temp string, but what kind of monstrosity is this
```
But doesn't the std::format style formatting make the formatting part of ostream redundant -> it somewhat does. I guess that's why there are 3 types of print overloads:
* accepting ostream as first argument
* accepting c style "FILE*" as first output argument
* no output stream argument, implicitly outputting to stdout
One potential consideration to use ostream based versions instead of FILE* ones even though the formatting part is somewhat redundant, is RAII based resource management. If you want to use FILE* based versions, it means that you have to either remember manually close the File* handle which just like manual new/delete or malloc/free calls is something modern C++ tries to move away, or you have to create your own RAII wrappers around c style FILE* handles.
An alternative would have been introducing new kind of output stream API which only concerns with outputing buffered stream of bytes and skips the formatting part of ostream, but that would have introduced different kind of redundancy -> having 3 kinds of API for opening and writing to file. Allowing to pass ostream to print, while discouraging use of << operator parts of it seems like a simpler solution.
One more concern is how using the version of print which outputs to stdout without taking output File or ostream handle interacts with any buffering done by previous printf,and std::cout APIs and also synchronization with input streams. The same problem already existed before for interactions between printf and std::cout. For the most part if you don't mix them it doesn't matter too much, but if you care the standard defines how exactly they are synchronized. The new print probably either adds third case to this or is defined as being equivalent to one of the previous two.
After looking reading docs a bit more, seems like std::basic_streambuf/std::file_buf did exist. The new print API, might have used that as abstraction around output streams without some of the ostream formatting baggage. I have only seen them mentioned as implementation details of ostream stuff, never seen anyone use them directly.
There was also std::format_to in c++20 which avoid the temporary string std::format returns. I guess they could have extended that with additional overloads so that it can function more similar to std::print. But if they need to define new functions might as well call them something more familar to people coming from other languages like print, instead of having format_to(stdout, "formatstring", arg1, arg2);. Currently std::format_to outputs to output iterator.
So to sumarize why print exists:
- combine std::format introduced by C++20 with outputting somewhere with cleaner syntax compared to doing it manually.
- sane looking hello world for beginner and people coming from other programming languages, this is probably also why println exists
- (maybe) avoid some heap allocations that temporary string returned by std::format would cause.
- avoid confusion of mixing two formatting approaches that `std::cout<<std::format()` implies
- avoid C style manual resource management that would be required for doing `File\* f=fopen("");auto tmp=std::format();write(f, tmp.c_str())`
iostreams required only early C++ features and in return gave you:
1. Type safe string formatting. You'd get a compile error if the value you tried to write didn't support it.
2. User-defined formatting support for user defined types. You could make instances of your own class support being written directly to a stream. Sort of like how some other languages let you override `toString()`.
3. No runtime dispatch for #2. Unlike Java, C#, etc. the user-defined formatting code for your type is called statically with no overhead of virtual dispatch.
Using operator overloading to achieve all of those was really clever and led to a very powerful, efficient, and flexible system. It's also just unfortunately pretty verbose and kind of weird.
I could be wrong as I was young and not yet in the field, but my impression has always been that sometime in the 80s/90s as the whole "networking, world wide web, wowie!" moment happened, there was this idea that "maybe on a local computer everything is files, but on the network everything is streams. Hey, maybe everything is streams!?" and C++ just happened to be creating itself in that zeitgeist, trying to look modern and future-thinking, so somebody decided to see what would happen if all the i/o was "stream-native".
IDK, it'll probably make more sense in another 15 years as we clear away the cruft of all the things that tried to bring "cloud native" paradigms into a space where they didn't really fit...
I think it is more simple and technical that that.
The big thing is that they wanted type safe IO. Not like printf where you can print an integer with %s and the compiler won't have a problem with that, and it will crash.
Reusing bit shift operators for IO is quite clever actually. If you have operator overloading, you have type safe IO for free. Remember C++ came out in the 80s as a superset of C, these technical considerations mattered. std::println doesn't look like much, but it actually involves quite significant metaprogramming magic to work as intended, which is why it took so long to appear.
> Reusing bit shift operators for IO is quite clever actually
It's a miserable trap. Operators should do something in particular, because of the Principle of Least Surprise. The reader who sees A + B should be assured we're adding A and B together, maybe they're matrices, or 3D volumes, or something quite different, but they must be things you'd add together or else that's not a sensible operation for the + operator.
When you don't obey this requirement, the precedence will bite you as, as it does for streams. Because you forgot, these operations still get bit shift precedence even though you're thinking of this as format streaming it's just a bit shift and happens when you'd do a bit shift...
Streams looks like something dumb you'd do to show off your new operator overloading feature, because that is in fact what it is. It should have existed like Mara's whichever_compiles! macro for Rust - as a live fire warning, "Ha, this is possible but for God's sake never use it" - but instead it was adopted for the C++ standard library.
People have investigated better solutions than iostreams for decades. But creating a near perfect interface is difficult in C++ because you have almost infinite control over the compile time and runtime characteristics of it. It has to compile fast, it has to be extensible (even the formatting syntax is interchangeable and programmable here), it has to give formatting errors at compile time, and it has to run nearly optimally. Victor Zverovich cracked the code, but only after Meta, Boost, and many others put tons of work into their own attempts. Imo, std::format is the most complex part of the standard library so far in terms of its implementation.
Iostream predates parameter packs, which is why they use << for a variadic API.
>Just like Pascal/Modula-2, except 30 years later :-D
Now if they could fix the CaSe SeNSiTivity bug, get rid of macros, get some better strings, and maybe use @ to get an address, and ^ to point to things.... they'd be on to something.
Especially getting rid of macros, I hate them.
Most C code looks like line noise, maybe they could then use something like Begin/End to denote blocks instead of abusing comment markers {} ;-)