The title is slightly click-bait-y: C++20 doesn't have default parameters. It talks about how to fake it with structs.
It's a pretty convincing fake, but it has one problem: the person who wrote the function needs to have written it that way. What I wish more languages had (C++, Java, etc.) is the following:
Suppose I write some function that take way too many parameters:
void foo(char a, short b, long c, float d, double e){}
What I want, as a caller, is for the language to let me get a struct corresponding to the argument list, something like this:
foo::args whatever;
whatever.e = 3.14;
// fill in the rest
And then add a bit of magic syntax so we can "unpack" that struct to actually call foo. (Extending this proposal to handle varargs is left as an exercise to the reader, because I have no idea.)
- either the struct is replicated at the function call (not really efficient, but the semantics are straightforward);
- either the struct is used to prepare directly the stack for the function call (more efficient, but the semantics must be defined).
One of the issues of the second approach is that if the structure/variable is a classical one, what should happen if we use the variable after the call?
If we have to read the values inserted before the call then we have to keep a copy of the structure in the stack (so it's the same as the first approach).
Or we can consider that the variable no longer exists in the scope.
Which corresponds to the ownership concept in Rust for example. (but, to my knowledge, this does not exist in C++)
What is the best option? Or are there other options?
Another question, what sizeof(foo::args) should return ?
This mostly solves the unpacking problem. Thus, there are only two problems left:
1. I need to handcraft the tuple type myself, and make sure it matches the function declaration. (In other words, I don't want to manually write tuple<char, short>. I want to write foo::arg_tuple.)
2. This provides no support for named parameters. Tuples in C++ are numbered, not named.
Having programmed in languages with named parameters (Smalltalk, ObjC and Commonlisp) I’ve found them cumbersome.
Also they can lead to a poor pattern which is functions/methods that take too many parameters.
They also tend to make generics harder to understand as the possible parameters can vary. Of course you can make the same problem in C++ via, say, `auto multiply(int x, int y)` and `auto multiply(double x, double y, enum rounding rp = rounding::nearest)`
In general they impose the verbosity (in keyboarding and reading) in the routine cases but don’t free you from having to look up the argument semantics of lesser-used functions.
Side point: I used “keyboarding” because the obvious word, “typing” has a homonym which made the sentence confusing!
Hmm. Why do you think they would lead to misuse via functions that take too many parameters?
I understand how this is a problem but I’ve seen more than one function in C that takes 4 or more parameters.
Methinks it’s more a product of being an inexperienced coder and not being able to model a domain appropriately. The huge benefit? If you do need 4 parameters you can at least read them / understand them if they are named!
tends not to attract the same scrutiny for some reason. On the other hand, removing named parameters doesn't actually fix this (see jasode's example at https://news.ycombinator.com/item?id=24401913 so I don't need another dozen lines), so it's not clear that this is a good argument against named parameters.
In my opinion, the first is a terrible interface because I have no idea what any of the parameters do, not because there are so many. The second is a good interface to me because all of the values are labeled. I don't see any problem with having this many parameters when all of those parameters are relevant (see: Vulkan). Named parameters often implies default values too, which means you wouldn't have to specify them all.
I basically emulate this in other languages, like Rust, with an “options” struct that has default values.
Well, my example was a small case of this problem. It’s not unreasonable that when operating on floating point, different algorithms can need different ways of determining which FP number to choose when the result cannot be precisely represented. And the example I gave is innocuous...though it’s likely an underlying function called by functions that implement higher level semantics.
And in fact in that case using a keyword isn’t that different from binding the meaning into the function name (`divide_rounding_down` and so on).
But in my experience I saw a pernicious pattern: that the function name would essentially become the common entry point for a large number of divergent functions `divide(dividend: x, divisir: y, truncate: true, underflow_handling: ufh::throw, negative_permitted: true)` and so on (contrived example for explanatory purposes).
In addition, positional arguments are simply easier to read; as keyword arguments are not ordered, you’re essentially parsing the arguments, seeking the important ones etc. You’re increasing the cognitive overhead of the common case while only marginally assisting the uncommon case. It’s like reading while hearing the sound of the words in your head: quite possible, but significantly slower than having the meaning enter directly.
>Also they can lead to a poor pattern which is functions/methods that take too many parameters.
>In addition, positional arguments are simply easier to read; as keyword arguments are not ordered, you’re essentially parsing the arguments, seeking the important ones etc.
Microsoft's raw Win32 API SDK is based on the C Language and even though that language doesn't have named parameters, it still did not discourage functions designed with lots of parameters that's difficult to mentally parse.
To make such code self-documenting, this is how I typically write Win32 API calls to make the code readable:
If the language had named parameters, I wonder if Microsoft would have used them so programmers wouldn't have to sprinkle extra comments like that. (Comments are also brittle because they're not real tokens that the compiler would check.) Without the comments, it looks like this:
All those naked NULLs and TRUE params are very hard to parse unless you've memorized the positions.
I'm not recommending that we need the verbosity of Objective-C named parameters but I don't see how mentally relying on position is less cognitive load.
>I don’t recommend a large number of parameters in any paradigm. Your case is a good example.
Sure, I understand that too many parameters is to be avoided and can point to some underlying poor architecture design. However, _if_ there's irreducible complexity that calls for 5+ parameters to minimally specify behavior, what are the better syntax options than named parameters?
If you say "create a struct with members and pass the struct instead", then you're just shifting the "named params" to the "struct member names".
This causes the extra verbose syntax of defining one-time-use struct variables to pass as a function argument. In contrast, named parameters in the function can act as "anonymous unnamed struct" which is cleaner.
Using structs as function parameters is more useful when the structs can also be re-used elsewhere in multiple places (e.g. 3d points {x, y, z}).
Not a minor nit: in Objective-C, keyword arguments are ordered. Every function call reads like a sentence and the keyword argument itself can be something phrase like:
Kotlin has a nice convention with named and variable arguments. If the types are all different, it's not ambiguous and the compiler can figure it out, but you're free to name them at the call site anyway. I really don't see how we've suffered the C convention for 40 years when we could have had this instead.
One thing to be aware of is that named arguments will become part of the public interface which you can't change without breaking everyone's code. I believe swift solves this by having internal and external names for arguments.
It’s also critical in Python to make heavy use of metaprogramming with decorators, so the bundling / unbundling operators combined with named parameters aren’t just for ease of reading or annotating call sites, but rather they open up a huge range of behavior that can only work if decorators can easily manipulate parameters of the functions they are decorating.
For example, suppose you have a function foo and within foo you call another function baz, where baz has a very large range of customization parameters. If you want to memoize calls to foo with a decorator, you can just pass kwargs for any “pass through” customizations of baz. The memoizing decorator can flatten kwargs out into a tuple along with any positional args, so you get call signature memoization without needing to restrict or list out all possible args of baz. In fact you can make this perfectly generic over the function being memoized so it doesn’t matter if you change baz to bar one day, with a whole new signature.
You can combine this with the inspect module and with functools.wraps to ensure full and complete docstrings and introspected signatures as well, so that putting kwargs all over doesn’t hurt readability, remove info from docstrings or help messages, or make it hard to read.
> It’s also critical in Python to make heavy use of metaprogramming with decorators
Interesting use case, thanks for sharing! Out of curiosity, when are some scenarios where it's "critical" to write code this way? I mostly interact with small pytorch / numpy codebases that don't grow large enough for this sort of thing.
Pytest as a unit test framework is a great example. Some other good ones include the contextmanager decorator from contextlib, the retrying.retry decorator, gradual type enforcement decorators, decorators for control web app routing and HTTP / REST parametrics like in Flask.
Even in hardcore numerical computing this way of working has huge wins over traditional OOP designs or plain flat function designs like C. For example see numba’s jit and autojit decorators that also work this way.
For cases when you want to do it yourself, think of functions that wrap complex matplotlib or seaborn plotting utilities that rely on tons of optional parameters via kwargs. I often find it super helpful to do this when wrapping pandas functions too, which also involve tons of kwargs customization.
FWIW I work professionally in deep learning and other ML. I think scientific computing in Python in particular highlights the importance of this.
But those are non-trivial or cumbersome in their own way, and often people don't for their own internal functions. Named parameters neatly side-steps the issue entirely, by rolling the "local constant" right into the function call itself.
Like consider these two python snippets:
['Ford', 'BMW', 'Volvo'].sort(true)
vs.
['Ford', 'BMW', 'Volvo'].sort(reverse=true)
The latter is much more readable than the former, and there wasn't any "too many parameters" issue. It's a way to help create self-documenting code that's also compiler enforced. And it absolutely frees the reader up from needing to go lookup what the boolean parameter on "sort" means.
Although I agree in general, I think that most of the time this should be solved on the API level (like your link suggests). This is because although you 'can' name your parameters, there is nothing forcing you to. This means that many people forget or are too lazy, so you still end up with a large part of your code not being as readable as it could be.
For the above you'd have an 'rsort' or 'reversesort' function instead (or if your language supports it, some fluent kind of sort().reverse()). If you have a lot of options, some of them boolean, you might send in an options struct or have an enum as mentioned in the article.
On the Windows team at Microsoft where I work, we work around the lack of actual named parameters by putting parameter names in C-style comments. To illustrate, given a hypothetical list class with a sort method like Python's, with a boolean "reverse" parameter, we'd call that method like this:
list.sort(true /* reverse */);
This convention is very handy for certain Win32 functions that take lots ofparameters, such as CreateProcess or CreateWindowEx.
Yup, that's the API design tweak that QT recommends as well. But it's a lot more boilerplate, and that sort of boilerplate basically never happens for internal-only methods. Named parameters is basically the middle ground solution for the "quick" hacks that have a tendency to be rather permanent hacks.
I found it quite useful & more natural to read in Objective-C/Swift & in those you typically are writing in an IDE that auto-completes all the names. I've also not generally found them cumbersome in Python where I code primarily in VIM without auto-completion. YMMV
Yes, I’ve used it in Ruby and now I find myself using it a lot in Dart. Also, the Flutter widgets use it and I find it much easier to figure out which arguments to fill in with my IDE instead of always jumping over to the documentation.
Unfortunately in many cases this will cause all of the parameters to be passed via the stack rather than in registers. It’s a meaningful penalty for the kind of people who would willingly select c++ as a language today (rather than, say, python or java).
From my experience with C99: if the struct size is 16 bytes or less, the values will be packed into registers. But true as soon as the struct grows bigger, the entire content will be passed on the stack.
IMHO this sort of named arguments "easter egg" is fine for big "option bag structs" passed to functions that are not performance critical (or for small struct <= 16 bytes). For other cases, regular args are better.
OTH it seems like passing normal args also spills over into the stack very soon (only 4 registers used?):
The main difference seems to be that normals arguments "spill over", while passing structs by value puts everything on the stack once the threshold size is passed.
Six of the parameters use registers here. the `lea` is putting the sum of the first two into `eax`, then the rest accumulate more adds into `eax`. Only the last 3 parameters are reading from stack memory (the `dword ptr [*]` stuff).
In any case you don't get to actually see the passing at the callsite because the result `45` is inlined into `main()`. :)
Thanks for the clarification (also to archgoon). To your last point, I should have used -O1 instead of -O2, this also makes it clear what happens on the callsite.
You don’t need to lower the optimization level to see what happens at the call site: you can turn the function ‘blub’ into a function prototype. The optimizer cannot in-line your code if it isn’t available.
Another approach is to use a `volatile` function pointer: https://www.godbolt.org/z/e6GM93 -- That way you get to keep other optimizations from `-O3` without inlining the function call itself.
Six arguments use registers (and rax for return). The first two in your example are in the lea instruction. Using rdi, rsi, rdx, rcx, r8, r9 and then the stack is the calling convention used on x64 linux. 32-bit (-m32 in godbolt) linux would use the stack exclusively.
> Using rdi, rsi, rdx, rcx, r8, r9 and then the stack is the calling convention used on x64 linux
It's more complicated than that - floats/doubles and structs of them are put in SSE registers instead. Mixed structs are packed into integer registers (!) if they are small enough.
And with C++ classes with non-trivial destructor/etc. are always passed on the stack.
If it's a prvalue there is no copying involved. Here's an example where one of the 'parameters' has an 'expensive' copy constructor but it isn't invoked at all: https://godbolt.org/z/94ehvx
However, if you use an lvalue expression then yes, the copy constructor is invoked (in `bar` here), and you can avoid that by using an xvalue expression (in `baz`, using `std::move`): https://godbolt.org/z/MnKnWM
Basically it doesn't depend on being passed through a struct as much as on the parameter itself. You'd see the same behavior if the `C` struct was passed as a regular parameter.
Only 16 bytes are passed in registers except for float types:
> If the size of the aggregate exceeds two eightbytes and the first eight-byte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argumentis passed in memory.
I'd expect scalar replacement of aggregates (SROA) would eliminate the use of a struct like this in some cases, but there are definitely limits, especially if not compiling with LTO enabled, since optimizations across compilation units will be limited. Honestly doesn't seem worth the cost.
Using registers to pass structs is an ABI thing and doesn't need LTO, it happens during compilation (see godbolt.org links on the rest of this thread).
I guess that's because the struct size is below 16 bytes (see my sister comment). Adding two more ints to the struct places the entire content on the stack.
Everything that's not a struct that you would pass as a parameter (int, pointer, char, etc) all fit in a register and there are register reserved for function parameters in the ISA.
Structs with more than one field don't fit into a register.
This is my favorite feature of Objective-C / Smalltalk.
It makes everything so readable. I can come back to my code a few weeks later and I don’t really need to look up function signatures.
In particular, my favorite thing about this in Objective-C is that the C functions have a different way of being called so you can easily distinguish between the differences in paradigms.
I wonder what other languages force all calls to be via named parameters. Would love to have another option :)
This is quite interesting, a very elegant struct hack. Dlang is working on an RFC for implementing this feature right into the language, which would probably make it the first systems language to implement this without structs.
Is there a way to have initialization fail at compile time if all members aren't included? So that if new members are added, all call sites become invalid until updated?
I don't think you'd want to. One of the primary usages of this kind of thing would be like when you need to pass in some massive struct of options to some dumb API function (you know the ones!), and then you'd absolutely want default values for all the things you don't want to specify.
Also, some might argue that what you're describing (call sites not becoming invalid when you update the function) is a feature, not a bug. This way, you can add options to your functions and still have everything compile like it should.
I wouldn't personally argue that though, i think it leads to bad practice where functions take a bazillion options and they become horrible tech debt. I'm working in a mixed C++ and Lua codebase, and the Lua codebase is littered with functions that take, like, 9 different arguments that change the behavior of the function. When I asked one of my colleagues about this style, he said it was one of his favorite features of Lua, that you can add arguments to a function to change the behavior in certain cases but keep the original functionality (guarded with default values or if's or whatever). It's made much of the code a living nightmare!
So, I guess, in the process of writing this comment I've come over to your side :) say no to adding arguments with default values!
I don't mind it for thing like pyplot where you have tons of style options hidden away like that.
But I'd definitely like to be able to say this one isn't optional or implicitly default constructed (or worse uninitialized, maybe that is already precluded by any amount of brace initialization, or maybe not if the class is POD, I'll have to check).
Someone suggested below deleting default constructors, you could make a template class that did that and forwarded other constructors, but I'd still like to be able to allow explicit default initialization in the ones that aren't considered optional.
It's a pretty convincing fake, but it has one problem: the person who wrote the function needs to have written it that way. What I wish more languages had (C++, Java, etc.) is the following:
Suppose I write some function that take way too many parameters:
void foo(char a, short b, long c, float d, double e){}
What I want, as a caller, is for the language to let me get a struct corresponding to the argument list, something like this:
And then add a bit of magic syntax so we can "unpack" that struct to actually call foo. (Extending this proposal to handle varargs is left as an exercise to the reader, because I have no idea.)