Hacker News new | comments | show | ask | jobs | submit login
Auto Type Deduction in C++ Range-Based For Loops (petrzemek.net)
61 points by jupp0r on Sept 2, 2016 | hide | past | web | favorite | 27 comments

I don't like that at all.

Sure if you consider:

for (auto& p : wordCount) {

    // ... word: p.first, count: p.second


for (auto& [word, count] : wordCount) { // C++1z

    // ...

seems like an improvement because you immediately know the semantics of the two parts of the pair. But you have no idea what "word" or "count" is. In this example with words like "word" or "count" the semantics somehow encode type information but most of the time you deal with user defined types where that information is a) not clear and b) not as trivial.

To me this whole "auto" thing seems misunderstood. People use it to make their lifes "easier" (think lazy) but not simpler. The type system is there to help you and type information is of uttermost importance and should be available as closest to the usage as possible without destroying readability so you don't have to look it up somewhere else.

using wcPair = std::pair<const std::string, int>;

for (wcPair& p : wcMap) {

  std::cout << p.first << " : " << p.second << std::endl;

seems reasonable enough. The loop itself can be read without clutter, you can infer what's supposed to happen without knowing much stuff and even if you want a deeper understanding of what's going on then the information you need is right above.

Doesn't that seem much better? oO

...and if you want a language without a type system don't use a language with a type system.

>...and if you want a language without a type system don't use a language with a type system.

I wasn't the one who downvoted you but I'm guessing it happened because your advice is misguided and it shows how you're confusing the ceremony of a human bashing extra keystrokes to repeat annotations everywhere with the concept of static typing that enables compiler correctness checking. The keyword "auto" enables the separation of those 2 concepts. Type inferencing and type deduction is the technology that gives you "types" without the tedious "ceremony".

  > if you consider:
  > for (auto& p : wordCount) {
  > ...
  > But you have no idea what "word" or "count" *is*.
If you know what wordCount is, you'll know what types word and count are.

I remember back when I was still a very new programmer I somehow got the syntax for an iterator-based loop wrong ("for std::vector<int>::iterator itor = v.begin(); itor != v.end(); ++itor)"). I don't remember my mistake. I probably missed a const somewhere. The compiler told me that "std::vector<int>::iterator" did not match the type of v.begin(), but it wouldn't tell me what would match. Even as a beginning programmer, I knew the compiler knew, but all it would say was "you got it wrong."

And, of course, I didn't care what the type of v.begin() was. All I cared about was whether I could iterate over the elements of v. And I knew that v contained int's.

Many programmers learned about strong typing ( https://en.wikipedia.org/wiki/Strong_and_weak_typing ) and static typing ( https://en.wikipedia.org/wiki/Type_system#STATIC ) in languages that require manifest typing ( https://en.wikipedia.org/wiki/Manifest_typing ), so they often don't realize those are three distinct ideas. You can write strongly typed programs in Haskell, Ocaml, Go and Rust without many type statements.

The experience of other language communities is: you'll get used to it. When C# introduced its version of this with the 'var' keyword many felt that code would become less clear but, in general, context gives you what you need. If it doesn't that often means that there are other problems.

I don't know if this is true in your case, but often people who object haven't spent much time working in dynamically typed languages. The readability issue is orthogonal to typing and it tends to be fine.

The purpose of the type system in C++ is partly to encode semantics about the usage of a particular construct. For example, we might declare three or four types which all support the increment() function(or operator). The traditional way to do this is to create an interface and have the four concrete types inherit from the interface.

What auto says is "I don't care what type this value is. I just want it to support the semantics that I'm about to describe in this block." If you change the return type of the value in the auto expression, as long as the new return type supports the same semantics as the original return type, you don't need to touch that function after the refactor.

The compiler still complains when it can't find a way to make the return value support the semantics you're asking for. This is an improvement, but only in cases where you genuinely don't care what type it is, but only that it supports iteration. You still have to use auto with care, like every other keyword, but it does significantly improve maintainability in code bases where traditionally you would mechanically key in the same type information multiple times throughout the declaration of a class and its usage. Typedef and using statements solve similar problems, but they don't solve exactly the same problem.

For a good example of why this is a wonderful thing, try to determine what return type you should declare for an STL iterator, or try using Boost while avoiding the auto keyword.

I'm currently working on a code base that's just upgrading to MSVC 10.0, and seeing std::vector::value_type(?)::const_iterator everywhere is killing me.

With regard to your example, in any decent IDE you can mouse over "auto" and get the type information. However p.first and p.second are obscure and hard to read no matter what your tools do.

I don't get why people seem to think type inference equates to a hatred of static typing. Usually if I'm reading code I want as little clutter as possible, and if I need to know the type it's trivial to find it. I still like that I can know the type of something ahead of time, but I don't need it in my face constantly.

Static typing doesn't always require type annotations. Type inference is 1970's era technology.

There's definitely a way to abuse it. But if, for instance the type is shown a couple lines up, it's not too bad. E.g. if the for loop is iterating over a var passed into a function.

I use const auto on occasion, when the element type is inexpensive to copy. If I'm iterating over a vector if int's, why would I prefer const auto& over const auto? Assuming the compiler does not try to optimize the const auto&, const auto should be faster than accessing a value through a reference.

Using const auto& over const auto makes your code more robust. For example, what if you later decide to change the type of items in the vector from ints to something that is more expensive to copy? You would need to track all uses of the vector and add an ampersand there. And as for the access speed, YMMV but compilers are generally good at optimizations and will drop the reference when it would be faster to just copy the items.

Good point about robustness, it makes sense in the general case, but if I have a container of integers, I think the chance that I replace int with a more expensive type while not having to modify the rest of the loop is pretty unlikely. Chances are the more expensive type would no longer behave like an int. I think there are cases where the robustness approach shines, and cases where it's pretty safe to just use a const copy and a reference would just add noise. Imagine a simple function where I initialize an array of integers, then iterate over them. const auto or just const int makes sense, const auto& requires more explanation and invoking what if's that are unlikely to ever surface.

But be careful about the lifetime of the object referenced! Just because its 'const' doesn't mean it cant go out of scope.

Local const reference prolongs the object lifetime[1]. There shouldn't be any problems whenever you use it in a for loop.

[1]: https://herbsutter.com/2008/01/01/gotw-88-a-candidate-for-th...

There definitely can be problems: the reference may escape the for loop, and the reference may be invalidated by a modification of the container (even for something "stable" like ordered_map, the referent may be removed).

If a pointer/reference escapes the loop you'll have problems whether the thing to the right of the colon is a reference or a copy. If it's a copy, the escaped reference will definitely be invalid once that iteration complete. If it's a reference, it will remain valid in more situations.

Yes, I believe that's what I was saying, but your comment sounds like it is disagreeing. Could you clarify?

Oh, okay, then. I thought you were saying that that was a problem specific to iterating by reference.

references prolong the lifetime of temporaries. It wouldn't work in this case:

vector<int> v =...; const auto& x = v[59]; v.clear(); // x is dangling

Huh, I always thought that there is an optimization to always choose the most consty and referency option possible. So if you write `auto` but don't modify the item in the loop, it may choose to use a reference / pointer under the hood, as if you wrote `auto &` or `const auto &`. But it can't strip away the `&`.

The reason is that I imagined C++ can substitute anything for auto, so it can make `auto` into a pointer to Foo, but it can't turn `auto &` into `Foo`. (In a first pass, of course due to the as-if rule the compiler can do whatever it wants when you are not looking.)

In the example the author reccommends against, `for (const auto x : range)`, I would think it means:

- const: don't let me modify x (enforce const-correctness at compile time)

- Make a copy or reference, whatever is faster.

Particularly, I think/thought: `const` does not tell the compiler to make a copy. The lack of `&` allows the compiler to make a copy.

I'm sure this interpretation is not 100% what's in the standard, but a intuition I developed from observation, so probably some of it is wrong.

auto doesn't mean "make a copy or reference, whatever is faster". auto first and foremost means "make a copy". Then sometimes compilers can do a trick called copy-elision and you might be working on a reference. But generally assume it's going to be a copy.

auto& means only make a reference.

Wouldn't one want to use 'f(std::forward(x))' with 'auto&&' ?

I think the example with function application is a bad one here - moving the element potentially leaves it in an undefined state. It would be a better example if the code did something with x instead of calling a function on it.

So I don't think you'd want to forward it. Note that you also need the element's type to forward it, which isn't possible in general with the example's signature.

This has been slightly confusing to me, but the syntax auto&& might not mean an rvalue reference. According to the post, it means a universal reference. I wish they would have made a different syntax. For example, with templates, my understanding is:

void somefunction(int && i) // i is an rvalue reference

template<typename T> void somefunction2(T && i) // i is a universal reference

It might be a similar thing with auto&&. So I would tentatively agree that std::forward should be used.

Yes in this case it's a universal reference, I'm aware. Universal references work with auto&&. What std::forward does is the following:

- if you put in an rvalue reference, it's like std::move

- if you put in anything else, it does nothing.

So my point about the article's formulation ("when you want to modify elements in the range in generic code") and its example remains.

I have improved the example in the article to make the code less confusing. Now, a value is assigned to each element in the range and there is no function call.

Thanks, that's clearer.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact