
Q: Why can I access an out-of-scope C++ var? A: So you rent a hotel room... - sutro
http://stackoverflow.com/questions/6441218/c-local-variable-can-be-accessed-outside-its-scope/6445794#6445794
======
yaakov34
I say this as a C++ programmer (C++ is essentially the only choice in the
domain in which I currently work), but this really underscores how
unreasonable C++ is for writing secure/high reliability applications. On the
one hand, the compiler will curse you out for trying to use a non-const
iterator in a const function (sheesh, what kind of an idiot are you, anyway)?
On the other hand, you can read and write to every bit of memory allocated to
your application, and it will stand by and do nothing. I think the number of
applications which had buffer overflows at least at some point is
statistically indistinguishable from 100%.

Try this hotel analogy: you go to a hotel in which you once stayed, and tell
them that you're going to dump every bit of possessions and furnishings in all
the rooms outside on the street, rifle through them, set them on fire, then
photograph all their guests in the nude and distribute the pictures. Most
hotels will object. C++ won't.

~~~
yaakov34
It seems from some of the comments down the thread that the idea of writing
over memory in C++ is new to many people. Let me introduce another couple of
concepts.

You don't have to stop with writing over the variables. You can write over the
return address of a function, which is stored on the same stack as the
variables. If you write some user-supplied buffer to memory, and you didn't
carefully make sure that the memory is allocated for this purpose, the
(malicious) user can supply you with data that will set the function's return
address to something in his buffer, and the function will go on executing his
code. That's called a buffer overflow vulnerability, and it's been well known
and exploited for decades; half the patches you see coming down the pipeline
used to be for this.

Now, there are good ways to protect against buffer overflow vulnerabilities.
If you make the stack and data non-executable, and the pages containing code
non-modifiable, then seemingly there is no place for the attacker to place his
code and get it to run. Modern processors support this in hardware.

Except that some people came up with something called return-oriented
programming. This works by finding little pieces in your code that do
something simple like increment a register or write a byte of data, followed
by a "return" instruction. These tiny pieces of code are called "gadgets". You
can usually find enough gadgets to make a Turing-complete language. Then you
write a compiler which transforms your code into a stream of calls to the
gadgets. Now all you need is one overwrite of the return address on the stack,
and you can start playing the application like a piano, without ever executing
data or modifying code. There are now tools to automate all this.

I don't want to reveal anything about where I work, but we write applications
which need to be reliable and secure. We have regular meetings to discuss
vulnerabilities and security techniques. At the meeting where we discussed
return-oriented programming, when we asked what we could do to protect from
it, the consensus answer was "You can't. Don't make any buffer overruns." The
problem is it's not clear that anyone ever wrote a program free from those.

Since hotel analogies seem to be so popular, here you go:
<http://en.wikipedia.org/wiki/Psycho_(film)>

~~~
yaakov34
One last addition - I hope I'm not overstaying my welcome with all these
little lectures on security. It used to be that a buffer overrun typically
resulted from carelessness, like

    
    
       char name[100]; //Who has a name longer than 100 characters?
       printf(“What is your name?”);
       scanf(name, ”%s”);
    

It turns out that the same kind of people who have names like "Robert'); DROP
TABLE Students; --" sometimes have 150-character names with a return address
and a few bytes of malicious code. That's something that you handle kind of
like you sanitize your inputs. You don't pour user input into a limited space.
Scanf is deprecated now, and this kind of a vuln is becoming rare.

But as I explain a couple of comments down, a buffer overflow doesn't have to
be this simple. You can get a bad pointer in a structure anywhere in your
program, and when you use the structure, you will overwrite the stack. It's
very hard to catch that kind of vuln - basically, it's a race with you (and
white hats) vs. black hats. This is why I welcome the switch from C/C++ to
managed code which is happening now, but very, very slowly.

~~~
ma2rten
I don't think you are overstaying your welcome. Actually, I learned alot from
your posts. Thanks.

~~~
anonymous246
Off topic/meta: In the days when comment scores used to be shown, I could do
the HN-equivalent of saying "hear, hear!" [or as the Internet says it: "here,
here" :)] by upvoting a "thank you" note like the above. Now, if I _really_
want to let you know that I too liked your comment, I have to post a "me too!"
comment like this one. Anyway, thanks. TIL about "return oriented
programming".

------
solutionyogi
Eric Lippert is a prolific writer and is amazing at explaining anything
related to computer programming. His blog is a MUST read for every .NET
developer: <http://blogs.msdn.com/b/ericlippert/>

I also follow Eric's activity on SO here:

[http://stackoverflow.com/users/88656/eric-
lippert?tab=activi...](http://stackoverflow.com/users/88656/eric-
lippert?tab=activity)

Some of his answers which I liked:

[http://stackoverflow.com/questions/2704652/monad-in-plain-
en...](http://stackoverflow.com/questions/2704652/monad-in-plain-english-for-
the-oop-programmer-with-no-fp-background/2704795#2704795)

[http://stackoverflow.com/questions/921180/how-can-i-
ensure-t...](http://stackoverflow.com/questions/921180/how-can-i-ensure-that-
a-division-of-integers-is-always-rounded-up/926806#926806)

[http://stackoverflow.com/questions/5032081/coding-style-
assi...](http://stackoverflow.com/questions/5032081/coding-style-assignments-
inside-expressions/5032287#5032287)

He is on HN as well:

<http://news.ycombinator.com/threads?id=ericlippert>

~~~
jseliger
Heads up: the two links at the top go to the same source:

 _Eric Lippert is a prolific writer and is amazing at explaining anything
related to computer programming. His blog is a MUST read for every .NET
developer:<http://blogs.msdn.com/b/ericlippert/>

I also follow Eric's activity on SO here:

<http://blogs.msdn.com/b/ericlippert/*>

~~~
solutionyogi
I don't know who downvoted you, you are correct. I fixed up the links.

------
codexon
I'll probably get downvoted, but I think elaborate real-life analogies like
this are more likely to confuse novices than help them.

A better explanation in my opinion is to draw a diagram of a stack and showing
that returning from a function just decreases a pointer. C++ programs don't
scrub the top of the stack once you finish a function because it is a waste of
time.

If another function was called before the 2nd call to foo(), then the variable
on the stack would be overwritten.

~~~
olavk
The analogy is supposed to explain the concept of _undefined behavior_ , not
the details of memory management.

If you are used to safe languages, the concept of undefined behavior might be
confusing. In those languages, an operation is either allowed or not allowed,
and if it is allowed it has well-defined behavior. In C a number of operations
does not have well-defined results, but is still technically possible to
perform, or might be possible depending on circumstances. But you shouldn't
use them. The analogy is supposed to explain that.

~~~
sliverstorm
undefined behavior is fun. All this stuff gets a lot more interesting in
microcontrollers, where security is not an issue and you sometimes have to
manage memory directly and store data between power cycles.

------
jamesgeck0
I had a roommate in college who would write code like this. I (and several
other people) tried to explain to him that he couldn't necessarily count on
the pattern working all the time. His response was to say, "But it works
here!" and continue abusing C++'s undefined behavior.

I stopped giving him help with his CS coursework in fairly short order.

------
city41
Eric Lippert was one of the highlights of working at Microsoft. He was (and
I'm sure, still is) very active on the internal C# mailing list. His answers
are always very entertaining to read. He has a smugness that is deserved and
not irritating (mostly :)), and inside the walls of MS he turns that smugness
up quite a bit higher.

~~~
ericlippert
I'm glad you enjoyed my answers; however, I am never intending to come across
as smug. It is difficult to get across subtleties in an impoverished text-only
medium; it has certainly been my experience that _instructive and constructive
criticism_ can easily be read as smugness. It's also been my experience that
trying to deliberately make it sound less smug often just turns it into
sounding like condescention. Text-based communication of technical issues is a
hard problem and I do struggle with finding the right tone. Thanks for the
feedback.

~~~
city41
I don't think "smug" is quite the correct term. I meant nothing negative with
my comment (and was teasing with the "mostly"). You do have an "air of
authority" about your answers, but that authority is clearly earned. I enjoyed
your answers on the C# mailing list quite a bit.

------
briandoll
Great analogy and well written. We often see attempts at this style of
explanation, but we usually fail at one or both of those qualities.

While I wish I saw more of this style of teaching in technical publications,
it's admittedly rarely done well enough to convey the point.

------
dools
I'm not sure I agree _entirely_ with this explanation. He emphasises that this
is unsafe behaviour that is protected against in safer languages - I've never
heard it said that the use of pointers should be considered unsafe and there
are also pointers in other languages that exclude things like multiple-
inheritance and have garbage collection (which could be considered safer
languages).

I like the room key analogy (almost) but I don't think it needs to be stolen
and I don't think all the talk about policies and contracts is necessary at
all.

It could simply say the thing you're returning is not the local variable. It
is a pointer.

I have a box and instead of giving you the box I give you a key to the box.
Anytime you want to use the contents of the box you must come over to the box
and open it, but you can't take the box with you. If you want to put something
in the box you just walk over and put it in - but here's the rub: you have no
guarantees that you're the only one with a key. I could give out the key to a
bunch of people so you don't know what will be in the box and who owns it.

Use with caution!

~~~
stoney
I think he means that pointers are unsafe in the same way that dynamite is
unsafe. They both have legitimate uses, but you need to be very careful when
you are using them. (Rather than unsafe in the sense of don't use them at
all).

I like your box analogy, but I don't think it goes far enough - with C++
pointers it is not always ok to put something in the box. If the pointer is
out of date or invalid there is a chance that writing to it will make the OS
kill the entire process.

Languages with garbage collection don't tend to have that kind of problem,
since whatever is at the memory location will stay at that location until
everyone surrenders their pointers.

~~~
maranas
I agree with you for the most part. C/C++ lets you do what you want,
presumably because you know what you're doing. But I don't agree that pointers
are evil, and should be avoided. If you want to do any practical programming
with C/C++ at all, you have to learn how to dynamically allocate memory and
use pointers. Generally, pointers work as advertised. It's the cases where
they work when they shouldn't that is the problem. In these cases a compiler
usually warns you, so you are still covered. So in C/C++, just because it
works, doesn't mean your code is right.

~~~
nl
_But I don't agree that pointers are evil, and should be avoided._

No one is saying that at all. Pointers aren't evil, but they are dangerous.

Perhaps a better analogy is a sharp chef's knife. In the right hands it's an
effective and efficient tool that lets you do things quicker and just as
safely as any other tool.

In the wrong hands it is dangerous to the person using it and to those around
them.

There are numerous other examples: welding torches, motorbikes, explosives etc
etc.

~~~
stoney
Exactly. I didn't mean to imply that pointers are evil or should be avoided.
That was supposed to be the point of my dynamite analogy, but I guess the
comment made elsewhere on this topic about the inadequacies of analogies holds
true here.

So, for the avoidance of doubt, I believe: pointers are awesome, powerful
tools and you can do some great things in C/C++ using them and I sometimes
miss them (a little bit) when using other languages. But you can also do some
terrible things with them - and I have done some spectacularly bad things with
them in the past. But that doesn't mean they are bad - it just means that I am
reckless.

~~~
maranas
^ Ditto. I thought pointers were being shown in a negative light at first, but
now I see the point. In my opinion too, manual memory management is a very
important part of developing highly scalable applications, but should only be
done if absolutely needed - which is fortunately rarely the case nowadays as
most people just develop for the Web, and can afford to throw money at the
problem. But for embedded systems where resources are limited and inputs are
very limited, it is still very useful. In the early years, it was C + inline
assembly for further optimization - where you try your best to avoid assembly.
I guess now, it's a dynamic/interpreted language, plus C/C++ for further
optimization, and avoid C/C++ like the plague as much as possible.

------
wheels
Local variables in C and C++ are just put on a stack (i.e. _the stack_ ) and
its behavior is pretty predictable. For example:

    
    
      #include <stdio.h>
      
      static int *foo(void)
      {
          int i = 42;
          return &i;
      }
      
      static int *bar(void)
      {
          int i = 43;
          return &i;
      }
      
      int main(void)
      {
          int *f = foo();
          /* with next line commented out prints 42, with it, 43 */
          bar();
          int i = *f;
          printf("%i\n", i);
          return 0;
      }
    

Until you've made another function call that reuses that space on the stack
you'll almost certainly still have valid values when accessing those local
variables by address. Things are a bit vaguer in C++ where those variables may
be objects which have destructors which have already been called.

Edit: Made it predictable, per caf's (correct) comment by adding:

    
    
      int i = *f;

~~~
alain94040
In embedded systems where interrupts are handled from user space (or the code
you show is in the kernel), then it is guaranteed that your stack will be
obliterated if an interrupt occurs at the exact right time, so your value will
not be there.

This is not academic: I have debugged code that relied on that behavior, and
would fail once in a million runs. Very frustrating to figure out.

~~~
bodyfour
Same in userland if you're using POSIX signals (and you haven't explicitly
configured a separate sigstack)

All it takes is the user resizing their xterm and JUST the right moment and
the SIGWINCH handler will happily step on your out-of-scope stack objects.

------
thewisedude
A good analogy no doubt! But you should be careful when using analogies. Every
behavior of the analogy might not be applicable to the original problem.

Analogies can only help make people "understand" something. Normal brains try
to relate things to what it has learnt so far. So analogies seem to help.
Analogies in themselves are not scientific explanations obviously.

A scientific/rational explanation to the question posed by the user would be -
C++ makes no promise about the behavior of the program when out-of scope C++
variable is used. Obviously this explanation -a simple one is quite less
dramatic.

------
ConceitedCode
Title is a bit misleading... but still a great answer.

------
stevetjoa
200 upvotes in nine hours?! Is that a Stack Overflow record?

~~~
leon_
no, I guess many people started to understand this behavior just now.
(something you don't meet when all you do is javascript and ruby/python)

~~~
DrJokepu
Maybe I'm an elitist but I find it rather scary that people write code for a
living without understanding concepts such as this. It's easy to forget what a
huge service StackOverflow provides to the developer community by spreading
knowledge in an easily accessible and entertaining fashion.

~~~
roryokane
If those people who write code for a living never program in C++ or other
languages with pointers, instead using languages like Java and Python, I
wouldn't expect them to know what happens when you mess around with pointers
in C++. If you don't use pointers, and your language doesn't even support
pointers, why care about the behavior of pointers? Understanding the simpler
concept of "references" is good enough.

------
skrebbel
if you can always legally access the book, which will for sure be available to
you in that drawer, then the hotel key you stole must be a closure.

