Bah, those are all well-known. What value does the following program return? int...

_kst_ · 2024-09-26T23:01:19 1727391679

It returns 2.

The only reason that might be surprising is that the "return *p;" statement refers to the value of an object at a point (textually) before its definition. But the lifetime of the object named "v" begins on entry to the innermost compound statement enclosing its definition -- in this case the body of "main".

Space for "v" is allocated on entry to "main". It's initialized to 1 when its definition is reached. The "return *p;" statement appears before the definition of "v" in the program source, but is executed after its definition was reached at run time, and within its lifetime.

It's important to remember that scope and lifetime are two different things. The scope of an identifier is the region of program text in which the identifier is visible; for "v" it extends from the definition to the closing "}". The lifetime of an object is the time span during execution in which it exists; for "v" it extends from the time when execution reaches the opening "{" to the time when execution reaches the closing "}". Formally, storage for "n" is allocated at the beginning of its lifetime and deallocated at the end of its lifetime. Compilers can and do optimize allocation and deallocation, as long as the visible behavior is consistent.

Aside: If "v" were a VLA (variable length array, introduced in C99, made optional in C11) its lifetime would begin when execution reaches its definition.

shultays · 2024-09-27T00:42:27 1727397747

Can't it reuse v's memory for other things before v is defined? Say there is "int a = 4;" at the beginning of main that is no longer used when it reaches "int v = 1;", can't a & v share same memory location?

_kst_ · 2024-09-27T03:36:06 1727408166

A compiler can reuse memory as much as it likes -- but only if the visible behavior of the program is consistent with the language requirements.

If you write:

    {
        int n = 42;
        printf("%d\n", n);
    }

in the abstract machine, `sizeof (int)` bytes are allocated on entry to the block and deallocated on exit, but a compiler can legally replace the entire block with `puts("42")` and not allocate any memory for `n`.

Memory for objects defined in nested blocks is logically allocated on entry to the block, but compilers commonly merge the allocation into the function entry code. Even so, objects in parallel blocks can certainly share memory:

    int main(void) {
        {
            int a;
        }
        {
            int b; // might share memory with a
        }
    }

Logically, memory for `a` is allocated on entry to the first inner block, and memory for `b` on entry to the second inner block. Compilers will typically allocate all the memory on entry to `main`, but can use the same address for `a` and `b`.

mananaysiempre · 2024-09-27T01:07:35 1727399255

As written, without introducing VLAs or additional blocks, no. C23 §6.2.4(5–6):

> An object whose identifier is declared with no linkage and without the storage-class specifier `static` has automatic storage duration [...].

> For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial representation of the object is indeterminate. If an initialization is specified for the object and it is not specified with `constexpr`, it is performed each time the declaration or compound literal is reached in the execution of the block [...]; otherwise, the representation of the object becomes indeterminate each time the declaration is reached.

That is, a local variable is live from the moment the block that contains its declaration is entered (however and wherever that happens) until it is left (ditto), but is initialized or, for lack of a better word, uninitialized each time execution passes that declaration (however many times that happens, including none). This is despite the fact that at compile time the variable’s name is not in scope until the = introducing its initializer (or the place where such a = would go if there isn’t one). Modulo its smaller feature set, C89 §6.1.2.4(3) stipulates the same.

In addition to GGP’s deliberately confusing example, this permits the much more reasonable and C89-compatible

  switch (x) {
      int i, j;
  
  case 1:
      /* use i and j */
      break;
  
  case 2:
      /* use i and j */
      break;
  }

The only exception is locals of variably modified type (e.g. variable-length arrays), whose declarations you can’t jump over on pain of undefined behaviour.

No wonder basically every C compiler allocates a single stack frame at function entry.

sweeter · 2024-09-26T22:25:09 1727389509

Is it 2? I'm not exactly sure though. I'm interested in hearing the logic

_kst_ · 2024-09-27T03:36:47 1727408207

See my comment above:

https://news.ycombinator.com/item?id=41664474

tylerhou · 2024-09-26T22:48:36 1727390916

gcc, msvc, and clang both produce code that exits with code 2: https://godbolt.org/z/WEYjns85Y