Hacker News new | past | comments | ask | show | jobs | submit login
A little-known C feature: Static array indices in parameter declarations (2013) (hamberg.no)
259 points by fanf2 on Dec 29, 2018 | hide | past | favorite | 60 comments



There is something personally poetic (is that a thing? it's poetic to me, but maybe not to you?) about a C feature I was unaware of despite working in the language for 25 years being yet another way in which the keyword "static" is overloaded.

It's as if they go around looking for places in the grammar they can unambiguously stick "static" and then back-rationalize a meaning for it.


If you like that, there is yet another seldom seen feature of c99, the use of '' in an array declarator:

    echo 'int my_func(int i, int arr[*]);' > main.c
    gcc -c -std=c99 main.c
    -- * observe that there are no errors * -- 
See 6.7.5.2 Array declarators for details http://port70.net/~nsz/c/c99/n1256.html

Edit: I can't figure out how to escapse an asterisk, so the above might not read correctly but I tried.


If the character following a * is whitespace it will show up as an asterisk without implication of special formatting.

    In code blocks *asterisks* don't need any special
    escaping as they always show up "as is".


What's the point of this? It seems like it's the same as not including the asterisk in the first place.


It seems this only really matters in function prototype declarations that contain multidimensional arrays. Compare:

    int foo(char [][*]);
    int bar(char [][]);
The second declaration should give an error.

I'm getting this from just reading the spec now, but it looks like the reason has to do with "incomplete" versus "complete" types.

Something like 'int arr[]' is an "incomplete type", i.e. an array of ints with unspecified size. While 'int arr[*]' is a complete type for an array with variable length. I'm guessing that array elements must have a complete type, which is why the above snippet gives an error.


If I understood correctly, this will be removed in C2x.


The size of any upper dimension must be known for the array subscribing operator to work, using * allows you to bypass that in function prototypes.


Having written a parser for C I can tell you that parameter type declarations are an absolute bear to parse. Wow.


> It's as if they go around looking for places in the grammar they can unambiguously stick "static" and then back-rationalize a meaning for it.

I would think that that's exactly what they do, in order to avoid breaking existing code by introducing any more reserved words.


If only they'd thought of using a special prefix for reserved words, e.g. _.


Those are the words that are part of the language itself! The _ prefix is used for variables.


I know. I mean if they differentiated keywords like static with _ (and didn't allow it as a variable prefix) then they could have an infinite number on non-clashing language keywords...


That's exactly how it works for all new (from C99 onward) keywords.

The actual keywords are defined with a leading underscore and a capital letter. The standard reserves all such identifiers for implementation, and so no valid conforming program should contain any such.

Thus, you get keywords like _Bool and _Complex. And then, to make them look nice, there are headers like <stdbool.h> which basically just do #define bool _Bool etc.


They're trying to avoid introducing new keywords or funky punctuation, both worthy goals.


They are likely to remove this in the next C standard (or may have already), because a declaration with a static size is “compatible” with a declaration with no static size, and thus you can get unsoundness.

So your compiler should accept (though the semantics of static suggest this program would be wrong):

    void foo(int a[static 3]);
    void foo(int a[]);
According to the spec, this is fine, but when you come to define the function foo, should it respect the ‘static’ annotation? The spec doesn’t say, and doing static analysis of subsumption for the expressions after ‘static’ is much more complex to keep sound than C prefers in its specification.


Try as I might I can't get gcc to produce warnings for this (invocation: gcc --std=c99 -Wall -Werror -Wpedantic -Wnonnull). I can get clang to do so however.

I didn't expect to get any compiler warnings for code that passes NULL to a [static 1] parameter, so I wasn't surprised when this code compiled silently:

    #include <stdio.h>
    
    void end(int foo[static 1]) {
        printf("%d\n", *foo);
    }

    void middle(int *foo) {
        end(foo);
    }

    int main() {
        middle(NULL);
        return 0;
    }
But I was a bit disappointed that the following compiled without warning as well:

    #include <stdio.h>

    void end(int foo[static 1]) {
        printf("%d\n", *foo);
    }

    int main() {
        int *foo = NULL;
        end(foo);
        return 0;
    }


GCC has an extension __attribute__((nonnull)) which is probably better if you want to ensure that NULL is not passed as a function parameter. https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attribute... According to this document it also allows the compiler to make extra optimizations: http://rachid.koucha.free.fr/tech_corner/nonnull_gcc_attribu...


That's because the compiler would have to track at compile time which value the "foo" has.


That's what I was afraid of, the feature is "incomplete" in my eyes, because one would expect the compiler to be "smart enough" to catch this case. A feature such like this that has an unclear perimeter is of little use.

Well, I guess that this feature has a precise definition in the C spec, but one should note that the purpose half of the warnings the compiler emits (guesstimate) is to mitigate the imperfect knowledge of the C spec. Typical example is operator precedence: who can remember 17 levels of precedence? One can remember the obvious cases, one can remember the less obvious case if one uses them often enough, but anything besides that (and when in doubt when you trying to figure out where a bug comes from), one just says "fack it" and put parenthesis everywhere. Smalltalk, APL, Lisp, Forth were quite right to sacrifice natural expressions notations for something more useful than being kind with beginners. An unnatural but simpler model is better than a natural but more complex model, because programmers are not compilers. This is a fact that compiler/language designers seem to forget sometimes.

So it's not surprise to me that this feature is little known: it's not reliable.


IIRC GCC doesn't warn you but Clang does. Looks like the decade-old rule of thumb "use Clang for development and GCC for production" still applies.


Seems cool, but how is useful in practice? In 15 years I've never written a C function taking a fixed array size at all. I always write my functions with a dynamic size argument like

    void foo(char *data, int len)
Perhaps cryptography functions with 256-bit outputs might be able to use this, but even when the array size is fixed, most crypto libraries (like OpenSSL) tend to have a size argument anyway, to encourage the caller to think about what they're doing, and for the library to check that 32 bytes are actually available, for example.


I would agree. IMO, I would argue that if you are in that situation, you should just wrap the array in a `struct`. It will work the same way, but now nobody can get the size wrong - the `struct` will always have an array of the correct length inside it. And as a bonus you can give it a good name.


I like using pointer-to-array types for this purpose as (like an array in struct) the array's length is encoded in the type, and thus likewise allows the compiler to warn if an incompatible array is provided. e.g.

   void f(char (*n)[255]);
   char array[6];
   f(&array);
warns of "incompatible pointer types passing 'char (* )[6]' to parameter of type 'char (* )[255]'"

This won't produce a diagnostic for f(NULL) like "static" does, but does have two properties that might be considered benefits:

1) The length is exact rather than a minimum.

2) The type of "* n" is still char[255], whereas a char[static 255] parameter is still a decayed pointer-to-char. Thus with the former sizeof(* n) behaves as expected inside of "f", yielding 255.

These are true of the array-in-struct method as well.


This is a fantastic technique that didn't occur to me for an embedded project that I was developing in C and where passing arrays of known size was a frequent thing. Also it was desirable to save space by avoiding the padding that a "array+size" struct would contain.

Thanks for commenting this.


But remember structure padding, which the compiler is free to add in the middle (and end) of a struct (though there are pragmas and/or attributes to pack structs). IOW, an array and a struct are not necessarily the same in memory.


That's completely true, however in practice any extra padding shouldn't matter unless you're trying to do some funny buisness or are severly memory constrained. If all you really needed was a fixed-sized array, then it should work exactly the same, even if it technically uses a byte or two more. (Edit: And of course, while I think you already know this, the compiler can't add padding inside the array, so the array it self will still be exactly the same memory-wise).

With that, I wouldn't expect that any padding would be added since the array should already have the same alignment as the `struct` anyway - but compilers are known to do weird things from time to time, so it's definitely not impossible.


The array in a struct is still going to be continuous in memory. All you might change is allignment.


Here's one example of where I've used this:

    void wlr_matrix_multiply(float mat[static 9], const float a[static 9],
    	const float b[static 9]);


As this is valid, but only useful as documentation:

    void foo(const int n, const char data[n])
Is this also valid?

    void foo(const int n, const char data[static n])


The `n` may be any assignment-expression, and as a result values like `n` are appear to be allowed by the standard.

I've used this in code compiled with gcc and clang. Both accept this construct, but I'm not sure how effectively any compiler is able to use this information to emit warnings/errors.


Not only do the opportunities to use it seem to be limited, but the opportunities for the compiler to statically check that the constraint is respected seem limited, both in the case of the caller and in the implementation of the function.


This is not primary about static checking. This is mostly a guarantee to the implementation so it has information at the target function so it can generate better code without having to resort to derivations or complex data flows (to optimize, basically).


If you have ever written a function that accepts a pointer to a single value as an argument, with null pointer not being valid input, then declaring it as a static 1-element array should make things just a little bit better (mainly because the compiler can assume no null).


You could define a macro NOTNULL for this, to make it clearer to readers.

    void foo(char bar[static 1])
vs

    void foo(NOTNULL(char* bar))
Though I'm not sure whether this can be expressed as a macro, since you need to remove the "*"?


How about this:

  #define NOTNULL 
  
  void foo(NOTNULL char *bar)
If you're using Objective-C, then nonnull is your friend.


The point is to get [static 1] in the expansion of the macro, so that the compiler is aware too, not just the coder.


Ah. I was wondering why this was difficult; thanks for the clarification!


The clang nullability annotations can be used in C/C++ as extensions: https://clang.llvm.org/docs/AttributeReference.html#nullabil... You could then conditionally define your NOTNULL using them or not. The compiler will then complain about incorrect usage.


See here:

https://news.ycombinator.com/item?id=18787595

The compiler didn't complain when passed a pointer to NULL (ok, not the same as passing NULL.


You didn't read the article it seems static in this context doesn't mean a fixed array size it means at least as many as indicated. Take for example strlen(const char *src) could be strlen(const char src[static 1])


you can however use

    void foo(char data[static 0], int len)
with 0 or 1 to signal at least zero/one elements are required (for example), and the pointer must be non-null.


The standard says "If the expression is a constant expression, it shall have a value greater than zero." Clang gives me a warning that "static 0" has no effect. NULL is valid as a pointer to a 0-length array, so that makes sense that it wouldn't function as a non-NULL assertion.


When doing geometry, it is common to pass around double[3] or double[3][3] types.


Here's an example of "use in anger", https://gitlab.com/jjg/lcrp/blob/master/lcrp.c#L159 -- the algorithm used needs the array to have at least 3 elements, generally there will be more. It seems useful that there is a mechanism to warn the future me that this limit should be observed.


void SetColour (const double rgb[static 3])


Nice try, but a struct would be more appropriate.


Sometimes, but not always. ex: a big piece of RGBRGBRGB memory that you iterate. ex: https://www.vtk.org/doc/nightly/html/classvtkProperty.html#a...


That feels like something the compiler should fix, not the programmer. Then again, this is C where simplicity is preferred over costless abstraction.


What about arrays of a struct with a double[3] member?


Sometimes there are fixed arrays. E.g. an MD5 hash is 16 bytes, classic Mac OS file/creator types were 4 bytes, and so on.


> nice

Ok, so, in addition to ‘static’ meaning either ‘local static’ or ‘private’, turns out we also have ‘at least.’ While I kinda understand the need to overload keywords, this still rubs the wrong way every time I see it. Well, I guess, if this is OK in a natural language (for a word to have several meanings; and in mathematics, too), it must be OK in a programming language...


C's "static" - a Short Token who's meaning is Always Taken In Context.


I once proposed to extend that to allow array size checking.[1]

The trouble with C static array sizes is that the compiler doesn't do much of anything with the information.

[1] http://www.animats.com/papers/languages/safearraysforc43.pdf


Hardware solutions like Solaris with SPARC ADI are the only workable way, if ISO C is not willing to actually change the language.

However the other CPU vendors and OS developers don't seem to be in a hurry to provide similar security support.


There's also Intel MPX: https://intel-mpx.github.io/design/


Intel has deprecated MPX and will remove it from newer CPU generations.

Likewise GCC 9 will be dropping support.

Only ARM and Google (via Android) are currently serious about such kind of feature support, but it will still take a couple of years until it gets widespread support.


I wouldn't call another overloading of static keyword, a nice feature. Useful may be, but ugly given how overused the keyword is.


am I the only one that loves C but hates C99?


It's hard to love C over C99 with its declaration before statements rule.


I totally agree with you. It's all about C11!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: