
Lambdas for C – sort of - fogus
https://hackaday.com/2019/09/11/lambdas-for-c-sort-of/
======
dragontamer
The "hackaday" blog focuses on cool things that are typically (but not
necessarily) impractical. He isn't suggesting that this "lambda" be used.
Instead, this is a stealth blog-post about "__anon", as far as I can tell.

Which is really what hackaday is about: finding weird features in
hardware/compilers/etc. etc. and using them in some manner. There's a whole
lot of obscure features of GCC that are being touched upon in this blogpost
(nested functions, whatever is going on with $__anon$, etc. etc.). I can't say
that I can figure out exactly what is going on yet, but its kind of exciting
to see all of these features get used at once.

[https://github.com/wd5gnr/clambda/blob/master/clambda2.c](https://github.com/wd5gnr/clambda/blob/master/clambda2.c)

EDIT: Unfortunately, it just segfaults for me at the moment.

    
    
        $ gcc --std=gnu99 clambda2.c
        $ ./a.out
        Segmentation fault (core dumped)
        $ gcc --version
        gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
    

This is Ubuntu on Windows, but I doubt that would make a difference.

~~~
quietbritishjim
> whatever is going on with $__anon

The "lambda$__anon$" identifer is just the name of the local function, it
could just as well have been "elephant" or anything else. The first line
defines the nested function:

    
    
        {
            double elephant (double x){ return x/3; }
    

And the second line references that same identifer:

    
    
            &elephant;
        }
    

Normally an expression that didn't include an assignment call or a function
call is legal but doesn't do anything. But as the article mentions, GCC uses
it as the return value of the block.

The commenters seem to have identified the undefined behaviour here: the
resulting value is a pointer to a function that's only valid within the block
but is being used outside it.

~~~
ramshorns
What do the dollar signs do? If they're really just part of the identifier it
doesn't seem necessary to make sure the compiler supports it, rather than just
use a more normal name like elephant.

~~~
gpderetta
It is just a way to uglify symbols to make collisions with surrounding code
less likely. C macros are not hygienic.

------
kazinator
I'm not conviced that GCC defines the behavior of this, because the trick
relies on defining a local function in a block scope, and then allowing it to
escape from that block scope:

    
    
       {
         rettype foo(args ...) { ... }
         foo;
       }
    

GCC local functions are "downward funarg only", as far as I know. This would
definitely be wrong:

    
    
       {
         int local = 42;
         rettype foo(args ...) { ... reference local ... }
         foo;
       }
    

then, when _foo_ is called, _local_ no longer exists, which is bad news. The
lambda macro doesn't do this (the block doesn't extend the enviornment;
nothing is captured from there), and so maybe works by fluke.

Another thing to is that pointers to GCC local functions work via trampolines:
pieces of executable machine code installed into the stack. When you use GCC
functions, the linker has to mark the executable with a bit which says "allow
stacks to be executable". The default in most distros is non-executable
stacks, which guards against stack overflow exploits.

(Speaking of trampolines, I'm not sure about the effective scope of those. If
we lift a pointer to a local function inside a block, requiring a trampoline,
and then that block terminates, is that trampoline scoped to the block or the
function? If it's scoped to the function, won't it be overwritten if we
execute that logic multiple times? If the trampoline is scoped to the block,
then the invocation of foo is using an out-of-scope trampoline.

~~~
ndesaulniers
There are quite a few GNU C extensions with unspecified behavior for edge
cases. Source: have implemented and debugged/fixed some in Clang.

~~~
kazinator
By the way I compiled and ran the program (Ubuntu 18.04, x86_64 with various
optimization options and whatnot, such as -fstack-protector. It runs cleanly
under Valgrind.

~~~
iforgotpassword
Valgrind is pretty bad at detecting stack corruption, or at least was a couple
years ago. Did you try -fsanitize=address too?

------
cryptonector
So, this doesn't work because the scope of the statement-expression _is_ the
scope of the local function, so to use the function outside that scope (as TFA
shows) is UB.

C w/ GCC's local functions extensions is just not enough for lambda
expressions. You have to declare the local function earlier than (and in scope
of) the use site.

For example, an expression like this:

    
    
      float x = add_fns(1,
                        lambda(float,(float x),{ return 2*x; }),
                        lambda(float,(float x),{ return 3*x; }));
    

may well assign 6.0 to x rather than 5.0 because the first lambda gets
overwritten on the stack with the second. That's if it works at all -- after
all, we have UB here, and this could just summon cthulhu or anything else.

~~~
DSMan195276
There actually appears to be a `gcc` bug here, `gcc` doesn't warn if you
return the address of a local function even though it's clearly bogus usage
due to it being implemented via a trampoline on the stack.

Interesting note, some quick testing shows that if the local function doesn't
require any variables from the outside scope, it will actually be stored in
the `.text` segment, which would allow this to work in a defined way. That
said, I view this is just an implementation detail that you can't rely on, as
the docs don't mention this and only talk about trampolines. It's also super
easy to mess up, obviously.

~~~
cryptonector
Good points all around.

------
pjmlp
Apparently the author forgot to look into clang blocks language extension.

[https://clang.llvm.org/docs/BlockLanguageSpec.html](https://clang.llvm.org/docs/BlockLanguageSpec.html)

------
basementcat
There are a variety of ways to use lambdas in C, each uniquely horrifying.

[https://codegolf.stackexchange.com/questions/2203/tips-
for-g...](https://codegolf.stackexchange.com/questions/2203/tips-for-golfing-
in-c/104999#104999)

~~~
eyegor
Hmm, I wonder if you couldn't wrap all those horrible approaches in a unfified
header macro with

    
    
      #ifdef __GNUC__, 
      #ifdef __clang__,
      etc.

------
mpfundstein
I still wonder why C still does’t have lambas implemented by standard. I
understand its a quite slow moving language but it would make programming in
it mich nicer (see C++11)

Are there anh underlying ‘issues’ with lambdas, I wonder?

~~~
Gibbon1
Having kicked this around I think two problems.

One: Compiler development is driven by the C++ standards committee. And they
all hate C and wish it would die already. More to the point things you would
do to make C a better more powerful language are orthogonal to the direction
C++ is being pushed.

Two: Being tied to C++ also means being tied to the same ABI as C++. And
improvements to the C language probably would need some extensions to the ABI.

Three: I can't wrap my head around this but a lot of people are extremely
hostile to attempts to extend and improve C.

~~~
jcranmer
> One: Compiler development is driven by the C++ standards committee. And they
> all hate C and wish it would die already.

The latter statement is not true. But it is true that most of the evolution of
C/C++ is driven by the C++ committee, with the C committee mostly adapting
features from C++ and very little innovation in C being adapted for C++. (As
one C++ committee member confided to me, the C committee does have a bit of a
tendency to completely screw things up when the C++ committee liaisons leave
the room). But there is still coordination and cooperation between the
committees--for example, the recent proposals to replace the current EH model
in C++ includes a coordinating proposal to modify the C ABI to provide access
to a Result-esque exception model.

> Two: Being tied to C++ also means being tied to the same ABI as C++. And
> improvements to the C language probably would need some extensions to the
> ABI.

The C ABI desperately needs extensions anyways, especially because it is the
de facto platform ABI and languages usually only support FFI features using
the C ABI. The biggest missing features here are SIMD vector support and
multiple return value support.

~~~
Gibbon1
I apologize for the slight against the C++ standard people.

I do like your comment about the ABI needing to be extended to improve FFI
features. I feel that way too. Also think that a clean (non clunky) method for
FFI is exactly what C has needed for a long time.

------
lgeorget
> However, it seems like if it compiles it ought to work and — mostly — it
> does.

I'm taking this out of context of course but that looks like a very dangerous
assumption to make...

