
Partially Applied Functions in C - mmphosis
http://tia.mat.br/blog/html/2013/07/20/partial_functions_in_c.html
======
haberman
Interesting hack. This is roughly equivalent to writing a dynamic code
generator (JIT) for generating tiny little trampolines, except it's
architecture-independent. It achieves portability by patching up some existing
code (for an unknown instruction set) rather than generating code from scratch
(which would require you to understand the instruction set).

It would need some changes to be truly architecture-independent; for example
it should use memcmp() instead of this idiom which can throw SIGBUS on some
platforms due to the unaligned access:

    
    
        if (*((intptr_t *)code) == look) {
    

I used a similar hack once when I wanted to create an ELF object file at
runtime, which is necessary to use the GDB JIT interface
([http://sourceware.org/gdb/onlinedocs/gdb/JIT-
Interface.html](http://sourceware.org/gdb/onlinedocs/gdb/JIT-Interface.html)).
I didn't want to implement the whole ELF format myself, but I needed this ELF
file to point to some JIT code that is located at an address that is only
known at runtime. I achieved this by generating the ELF object file with gcc
at build time, but used a similarly distinctive constant for the address so I
could patch it up at runtime.

If you wanted to do this through straight-up codegen (instead of patching)
this would be really easy with DynASM, and you could use my tutorial here as a
guide:

[http://blog.reverberate.org/2012/12/hello-jit-world-joy-
of-s...](http://blog.reverberate.org/2012/12/hello-jit-world-joy-of-simple-
jits.html)

~~~
thedigitalengel
It isn't truly arch-independent till it assumes PARAMETER_CONSTANT and
FUNCTION_CONSTANT will be stored as direct immediates in the generated code.
On some archs, for instance, 0xFEEDBEEF might be too big a constant; and the
compiler would then be forced to move it to a register (or a stack slot) in
parts.

Edit: and of course, you run the possibility that on some archs, 0XFEEDBEEF is
actually a valid encoding for some instruction. :)

~~~
Someone
For examples of a CPU that behave differently, lok at RISC CPUs. 32-bit
PowerPC, for example, would translate an immediate long load into an immediate
'load short into high word and zero out low word' and a signed immediate
addition (it would load $DEAE first, then add -$4111 to get $DEADBEEF)

The list of problems is way longer, by the way. This code makes assumptions on
pointer size (I don't think it will run on x64 with common ABI's)

There also is no guarantee that function pointers point to the memory where
the function's code can be found (there could be a trampoline in-between, or a
devious compiler writer could encrypt function addresses).

Neither is there a guarantee that functions defined sequentially get addresses
that are laid out sequentially (there is no portable way to figure out the
size of a function in bytes).

Finally, I don't think there is a guarantee that one can read a function's
code (its memory could be marked 'execute only').

I guess those more familiar with The C standard will find more portability
issues.

~~~
saurik
ARM (quite common these days) will sometimes do this as well, as the immediate
value of an instruction has to fit inside of the fixed-width instruction, and
the instruction is only itself 32-bits (some required for overhead). While it
is slightly more likely you will see a PC-relative load, that requires an
extra data fetch; so, instead, a "movt" (move top) instruction is used to set
the upper 16-bits of a register after first setting the lower 16-bits. This
requires just as much space as the load and separate data literal, and runs
entirely out of the instruction cache.

~~~
raverbashing
What I saw an ARM compiler doing once is putting the 0xDEADBEEF into a data
segment (in the 'text' area I think) then loading from there

Like it would load a string, but only 4 bytes, it can load in one step

~~~
saurik
Not certain if you are disagreeing, so to clarify: you can either do a load
like that or two moves. The load requires one instruction word and one data
word, so two words. The moves require two instruction words, so also two
words. The load however, requires a data fetch. This fetch has to come from
somewhere nearby, as the load instruction cannot offset very far, and so is
going to be in the same segment as the code (called "text", for historic
reasons: this has nothing to do with strings, which will be stored in the
"data" segment, or sometimes even a "strings" segment), and most of the time
will be directly after (or, for long functions, inside) of the code for that
function. This is a trade off: moves are fast (cycle counts of instructions
are not all the same, so "single step" doesn't mean much), have better
guarantees that they will always have the same result (code changing has much
heavier penalties than data changing), and might use up an asynchronous memory
access that other parts of your code might be saturating (although I honestly
am not certain if ARM does this like other architectures do). It thereby comes
down to circumstance, configuration, and the whims of the people who wrote
your compiler as to whether you will get mov+movt or a single ldr pc,.X+data.

~~~
raverbashing
I'm not disagreeing, just mentioning

"It thereby comes down to circumstance, configuration, and the whims of the
people who wrote your compiler as to whether you will get mov+movt or a single
ldr pc,.X+data."

Yes, this sums it nicely

------
majke
I thought this is what trampolines are:

[http://en.wikipedia.org/wiki/Trampoline_(computing)](http://en.wikipedia.org/wiki/Trampoline_\(computing\))

[http://gcc.gnu.org/onlinedocs/gcc/Nested-
Functions.html](http://gcc.gnu.org/onlinedocs/gcc/Nested-Functions.html)

FFcall can be used to do simillar things:
[http://www.gnu.org/software/libffcall/](http://www.gnu.org/software/libffcall/)

It has four subprojects:

* avcall - build a C argument list incrementally and call a C function on it.

* vacall - C functions called with variable arguments

* trampoline - closures as first-class C functions

* callback - closures with variable arguments as first-class C functions

[update] used outdated link to [http://www.haible.de/bruno/packages-
ffcall.html](http://www.haible.de/bruno/packages-ffcall.html)

~~~
thedigitalengel
On llvm: [http://llvm.org/docs/LangRef.html#trampoline-
intrinsics](http://llvm.org/docs/LangRef.html#trampoline-intrinsics)

------
spc476
I use TCC for this type of thing. TCC is a small, fast C99 compiler that is
also a library (so it can be embedded in an application). It can compile code
into memory which can then be called (in fact, the same code can be re-
compiled multiple times, each time resulting in new code).

On the down side, it only supports x86, X86-64 and Arm architectures.

~~~
mmphosis
[http://bellard.org/tcc/](http://bellard.org/tcc/)

------
zbowling
Trampolines do this.

A good example is that this happens in objective-c all the time when you get a
block imp. The closure from the block is wrapped into a new trampoline so the
you can use the block as any normal imp but with the closures state stored
correctly.

You can actually do this really easily and in a cross platform way with
libffi.

What is ingesting is if you are a secured platform, you might not be able to
mark a page as executable and writable so this isn't always possible. On iOS
we had to do an amazing hack with a .S file and the vm_remap call on Darwin.
Libffi uses a cute technique as well.

------
comex
I still wish Apple blocks were standardized in C - like C++ lambdas, but they
have a real type and can be kept alive after the containing function has
ended, and they come with block versions of the various C standard library
functions that take a function and argument (including atexit, though not
signal).

~~~
saurik
You can keep a C++11 lambda alive after the stack frame disappears using copy
semantics (the = modifier), which is what blocks are doing anyway; C++11,
however, gives you more control, as whether a copy or a reference is taken is
a property of the lambda, not the variable (as it is when using the __block
modifiers). Also, not having a nameable type is a non-issue, due to templates,
auto, and std::function (which is, in a way, the standard type you would
actually use to deal with a lambda). If anything, lambdas have more type, as
the type of a Block argument is erased in the type signature, even though it
is required for the calling convention.

~~~
comex
> You can keep a C++11 lambda alive after the stack frame disappears using
> copy semantics (the = modifier), which is what blocks are doing anyway

Ah, yes. Blocks allow you to keep _mutable_ data alive without specifically
moving them to the heap; in any case, simply having them is more important
than those differences. I am not sure what you mean about types being erased.

~~~
saurik
C++11 supports mutable lambdas. The only thing blocks seem to be able to do
that C++11 lambdas do not is that multiple blocks are able to share mutable
data (which at least makes the decision to make __block a storage class not
stupid).

As for the type erasure, a block is pragmatically an instance of NSBlock,
which is represented as "?" in the type signature and is morally equivalent to
an object id. There is therefore an implicit conversion from blocks of all
types to id.

Meanwhile, even in C++, there is an implicit conversion from id to blocks of
any type signature. As blocks are objects and objects often need to be
smuggled around, this leads to messy and non-obvious type conversions in
programs that use them.

From a language design perspective, this fails to take into consideration the
history of Objective-C; while object type casting in Objective-C is sloppy,
selectors are supposed to fairly uniquely map to a specific type signature.

The reason this is important is that if you have a random object, you want to
be able to send it an arbitrary message and not be concerned you got the type
signature wrong; the worst that can happen is that the object doesn't
understand.

However, blocks mix together the concept of invocation with selectors both
physically (you can even send it an invoke message) and conceptually (as you
can think of the call as a special selection syntax) and yet break this type
guarantee.

Even worse, blocks didn't even have their type encoded _at all_ when they were
first designed, so in addition to this being awkward at compile time, it was
even fundamentally impossible to catch this kind of type error at runtime.

Thankfully, they fixed that (although now blocks compiled with older
compilers, which I believe Apple simply counts gcc in its entirety as these
days, won't have this information), but they did not also solve any of the
other type issues.

~~~
comex
> C++11 supports mutable lambdas. The only thing blocks seem to be able to do
> that C++11 lambdas do not is that multiple blocks are able to share mutable
> data

Blocks are also a bit simpler if you want to continue to mutate the same
variable both inside and outside of a lambda (but then have the variable stay
alive after the outer function exits).

> As for the type erasure, a block is pragmatically an instance of NSBlock,
> which is represented as "?" in the type signature and is morally equivalent
> to an object id. There is therefore an implicit conversion from blocks of
> all types to id.

All objects are also represented with the same character, and all function
pointers are ^?. This may be unfortunate, but isn't really about blocks; in
pure C, of course, they are only erased inasmuch as all types are erased.

> The reason this is important is that if you have a random object, you want
> to be able to send it an arbitrary message and not be concerned you got the
> type signature wrong; the worst that can happen is that the object doesn't
> understand.

How do you mean? If you have a random `id`, or possibly an object casted to
the wrong object type (isn't that the equivalent scenario, since you're
talking about implicit casting from id to blocks of any type signature?) and
send it a message for a selector it implements but with different types, the
call will go through and probably crash, same as if you cast a block to the
wrong type.

~~~
saurik
> ...and send it a message for a selector it implements but with different
> types...

I am not certain how to best correct the misconception here. Yes: if you do
this, it will crash. However, as I stated, the history of this language sets
us up so that this should not happen: a selector should uniquely identify a
calling signature to the best of the abilities of the programmers involved
(and generally it will by accident for anything but very short names).

The reality is that this is actually not just convenient but somewhat
required, as the compiler has to select a specific calling convention at
compile time, so if you are using an object in a situation where it is even
remotely ambiguous you get a compiler diagnostic if there is any ambiguity in
the libraries you are working with with respect to the selector's type
signature.

Again, with blocks, this all breaks: you can't even get a warning, because the
user has to cast it first to a specific calling convention, and it isn't
obvious in the code, because the cast is allowed to be implicit. The chance of
an overlap is high, because the conceptual selector's name is effectively
nothing. Objective-C mitigates this problem elsewhere: not here.

> All objects are also represented with the same character, and all function
> pointers are ^?. This may be unfortunate, but isn't really about blocks; in
> pure C, of course, they are only erased inasmuch as all types are erased.

A) It is not relevant that the class of an object is erased, because what's
important is the type of a message implementation: you don't break the type
system by having an object of the wrong class, only by sending a message with
the wrong calling convention. It doesn't matter at all, therefore, whether
objects are all "erased": there wasn't information there to begin with.

B) It is not relevant that C function pointers have their type erased, because
they aren't also objects: the whole point of objects in Objective-C is that
they are dynamically typed (back to point A: their type is irrelevant), so you
are expected to use them as arguments to things like withObject: and then let
them implicitly cast back on the other side. Again: selectors are king.

C) It is not relevant that in pure C types are "erased" as C does not have a
runtime, so I'm not even certain how one could determine the difference
between the types being erased and the types not being erased. The erasure
wasn't even the core problem, remember: the erasure only makes it impossible
to have runtime fixes for the actual problem, which is the compile-time
selector slop.

~~~
comex
> However, as I stated, the history of this language sets us up so that this
> should not happen: a selector should uniquely identify a calling signature
> to the best of the abilities of the programmers involved (and generally it
> will by accident for anything but very short names).

I see what you're saying... screwing up an id cast will _usually_ not cause
low-level indeterminacy (even though it might), but screwing up a block cast
always will. This seems like a pretty minor issue to me, especially as blocks
usually need not be passed as ids in the first place, and Objective-C is
hardly a safe language in other respects (at least historically, with
retain/release).

------
mihai_ionic
C++14 version: [https://dpaste.de/a8g9D/](https://dpaste.de/a8g9D/)

You obviously can't pass this to C code, but it's a portable and extremely
efficient solution if you are working purely within a C++ code-base.

~~~
panic
I don't think "curry" is quite the right name for that function. For a
function

    
    
        int f(int, int, int);
    

the curried version of f,

    
    
        auto f_curried = curry(f);
    

must be able to take multiple arguments one-at-a-time, so

    
    
        f_curried(x)(y)(z) == f(x, y, z)
    

The code at [https://dpaste.de/a8g9D/](https://dpaste.de/a8g9D/) just
partially applies the function to one set of arguments, which isn't enough --
you have to keep partially applying until you get them all.

~~~
mihai_ionic
I knew I should have just called it bind.

As a bonus, an actual implementation of curry:
[https://dpaste.de/qGWcq/](https://dpaste.de/qGWcq/)

------
mappu
I used the same approach:
[https://gist.github.com/anonymous/3154939](https://gist.github.com/anonymous/3154939)

------
huhtenberg
Sure, a good hack, upvoted.

Though, not to state the obvious, but if you find yourself in a need a context
for an atexit() callback, then you are doing something wrong. Similarly, if a
library function accepts just a function pointer, then it's either a global
callback, or it's a mistake in a library design.

------
chacham15
Can someone explain to me why you would do this very hacky thing instead of:

    
    
        void *global_ptr;
        int callback(){
            realcallback(global_ptr);
        }
        int main(){
            global_ptr = whatever;
            atexit(callback);
            return 0;
        }

~~~
to3m
Let's assume that realcallback has no use other than to be a callback
function. Then, this suggestion is not much better than simply having
realcallback use global_ptr directly.

Now say you wanted to have one invocation of realcallback with one argument,
and another invocation with another. Then it makes sense for realcallback to
exist - but you would still need two callback functions (and two globals). And
if you needed three different invocations, you'd need three, and so on. This
is all easy enough to do, but not extensible in any data-driven fashion.

The hack generates a separate thunk for each value, but at runtime, so you
have as many callbacks as you have context arguments, and without needing to
write them by hand ahead of time.

(Fixing the calling code is a much better solution, where available. You
virtually always need a context parameter, and when you don't, it's easy
enough to ignore. "Solving" the problem with this sort of ugly hack is just
making more work for yourself and for others.)

~~~
thedigitalengel
For multiple functions taking different arguments you could exploit atexit's
specified calling order

    
    
        data_t *global;
        
        int main() {
          atexit(callback);
          atexit(set_global_to_x);
          atexit(callback);
          atexit(set_global_to_y);
        }
    

Since functions are called in reverse order of their installation using
atexit, you end up with two calls to callback; one with global set to y and
one with global set to x.

------
dschiptsov
I might be wrong, but the concept of partial application implies existence of
lexical closures and environment model of evaluation, so partially applied
function becomes a closure over its argument. This is how currying is useful.
This article, then, describes dome nice hack, but I can't see how it is
related to partial application as we know it.

------
im3w1l
Good thing he patches the function constant before the parameter constant, or
there could be a denial of service attack if the function argument can be
chosen (to abad1dea) by an attacker.

