Using a union still has the going-through-memory problem, though. I spent a day or two trying to coax clang into doing the efficient thing with a union a couple years back when I could have gotten the job done in 30 minutes with raw assembly (alas...)
RE: Return values, you'd be surprised. You can't assume you'll get properly optimized code for something like that in WebAssembly for example, despite the fact that you're using an industry-standard compiler (clang) and runtime (v8 or spidermonkey).
RE: Return values, you'd be surprised. You can't assume you'll get properly optimized code for something like that in WebAssembly for example, despite the fact that you're using an industry-standard compiler (clang) and runtime (v8 or spidermonkey).