Basically, it's a Linux variant that launches a separate kernel on each ISA-island (think a set of cores), links the kernels together with a custom messaging layer and page coherency protocol to create a single system image for applications, and then provides a compiler and set of runtime tools to enable developers to write code like they would for a traditional single-ISA SMP machine but that can take advantage of the different ISAs in the system.
The short story is that there are performance and power benefits to be had, but only if you can support quick and efficient migration between architectures. I'm not at liberty to say right now exactly how that works (we're still publishing the work), but suffice to say: we've hooked up an x86-64 machine and aarch64 machine, made it look like a single system, and migrated applications between the two. It's pretty cool to watch processes move back and forth between architectures :-)
I spent a semester working with a Xilinx SoC, and the experience was enlightening. My computer engineering friends were very comfortable with gate description diagrams and debugging with input/output wires and waiting literal hours between test cycles. I was the only software engineer in the room, and all I could do was ask myself how anyone could be OK with this awful tooling situation. It really befuddled me - I was especially frustrated while using high-level synthesis tools which take C++ and convert it into a functioning hardware description (Alleviating the need to rewrite business logic in VHDL or Verilog). It would take well-formed C++ code with a simple API and give a pretty good hardware description (sometimes with better perf than a handwritten equivalent, with a little optimizing), but fail to generate a corresponding API for it on the associated CPU for anything beyond simple register access (despite starting with what was likely the desired software API)! IMO, FPGA tooling could use a lot of TLC, but maybe I just had a bad experience.
Somen vendors even provide an OpenCL API/SDK with which you can express your algos at a higher level than VHDL.
FPGAs are awesome :)