Btw, core EF is quite efficient (perf wise) on the decoding side even on GPUs. I wanted to do PEF, but that seemed a bit more involved and I didn't have the time to do it. Here's a GPU implementation for graph problems if anyone is interested: https://github.com/pgera/efg. I also used folly on the encoding side and it works great.