Makes most sense to compare Impala and Spark architecturally. Ibis will eventual... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

wesm on July 20, 2015 | parent | context | favorite | on: Ibis: Scaling the Python Data Experience

Makes most sense to compare Impala and Spark architecturally. Ibis will eventually integrate with Spark. We've been focusing on Impala integration for reasons cited here: http://blog.cloudera.com/blog/2015/07/getting-started-with-i...

In particular, we're working on byte-level shared-memory integration with Impala (which is implemented in C++ with LLVM runtime codegen — the project's tech lead, Marcel Kornacker, was the tech lead for Google F1's query engine) to run user-defined logic without data serialization / memory usage overhead. This also opens up Python's HPC / scientific computing stack and existing data libraries to be run in a Hadoop setting without Python-JVM interoperability issues.

infinite8s on July 20, 2015 | [–]

Are you planning on leveraging numba, or will this be a new way to generate LLVM bytecode from python?

Lofkin on July 20, 2015 | | [–]

I was wondering this also

perone on July 20, 2015 | [–]

Now I got it, thanks for the explanation Wes, sounds very interesting indeed. Congratulations for the project.

Consider applying for YC's Spring batch! Applications are open till Feb 11.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact