- Standard ANSI SQL syntax, including all the basic features you'd expect from a SQL engine (aggregations, joins, etc) and other more advanced features like analytic window functions, common table expressions (WITH), approximate distinct counts and percentiles.
- It's extensible. The open source code base includes a connector for Hive, but we also have some custom connectors for internal data stores at Facebook. We're working on a connector for HBase, too.
- In comparison to Hive, it's very fast and efficient. For our workloads it's at least 10x more CPU-efficient. Airbnb is using it and has had a similar experience.
- Most importantly, Presto has been battle-tested. It's been in production at Facebook since January and it's used by 1,000 employees every day running 30,000 queries daily. We've hit every edge case you can imagine.
In general, you should be able to write your query in the simplest and most readable way, and Presto should execute it efficiently. We already have the start of an advanced optimizer that supports equality inference, full predicate move-around, etc. This means that you don't need to write redundant predicates everywhere as is required with some query engines.
Also, if you are familiar with PostgreSQL, you should feel right at home using Presto. When making decisions for things not covered by ANSI SQL, the first thing we look at is "what does PostgreSQL do".
That said, earlier this year, during a hackathon, we build a prototype connector that could split a query and push down the relevant parts to a distributed database that supports simple aggregations. It would be more work to clean this up and integrate, so if a lot of people are interested in this we can prioritize that.