In the world of big data and analytics, query speed makes a huge difference.
But, speed isn’t the whole picture. The real experts in the industry are now rightly expanding their focus to consider the “quality and completeness” of SQL on Hadoop implementations.
“Quality” means looking at speed along with overall query effectiveness, which has a major impact on a company’s ability to maximize their benefits from big data analytics. And that can directly impact profit, scale, complexity, developer productivity, and operational costs. Poor technical decisions in this area ultimately waste time and money.
One thing was clear as a macro observation after these new tests—hands down, HAWQ beats Apache Hive™ and Impala on query speed and effectiveness. This short bullet list highlights the outcomes of the technical details explained further below:
- HAWQ runs the set of TPC-DS queries in roughly half the wall clock time that Apache Hive™ and Impala can.
- HAWQ provides a 344% speed improvement against Apache Hive™ and 454% against Impala. (By measuring and comparing the geometric mean across the TPC queries)
- Importantly, Apache Hive™and Impala do not support all 99 of the standard TPC-DS queries. These queries represent the minimum market requirements, and the two fail on roughly one third of the queries while HAWQ runs 100% of them natively.
- Apache Hive™ and Impala also lack key performance-related features, making work harder and approaches less flexible for data scientists and analysts.
- SQL-related workarounds were not necessary with HAWQ whatsoever, test and development iterations were much quicker and with fewer issues.
- HAWQ also provides more robust query partitioning and performs significantly better on certain classes of workloads—BI roll-ups, predictive analytics, and machine learning—the most critical use cases for enterprise reporting and big data analytics.
Editor’s Note: ©2015 Pivotal Software, Inc. All rights reserved. Pivotal HAWQ is a trademarks and/or registered trademarks of Pivotal Software, Inc. in the United States and/or other countries. Apache, Apache Hive, and Apache Tez are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.