The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I currently use Power BI Desktop to connect to Apache Hive through a HortonWorks Hive ODBC connection. I pass SQL-like statements to Hive in Power BI to process the statements on the server and then have Hive return the results.
My issue is that returning the data to Power BI is extremely slow. For instance, it takes up to an hour to return a "table" of about 151M records in Power BI. When I use a database management system to query Hive -- like DBeaver, for instance -- I can get around this by running the queries through the Tez engine, with the statement below:
set hive.execution.engine=tez;
More on the Tez engine here.
Running these statements through the Tez engine takes about 1/100th of the time. (BTW: DBeaver connects to Hive through JDBC drivers, which looks like Power BI does not yet support.)
Is there a way to force Power BI to run queries through the Tez engine?
It can be done on the DSN config....go to advanced in your ODBC config...Server Side Properties...Add...put in hive.execution.engine as the key and tez as the value...OK...OK...OK. Worked for me...took an hour long query down to 12 minutes. Not perfect but way more palatable.
Hi,
This is not a solution to your question.
But I am trying to connect to Tez through Hive using Dbeaver, can you please help as you are able to connect.
Thanks in advance!
Manas
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
110 | |
82 | |
62 | |
54 | |
51 |
User | Count |
---|---|
127 | |
118 | |
81 | |
66 | |
65 |