Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
dave4667
New Member

Slow Query on Gateway but Fast on Database

Hello,

 

Our company has a server dedicated for Power BI On-Premises Data Gateway with 20 vCPU and 64 GB RAM. We have a little problem where Power BI Gateway performs refresh queries very slow especially to Hive tables. There's this one semantic model that queries to the Hive tables, the problem is, when I click refresh on the semantic model, it can take up to 15 minutes with a query limit of 50 rows. But when we check from the Hive side, the query only takes 3 seconds. From the Gateway perspective, the CPU usage never goes above 15% and memory usage rarely goes above 30%. So where is the bottleneck and how do I troubleshoot this? The problem only occurs to Hive. We're not getting these issues with Oracle, PostgreSQL, and MySQL databases.

 

Thank You

1 ACCEPTED SOLUTION
Zanqueta
Solution Sage
Solution Sage

Hi @dave4667,

 

This behaviour is relatively common when working with Hive via the On-Premises Data Gateway, and it is usually not related to CPU or memory usage but rather to network latency, data serialisation, and lack of efficient query folding. Let us analyse the main points:

Why is the query fast in Hive but slow through the Gateway?

  1. The Gateway does not execute the query directly: It acts as a proxy, receiving data from Hive and sending it to the Power BI service.
    Even if the query returns only 50 rows, Hive may send additional metadata or process via ODBC/JDBC, which introduces overhead.
  2. Hive ODBC/JDBC driver performance: Some drivers are not optimised for small operations and can have high initialisation latency.
  3. Partial or no query folding: If Power Query applies transformations that are not fully folded to Hive, the Gateway may process part of the logic locally.
  4. Network between Gateway and Hive: Even with sufficient CPU and RAM, bandwidth or latency can be the bottleneck.

How to diagnose and improve

  1. Check the connection mode:
    • Are you using Import or DirectQuery?
    • In Import mode, the Gateway must load the data into the Power BI service, which can be slow if compression or conversion is involved.
  2. Enable Query Diagnostics in Power BI Desktop:
  3. Test with a simple query in Power Query:
    • For example, SELECT * LIMIT 50 without transformations. If it remains slow, the issue is likely with the driver or network.
  4. Update the Hive ODBC/JDBC driver:
    • Older drivers often have performance issues.
  5. Adjust Gateway settings for timeouts and concurrent connections:
    • In Gateway Settings, increase the maximum number of concurrent connections for Hive.
  6. Check compression and serialisation:
    • Hive may send data in Avro/Parquet format, but the Gateway converts it internally, which can be slow.

Best practices

  • Where possible, create optimised views in Hive to reduce transformations in Power Query.
  • Consider using DirectQuery to avoid large data transfers, but test whether performance improves.
  • Install the Gateway as close as possible to the Hive cluster to minimise network latency.

 

If this response was helpful in any way, I’d gladly accept a 👍much like the joy of seeing a DAX measure work first time without needing another FILTER.

Please mark it as the correct solution. It helps other community members find their way faster (and saves them from another endless loop 🌀.

View solution in original post

4 REPLIES 4
v-priyankata
Community Support
Community Support

Hi @dave4667 

Thank you for reaching out to the Microsoft Fabric Forum Community.

@R1k91 @Zanqueta Thanks for your inputs

I hope the information provided by users was helpful. If you still have questions, please don't hesitate to reach out to the community.

 

Hi @dave4667 

Hope everything’s going smoothly on your end. I wanted to check if the issue got sorted. if you have any other issues please reach community.

R1k91
Super User
Super User

what kind of power query has been implemented in the semantic model? 

I mean, is it zero/low transformation power query and all the transformations happen on Hive side or there're many transformations?
in the second case have you checked if transformations are folding to the source?

if it's not folding, transformations happen on the gateway side and they can be slow.


--
Riccardo Perico
BI Architect @ Lucient Italia | Microsoft MVP

Blog | GitHub

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Zanqueta
Solution Sage
Solution Sage

Hi @dave4667,

 

This behaviour is relatively common when working with Hive via the On-Premises Data Gateway, and it is usually not related to CPU or memory usage but rather to network latency, data serialisation, and lack of efficient query folding. Let us analyse the main points:

Why is the query fast in Hive but slow through the Gateway?

  1. The Gateway does not execute the query directly: It acts as a proxy, receiving data from Hive and sending it to the Power BI service.
    Even if the query returns only 50 rows, Hive may send additional metadata or process via ODBC/JDBC, which introduces overhead.
  2. Hive ODBC/JDBC driver performance: Some drivers are not optimised for small operations and can have high initialisation latency.
  3. Partial or no query folding: If Power Query applies transformations that are not fully folded to Hive, the Gateway may process part of the logic locally.
  4. Network between Gateway and Hive: Even with sufficient CPU and RAM, bandwidth or latency can be the bottleneck.

How to diagnose and improve

  1. Check the connection mode:
    • Are you using Import or DirectQuery?
    • In Import mode, the Gateway must load the data into the Power BI service, which can be slow if compression or conversion is involved.
  2. Enable Query Diagnostics in Power BI Desktop:
  3. Test with a simple query in Power Query:
    • For example, SELECT * LIMIT 50 without transformations. If it remains slow, the issue is likely with the driver or network.
  4. Update the Hive ODBC/JDBC driver:
    • Older drivers often have performance issues.
  5. Adjust Gateway settings for timeouts and concurrent connections:
    • In Gateway Settings, increase the maximum number of concurrent connections for Hive.
  6. Check compression and serialisation:
    • Hive may send data in Avro/Parquet format, but the Gateway converts it internally, which can be slow.

Best practices

  • Where possible, create optimised views in Hive to reduce transformations in Power Query.
  • Consider using DirectQuery to avoid large data transfers, but test whether performance improves.
  • Install the Gateway as close as possible to the Hive cluster to minimise network latency.

 

If this response was helpful in any way, I’d gladly accept a 👍much like the joy of seeing a DAX measure work first time without needing another FILTER.

Please mark it as the correct solution. It helps other community members find their way faster (and saves them from another endless loop 🌀.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.