Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
KailashPareek
Helper I
Helper I

Serious performance issues and capacity consumption on lakehouse SQL endpoint

We have F8 capacity. There is table with 850 billion rows (about 10 small columns). It is price data and partitioned on DATE/HOUR. On an average, there are about 30 million rows for a DATE+HOUR. A SQL like

select * from prices where date='2024-08-06' and HOUR=10 and MINUTE=10 should return <1 million rows. But it keeps going and does not return any value even in 10 minutes. We expect it to do partition pruning and read only 25 million rows and return data for one minute.

 

If capacity is paused and restarted, the same SQL returns data in <1 Min but consumes almost all capacity and throttles any other operations. Question is why reading only one partition of 25 million rows should consume all F8 capacity.

 

Anyone used SQL end point for such data? It appears that partition pruning is not working and something is wrong in SQL endpoint?

6 REPLIES 6
KailashPareek
Helper I
Helper I

The issue is somewhat more complex and a ticket is raised with MS Support and demonstrated. The SQL with DATE/HOUR (Parition keys) in WHERE for one hour takes 30-40 secs in Spark SQL but does not complete in SQL Endpoint/Power BI or SSMS. It is almost clear that for whatever reasons, SQL endpoint is not using partition pruning as expected.

Anonymous
Not applicable

HI @KailashPareek,

Have you got any responses from MS team? Perhaps you can share them here to help other users who faced the similar issues.

Regards,

Xiaoxin Sheng

THuss
Frequent Visitor

Are you querying the SQL endpoint from the browser frontend? If yes, have you tried it with a tool like SSMS as well?

Anonymous
Not applicable

Hi @KailashPareek,

It seems like you work with huge amount of records, have you tried to directly execute these query on the original table?

For the query executed on the partitioned table but not get the expected result as your wish, I suppose they may include some of other processings. (they may spend resources to initialize the table data and schema to optimization on following operations)

Also capacity throttling mechanism will also optimizing the resource spend to make them smoothly so that one executor highly frequency and longtime occupancy too many resources.

For pause and resume the fabric capacity, I'd like to suggest you take a look the following documents if they help for your scenarios:

Pause and resume your capacity - Microsoft Fabric | Microsoft Learn

Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn

Regards,

Xiaoxin Sheng

yes, it is big table. The query is select * from <table> where DATE='2021-06-04' and HOUR=10. The partition key is DATE/HOUR. The data for an hour is about 30 million. The same query takes some 20-30 seconds in Spark SQL and does not work in any reasoanable time in SQL Endpoint.

 

SparkSQL is using partition pruning. Is thete any way to check if SQL Endpoint is also doing partition pruning?

Anonymous
Not applicable

Hi @KailashPareek,

Perhaps you can take a look at the following document to monitor and trace the query processing:

Monitoring in Fabric Data Warehouse overview - Microsoft Fabric | Microsoft Learn

Regards,

Xiaoxin Sheng

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.