Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
KailashPareek
Helper I
Helper I

working with large data set, say, 1 trillion rows in Fabric Lakehouse and SQL Analytics endpoint

Hi There

 

We have a delta table of about 1 trillion rows in F8 capacity. The table is partitioned on DATE+HOUR and has only 10 columns of data. All work fine in Spark (both PySpark and Spark SQL). When a simple sql like select * from table where DATE=<date> and HOUR=<hour> in SQL Analytics endpoint, it goes on forever and nothing is returned even in couple of hours.

 

We expect it should read Parquet files of that date+hour (2-3 files <500 MB and 30 million rows). It seems to be trying to read all instead of using partition info. MS Support is not helpful so far.

 

The question is are there anyone in the forum using such data and what exactly is their experience on such data volumes with partitioned tables and SQL Analytics endpoint? 

 

The same behaviour in Power BI as well as are abel to connect only as DQ not as Direct lake due to size limitations.

 

Kailash Pareek 

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @KailashPareek,

Have you tied to upgrade the SKU of the capacity? AFAIK, different level has their own configurations/resources and limitations.

Since you are working with Lakehouse semantic model and SQL endpoint with larger number of records. I think the higher level SKU should provide more available resource to processing these records. 

What is Power BI Premium? - Power BI | Microsoft Learn

SKU Max memory (GB)1, 2 Max concurrent DirectQuery connections (per semantic model)1 Max DirectQuery parallelism3 Live connection (per second)1 Max memory per query (GB)1 Model refresh parallelism Direct Lake rows per table (in millions)1, 4 Max Direct Lake model size on OneLake (GB)1, 4
F2 3 5 1 2 1 1 300 10
F4 3 5 1 2 1 2 300 10
F8 3 10 1 3.75 1 5 300 10
F16 5 10 1 7.5 2 10 300 20
F32 10 10 1 15 5 20 300 40
F64 25 50 8 30 10 40 1,500 Unlimited
F128 50 75 12 60 10 80 3,000 Unlimited
F256 100 100 16 120 10 160 6,000 Unlimited
F512 200 200 20 240 20 320 12,000 Unlimited
F1024 400 200 24 480 40 640 24,000 Unlimited
F2048 400 200   960 40 1,280 24,000 Unlimited

Regards,

Xiaoxin Sheng

View solution in original post

2 REPLIES 2
AndyDDC
Super User
Super User

Hi @KailashPareek there's a performance guidelines document, hopefully you'll find some insight

 

SQL analytics endpoint performance considerations - Microsoft Fabric | Microsoft Learn

Anonymous
Not applicable

Hi @KailashPareek,

Have you tied to upgrade the SKU of the capacity? AFAIK, different level has their own configurations/resources and limitations.

Since you are working with Lakehouse semantic model and SQL endpoint with larger number of records. I think the higher level SKU should provide more available resource to processing these records. 

What is Power BI Premium? - Power BI | Microsoft Learn

SKU Max memory (GB)1, 2 Max concurrent DirectQuery connections (per semantic model)1 Max DirectQuery parallelism3 Live connection (per second)1 Max memory per query (GB)1 Model refresh parallelism Direct Lake rows per table (in millions)1, 4 Max Direct Lake model size on OneLake (GB)1, 4
F2 3 5 1 2 1 1 300 10
F4 3 5 1 2 1 2 300 10
F8 3 10 1 3.75 1 5 300 10
F16 5 10 1 7.5 2 10 300 20
F32 10 10 1 15 5 20 300 40
F64 25 50 8 30 10 40 1,500 Unlimited
F128 50 75 12 60 10 80 3,000 Unlimited
F256 100 100 16 120 10 160 6,000 Unlimited
F512 200 200 20 240 20 320 12,000 Unlimited
F1024 400 200 24 480 40 640 24,000 Unlimited
F2048 400 200   960 40 1,280 24,000 Unlimited

Regards,

Xiaoxin Sheng

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.