Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
mkj1213
Helper I
Helper I

1 Trillion rows table in Fabric and Power BI.

In the article querying-one-trillion-rows-of-data-with-powerbi-and-azure-databricks written by Kyle Hale, he had tested combining azure databricks with Power BI sucessfully. 

We followed the same steps for a table with around 1 Trillion rows:

  • The table was created in a lakehouse by the following code:
from delta.tables import DeltaTable
DeltaTable.create(spark)\
    .tableName("FACT_TABLE")\
    .addColumn("ID", "STRING")\
    .addColumn("TIC", "STRING")\
    .addColumn("UPDATE_TIMESTAMP", "TIMESTAMP")\
    .addColumn("SRC", "STRING")\
    .addColumn("PRICE", "DOUBLE")\
    .addColumn("QUANTITY", "DOUBLE")\
    .addColumn("TYPE", "STRING")\
    .addColumn("DATE", "DATE", generatedAlwaysAs="CAST(UPDATE_TIMESTAMP AS DATE)")\
    .addColumn("HOUR", "INT", generatedAlwaysAs="HOUR(UPDATE_TIMESTAMP)")\
    .addColumn("MINUTE", "INT", generatedAlwaysAs="MINUTE(UPDATE_TIMESTAMP)")\
    .addColumn("SECOND", "INT", generatedAlwaysAs="SECOND(UPDATE_TIMESTAMP)")\
    .partitionedBy("DATE", "HOUR")\
    .execute()
  • Then the table was populated by writing pyspark dataframe from a notebook.

In power bi we used Direct Query with Dynamic M parameters (Date, Hour and Minute) to try to limit the number of records being fetched from a query. For one hour we have around 30 million rows or around 0.5 million per minute when we are counting using:

df = spark.sql("SELECT * FROM LH.FACT_TABLE where date  = '2023-06-10' and HOUR = 11 and minute = 12")
df.count()

from a notebook. The result is returned very quickly (less than 4 minutes).
When trying to run the same sql query from the sql endpoint of the LH, the result is never returned even after waiting for several hours. As if the SQL Query is not doing any partition pruning.

As we had chosen to connect the sql endpoint in power bi, we are unable to report on that table. 

 

Is there anyone out there dealing with 1 Trillion row tables in power bi? If yes, were you able to report on the table, or are you facing the same problem as mine. 

Thanks

 

2 REPLIES 2
v-stephen-msft
Community Support
Community Support

Hi @mkj1213 ,

 

Thanks for reaching out to our community.

You could check partition pruning. Please ensure that the SQL endpoint is correctly configured and supports partition pruning. Try explicitly specifying partition columns in your SQL query to ensure the query can leverage partitions.

Also consider using aggregate queries or precomputed views to reduce the amount of data being queried. For example, create daily or hourly aggregate tables and query these aggregates in Power BI.

Hope it helps.

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Thanks for your reply Stephen,

1- How am i supposed to check if the sql endpoint is correctly configured and supports partition pruning. We are using the sql endpoint that is provided by Fabric when i am creating a lake house. 
2-We had already tried to query only using the partition columns, and i am not getting data at all.

3-We had already created daily and hourly aggregate tables.

Thanks in Advance

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.