Solved: Handle with million rows in power bi

afonsofeliciano · ‎06-14-2023

I have a dimensional model in star schema with 5 dimension tables and 2 large fact tables which have about 100 million rows.
Unfortunately, I cannot aggregate these facts because the necessary metrics are using DISTINCTCOUNT and must be displayed in different perspectives according to the dimension tables selected by the final users.

As it is known that DISTINCTCOUNT measures cannot be added, as they are categorized as non-additive, the only solution was to load all 100 million rows into Power BI using import mode.

This led to relatively poor performance, and I'd like to check out possible improvements for large fact tables like these.

Some things that I did were:

- Creation of dimensional modeling star schema
- Creation of SKs keys using BIG INT in all tables

- Elimination of unnecessary columns

- Typing of columns duly included

For each day of the month, I have approximately 1 million records.

Based on this needs, is there more improvements to apply?

v-yanjiang-msft · ‎06-28-2023

Hi @afonsofeliciano ,

Here are a few additional suggestions that may help:

1. Use partitioning: Partitioning your fact table can help improve query performance by allowing the database engine to only access the relevant partitions when executing a query. You can partition your fact table based on a date column, which would align well with your use case of having approximately 1 million records per day.

2. Use columnstore indexes: Columnstore indexes can improve query performance for large fact tables by storing data in a column-wise format, which can be more efficient for aggregations. You can create a clustered columnstore index on your fact table to take advantage of this.

3. Use DirectQuery mode: If your fact table is too large to import into Power BI, you can consider using DirectQuery mode instead. This mode allows you to query the data source directly, without importing the data into Power BI. However, this may come with some limitations, such as slower query performance and less flexibility in terms of visualizations.

4. Optimize your queries: Finally, you can optimize your queries to improve performance. This may involve using filters to limit the amount of data being queried, or using calculated columns to pre-aggregate data before it is displayed in a visualization. You can also use query profiling tools to identify slow-performing queries and optimize them accordingly.

Best regards,

Community Support Team_yanjiang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-yanjiang-msft · ‎06-28-2023

Hi @afonsofeliciano ,

Here are a few additional suggestions that may help:

1. Use partitioning: Partitioning your fact table can help improve query performance by allowing the database engine to only access the relevant partitions when executing a query. You can partition your fact table based on a date column, which would align well with your use case of having approximately 1 million records per day.

2. Use columnstore indexes: Columnstore indexes can improve query performance for large fact tables by storing data in a column-wise format, which can be more efficient for aggregations. You can create a clustered columnstore index on your fact table to take advantage of this.

3. Use DirectQuery mode: If your fact table is too large to import into Power BI, you can consider using DirectQuery mode instead. This mode allows you to query the data source directly, without importing the data into Power BI. However, this may come with some limitations, such as slower query performance and less flexibility in terms of visualizations.

4. Optimize your queries: Finally, you can optimize your queries to improve performance. This may involve using filters to limit the amount of data being queried, or using calculated columns to pre-aggregate data before it is displayed in a visualization. You can also use query profiling tools to identify slow-performing queries and optimize them accordingly.

Best regards,

Community Support Team_yanjiang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

amitchandak · ‎06-14-2023

@afonsofeliciano , Hope you are on premium, You can do a POC on Microsoft Fabric

For Premium, you can enable it for users ( for pro/ppu can use 60 days trial)

Microsoft Fabric capacities are available for purchase: https://blog.fabric.microsoft.com/en-us/blog/announcing-microsoft-fabric-capacities-are-available-fo...

What is Microsoft Fabric and How it Works
https://www.youtube.com/watch?v=p-v0I5S-ybs&list=PLPaNVDMhUXGYU97pdqwoaociLdwyDRn39

Share with Power BI Enthusiasts: Full Power BI Video (20 Hours) YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

afonsofeliciano · ‎06-17-2023

But is there something to help with millions of rows during import data to power bi? I cannot sum distinct-count metrics because they are non-additive and the volume is too large for power premium, even using incremental refresh

Handle with million rows in power bi

Helpful resources

FabCon Global Hackathon

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta