Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
We are a very huge company and working on really huge data, primarily using a fact table with more than 800 million records and 350+ columns amounting to around 80-90 GB of data. The model will also contain few other dimension tables as well.
For this scenario, how is Fabric capacity good for Power BI performance? Please give solutions asap.
Thanks,
Ajit
Solved! Go to Solution.
Hi @AjitSingh100488 ,
Yes MS Fabric is capable of handling that much data with higher SKUs, say F128-256 can handle up to 50-100 gb. But that's expensive. So first we need to do the optimization of data model. Whether we need that much amount of data to be loaded for BI.
1. We can take only required columns
2. And filter the data from warehouse that is just required for BI
3. Aggregate the data based on a particular key element, exp sales amount etc
Then based on the data modelling, we have to decide on direct query or import mode.
Few points to note - It is not a good idea to bring in large volume of data to Power BI. For direct query, there is a restriction of results returning with 1 million row from a query.
If this post helps, please accept this as a solution. Appreciate your kudos.
Thanks,
Pallavi
Hi @AjitSingh100488 ,
Here giving few tips for assessing the actual data model size and doing the performance model. There are more to it. I will share few good links now and afterwards for your reference.
A. How to assess your model size (since you are saying 80-90 gb)
1. We need to evaluate the power bi model size for 1 year, say 2024 using Power BI Desktop
2. So filter the fact and dimension table rows only for year 2024.
3. Take only selected columns which you really need in Visualization.
4. Remove rows which are not required
5. Now see what is the size of Power BI file now after importing the data.
6. If one year data is so high and could not be loaded in PBI desktop, then do it for 6 months
7. From the above step, you will get an idea what would the actual size of your model for all the years.
B. Tips for model optimization:
1. As mentioned reduce no. of columns and rows. Only the required ones.
2. How much historical data you need to take to dashboard. There could be 5-10 years of data, but we have to think whether business will get any value for deriving insights from 5 year old data.
3.If it is huge set of rows in fact table, then decrease the granularity using group by. Aggregate the data, see if aggregation can be done on daily records, says sales generated by vendor per region, state etc. etc
4.If aggregation level at daily level does not help and still lot of records are there then decide how frequently data needs to be refreshed? Weekly, Monthly, Daily? Can we aggregate the data on weekly or monthly basis.
5.Column cardinality - Remove decimal and keep rounding numbers. Remove milliseconds and round to DateTime.
6.Try to all the column deriving and calculation in the backend and before bringing to power bi. This way it brings the derived or calculated column and avoid load Power BI visuals to do the calculation on load
7.use variables in dax calculation with multiple steps https://learn.microsoft.com/en-us/training/modules/optimize-model-power-bi/3-variables
8.If you have chosen import, then go for incremental refresh
Post optimization, check the size and decide on the capacity / FSKU.
Total memory consumed on service -> size of dataset + refresh + report interaction + concurrent user / dataset activities.
Few useful links
https://data-mozart.com/how-to-reduce-your-power-bi-model-size-by-90/
https://learn.microsoft.com/en-us/power-bi/guidance/power-bi-optimization
If this post helps, please accept this as a solution. Appreciate your kudos.
Thanks,
Pallavi
Hi @lbendlin /@pallavi_r -
Greeting!!
I am trying to connet to databricks server which is purly on cloud, and I am tring both mode direct and import mode for approx 2 GB of data table but it through me an error(attached here).
Please help to resolve it, if possible.
Thanks,
Ajit.
Hi @AjitSingh100488 ,
Yes MS Fabric is capable of handling that much data with higher SKUs, say F128-256 can handle up to 50-100 gb. But that's expensive. So first we need to do the optimization of data model. Whether we need that much amount of data to be loaded for BI.
1. We can take only required columns
2. And filter the data from warehouse that is just required for BI
3. Aggregate the data based on a particular key element, exp sales amount etc
Then based on the data modelling, we have to decide on direct query or import mode.
Few points to note - It is not a good idea to bring in large volume of data to Power BI. For direct query, there is a restriction of results returning with 1 million row from a query.
If this post helps, please accept this as a solution. Appreciate your kudos.
Thanks,
Pallavi
Hi @pallavi_r ,
Thanks for your response and time on my question, your answer help me to understand this situation more better.
Thanks,
Ajit
No worries @AjitSingh100488 , if you have any further questions regarding large volume, please raise it here.
Thanks,
Pallavi
Hi @pallavi_r ,
Greetings!
I have three question here on Power BI-
1. Can you please explain in detail about parameters for that import mode and how many types are available in Power Bi? Basically use parameter to restrict the number of rows to may be a few thousands and once you publish refresh it on server with millions of records to test import issue?
2. What are the standard limitation of import mode semantic model?
3. How to resolve it? Refer below image-
Please help me with these questions.
Thanks,
Ajit.
Hi @pallavi_r ,
Can you please help me to check performance optimization techniques for semantic data model for large data volumes (fact table with more than 800 million records and 350+ columns amounting to around 80-90 GB of data. The model will also contain few other dimension tables as well) using fabric capacity?
Please give your suggestions.
Thanks,
Ajit
Hi @AjitSingh100488 ,
Here giving few tips for assessing the actual data model size and doing the performance model. There are more to it. I will share few good links now and afterwards for your reference.
A. How to assess your model size (since you are saying 80-90 gb)
1. We need to evaluate the power bi model size for 1 year, say 2024 using Power BI Desktop
2. So filter the fact and dimension table rows only for year 2024.
3. Take only selected columns which you really need in Visualization.
4. Remove rows which are not required
5. Now see what is the size of Power BI file now after importing the data.
6. If one year data is so high and could not be loaded in PBI desktop, then do it for 6 months
7. From the above step, you will get an idea what would the actual size of your model for all the years.
B. Tips for model optimization:
1. As mentioned reduce no. of columns and rows. Only the required ones.
2. How much historical data you need to take to dashboard. There could be 5-10 years of data, but we have to think whether business will get any value for deriving insights from 5 year old data.
3.If it is huge set of rows in fact table, then decrease the granularity using group by. Aggregate the data, see if aggregation can be done on daily records, says sales generated by vendor per region, state etc. etc
4.If aggregation level at daily level does not help and still lot of records are there then decide how frequently data needs to be refreshed? Weekly, Monthly, Daily? Can we aggregate the data on weekly or monthly basis.
5.Column cardinality - Remove decimal and keep rounding numbers. Remove milliseconds and round to DateTime.
6.Try to all the column deriving and calculation in the backend and before bringing to power bi. This way it brings the derived or calculated column and avoid load Power BI visuals to do the calculation on load
7.use variables in dax calculation with multiple steps https://learn.microsoft.com/en-us/training/modules/optimize-model-power-bi/3-variables
8.If you have chosen import, then go for incremental refresh
Post optimization, check the size and decide on the capacity / FSKU.
Total memory consumed on service -> size of dataset + refresh + report interaction + concurrent user / dataset activities.
Few useful links
https://data-mozart.com/how-to-reduce-your-power-bi-model-size-by-90/
https://learn.microsoft.com/en-us/power-bi/guidance/power-bi-optimization
If this post helps, please accept this as a solution. Appreciate your kudos.
Thanks,
Pallavi
Hi @pallavi_r ,
Thanks for your quick response.
Can you please tell what is query folding and how can we make best use of it to improve performance?
Thanks,
Ajit
Hi @AjitSingh100488 ,
Query folding is a must, it is not a choice for heavy dataset.
Query folding is kind of the power query optimization where power query with a defined set of instructions for data transformation (filter, sort,aggr etc) is sent to data source for result retrieval and these transformation operation is directly applied to the data source rather than bringing the data and then processing it separately.
For faster refresh of dataset, this has to be done especially for heavy dataset. The dataset might take 40-60 mins and once query folding is done successfully, it takes just few mins to be completed.
Query folding is supported for limited data sources such as sql server, oracle etc.
We can easily check if query folding is successfully done or not by right click on the dataset query setting and check view Native Query is enabled.
If this post helps, please accept this as a solution. Appreciate your kudos.
Thanks,
Pallavi
@pallavi_r You will want to also emphasize that for query folding to be successful the data source has to be capable and ready to shoulder the additional load. There are many scenarios where query folding is actually making things worse, up to and including crashing the source. In some cases simple table spools are the better choice, even over large tables.
Hi @lbendlin /@pallavi_r -
Greeting!!
I am trying to connet to databricks server which is purly on cloud, and I am tring both mode direct and import mode for approx 2 GB of data table but it through me an error(attached here).
Please help to resolve it, if possible.
Thanks,
Ajit.
If you have a Pro license you can open a Pro ticket at https://admin.powerplatform.microsoft.com/newsupportticket/powerbi
Otherwise you can raise an issue at https://community.fabric.microsoft.com/t5/Issues/idb-p/Issues .
Hi @AjitSingh100488 ,
Can you also please tell how many read only users and how many pro users will be using the application approximately?
Thanks,
Pallavi
Hi @pallavi_r ,
Around 900 to 1000 active users are currently using TS which is supposed to be the base for PowerBI.
Thanks,
Ajit.
This here is a forum where users help users, time permitting. For urgent requests contact a Microsoft partner near you.
User | Count |
---|---|
33 | |
14 | |
6 | |
3 | |
2 |
User | Count |
---|---|
39 | |
22 | |
11 | |
7 | |
6 |