Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Needn
Frequent Visitor

What is the mechanism of data compression in PBIX file?

Hello,

 

I'm doing a test about the data compression ratio of PBIX file.

I've created 2 tables with the same schema in SQL Server, wide_fact1 and wide_fact2 and import them into Power BI desktop.

The difference is only the order of data between the two.

table1_vs_table2.png

 

Then I used VertiPaq Analyzer in DAX studio to analyze data size, got the same total size value.

wide_table1.png

wide_table2.png

But large difference in PBIX file as following.

pbix_size.png

 

I want to know what happened.

I've posted a github discussion, I'm appreciated @marcorusso  gave me some good idea.

See here https://github.com/DaxStudio/DaxStudio/discussions/541

But my question has not been resolved, that's why I post this topic here.

 

Please feel free to download PBIX files if you want

https://neednlab.blob.core.windows.net/public/wide_table1.pbix
https://neednlab.blob.core.windows.net/public/wide_table2.pbix

 

Thanks

6 REPLIES 6
negi007
Community Champion
Community Champion

@Needn there are various tricks and techniques you can use to reduce your datasize. I hope you would have done some simple things like

1. Not loading tables in powerquery which are being used only for support or aggregation purpose

2. Removing columns not being used

3. Using integer instead of text wherever possible

4. Creating one date table for all all time intelligence calculation instead of having it in multiple tables

5. Importing data at aggregate level if possible

 

You may refer to below links for help

https://docs.microsoft.com/en-us/power-bi/guidance/import-modeling-data-reduction

https://blog.pragmaticworks.com/power-bi-performance-tips-and-techniques

 




Did I answer your question? Mark my post as a solution!
Appreciate your Kudos



Proud to be a Super User!


Follow me on linkedin

Needn
Frequent Visitor

Yes, I've read power bi docs about the data reduction techniques and it's helpful.

But in this topic, I just want to focus on the question: why there is a big difference of compression ratio between the 2 PBIX files. Thanks

negi007
Community Champion
Community Champion

@Needn I can't download PBIX file. there seems to be an error with the link. Please check and share the correct link.




Did I answer your question? Mark my post as a solution!
Appreciate your Kudos



Proud to be a Super User!


Follow me on linkedin

Needn
Frequent Visitor

oops. please try again @negi007 

negi007
Community Champion
Community Champion

@Needn Yes, now I can access both the files. I did lot of reasearch on both the files and try to keep the settings same however it did not result in any change in the size of the file. I even open both the files in the dax editor to compare table and other properites. but everythinng looks same for both the files.

 

You can try reading below blog as well it will give you some information about various information about the pbix file. It might take you in the right direction

 

https://www.fourmoo.com/2017/05/02/what-makes-up-a-power-bi-desktop-pbix-file/

 

However, I was able to reduce the size of the file1. To reduce the size of the file 1 this is what I did. I ziped both file 1 and file 2 just by chaning the extension of the file from pbix to zip. After you rename the file to zip file and when you open it you get below information. As you can see DataModel is the one having bigger size then others. 

 

negi007_5-1612610638415.png

So next I replaced the DataModel file from my File2 to File1. After this I changed the extension of the file back to pbix. Below is the final size of both the files

 

negi007_4-1612610612265.png

 

 

However, to your question why files sizes are different I still could not find out the reason. May be there are few properties in the powerquery window that is determining the file size or may be not. Let's hope someone will be able to find out the reason.

 

Cheers for asking interesting query.

 

 

 




Did I answer your question? Mark my post as a solution!
Appreciate your Kudos



Proud to be a Super User!


Follow me on linkedin

Needn
Frequent Visitor

Thanks for your idea. I guess the difference of file size is due to a further X9 compression. As mentiond by Marco Russo, it is not compressed at VertiPaq level. @negi007 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.