Solved: Data compression Procedure

pennyhoho117 · ‎02-16-2025

According to : Data reduction techniques for Import modeling - Power BI | Microsoft Learn

"Import models are loaded with data that's compressed and optimized, and then stored to disk by the VertiPaq storage engine. When source data is loaded into memory, it's possible to achieve 10x compression, and so it's reasonable to expect that 10 GB of source data can compress to about 1 GB in size. Further, when persisted to disk an extra 20% reduction can be achieved."

Question:

1. When a import mode report is publish to Power BI Service, how the data being compressed and optimized?

2. So after it has been compressed and optimized, it would then be achieve by 10x compression by VertiPag?

3. so the completed process is publish report > compressed and optimized > achieve by 10x compression by VertifPag ?

4 how is report has already been published to Power BI service, then call Refresh Schedule to refresh data? Enquiry result compressed and optimzied by Data gateway? then achieve in Power BI Service?

powerbidev123 · ‎02-16-2025

Hi @pennyhoho117

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

When a report in Import Mode is published to Power BI Service, the underlying data model (powered by VertiPaq) is highly compressed and optimized using columnar storage and dictionary encoding. The key optimizations include:

Columnar Storage: Data is stored in a column-wise manner, reducing redundancy.
Encoding & Dictionary Compression: Repetitive values are stored as dictionary references rather than full values.
Run-Length Encoding (RLE): Sequences of repeated values are replaced with a single value and a count.
Bit-Packing & Data Type Optimization: Efficient storage of numeric values using the smallest required data types.
This compression and optimization happen before the report is published to Power BI Service.

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

Yes, typically VertiPaq can achieve 10x compression, but the actual ratio depends on:

The structure of your data (number of unique values, data types, etc.).
The presence of high cardinality columns (e.g., unique IDs reduce compression efficiency).
How efficiently dictionary encoding and RLE can be applied.
Some datasets may achieve higher compression (20x-100x), while others with many unique values may see less.

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

Not exactly, the correct sequence is:

Data Import & Model Creation: Data is loaded into Power BI Desktop.
Compression & Optimization (VertiPaq Engine): Data is compressed and optimized within Power BI Desktop.
Publish to Power BI Service: The optimized .pbix file (including the compressed model) is uploaded to Power BI Service.
Storage in Power BI Service: The compressed model is stored in memory when loaded.

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

When a report is already published to Power BI Service and a Scheduled Refresh runs:

Data Source Query Execution: Power BI queries the source (via a Data Gateway if the source is on-premises).
Data Compression (via VertiPaq) in Power BI Service:
- The refreshed data is re-compressed and optimized using the same VertiPaq techniques.
Storage in Power BI Service: The newly compressed data replaces the old dataset in Power BI Service.

🔹 The Data Gateway does not perform compression—it only facilitates query execution and data transfer. The actual compression happens inside Power BI Service, just like it does in Power BI Desktop.

View solution in original post

v-nmadadi-msft · ‎03-03-2025

Hi @pennyhoho117,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided by the community members for the issue worked. If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Thanks and regards

v-nmadadi-msft · ‎02-27-2025

Hi @pennyhoho117 ,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If our responses has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.

Thank you.

v-nmadadi-msft · ‎02-23-2025

Hi @pennyhoho117,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-nmadadi-msft · ‎02-17-2025

Hi @pennyhoho117 ,
Thanks for reaching out to the Microsoft fabric community forum.
Thanks @powerbidev123 for such a detailed and thorough solution, in addition to their points I would like to mention.

When you publish a Power BI Desktop file to the Power BI service, you publish the data in the model to your Power BI workspace. The same is true for any reports you created in Report view. You’ll see a new semantic model with the same name and any reports in your Workspace navigator. Publishing from Power BI Desktop has the same effect as using Get Data in Power BI to connect to and upload a Power BI Desktop file. So this means if we import some data into desktop and after compression if its about 1 GB, if we publish it to Service the semantic model will weigh the same.

For the third question, please make a note that the correct order as mentioned by @powerbidev123 is Import data > compressed, optimized and achieve 10x compression by VertifPag > Publish.

If you find this post helpful, please mark it as an "Accept as Solution" and consider giving a KUDOS.
Thanks and Regards

pennyhoho117 · ‎02-17-2025

so according to the point 3, the VertiPag Engine is installed together with Power BI Desktop?

so the size of pbix file is already compressed and optimizated, which cannot exceeds 1G limit, right?

but the pbix includes the visualization(report view), so visualization also count in 1G limit?

How if the pbix exceed 1G limit?

v-nmadadi-msft · ‎02-20-2025

Hi @pennyhoho117 ,
Thanks for reaching out to the Microsoft fabric community forum.

VertiPaq works in the background to boost performance of Power BI reports.
Visualization also will include in the 1 gb but the report visuals themselves typically take up much less space compared to the data model.
If the pbix file increases 1 GB limit we have to optimize the data model to reduce it or switch to direct query mode

If you find this post helpful, please mark it as an "Accept as Solution" and consider giving a KUDOS.
Thanks and Regards

powerbidev123 · ‎02-16-2025

Hi @pennyhoho117

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

When a report in Import Mode is published to Power BI Service, the underlying data model (powered by VertiPaq) is highly compressed and optimized using columnar storage and dictionary encoding. The key optimizations include:

Columnar Storage: Data is stored in a column-wise manner, reducing redundancy.
Encoding & Dictionary Compression: Repetitive values are stored as dictionary references rather than full values.
Run-Length Encoding (RLE): Sequences of repeated values are replaced with a single value and a count.
Bit-Packing & Data Type Optimization: Efficient storage of numeric values using the smallest required data types.
This compression and optimization happen before the report is published to Power BI Service.

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

Yes, typically VertiPaq can achieve 10x compression, but the actual ratio depends on:

The structure of your data (number of unique values, data types, etc.).
The presence of high cardinality columns (e.g., unique IDs reduce compression efficiency).
How efficiently dictionary encoding and RLE can be applied.
Some datasets may achieve higher compression (20x-100x), while others with many unique values may see less.

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

Not exactly, the correct sequence is:

Data Import & Model Creation: Data is loaded into Power BI Desktop.
Compression & Optimization (VertiPaq Engine): Data is compressed and optimized within Power BI Desktop.
Publish to Power BI Service: The optimized .pbix file (including the compressed model) is uploaded to Power BI Service.
Storage in Power BI Service: The compressed model is stored in memory when loaded.

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

When a report is already published to Power BI Service and a Scheduled Refresh runs:

Data Source Query Execution: Power BI queries the source (via a Data Gateway if the source is on-premises).
Data Compression (via VertiPaq) in Power BI Service:
- The refreshed data is re-compressed and optimized using the same VertiPaq techniques.
Storage in Power BI Service: The newly compressed data replaces the old dataset in Power BI Service.

🔹 The Data Gateway does not perform compression—it only facilitates query execution and data transfer. The actual compression happens inside Power BI Service, just like it does in Power BI Desktop.

pennyhoho117 · ‎02-17-2025

so according to the point 3, the VertiPag Engine is installed together with Power BI Desktop?

so the size of pbix file is already compressed and optimizated, which cannot exceeds 1G limit, right?

but the pbix includes the visualization(report view), so visualization also count in 1G limit?

How if the pbix exceed 1G limit?

Data compression Procedure

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

Become a Certified Power BI Data Analyst!

Data compression Procedure

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

1. When an import mode report is published to Power BI Service, how is the data being compressed and optimized?

2. After it has been compressed and optimized, does VertiPaq achieve a 10x compression?

3. Is the complete process: Publish report > Compressed and optimized > Achieve 10x compression by VertiPaq?

4. How does a published report refresh in Power BI Service? Does the Data Gateway compress and optimize data?

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025