Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount.
Register now#QuestionForGroup
Hi All.
I have a dataset of 2.9M rows which upon unpivoting and merging with 1 table becomes some 246M rows. After that I have to remove duplicates from the table(based on Col1, Col2, Col3…. Col7). After this step if I close and apply the dataset loads for 3/4 hours and then error pops as “buffer memory out of space”.
Could anyone please help me out on this?
Possible solutions tried till now:
1. Taken a dataset of 75000 rows and did all steps and checked all working or not. After getting satisfactory results I uploaded the dashboard in power bi service and renamed the file of 2.9M rows same as file name of 75000 rows and refreshed it in power bi service. —- Result: This process didn’t work as it is not taking 2.9M rows.
2. Tried joining tables but this doesn’t work out as well. (Many to one join but corresponding values are unique.) Hence count is not matching
Hi @Ayushman1997 ,
When handling a large dataset in Power BI that expands from 2.9M rows to 246M rows after unpivoting and merging, the "buffer memory out of space" error occurs due to excessive memory consumption in Power Query. This issue is primarily caused by the unpivot operation, which drastically increases the number of rows, followed by a merge operation that further strains system resources. Additionally, removing duplicates on such a massive dataset adds to the processing load, causing Power BI to run out of memory after 3-4 hours of processing. To resolve this issue, consider reducing data size before merging, avoiding unpivoting in Power Query (preferably doing it at the source level using SQL or an ETL tool), and leveraging Table.Buffer() to optimize performance. If merging is necessary, ensure that the joining columns are indexed for faster execution. Instead of performing all transformations in Power Query, it may be beneficial to pre-aggregate data, use joins at the source level, or even split the dataset into smaller tables while maintaining relationships in Power BI. Additionally, filtering out unnecessary columns and rows before loading the data can significantly reduce memory usage and improve query performance.
let
// Load the source table
SourceTable = PowerBI.DataSource(),
// Apply filtering before unpivoting to reduce row count
FilteredTable = Table.SelectColumns(SourceTable, {"Col1", "Col2", "Col3", "Col4", "Col5", "Col6", "Col7", "ValueColumn"}),
// Unpivot only necessary columns
UnpivotedTable = Table.UnpivotOtherColumns(FilteredTable, {"Col1", "Col2", "Col3"}, "Attribute", "Value"),
// Buffer the smaller table before merging
SmallTable = Table.Buffer(PowerBI.OtherSource()),
// Perform optimized merge
MergedTable = Table.NestedJoin(UnpivotedTable, "KeyColumn", SmallTable, "KeyColumn", "MergedData", JoinKind.LeftOuter),
// Expand merged data
ExpandedTable = Table.ExpandTableColumn(MergedTable, "MergedData", {"NewColumn1", "NewColumn2"}),
// Remove duplicates efficiently
DeduplicatedTable = Table.Distinct(ExpandedTable)
in
DeduplicatedTable
This optimized query improves performance by reducing row count before unpivoting, buffering the smaller table before merging, and filtering out unnecessary columns before loading the data, helping to prevent memory issues.
Hi there!
- Instead of unpivoting first, try to filter unnecessary columns and aggregate before unpivoting, you can use Remove Columns in Power Query for this.
- You could also try using incremental refresh. You can enable Incremental Refresh in Power BI Service and then filter data dynamically in Power Query using this formula: Table.SelectRows(Source, each [Date] >= Date.AddMonths(Today(), -12)). Don't forget to set the partitioning so that Power BI loads data in chunks instead of one large load.
Hope this helps!
😁😁
All unnecessary columns are deleted before unpivoting. The data is of 1 year and it does contain 2.9M rows minimum 😅. Also incremental or not they want full data of 1 year to be shown so no point in adding incremental refresh I guess as with schedule refresh it should be fine.
Hi @Ayushman1997,
Thank you for reaching out to the Microsoft Fabric Community Forum.
After thoroughly reviewing the details you provided, here are a few workarounds that may help resolve the issue. Please follow these steps:
If this post helps, then please give us ‘Kudos’ and consider Accept it as a solution to help the other members find it more quickly.
Best Regards.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the February 2025 Power BI update to learn about new features.
If you love stickers, then you will definitely want to check out our Community Sticker Challenge!
User | Count |
---|---|
44 | |
25 | |
20 | |
19 | |
18 |
User | Count |
---|---|
52 | |
48 | |
22 | |
22 | |
20 |