Solved: Re: Power bi dekstop

Ayushman1997 · ‎02-20-2025

#QuestionForGroup
Hi All.

I have a dataset of 2.9M rows which upon unpivoting and merging with 1 table becomes some 246M rows. After that I have to remove duplicates from the table(based on Col1, Col2, Col3…. Col7). After this step if I close and apply the dataset loads for 3/4 hours and then error pops as “buffer memory out of space”.

Could anyone please help me out on this?

Possible solutions tried till now:
1. Taken a dataset of 75000 rows and did all steps and checked all working or not. After getting satisfactory results I uploaded the dashboard in power bi service and renamed the file of 2.9M rows same as file name of 75000 rows and refreshed it in power bi service. —- Result: This process didn’t work as it is not taking 2.9M rows.

2. Tried joining tables but this doesn’t work out as well. (Many to one join but corresponding values are unique.) Hence count is not matching

Note: ALL UNNECESSARY COLUMNS ARE REMOVED

v-kpoloju-msft · ‎03-07-2025

Hi @Ayushman1997,

Thank you for sharing your update. We understand that you are encountering a memory issue due to the large number of records in Power BI. Since you have already escalated this to the relevant teams, we appreciate your patience as they investigate further.

In the meantime, you might consider optimizing your dataset by:

Using aggregations to reduce data volume.
Implementing incremental refresh to load only necessary data.
Filtering out unnecessary columns and rows before loading into Power BI.

Also, please go through the below solved solution for better understanding:
Solved: Memory run out issues in power bi desktop - Microsoft Fabric Community

If this post helps, then please give us ‘Kudos’ and consider Accept it as a solution to help the other members find it more quickly.

Best Regards

View solution in original post

rohit1991 · ‎02-22-2025

Hi @Ayushman1997 ,

Here’s a breakdown of what usually works best in real-world scenarios:

1. Always filter and trim before unpivoting or merging. Don’t wait until after unpivot to remove unnecessary rows/columns do it as the very first step. If you can, filter at the SQL/source level before Power BI even touches the data.

2. Unpivot only the columns you actually need. The fewer columns you unpivot, the less Power Query has to process and the less memory is used.

3. When merging, always buffer the smaller table. Use Table.Buffer() on the smaller table before the merge. Don’t buffer massive tables, it can actually make things worse.

4. Remove duplicates as early as possible. If you need to deduplicate, do it before merging. If your source supports it (SQL, Python, etc.), do the deduplication outside of Power BI.

5. Use DirectQuery or Dataflows for really large data. If your source is a SQL DB, see if DirectQuery is an option (it avoids loading everything into memory). Or use a Power BI Dataflow if your org allows, it’s more scalable than desktop Power Query for big crunches.

6. Disable load for intermediate queries in Power Query. Right-click on queries that are just steps, and select “Enable Load” OFF this can save a ton of RAM.

7. Try splitting your transformation into stages. If possible, pre-process your data into smaller, “staged” files or tables. Sometimes breaking one huge import into smaller parts solves the problem.

8. Consider incremental refresh or partitioning (Premium/Fabric only). If your report will keep growing, look at incremental refresh to load in chunks.

Example M code (pattern):

// Filter at the very start
Filtered = Table.SelectRows(Source, each [Col1] <> null and [Col2] <> null),
// Unpivot only needed columns
Unpivoted = Table.UnpivotOtherColumns(Filtered, {"Col1", "Col2"}, "Attribute", "Value"),
// Buffer only the small table before merge
SmallTableBuffered = Table.Buffer(OtherSmallTable),
// Merge
Merged = Table.NestedJoin(Unpivoted, "Key", SmallTableBuffered, "Key", "NewData", JoinKind.LeftOuter),
// Expand & deduplicate
Expanded = Table.ExpandTableColumn(Merged, "NewData", {"NewCol"}),
Deduped = Table.Distinct(Expanded)
in
Deduped

Bonus Tips: Close all other heavy apps (Chrome, Teams, etc.) when you’re applying steps. If you have access to a machine with more RAM, try running Power BI Desktop there. Sometimes, it’s honestly faster to do some “pre-crunching” in Python, R, or even Excel, and import the cleaned file into Power BI.

Did it work? ✔ Give a Kudo • Mark as Solution – help others too!

Ayushman1997 · ‎02-27-2025

i have checked in ETL method and resultant ouput takes up 70gb space which on exporting as csv takes 22gb. Now when I load 22gb directly it takes 3 hours

freginier · ‎02-21-2025

Hi there!

- Instead of unpivoting first, try to filter unnecessary columns and aggregate before unpivoting, you can use Remove Columns in Power Query for this.

- You could also try using incremental refresh. You can enable Incremental Refresh in Power BI Service and then filter data dynamically in Power Query using this formula: Table.SelectRows(Source, each [Date] >= Date.AddMonths(Today(), -12)). Don't forget to set the partitioning so that Power BI loads data in chunks instead of one large load.

Hope this helps!

😁😁

Ayushman1997 · ‎02-21-2025

All unnecessary columns are deleted before unpivoting. The data is of 1 year and it does contain 2.9M rows minimum 😅. Also incremental or not they want full data of 1 year to be shown so no point in adding incremental refresh I guess as with schedule refresh it should be fine.

v-kpoloju-msft · ‎02-20-2025

Hi @Ayushman1997,

Thank you for reaching out to the Microsoft Fabric Community Forum.

After thoroughly reviewing the details you provided, here are a few workarounds that may help resolve the issue. Please follow these steps:

Please ensure that only the necessary columns are included in your dataset and apply filters to minimize the data being loaded. Instead of importing all data into Power BI, use Direct Query mode to query data directly from the source, which can help manage large datasets more efficiently.
Direct Query model guidance in Power BI Desktop - Power BI | Microsoft Learn
Increase the maximum memory allowed for Power BI Desktop by navigating to File > Options and Settings > Options > Data Load and increasing the cache memory.
Utilize Power BI’s Data Limit feature to set a maximum threshold for data loaded into a report. This helps prevent performance issues and focuses on the most relevant data. Implement incremental refresh to load only new or changed data, rather than refreshing the entire dataset.
Write efficient DAX queries and avoid row-by-row calculations to enhance performance. Create aggregated tables to reduce the data processed at runtime. Consider splitting the large dataset into smaller, more manageable datasets.

If this post helps, then please give us ‘Kudos’ and consider Accept it as a solution to help the other members find it more quickly.

Best Regards.

v-kpoloju-msft · ‎02-23-2025

Hi @Ayushman1997,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-kpoloju-msft · ‎02-27-2025

Hi @Ayushman1997,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.

Thank you.

Ayushman1997 · ‎02-27-2025

The issue is not solved yet. Its not working even if after utilizing the comments

v-kpoloju-msft · ‎02-28-2025

Hi @Ayushman1997,
After thoroughly reviewing the details you provided, I have identified alternative workarounds that may help resolve the issue. Please follow these steps:

Unpivoting can significantly increase the row count. To optimize, try the following before unpivoting: filter out unnecessary columns early to retain only essential ones, and if unavoidable, perform unpivoting in SQL or Power Query. If your data source is SQL Server or Azure SQL, conduct the unpivot operation in SQL before importing into Power BI.
Deduplication at the Power Query level can be time-consuming. It is better to deduplicate data in SQL prior to loading it into Power BI. If you must use Power Query, use deduplication as early as possible to reduce data processing in later steps.
Alternatively, use the Group By approach rather than the Remove Duplicates function. By partitioning data by columns like Col1 to Col7 and then using an aggregation function like Min or Max, you can deduplicate while optimizing performance.
Instead of processing the whole 246M rows, please generate aggregated tables at different levels of granularity. Use summary tables for reporting and keep detailed data in a different dataset.
Ensure both tables are sorted before merging. Merging unsorted large tables is expensive. Instead of merging, use relationships in the data model to connect tables.
If possible, use a data warehouse to improve performance and manage large datasets efficiently. For Azure SQL Database scenarios, consider moving data to Azure Synapse Analytics and using Direct Query from there. This approach can improve query performance and reduce the load on Power BI by taking advantage of Synapse's scalable processing.

If this post helps, then please give us ‘Kudos’ and consider Accept it as a solution to help the other members find it more quickly.

Best Regards.

v-kpoloju-msft · ‎03-03-2025

Hi @Ayushman1997,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-kpoloju-msft · ‎03-06-2025

Hi @Ayushman1997,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.

Thank you.

Ayushman1997 · ‎03-06-2025

Hey hi. No the solution is not working. No other solution is working till now and I have escalated it to Power bi/Microsoft authors. Once I get a solution will post it. As of now the memory issue is due to high number of records

v-kpoloju-msft · ‎03-07-2025

Hi @Ayushman1997,

Thank you for sharing your update. We understand that you are encountering a memory issue due to the large number of records in Power BI. Since you have already escalated this to the relevant teams, we appreciate your patience as they investigate further.

In the meantime, you might consider optimizing your dataset by:

Using aggregations to reduce data volume.
Implementing incremental refresh to load only necessary data.
Filtering out unnecessary columns and rows before loading into Power BI.

Also, please go through the below solved solution for better understanding:
Solved: Memory run out issues in power bi desktop - Microsoft Fabric Community

If this post helps, then please give us ‘Kudos’ and consider Accept it as a solution to help the other members find it more quickly.

Best Regards

v-kpoloju-msft · ‎03-10-2025

Hi @Ayushman1997,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-kpoloju-msft · ‎03-14-2025

Hi @Ayushman1997,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.

Thank you.

v-kpoloju-msft · ‎03-18-2025

Hi @Ayushman1997,

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.