Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have a Self-join in one of my Power query Dataflow Gen2 tables. This is to determine gaps between rent conditions. The output of this table is written to a Lakehouse Data destination.
Refreshing the dataflow Gen 2 takes 6 hours and fails with a "Mashup Error".
Detailed Error message -> 'Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Evaluation ran out of memory and can't continue. Details: '. Error code: EvaluationOutOfMemoryError.
1. I tried writing the intermediate result in a table and tried referencing this table to apply the rest of the transformations. DF2 refresh failed.
2. When I apply the filter and get only 1 cost centre in the table, DF2 refresh runs successfully.
Has anyone experiened this issue before? I know Power query Merge is not recommended. Can you please let me know if there are any alternate ways to achieve this without using Self-join?
Thanks,
Smrithi
Solved! Go to Solution.
Cool. Thanks for the update @_Smrithi .
It does seem strange to be getting this error on only 5M rows. Up the the point that you add the indices for self-join, is the query maintaining folding i.e. is 'View Data Source Query' lit up and selectable when you right-click the query step before adding the first index? The more of the query that you can get the source to process (assuming an SQL source) before doing your transformation should help with memory load.
There are other ways to achieve the same result which should avoid the huge amount of table scanning that a merge does. Here's one method that you can try:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIw1AciIwMjYyjHDMKJ1UHImyHLW6LKGxroAxFU3thQ39AIKh8LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, #"Valid From" = _t, #"Valid To" = _t]),
Cols = Table.ToColumns(Source),
addPrevValidTo = Table.FromColumns(Cols & {{null} & List.RemoveLastN(List.Last(Cols),1)}, Table.ColumnNames(Source) & {"PrevValidTo"})
in
addPrevValidTo
The example code above transforms this:
...to this:
Pete
Proud to be a Datanaut!
Hi @_Smrithi ,
As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided by the community members for the issue worked. If our responses addressed, please mark it as Accept as solution and click Yes if you found it helpful.
Thanks and regards
Hi @_Smrithi
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.
Hi @_Smrithi
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hi @_Smrithi ,
Very difficult to help based on the information provided.
What source(s) are you using? What other transformations are you doing in the query? How many rows are in your table? What gateway are you using? What other refreshes are scheduled at the same time? Etc.
Please have a read through this and help us to help you:
https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/4381...
Pete
Proud to be a Datanaut!
Hi @BA_Pete,
Thanks for the response. I have documented my requirements below. Would you take a look at it and suggest?
Note: My dataflow is in Dataflow Gen2.
Requirement:
I have a table with date fields “Valid from” and “Valid To”.
I aim to find the gaps between the “Valid From” of the current record and the “Valid To” of the previous record.
To achieve this, I am creating Indices on the table and Self-joining the table based on the indices.
When I refresh the dataflow as I have Self join, it takes too long to refresh and sometimes, dataflow refresh fails.
Can you please suggest any alternative to self-join to attain the same result?
Please see my answers to your questions,
What source(s) are you using? My source files are from LakeHouse.
What other transformations are you doing in the query? I have grouping, Indexing, custom calculated columns and Self-Join.
I have tried the performance of each step in the transformation. The step that causes performance issue is Self-Join.
How many rows are in your table? I have about 5 million rows in the table.
What gateway are you using? I am connecting to Lakehouse which is in the same Workspace as my dataflow.
What other refreshes are scheduled at the same time? I am executing this dataflow on adhoc basis when there are no other dataflow refresh happens.
Thanks,
Smrithi
Cool. Thanks for the update @_Smrithi .
It does seem strange to be getting this error on only 5M rows. Up the the point that you add the indices for self-join, is the query maintaining folding i.e. is 'View Data Source Query' lit up and selectable when you right-click the query step before adding the first index? The more of the query that you can get the source to process (assuming an SQL source) before doing your transformation should help with memory load.
There are other ways to achieve the same result which should avoid the huge amount of table scanning that a merge does. Here's one method that you can try:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlTSUTIw1AciIwMjYyjHDMKJ1UHImyHLW6LKGxroAxFU3thQ39AIKh8LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, #"Valid From" = _t, #"Valid To" = _t]),
Cols = Table.ToColumns(Source),
addPrevValidTo = Table.FromColumns(Cols & {{null} & List.RemoveLastN(List.Last(Cols),1)}, Table.ColumnNames(Source) & {"PrevValidTo"})
in
addPrevValidTo
The example code above transforms this:
...to this:
Pete
Proud to be a Datanaut!
@BA_Pete Thanks for providing the solution. Would you let me know if the field, "ID" in the below table have different values?
Key | Valid From | Valid To |
1 | 1/1/2025 | 1/5/2025 |
2 | 1/7/2025 | 1/10/2025 |
3 | 1/15/2025 | 1/20/2025 |
4 | 1/25/2025 | 1/30/2025 |
Thank you,
Smrithi
Hi @_Smrithi ,
Thanks for reaching out to the Microsoft fabric community forum.
If you're asking whether the query works with a different ID, then yes, I have tested it, and it functions correctly.
Hope the work around provided by @BA_Pete was helpful.
If you find this post helpful, please mark it as an "Accept as Solution" and consider giving a KUDOS.
Thanks and Regards
User | Count |
---|---|
8 | |
7 | |
5 | |
5 | |
4 |
User | Count |
---|---|
14 | |
13 | |
8 | |
6 | |
6 |