Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Hi all,
I have the following problem. I have 2 tables, which have the following approximate format:
Table 1 (selection of columns)
Email Transaction ID Date Payment processor Status
john@gmail 0001 01/02/2018 PaymentYellow Paid
john@gmail cd_03A 01/03/2018 PaymentBlack Paid
john@gmail cd_03B 01/04/2018 PaymentBlack Paid
john@gmail cd_03B 01/04/2018 PaymentBlack Refund
Table 2 (all columns):
Email Transaction ID Date Payment processor Transaction number
john@gmail 0001 01/02/2018 PaymentYellow 0
john@gmail cd_03A 01/03/2018 PaymentBlack 1
john@gmail cd_03B 01/04/2018 PaymentBlack 2
The transaction number in table 2 is correctly calculated after the following indexing:
#"Grouped Rows" = Table.Group(#"Removed Duplicates", {"Email"}, {{"Count", each Table.AddIndexColumn(_,"Index2",0,1), type table}}),
#"Expanded Count" = Table.ExpandTableColumn(#"Grouped Rows", "Count", {"PaymentProcessor", "Transaction Id", "Date", "Index2"}, {"PaymentProcessor", "Transaction ID", "Date", "Transaction number"}),
The problem happens after the merge. After merging based on any combination of keys, trimmed, cleaned, transformed all to text, the transaction number, in around 25% of cases is not correctly transfered from Table 2 to Table 1 (left outer merge, with table 1 being on the left - I've tried multiple join options). No duplicate rows, most numbers get transported, but a significant part of them get incorrectly transported, with no obvious (to me) pattern, until now.
Please help, I've been losing a lot of sleep over this.
The expected result would be:
Table 1 (after merge and expand)
Email Transaction ID Date Payment processor Status Transaction number
john@gmail 0001 01/02/2018 PaymentYellow Paid 0
john@gmail cd_03A 01/03/2018 PaymentBlack Paid 1
john@gmail cd_03B 01/04/2018 PaymentBlack Paid 2
john@gmail cd_03B 01/04/2018 PaymentBlack Refund 2
You need to add Table.Buffer to the last step in the Query you are going to merge later into another Query (the query where the expend goes wrong). Same as how you need to Table.Buffer if you want to remove duplicates and always keep the 1st item after a sort.
TL;DR - Table.Buffer() fixed issue
Also had this issue. Completely incorrect join results after merge (Left outer), with seemingly no sense as to how it was processing join and delivering expanded columns (this was joining on multiple columns)
Attempted numerous fixes: data types, clean & trim, recreating entire structures from scratch - spent 24hrs on this blinkin issue
As above - added Table.Buffer() to the table which formed the right side of the join (this table was gettings its data from a source table which contained a Group By and index). It worked!
Thanks all for tip
I have found a solution which is much, much faster than using Table.Buffer. I got the idea from here:
https://docs.microsoft.com/en-us/power-query/commonissues#preserving-sort
This can be done after joins or inside groupings.
It doesn't have to be a sort, it can also be something else like adding an index column. There are several M table functions that tell the engine to handle a table in a sorted manner. Below examples of what I did. This performs much faster than using Table.Buffer.
Fix data order for a join. Fix inside the NestedJoin function:
Note: you have to experiment which side of the join you have to fix.
= Table.NestedJoin(Table.Sort(Source, {"Index", Order.Ascending}), {"Index"}, Table1, {"Index"}, "Table1Fixed", JoinKind.LeftOuter)
= Table.NestedJoin(Source, {"Index"}, Table.Sort(Table1, {"Index", Order.Ascending}), {"Index"}, "Table1Fixed", JoinKind.LeftOuter)
= Table.NestedJoin(Table.Sort(Source, {"Index", Order.Ascending}), {"Index"}, Table.Sort(Table1, {"Index", Order.Ascending}), {"Index"}, "Table1Fixed", JoinKind.LeftOuter)
Fix data order for a grouping. Fix inside the Group function:
Note: here I have to sort both the table the AddIndexColumn function and the FirstN function.
= Table.Group(Table2, {"w", "x", "y"}, {{"GroupTable", each Table.AddIndexColumn(Table.Sort(_, {"Index", Order.Ascending}), "FirstValueInColumnZ", Table.FirstN(Table.Sort(_, {"Index", Order.Ascending}), 1)[z]{0}, 1), type table}})
This has also been a huge paint point for me.
I think the Table.Buffer trick has worked but it really slows down performance for me.
I may end up doing this in an excel and just pasting the results back into Power BI and moving forward from there. Very disappointing.
I had a similar issue. What fixed it for me was adding in a dummy "replace values" step before I expanded the merged table out.
I've recently encounterd a similar problem. Through my own testing, the error (where values are switched) appears to be triggered when a merge is expanded AFTER a grouping event has been conducted in Power Query.
I went so far as to limit my data fields to just 1 row and found the error still occurring. While the error is observed in the merge/expand step - the symptom appears to be triggered when merges are placed after a grouping event.
(The grouping event is not pictured - and additional transformations were conducted between the grouping event and this pictured merge) The errors impacted a subset of rows (less than 10%). Below I've singled out one of the changed rows to try and problem solve what might be causing it.
I can reproduce this readily and it seems like a pretty big issue. I've been using Power BI since 2015.
I have this problem as well using Power BI Desktop May 2019 Report Server. Group By before merge => random expand results for whole number/integer data type.
any solution?
Hi, try adding an Index column after the "Group by". This seems to work for me.
I got a satisfactory result after including a Table.Buffer() before the SORT().
I have the same problem. All data are wrong.
@nazdravanul Did you find a solution to this? I think I am having the same problem? When I click on a table to preview before expanding the data is correct, but it changes after the expand?
Hi,
Please share the link from where i can download your PBI file.
Hi @nazdravanul,
Assuming that you have added a index column, then you only need to select Table 1 and click Merge Queries like below.
The detail steps you could have a reference of my attachment.
Best Regards,
Cherry
Thank you for the reply. I've been using PowerBI for years, I know how to use the UI to do an Expand. That is not my problem. After expand part of the data is incorrectly ported from Table 2 to Table 1 - see details in my post.
Hi @nazdravanul,
I'm not very clear abourt your issue based on your description and I cannot reproduce your issue in my side. Could you share a screenshot about your issue?
If it is convenient, could you share a dummy pbix file which can reproduce this issue, so that we can help further investigate on it? You can upload it to OneDrive or Dropbox and post the link here. Do mask sensitive data before uploading.)
Best Regards,
Cherry
User | Count |
---|---|
128 | |
108 | |
100 | |
65 | |
62 |
User | Count |
---|---|
137 | |
115 | |
102 | |
71 | |
61 |