Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Good morning,
I think I spot a bug in the way the remove duplicate function in Power Query. In case I ask it to be amended.
Very quickly, see example below. I would expect that removing duplicates from colum A would give the first two lines, but it does not. Somehow the rows considered refer to the Source step and not the last one.
Solved! Go to Solution.
https://learn.microsoft.com/en-us/powerquery-m/table-distinct
For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.
You should use group by first two instead and get min/max date to force behaviour. Table.buffer will work but will stop query folding
https://learn.microsoft.com/en-us/powerquery-m/table-distinct
For example, you can't assume that the first row with a unique set of column values will remain, and rows further down in the table will be removed. If you want the duplicate removal to behave predictably, first buffer the table using Table.Buffer.
You should use group by first two instead and get min/max date to force behaviour. Table.buffer will work but will stop query folding
Thank Deku for the solution which I do apreciate.
In my view it is still somehow a bit misleading for an user, as I would expecpt the first row founded to be retained.
Weird. Not ideal, but you could use Table.Group() with List.Max() as a quick fix until this is fixed.
yes, this is exactly what I did 👍. I use a grouping taking the max, but this is a workaround.
I just wanted to advise about a bug, in order for it to be amended.