Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi everyone,
I've tried to solve this issue on my own and then looking for an already solved post related to this but I wasn't able to find it, sorry in advanced if it has been already solved.
The issue is that I have a table where the key code field is duplicated in some rows and I want to keep just one row per key code with all the values related to this specific row. For each duplicated key code, which row do I want to keep? I want to keep the row that was first created (info stored in the column data). Attending to the last point, just eliminating duplicates in the query editor is not an option.
To come up with a solution I'm exploring two ways:
I hope I explained correctly, if you need any extra explanation please ask.
Any tip will be more than welcome.
Thanks a lot for your time!
Solved! Go to Solution.
Sorry, yes - the sorting order must have been ascending.
The "Remove-Duplicates"-step will retain the first rows, so they must be on top.
It could be that you don't get the correct result after changing the sort-order due to the buffering in the cache. You can try emptying the cache, then it should work. But this highlights the danger on relying to a sort order without using an explicit buffer in your code. Power BI isn't good at keeping sort-orders, so you have to edit your code unfortunately:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgWTRgYQyhhKg0lDCMdcKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}}), #"Sorted Rows" = Table.Buffer(Table.Sort(#"Changed Type",{{"Column1", Order.Ascending}})) in #"Sorted Rows"
So you need to "wrap" the sorting-step into a Table.Buffer(....) before applying the Remove-Duplicates step.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi everyone,
Trying to be a bit more specific here is an example of the table origin and the result that I want to get:
In red colour those rows that have to be deleted because I want to keep the row whith an earliest date (the last duplicates share the same date, in these cases I want to keep the first register on the table).
Thanks a lot again!
You have to do this in the query-editor:
1) Sort your table on key-EAC and fecha (descending).
2) Check columns: key-EAC and fecha and remove duplicates (either right-click your mouse or: Home -> Reduce Rows -> Remove Rows -> Remove Duplicates.
This will keep the first occurences of the duplicated rows (where only the selected columns "count" for the dups) and remove the others.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Thanks a lot for your asnwer Imke,
I don't understand the behaviour of the remove duplicates execution, but I tried what you proposed and I didn't get the result I'm looking for.
In the next picture you can see that performing the steps you mentioned and finally ordering "fecha" descending, the result isn't the correct, changind the type of ordering in "fecha" to ascending doesn't make any change in the result after removing duplicates:
Please, let me know if you see anything incorrect or you have any other proposal.
Thanks a lot again for your time.
P.D.: Now I'm working on solution based on an R script but may (should!) be an easier way to solve it
Sorry, yes - the sorting order must have been ascending.
The "Remove-Duplicates"-step will retain the first rows, so they must be on top.
It could be that you don't get the correct result after changing the sort-order due to the buffering in the cache. You can try emptying the cache, then it should work. But this highlights the danger on relying to a sort order without using an explicit buffer in your code. Power BI isn't good at keeping sort-orders, so you have to edit your code unfortunately:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgWTRgYQyhhKg0lDCMdcKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", Int64.Type}}), #"Sorted Rows" = Table.Buffer(Table.Sort(#"Changed Type",{{"Column1", Order.Ascending}})) in #"Sorted Rows"
So you need to "wrap" the sorting-step into a Table.Buffer(....) before applying the Remove-Duplicates step.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
I didnt know about having to write in Table.Buffer the sorting step. I found out only just a while ago that, although I have already sorted the rows, the remove duplicate function did not retain the topmost row.
You should file a bug-report then.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hello Imke,
It works perfectly now! I really appreciate your help and the way you explained the last point about buffering.
Thanks for sharing your knowledge.
Kind regards, juanca.
User | Count |
---|---|
77 | |
76 | |
44 | |
30 | |
26 |
User | Count |
---|---|
97 | |
91 | |
52 | |
47 | |
46 |