This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
I am building a dataset/model from an Excel workbook prepared by others and split into multiple worksheets. The worksheets are split into lists of providers and their counts of (1) established visits and (2) new visits and the same but where providers have fewer than 26 (3) established visits or (4) new visits. Also, I am getting the provider ids from yet another excel workbook.
Anywho, in combining the queries so that I would have all Established visits in one query and all New visits in another query each with the provider id (i.e. NPI), the fact that there are two providers with the same name has caused a problem where I have duplicate values in the NPI column but the corresponding rows are not duplicates. See image:
So, I tried setting up a custom column to create duplicates that I could then remove, like so:
each if ([NPI]="1760810293" and [taxonomy_key]="14" and [group_key]="2") then "1871035881"
else if ([NPI]="1871035881" and [taxonomy_key]="54" and [group_key]="95") then "1760810293"
else [NPI])As you can see in the image above, that didn't work.
Essentially, I need to get rid of rows 1 and 3. Note that the above image is filtered to show the two NPIs of interest. There are a total of 697 rows, should be 695. Help! What should I do?
@cathoms
I am not sure if the order and the pattern of the actual records are similar to what you have shown in the sample. I merged all the columns except NPI and applied "Remove Duplicates" which basically keeps the first occurrence and removes the rest. Check the following code after pasting it in a blank query on the Advanced Editor.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjCxNFbSUTI0ARJGQGxibAYkLZRidaKVTC0sDAnImYLkLE1BqgwtQErAklBDsUjGAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t, D = _t, E = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"A", Int64.Type}, {"B", Int64.Type}, {"C", Int64.Type}, {"D", Int64.Type}, {"E", Int64.Type}}),
#"Inserted Merged Column" = Table.AddColumn(#"Changed Type", "Merged", each Text.Combine({Text.From([B], "en-US"), Text.From([C], "en-US"), Text.From([D], "en-US"), Text.From([E], "en-US")}, "|"), type text),
#"Removed Duplicates" = Table.Distinct(#"Inserted Merged Column", {"Merged"})
in
#"Removed Duplicates"
⭕ Subscribe and learn Power BI from these videos
⚪ Website ⚪ LinkedIn ⚪ PBI User Group
So basically you are creating a new column based on all the other columns, then removing duplicates based on that new column. What you are showing gets rid of the wrong rows. However, if remove duplicates keeps the first occurence, would it work if I sort descending?
Also, could I just remove rows based on all the other columns without the need to create a new merged column?
Check out the April 2026 Power BI update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.