Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
iceparrot
Advocate II
Advocate II

Still duplicates in Power BI Desktop after removing them via Power Query

Hey guys, 

I'm facing a really akward problem. I imported data into Power BI and identified that there are duplicates.

I tried removing them. It works in Power Query Editor and I don't see any duplicates there. However, in my report I still see some duplicates. I compared the rows which contain duplicates and there are no hidden signs or special characters except from '-', '_', or ' '. No matter there are these signs, removing duplicates also works for other rows, which contain the same signs.

 

Some Reference Data where Power BI has Problems with:

Basic Talk
2022_KW48_DE_1122_Play_Live Event_ABC TBD

... and where Power BI does not have any problems with:

1236_TEST_CHINA
2&1 Datadriven @ Campaign Session

 

I tried the following methods (as also suggested here) to get rid of the duplicates and identify the issue why I still have them in the report, but none of them worked to remove the duplicates and don't switch everything to upper- or lowercase:

  1. Applied Remove Duplicates in Power Query Editor
    This removes the duplicates in Power Query Editor but when I save and switch back to my report I still have some duplicates. However, most are gone.

  2. I applied Trim and Clean before the Remove Duplicates step in Power Query
    There is no change to 1. 

  3. Applied Uppercase before Remove Duplicates in Power Query
    This works at least. However, then I do not have the original way of spelling, which would be great.

  4. Limit the table in Power Query on the Rows which still show Duplicates in the Report
    This is not practical at all, as I need the whole set of data, just without duplicates. I tried it to see if I still would get duplicates in my report. But, in this case the report also does not show any duplicates.

 

If I do a DistinctCount on the column containing the duplicates every row returns 1. Using a regular Count still shows there are duplicates. Also, removing the duplicates in an EXCEL-file or through a SELECT DISTINCT in SQL works without any problems.

 

Did anyone already face the same issue and can help me on how to fix this?

 

2 REPLIES 2
Ashish_Mathur
Super User
Super User

Hi,

Share some data to work with and clearly show the problem there.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
v-xinruzhu-msft
Community Support
Community Support

Hi @iceparrot 

From your message,does it mean that you have duplicate fields in your table, but they are either uppercase and some are lowercase.

If it is the situation above, you can try to the following code in "Advanced Editor" in power query as an example:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckoszkxWCEnMyVaK1YlWMjIwMor3DjexiHdxjTc0BHICchIr430yy1IVXMtS80riHZ2cFUKcXEhQXZKUAladBLaqBGxVLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each [Column1]),
    #"Uppercased Text" = Table.TransformColumns(#"Added Custom",{{"Column1", Text.Upper, type text}}),
    #"Removed Duplicates1" = Table.Distinct(#"Uppercased Text", {"Column1"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Column1"})
in
    #"Removed Columns"

If the information above cannot meet your requirement, can you provide some sample data or pbix file?

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.