Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Anonymous
Not applicable

"Remove Duplicate" doesn't remove all duplicate

Dear all,

 

I have a table with just one column, I tried to remove the duplicate in the column via power query. However, once I loaded to the dashboard I use count and count(distinct) both give me different number as the same number is expected.

 

 

Best regards,

Eric

15 REPLIES 15

Hey @Anonymous@

 

I wrote a post already in February about the different understanding of duplicates of Power Query and Power Pivot here, but @ImkeFs idea of using the Comparer.OrdialIgnoreCase porperty is great and simple. 

 

 

Thanks a lot 🙂

ImkeF
Super User
Super User

This will happen when the terms have different case profiles. Pls check out if this article helps: http://www.thebiccountant.com/2015/08/17/create-a-dimension-table-with-power-query-avoid-the-bug/

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Well, if your table just consists of one column, you can actually use this formula:

 

Table.ExpandListColumn(#table({"ColumnName"}, {{List.Distinct(Source[ColumnName], Comparer.OrdinalIgnoreCase)}}), "ColumnName")

 

It's a bit of a bugger, because the only way I found to use Comparer.OrdinalIgnoreCase (which will ignore case sensitivity) was to use it in list. So if anyone has an idea how to make this a bit smarter, you're more than welcome 🙂

 

http://www.thebiccountant.com/2016/10/27/tame-case-sensitivity-power-query-powerbi/

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

So if you want a distinct of all columns in the table, it's pretty easy:

 

Table.Distinct(Table, Comparer.OrdinalIgnoreCase)

 

 

Still need to figure out how to handle column-selection in it.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

KHorseman
Community Champion
Community Champion

@ImkeF try:

 

Table.Distinct(Table, {"ColumnName", Comparer.OrdinalIgnoreCase})




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@KHorseman

Totally awe!! Thank you so much!

 

& so it looks with multiple columns:

 

= Table.Distinct(Table.FromRecords({[A="one", B=1, C=2], [A="ONe", B=1, C=3]}), {{"A", Comparer.OrdinalIgnoreCase}, {"B", Comparer.OrdinalIgnoreCase}} )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

KHorseman
Community Champion
Community Champion

@ImkeF that's cool, so you could potentially mix-and-match case sensitivity? Like Column1 ignores case, Column2 doesn't? I didn't test far enough to try anything like that. I just noticed that the second argument in Table.Distinct is a list by default if you let the query editor generate the code for you, so I tried adding Comparer.OrdinalIgnoreCase to the list.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@KHorseman Haven't even thought of that, but computer says "yes"  🙂

 

Table.Distinct(Table.FromRecords({[A="one", B=1, C=2], [A="ONe", B=1, C=3]}), {{"A", Comparer.Ordinal}, {"B", Comparer.OrdinalIgnoreCase}} )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

KHorseman
Community Champion
Community Champion

@ImkeF nice. Thanks for sharing this. I never would have even noticed this comparer function otherwise.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Greg_Deckler
Super User
Super User

How many rows do you have? I have seen one other user reporting this and that user had millions of rows. I would report this as an Issue. https://ideas.powerbi.com/forums/360879-issues

 

Any chance you can post a link to the data so that this issue can be recreated?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Hi smoupre,

 

Yes I have millions of row in the database. My apology I cannot post the data.

 

I have posted this issue in the link you mention. Hopfully they come out with something more convenient.

 

@KHorseman and @ImkeF my data is not case sensitive. Yet this happen. I'd tried your code just in case but the results are the same.

 

 

@Anonymous your data source may not be case sensitive, but if the columns in question contain letters then Power BI will be case sensitive about them. But I do also like that non-printable character idea @ImkeF.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@Anonymous another thing you can try is to trim & clean before the remove-duplicates-step. Maybe there are some issues with non-printable characters or sth similar:

 

PBI_TrimClean.png

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

@ImkeF and @KHorseman my apology for the late reply.

 

I tried @ImkeF method it still doesnt work however I tried the "grouped by" function in "transform" tab and it works.

 

Just extra step.

Anonymous
Not applicable

Problem is, that Power BI has two different ways of handling data in two different situations.

 

1. Remove duplicates in Query Editor - it IS case sensitive, eg. "EMPLOYER" and "employer" are two different strings (are not duplicates)

2. Building a relation - it IS NOT case sensitive, eg. "EMPLOYER" and "employer" are the same strings (are duplicates), therefore I can't build a relation

 

Microsoft, please, fix this "feature" it is really annoying. Work with data one way in the application.

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.