cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.

Reply
Somu_Mohan
New Member

Remove Duplicates not working in power query

Hello,

 

I am not able to remove duplicates from an insured name column. I have tried clean and trim. Checked by loading it to excel as i was not able to create relationships in DAX, I see that this is not due to case sensitive. Surprisingly if copied this result make it a new table and new query it removes the duplicates. Normal excel countif gave me the count 2 in column1, meaning excel also sees it as duplicates. Why is PQ posing a issue here?

 

Somu_Mohan_0-1685433848203.png

 

 

2 REPLIES 2
Somu_Mohan
New Member

Hi Stephen,

 

Thanks for your support, the attached screenshot initially was arrived at by carrying out a remove duplicate action(where insured name was the only column present). Initially I had 600+ insured names. The "column1" is a excel column where I applied countif, which was resulting in value = 2. As this "Insured Name" columns was being arrived after tramsforming, cleaning then referencing a big dataset, the steps you suggested in part - 2 was taking too long. Unfortunately it was not feasible for me.

 

The solution that worked, I applied the below steps to my master data as follows.

Somu_Mohan_0-1685956583306.png

What I am not able to understand is that, the result with duplicates when goes in as new query it removes them perfectly. so why not in the first instance? why is Uppercased making a difference?

v-stephen-msft
Community Support
Community Support

Hi @Somu_Mohan ,

 

When there're only two column in your table, you can successfully remove duplicates.

vstephenmsft_0-1685583656087.png

2.png

vstephenmsft_1-1685583828009.png

It removes duplicate rows based on the removal of an entire row, not a column of duplicate rows.

For example, in the following data, if you want to remove duplicate rows in the Insured Name column.

vstephenmsft_2-1685583851664.png

Here's the solution.

1.Group by.

vstephenmsft_3-1685583970746.png

vstephenmsft_4-1685584070100.png

vstephenmsft_5-1685584096995.png

2.Add an index column.

vstephenmsft_6-1685584129965.png

 

vstephenmsft_7-1685584136206.png

3.Expand it and filter 1.

vstephenmsft_8-1685584145471.png

vstephenmsft_9-1685584178503.png

vstephenmsft_10-1685584185984.png

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.           

 

Helpful resources

Announcements
PBI November 2023 Update Carousel

Power BI Monthly Update - November 2023

Check out the November 2023 Power BI update to learn about new features.

Community News

Fabric Community News unified experience

Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.

Power BI Fabric Summit Carousel

The largest Power BI and Fabric virtual conference

130+ sessions, 130+ speakers, Product managers, MVPs, and experts. All about Power BI and Fabric. Attend online or watch the recordings.