Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
Somu_Mohan
New Member

Remove Duplicates not working in power query

Hello,

 

I am not able to remove duplicates from an insured name column. I have tried clean and trim. Checked by loading it to excel as i was not able to create relationships in DAX, I see that this is not due to case sensitive. Surprisingly if copied this result make it a new table and new query it removes the duplicates. Normal excel countif gave me the count 2 in column1, meaning excel also sees it as duplicates. Why is PQ posing a issue here?

 

Somu_Mohan_0-1685433848203.png

 

 

2 REPLIES 2
Somu_Mohan
New Member

Hi Stephen,

 

Thanks for your support, the attached screenshot initially was arrived at by carrying out a remove duplicate action(where insured name was the only column present). Initially I had 600+ insured names. The "column1" is a excel column where I applied countif, which was resulting in value = 2. As this "Insured Name" columns was being arrived after tramsforming, cleaning then referencing a big dataset, the steps you suggested in part - 2 was taking too long. Unfortunately it was not feasible for me.

 

The solution that worked, I applied the below steps to my master data as follows.

Somu_Mohan_0-1685956583306.png

What I am not able to understand is that, the result with duplicates when goes in as new query it removes them perfectly. so why not in the first instance? why is Uppercased making a difference?

v-stephen-msft
Community Support
Community Support

Hi @Somu_Mohan ,

 

When there're only two column in your table, you can successfully remove duplicates.

vstephenmsft_0-1685583656087.png

2.png

vstephenmsft_1-1685583828009.png

It removes duplicate rows based on the removal of an entire row, not a column of duplicate rows.

For example, in the following data, if you want to remove duplicate rows in the Insured Name column.

vstephenmsft_2-1685583851664.png

Here's the solution.

1.Group by.

vstephenmsft_3-1685583970746.png

vstephenmsft_4-1685584070100.png

vstephenmsft_5-1685584096995.png

2.Add an index column.

vstephenmsft_6-1685584129965.png

 

vstephenmsft_7-1685584136206.png

3.Expand it and filter 1.

vstephenmsft_8-1685584145471.png

vstephenmsft_9-1685584178503.png

vstephenmsft_10-1685584185984.png

 

Best Regards,

Stephen Tao

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.           

 

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors