Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
boa
Helper I
Helper I

Removing duplicates but keeping the records with the most information in

Hello

 

I need some help with cleaning my data. I have a lot of duplicates in my table. I want to keep the record which contains the most data.

 

This is how my table looks like:

boa_0-1673450269864.png

Records with the same 'key intervention' are duplicates. In the perfect scenario the records have no 'null' in 'time call', 'time arrival' and 'time end'. I want to keep the records in blue and remove the other records. 

 

Help is really appreciated!

1 ACCEPTED SOLUTION
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

4 REPLIES 4
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Thank you for your help and excellent explanation! 🙏

sudhav
Helper V
Helper V

go to transform data and goto power query editor and there select time arrival column click on down arrow and deselect null values, then you will get.

Hello sudhav

 

If I deselect "null" values in the 'time arrival' column (=remove empty?) I remove also records that I want to keep... If the arrival time is empty and the end time is not empty/or empty, I want to keep it as well

 

boa_0-1673469662698.png

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.