Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
boa
Helper I
Helper I

Removing duplicates but keeping the records with the most information in

Hello

 

I need some help with cleaning my data. I have a lot of duplicates in my table. I want to keep the record which contains the most data.

 

This is how my table looks like:

boa_0-1673450269864.png

Records with the same 'key intervention' are duplicates. In the perfect scenario the records have no 'null' in 'time call', 'time arrival' and 'time end'. I want to keep the records in blue and remove the other records. 

 

Help is really appreciated!

1 ACCEPTED SOLUTION
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

4 REPLIES 4
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Thank you for your help and excellent explanation! 🙏

sudhav
Helper V
Helper V

go to transform data and goto power query editor and there select time arrival column click on down arrow and deselect null values, then you will get.

Hello sudhav

 

If I deselect "null" values in the 'time arrival' column (=remove empty?) I remove also records that I want to keep... If the arrival time is empty and the end time is not empty/or empty, I want to keep it as well

 

boa_0-1673469662698.png

 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.