cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.

Reply
boa
Helper I
Helper I

Removing duplicates but keeping the records with the most information in

Hello

 

I need some help with cleaning my data. I have a lot of duplicates in my table. I want to keep the record which contains the most data.

 

This is how my table looks like:

boa_0-1673450269864.png

Records with the same 'key intervention' are duplicates. In the perfect scenario the records have no 'null' in 'time call', 'time arrival' and 'time end'. I want to keep the records in blue and remove the other records. 

 

Help is really appreciated!

1 ACCEPTED SOLUTION
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

4 REPLIES 4
v-yangliu-msft
Community Support
Community Support

Hi  @boa ,

 

Here are the steps you can follow:

1. Go to Power Query – Transform – Group By – Advanced – Set as follows.

vyangliumsft_0-1673516826340.png

2. Select the new column – click the expand flag -- check [time call], [time arrival], [time end].

vyangliumsft_1-1673516826342.png

3. Add Column – Custom Column – Enter the following code.

if [Count] = [time call] then 1 else 0

vyangliumsft_2-1673516826360.png

4. Select [Custom] – select only 1.

vyangliumsft_3-1673516826361.png

5. Check [Count], [Custom] – Remove Columns.

vyangliumsft_4-1673516826363.png

6. Result:

vyangliumsft_5-1673516826364.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Thank you for your help and excellent explanation! 🙏

sudhav
Helper V
Helper V

go to transform data and goto power query editor and there select time arrival column click on down arrow and deselect null values, then you will get.

Hello sudhav

 

If I deselect "null" values in the 'time arrival' column (=remove empty?) I remove also records that I want to keep... If the arrival time is empty and the end time is not empty/or empty, I want to keep it as well

 

boa_0-1673469662698.png

 

Helpful resources

Announcements
PBI November 2023 Update Carousel

Power BI Monthly Update - November 2023

Check out the November 2023 Power BI update to learn about new features.

Community News

Fabric Community News unified experience

Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.

Power BI Fabric Summit Carousel

The largest Power BI and Fabric virtual conference

130+ sessions, 130+ speakers, Product managers, MVPs, and experts. All about Power BI and Fabric. Attend online or watch the recordings.

Top Solution Authors
Top Kudoed Authors