Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Neseren
Frequent Visitor

Remove duplicates based on multiple criteria

Hi

I have a pretty complicated task that i'm not able to crack myself. So i hope you guys can help me out?

I have a dataset consisting of batch, results, lower tolerance and upper tolerance.

The batch can sometimes be duplicated. I need basically to remove the duplicates on the batches. The issue is to set up the rules to remove the right duplicate(s).

The rules are as follows:

- Only one unique batch

- If a unique batch is below or above the tolerance, the value must be kept

- If a duplicate consists of one value within the tolerances and one or more values outside the tolerances. Then the value inside the tolerances shall be kept.

- If a duplicate consists of values only within the tolerances. Then the lowest value within the tolerances shall be kept.

- If a duplicate consists of values only outside the tolerances. Then the value closest to the tolerances must be kept.

 

Neseren_0-1701172898915.png

 

Hope you guys can help me.

Thanks. 🙂

2 ACCEPTED SOLUTIONS

Thanks a lot!

You just solved my problem. Appreciate it 🙂

View solution in original post

16 REPLIES 16
Ahmedx
Super User
Super User

to know how to do this watch my video

https://1drv.ms/v/s!AiUZ0Ws7G26RjAp9lOi4-mJWZfuH?e=szbLe1

Thanks for this!

However there is still a problem. This was also one of my original sollutions.

 

In the batch ending with 7021 i have the values 8 and 3.

With your example the value to be kept is 3, but this is outside the tolerances. I need 8 to be the value to be kept in this case.

I added another two lines to specify what i need:

Neseren_0-1701177335464.png

In this case, i want the row with the value 4 because it is closest to the tolerances. With your example it will give me 2.

Does it make sense?

ok, pls try again

(x)=> if List.Max(x[Result]) > List.Max(x[Lower tolerance]) then List.Min(x[Result]) else List.Max(x[Result])

 

 https://1drv.ms/v/s!AiUZ0Ws7G26RjAt0CFj38gO46CJW?e=QvIaTP

Thanks once again!

You are so close to make it right.

Unfortunately the batches ending on 7021 and 1021 gives me the number 3 in both cases.

I need in only these two cases to get the number 8 and 6.

The reason is, that the result min is outside the tolerance and i therefore need the highest numbers, because it is within the tolerances.

 

Additional info:

Also if, lets say i have a result above the tolerance and another beyond the tolerance, i need the result closest to the tolerance gap to be kept.

 

For example:

Neseren_0-1701240957377.png

Here i need 4 and 11 to be kept.

 

Hope you can make it? 🙂

pls try

 

7011 and 3011 is now wrong.

I want the numbers 6 i both cases, because it is the value within the tolerance.

Neseren_0-1701244464399.png

 

try again

 

You're getting really close now.

Only error is 3331. It needs to be result: 4 instead of 13.

Reason is that 4 is closer to the tolerance than 13 is.

Neseren_0-1701246928054.png

 

try

 

and is 1 closer to 5 than 11?

Screenshot_2.png

In this case i would like to keep the result: 11.

The tolerance is 5-10.

The distance from 1 to 5 is 4.

The distance from 11 to 10 is 1.

Therefore i would like 11 to be the kept result.

and try this

 

Thanks a lot!

You just solved my problem. Appreciate it 🙂

At first sight it seems like 99 and 100 both works.

Can you explain the difference?
I will validate it later today to be sure 🙂

in this case what should happen and why

Screenshot_1.png

In this case i would like 8 to be the kept result.

Reason is that 8 is between the tolerance 5-10 and 3 is outside.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.