Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
Ian_XYZ
Frequent Visitor

Identify offsetting entries based on multiple criteria

Hi there. I have a data set in which a subset are opposing entries but to different "customers" (example data below). I need to identify in my data set which rows have the following criteria:

  • Customer is "P001";
  • Data within the "Location", "Product" & "Description" columns match a different row;
  • The different row has a Customer code starting with "T"; and
  • The Values are the same, however in the "P001" Customer row it's negative, and in the "T" customer row it's positive (offsetting amounts)

Ideally I'd be able to include a label such as "Cleared" in a new column ("Comments" column) for these specific rows.

 

LocationProductCustomerShipping methodShipping TimeValueDescriptionComments
CaliforniaAppleT150Truck1 day100Box Order 
MexicoOrangeT200Truck2 day200Single Order 
ArgentinaPearT250Truck2 day300Box Order 
MexicoOrangeT300Car1 day250Single Order 
CaliforniaAppleP001Car3 day-100Box OrderCleared
CaliforniaPearT200Car3 day200Single Order 
ArgentinaPearP001Truck1 day-300Box OrderCleared
CaliforniaAppleT400Car2 day-100Box Order 
ArgentinaPearP001Truck1 day-100Box Order 

 

I've managed to complete the above using a set of Merges to find duplicates using an absolute value column but it's completely slowed down my Query (I have a few other required Merges). I'm wondering if there is a formula which would help me with the above, I can't seem to find or figure it out.

 

Thank you for the help in advance, I do really appreciate it.

10 REPLIES 10
dufoq3
Super User
Super User

Hi @Ian_XYZ, another solution:

Result

dufoq3_0-1762270823729.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lZHNCoMwEIRfRXJWyE99AOu5KLQ38RA0DaEhkaUF+/Y1xkpaQ6mngSUz386maVDJtbpaMIqjFBXDoMWkF5JjJ/DobpOSpOdPp9hNj3ZMKugFoDZt0EmMqrPTuAJu5GymODTTxeynZ2WkFoG/ACnMXRmHrwWHOSCPBbD/6P5ZOSe9F/eBG3a0e40xWQPYEpDFqn/Y190DPNtdfYF/Hz6LdY//3CHg01/b76Bv7O0L", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Location = _t, Product = _t, Customer = _t, #"Shipping method" = _t, #"Shipping Time" = _t, Value = _t, Description = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Value", type number}}),
    CheckCols = List.Buffer({"Location", "Product", "Description", "Value"}),
    FilteredP001 = Table.SelectRows(ChangedType, each ([Customer] = "P001")),
    Ad_AbsValue = Table.AddColumn(FilteredP001, "Abs Value", each Number.Abs([Value]), type number),
    MergedQueries1 = Table.NestedJoin(Ad_AbsValue, {"Location", "Product", "Description", "Abs Value"}, ChangedType, CheckCols, "Merged", JoinKind.LeftOuter),
    MatchedRows = Table.SelectColumns(Table.SelectRows(MergedQueries1, each not Table.IsEmpty([Merged])), CheckCols),
    MergedQueries2 = Table.NestedJoin(ChangedType, CheckCols, MatchedRows, CheckCols, "Merged", JoinKind.LeftOuter),
    Ad_Comments = Table.RemoveColumns(Table.AddColumn(MergedQueries2, "Comments", each if [Customer] = "P001" and not Table.IsEmpty([Merged]) then "Cleared" else null, type text), {"Merged"})
in
    Ad_Comments

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

wdx223_Daniel
Super User
Super User

wdx223_Daniel_0-1761809033625.png

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Custom1 = let
                 fx=(x)=>Text.Format("#[Location]|#[Product]#[Value]#[Description]",x&[Value=Number.Abs(x[Value])])
              in
                Table.Combine(
                              Table.Group(
                                          Source,
                                          {"Location","Product","Value","Description"},
                                          {"n",(tbl)=>Table.AddColumn(
                                                                      tbl,
                                                                      "Comments",
                                                                      (x)=>if Text.StartsWith(x[Customer],"P") and List.Select(tbl[Customer],each Text.StartsWith(_,"T"))<>{} then "Cleared" else null
                                                                     )
                                          },
                                          1,
                                          (x,y)=>Value.Compare(fx(x),fx(y))
                                         )[n]
                              )
in
    Custom1

 

v-dineshya
Community Support
Community Support

Hi @Ian_XYZ ,

Thank you for reaching out to the Microsoft Community Forum.

 

Hi @lbendlin , @Poojara_D12  and @vojtechsima ,Thank you for your prompt responses.

 

Hi @Ian_XYZ ,  Could you please try the proposed solutions shared by @lbendlin , @Poojara_D12  and @vojtechsima  ? Let us know if you’re still facing the same issue we’ll be happy to assist you further.

 

Regards,

Dinesh

Hi, yes thank you @vojtechsima @Poojara_D12 @lbendlin @v-dineshya  I'll be diving into the proposed solutions today and will provide an update, I truly appreciate everyones support!

Hi @Ian_XYZ ,

Please Provide ETA (Estimated Time for Arriaval) for the response to this thread.

 

Regards,

Dinesh

Thanks Dinesh. I've been pulled in a few different directions so won't be able to finalize for another week or so. Happy to mark it as done given I should be able to sort something and appreciate the support everyones provided.

Hi @Ian_XYZ ,

Thank you for the update. As mentioned in your previous response,  as mark the thread as done. Based on your request we are closing this thread, If you have any issues regarding Power BI and Fabric , create a new thread in Fabric community. We are happy to assit you. Thank you

 

Home - Microsoft Fabric Community

 

Regards,

Dinesh

 

vojtechsima
Super User
Super User

@Ian_XYZ ,

Hey, I tried to cook something, idk if it will be faster, but I would play with Buffer.

I did merge one time, but I didn't expand any of these and accessed only individual columns. Merge itself it's really that bad if you have good sources and you don'T over do it. The performance issues will happen when using List Contains on unbuffered stuff and other scanning functions. You can theoretically play with HashedJoin as well.

 

https://learn.microsoft.com/en-us/powerquery-m/joinalgorithm-type

 

I wrote an article about different types of merges and lookups, feel free to read it:

https://www.vojtechsima.com/post/why-is-power-query-list-contains-slow-faster-lookup-alternatives

 

Here's the solution you may try:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZHNasMwDIBfJfjcgpOsD9B6x40GWnYJORhbTc38h+NA9/aznTSEJoVuJ8my+Pgk1TX6MIx6YTTaoMoZ3jMfMtJ33ihwIT1dhbVCt5kCfzV8XjkLBeH9RWUf4zt0zAk7wohRCrTvULOpEaFSXIzTgoafvbUy9p/zHY7B9ew7xDzj9CdGHKsHc8uOjieFLDE+4SaYCc+jo7pNgALPAcUIGKqnYCjhgbF3bXASOmpUQF2C7NYg5esWQytJtPsQA3TVYXUXFcb5BClHyHa5CiKDNfAlaJpmJlP+ayGjyuNZtsuNPJWZLvw2symeT/VnlyUCNc0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"(blank)" = _t, #"(blank).1" = _t, #"(blank).2" = _t, #"(blank).3" = _t, #"(blank).4" = _t, #"(blank).5" = _t, #"(blank).6" = _t, #"(blank).7" = _t]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    changeTypes = Table.TransformColumnTypes(#"Promoted Headers",{{"Location", type text}, {"Product", type text}, {"Customer", type text}, {"Shipping method", type text}, {"Shipping Time", type text}, {"Value", Int64.Type}, {"Description", type text}, {"Comments", type text}}),
    
    // changeTypes is your last steps, change it to your last step

    // transformations
    addReversedValue = Table.AddColumn(changeTypes, "ValueReversed", each [Value]*-1, type number),
    addIndex = Table.AddIndexColumn(addReversedValue, "Index", 0, 1, Int64.Type),
    buffer = Table.Buffer(addIndex),
    columnNamesOrigin = {"Location", "Product", "Description", "ValueReversed"},
    
    findP001 = Table.SelectRows(buffer, each [Customer]="P001"),
    columnNamesP001 = {"Location", "Product", "Description", "Value"},

    findMatchisRows = Table.AddJoinColumn( buffer, columnNamesOrigin,  findP001, columnNamesP001, "isMatching"),
    getIndexOfCleared = List.Buffer( List.Combine( List.Transform( Table.SelectRows(findMatchisRows, each not Table.IsEmpty([isMatching]) and Text.StartsWith([Customer], "T"))[isMatching], each Table.Column(_, "Index")) ) ),
    
    addComments = Table.AddColumn( buffer, "Comments2", each if List.Contains(getIndexOfCleared, [Index]) then "Cleared" else null, type text )
in
    addComments

What I do, addReversedValue so you can later merge on it, I add an index column, Buffer it, define merging columns, (you don't need extra variables for it, but I thought it's cleaner), filter the P001 rows, then merge those rows with the criteria you need and then access those merged rows that have matching value and take the index from it and buffer it. Then add new column to the original buffered table with lookup that if it's present add 'cleared'.

Poojara_D12
Super User
Super User

Hi @Ian_XYZ 

In this scenario, you have entries where customer P001 has negative values that should offset matching positive values belonging to customers whose IDs start with "T". The match needs to be based on the columns Location, Product, and Description, with the Value being the same magnitude but opposite sign. While you already achieved this using multiple merge operations in Power Query, the performance has degraded due to the number of joins. A more efficient approach is to add a helper column that creates a matching key and then use a single merge or a conditional check instead of multiple merges. For example, you can create a new column combining Location & Product & Description & ABS(Value) as a "match key", filter P001 rows, and check if a corresponding "T*" customer row exists with the same key but opposite sign. Then, create a Comments column that returns "Cleared" if the rule matches, otherwise blank. Alternatively, this logic can be computed in DAX after loading the table, using a calculated column to check for an offsetting T-customer row by searching the table with the same key and reversed value using functions like CALCULATE, FILTER, and COUNTROWS. The idea is to reduce join operations and instead rely on faster column operations and conditional lookups, which keeps the query refresh responsive even when working with large datasets.

 

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos"

Kind Regards,
Poojara - Proud to be a Super User
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS
lbendlin
Super User
Super User

- Read about Table.Buffer for better overall performance

- Read about Table.AddKey to speed up merges

- Generally, try to avoid merges as much as possible, let the data model do the work instead (in memory!)

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors