Re: Identify offsetting entries based on multiple ...

Ian_XYZ · ‎10-28-2025

Hi there. I have a data set in which a subset are opposing entries but to different "customers" (example data below). I need to identify in my data set which rows have the following criteria:

Customer is "P001";
Data within the "Location", "Product" & "Description" columns match a different row;
The different row has a Customer code starting with "T"; and
The Values are the same, however in the "P001" Customer row it's negative, and in the "T" customer row it's positive (offsetting amounts)

Ideally I'd be able to include a label such as "Cleared" in a new column ("Comments" column) for these specific rows.

Location	Product	Customer	Shipping method	Shipping Time	Value	Description	Comments
California	Apple	T150	Truck	1 day	100	Box Order
Mexico	Orange	T200	Truck	2 day	200	Single Order
Argentina	Pear	T250	Truck	2 day	300	Box Order
Mexico	Orange	T300	Car	1 day	250	Single Order
California	Apple	P001	Car	3 day	-100	Box Order	Cleared
California	Pear	T200	Car	3 day	200	Single Order
Argentina	Pear	P001	Truck	1 day	-300	Box Order	Cleared
California	Apple	T400	Car	2 day	-100	Box Order
Argentina	Pear	P001	Truck	1 day	-100	Box Order

I've managed to complete the above using a set of Merges to find duplicates using an absolute value column but it's completely slowed down my Query (I have a few other required Merges). I'm wondering if there is a formula which would help me with the above, I can't seem to find or figure it out.

Thank you for the help in advance, I do really appreciate it.

dufoq3

Hi @Ian_XYZ, another solution:

Result

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lZHNCoMwEIRfRXJWyE99AOu5KLQ38RA0DaEhkaUF+/Y1xkpaQ6mngSUz386maVDJtbpaMIqjFBXDoMWkF5JjJ/DobpOSpOdPp9hNj3ZMKugFoDZt0EmMqrPTuAJu5GymODTTxeynZ2WkFoG/ACnMXRmHrwWHOSCPBbD/6P5ZOSe9F/eBG3a0e40xWQPYEpDFqn/Y190DPNtdfYF/Hz6LdY//3CHg01/b76Bv7O0L", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Location = _t, Product = _t, Customer = _t, #"Shipping method" = _t, #"Shipping Time" = _t, Value = _t, Description = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"Value", type number}}),
    CheckCols = List.Buffer({"Location", "Product", "Description", "Value"}),
    FilteredP001 = Table.SelectRows(ChangedType, each ([Customer] = "P001")),
    Ad_AbsValue = Table.AddColumn(FilteredP001, "Abs Value", each Number.Abs([Value]), type number),
    MergedQueries1 = Table.NestedJoin(Ad_AbsValue, {"Location", "Product", "Description", "Abs Value"}, ChangedType, CheckCols, "Merged", JoinKind.LeftOuter),
    MatchedRows = Table.SelectColumns(Table.SelectRows(MergedQueries1, each not Table.IsEmpty([Merged])), CheckCols),
    MergedQueries2 = Table.NestedJoin(ChangedType, CheckCols, MatchedRows, CheckCols, "Merged", JoinKind.LeftOuter),
    Ad_Comments = Table.RemoveColumns(Table.AddColumn(MergedQueries2, "Comments", each if [Customer] = "P001" and not Table.IsEmpty([Merged]) then "Cleared" else null, type text), {"Merged"})
in
    Ad_Comments

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

wdx223_Daniel · ‎10-30-2025

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Custom1 = let
                 fx=(x)=>Text.Format("#[Location]|#[Product]#[Value]#[Description]",x&[Value=Number.Abs(x[Value])])
              in
                Table.Combine(
                              Table.Group(
                                          Source,
                                          {"Location","Product","Value","Description"},
                                          {"n",(tbl)=>Table.AddColumn(
                                                                      tbl,
                                                                      "Comments",
                                                                      (x)=>if Text.StartsWith(x[Customer],"P") and List.Select(tbl[Customer],each Text.StartsWith(_,"T"))<>{} then "Cleared" else null
                                                                     )
                                          },
                                          1,
                                          (x,y)=>Value.Compare(fx(x),fx(y))
                                         )[n]
                              )
in
    Custom1

v-dineshya · ‎10-29-2025

Hi @Ian_XYZ ,

Thank you for reaching out to the Microsoft Community Forum.

Hi @lbendlin , @Poojara_D12 and @vojtechsima ,Thank you for your prompt responses.

Hi @Ian_XYZ , Could you please try the proposed solutions shared by @lbendlin , @Poojara_D12 and @vojtechsima ? Let us know if you’re still facing the same issue we’ll be happy to assist you further.

Regards,

Dinesh

Ian_XYZ · ‎10-29-2025

Hi, yes thank you @vojtechsima @Poojara_D12 @lbendlin @v-dineshya I'll be diving into the proposed solutions today and will provide an update, I truly appreciate everyones support!

v-dineshya

Hi @Ian_XYZ ,

Please Provide ETA (Estimated Time for Arriaval) for the response to this thread.

Regards,

Dinesh

Ian_XYZ

Thanks Dinesh. I've been pulled in a few different directions so won't be able to finalize for another week or so. Happy to mark it as done given I should be able to sort something and appreciate the support everyones provided.

v-dineshya

Hi @Ian_XYZ ,

Thank you for the update. As mentioned in your previous response, as mark the thread as done. Based on your request we are closing this thread, If you have any issues regarding Power BI and Fabric , create a new thread in Fabric community. We are happy to assit you. Thank you

Home - Microsoft Fabric Community

Regards,

Dinesh

vojtechsima · ‎10-29-2025

@Ian_XYZ ,

Hey, I tried to cook something, idk if it will be faster, but I would play with Buffer.

I did merge one time, but I didn't expand any of these and accessed only individual columns. Merge itself it's really that bad if you have good sources and you don'T over do it. The performance issues will happen when using List Contains on unbuffered stuff and other scanning functions. You can theoretically play with HashedJoin as well.

https://learn.microsoft.com/en-us/powerquery-m/joinalgorithm-type

I wrote an article about different types of merges and lookups, feel free to read it:

https://www.vojtechsima.com/post/why-is-power-query-list-contains-slow-faster-lookup-alternatives

Here's the solution you may try:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZHNasMwDIBfJfjcgpOsD9B6x40GWnYJORhbTc38h+NA9/aznTSEJoVuJ8my+Pgk1TX6MIx6YTTaoMoZ3jMfMtJ33ihwIT1dhbVCt5kCfzV8XjkLBeH9RWUf4zt0zAk7wohRCrTvULOpEaFSXIzTgoafvbUy9p/zHY7B9ew7xDzj9CdGHKsHc8uOjieFLDE+4SaYCc+jo7pNgALPAcUIGKqnYCjhgbF3bXASOmpUQF2C7NYg5esWQytJtPsQA3TVYXUXFcb5BClHyHa5CiKDNfAlaJpmJlP+ayGjyuNZtsuNPJWZLvw2symeT/VnlyUCNc0v", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"(blank)" = _t, #"(blank).1" = _t, #"(blank).2" = _t, #"(blank).3" = _t, #"(blank).4" = _t, #"(blank).5" = _t, #"(blank).6" = _t, #"(blank).7" = _t]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    changeTypes = Table.TransformColumnTypes(#"Promoted Headers",{{"Location", type text}, {"Product", type text}, {"Customer", type text}, {"Shipping method", type text}, {"Shipping Time", type text}, {"Value", Int64.Type}, {"Description", type text}, {"Comments", type text}}),
    
    // changeTypes is your last steps, change it to your last step

    // transformations
    addReversedValue = Table.AddColumn(changeTypes, "ValueReversed", each [Value]*-1, type number),
    addIndex = Table.AddIndexColumn(addReversedValue, "Index", 0, 1, Int64.Type),
    buffer = Table.Buffer(addIndex),
    columnNamesOrigin = {"Location", "Product", "Description", "ValueReversed"},
    
    findP001 = Table.SelectRows(buffer, each [Customer]="P001"),
    columnNamesP001 = {"Location", "Product", "Description", "Value"},

    findMatchisRows = Table.AddJoinColumn( buffer, columnNamesOrigin,  findP001, columnNamesP001, "isMatching"),
    getIndexOfCleared = List.Buffer( List.Combine( List.Transform( Table.SelectRows(findMatchisRows, each not Table.IsEmpty([isMatching]) and Text.StartsWith([Customer], "T"))[isMatching], each Table.Column(_, "Index")) ) ),
    
    addComments = Table.AddColumn( buffer, "Comments2", each if List.Contains(getIndexOfCleared, [Index]) then "Cleared" else null, type text )
in
    addComments

What I do, addReversedValue so you can later merge on it, I add an index column, Buffer it, define merging columns, (you don't need extra variables for it, but I thought it's cleaner), filter the P001 rows, then merge those rows with the criteria you need and then access those merged rows that have matching value and take the index from it and buffer it. Then add new column to the original buffered table with lookup that if it's present add 'cleared'.

Poojara_D12 · ‎10-28-2025

Hi @Ian_XYZ

In this scenario, you have entries where customer P001 has negative values that should offset matching positive values belonging to customers whose IDs start with "T". The match needs to be based on the columns Location, Product, and Description, with the Value being the same magnitude but opposite sign. While you already achieved this using multiple merge operations in Power Query, the performance has degraded due to the number of joins. A more efficient approach is to add a helper column that creates a matching key and then use a single merge or a conditional check instead of multiple merges. For example, you can create a new column combining Location & Product & Description & ABS(Value) as a "match key", filter P001 rows, and check if a corresponding "T*" customer row exists with the same key but opposite sign. Then, create a Comments column that returns "Cleared" if the rule matches, otherwise blank. Alternatively, this logic can be computed in DAX after loading the table, using a calculated column to check for an offsetting T-customer row by searching the table with the same key and reversed value using functions like CALCULATE, FILTER, and COUNTROWS. The idea is to reduce join operations and instead rely on faster column operations and conditional lookups, which keeps the query refresh responsive even when working with large datasets.

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos"

Kind Regards,
Poojara - Proud to be a Super User
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS

lbendlin · ‎10-28-2025

- Read about Table.Buffer for better overall performance

- Read about Table.AddKey to speed up merges

- Generally, try to avoid merges as much as possible, let the data model do the work instead (in memory!)

Identify offsetting entries based on multiple criteria

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

Fabric Data Days starts November 4th!

Identify offsetting entries based on multiple criteria

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026