Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi everyone,
I have some data that I will need help from you Power Query experts to transform. The data looks like the below table (Data Input). Column Ax is matched with Column Bx, Ay with By, Az with Bz.. I want to remove anything that are in the A Columns when it matches perfectly with their counterpart in Column B.
I am thinking the iterative function / code should have the following logic:
1. If value in Column Ax = Column Bx, then replace the value with blank (see Data2 row input + output)
2. If they partially match, only the matching one is replaced with blank (see Data4 row input + output)
3. If they do not match, leave them all behind (see Data3 row input + output)
Data Input:
| Column Ax | Column Ay | Column Az | Column Bx | Column By | Column Bz | |
| Data1 | Ax1 | Ay1 | Ax1 | |||
| Data2 | Ay2 | Az2 | Ay2 | Az2 | ||
| Data3 | Ax3 | Az3 | ||||
| Data4 | Ax4 | Ay4 By4 | Az4 | Ay4 | Bz4 | |
| Data5 | Ax5 | Ay5 By5 | Ay5 By5 | Bz6 |
Data Output:
| Column Ax | Column Ay | Column Az | Column Bx | Column By | Column Bz | |
| Data1 | Ay1 | Ax1 | ||||
| Data2 | Ay2 | Az2 | ||||
| Data3 | Ax3 | Az3 | ||||
| Data4 | Ax4 | By4 | Az4 | Ay4 | Bz4 | |
| Data5 | Ax5 | Ay5 By5 | Bz6 |
Currently, I am replacing value for each column in separate steps, so I have 3 different steps for each columns:
= Table.ReplaceValue(Table,each [Column Bx],null,Replacer.ReplaceValue,{"Column Ax"})
= Table.ReplaceValue(PrevStep,each [Column By],null,Replacer.ReplaceValue,{"Column Ay"})
= Table.ReplaceValue(PrevStep2,each [Column Bz],null,Replacer.ReplaceValue,{"Column Az"})
Would it be possible for me to have it only in one step and reiterate through the columns? I am able to get the column headers name for both the A and B side of things, but not sure how I can reiterate dynamically for both sides at once (Ax & Bx, then Ay & By instead of Ax & By).
Hope I was clear at the explanation! Will edit the post again if I need to clarify anything.
Here is a bit of a mismash of AlienSx's solution and ronrsnfld's solution. Table.ToList converts each row to a list and applies the function supplied in second arg, similar to AlienSx. In the function we are splitting text, removing from A matching in B, then re-combining, similar to ronrsnfld. And buffering and iterating through each row's list is relatively faster, although it probably won't be noticable unless you are dealing with non-small data (100k+ rows in my testing).
let
fix_func = (row as list, optional row_headers as number) as list => [
row_b = List.Buffer( row ),
row_header_count = row_headers ?? 1,
row_data_setSize = (List.Count(row_b)-row_header_count)/2, // assume two data column sets of equal length
row_header = List.FirstN( row_b, row_header_count ),
row_data_setB = List.LastN( row_b, row_data_setSize ),
row_data_setA_removeB = List.Generate(
()=>0, each _ < row_data_setSize, each _ + 1,
each [
item_a = row_b{row_header_count+_},
item_b = row_b{row_header_count+row_data_setSize+_},
removematch =
if item_a = null or item_b = null
then item_a
else Text.Combine(
List.RemoveMatchingItems(
Text.Split( item_a,"#(lf)" ),
Text.Split( item_b,"#(lf)" )
),
"#(lf)"
)
] [removematch]
),
row_output = row_header & row_data_setA_removeB & row_data_setB
] [row_output],
Source = Sample,
FixRows = Table.ToList(Source, fix_func ),
ToTable = Table.FromRows( FixRows, Value.Type(Source) )
in
ToTable
To test, I used your same sample data, but I also added in the following row to validate case where lines of A are a proper subset of lines of B.
Added row in sample
| (blank) | Column Ax | Column Ay | Column Az | Column Bx | Column By | Column Bz |
| Data6 | Ax6 | Ax6 Bx6 |
Target output (I'm assuming)
| (blank) | Column Ax | Column Ay | Column Az | Column Bx | Column By | Column Bz |
| Data6 | Ax6 Bx6 |
Here is the output with the sample and code
edit: updated row_data_setA_removeB step in fix_func to handle nulls
Here's another method that seems to work a bit faster.
It uses List.Accumulate to iterate the Table.ReplaceFunction to check each pair of columns.
Note that I use List.RemoveMatchingItems function. As written, it will only remove if there is a "perfect" match, as you wrote. If you want to make the match case-insensitive, or include other imperfections, you can include the optional Equation Criteria.
However, if your actual layout is different, and/or if you have other columns in your actual data that are not relevant to your problem. That algorithm will need to be changed.
Please read the code comments, explore the applied steps, and ask if you don't understand something.
let
//Change next line to reflect your actual data source
Source =Table,
/*Written to accomodate any number of columns, but assumes they are all type text
If you have other columns in your real data that are of a different type, this
will need modification*/
#"Changed Type" = Table.TransformColumnTypes(Source,
List.Transform(Table.ColumnNames(Source), each {_, type nullable text})),
/*If your real data has more columns that are not to be checked
this algorithm to get the column pairs will need to be changed*/
#"Columns" = [a=List.Skip(Table.ColumnNames(Source)),
b=List.Split(a,List.Count(a)/2),
c=List.Zip(b)][c],
#"Replace Matches" = List.Accumulate(
#"Columns",
#"Changed Type",
(s,c)=> Table.ReplaceValue(
s,
each Record.Field(_,c{0}),
each Record.Field(_,c{1}),
(x,y,z) as nullable text=> [a=Text.Split(y,"#(lf)"),
b=Text.Split(z,"#(lf)"),
c=List.RemoveMatchingItems(a,b),
d=Text.Combine(c,"#(lf)")][d],
{c{0}}
)
)
in
#"Replace Matches"
Original
Results
Hi @ronrsnfld ,
First of all - Thank you so much for your input! Apologies for the late response, it took me a little bit to understand how to implement your code in my file, as I am not that code savvy.
The example data/table that I've put in the original post is a simplified version of the actual data. I have about 155 columns in my data that are not part of the "coupled columns" and 23 pair of the "coupled columns". All of the 23 pair "coupled columns" have text as their data type. The rest of the 155 columns are either Int64.Type, text, date, or number. The primary key of the table is an Int64.Type.
Additionally, the "coupled columns" are actually in two separate tables, but we can easily merge them as they use the same primary key. I merged them through left join, so there will be cases like Data3, where Table A (with columns Ax, Ay, and Az) do not have an accompanying pair in Table B (columns Bx, By, and Bz).
I've modified your code to take the above facts into account:
let
Source = Table.NestedJoin(TableA, {"PrimaryKey"}, TableB, {"PrimaryKey"}, "TableB", JoinKind.LeftOuter),
Expand = Table.ExpandTableColumn(Source, "BColumns", TableBHeaders),
FinalSource = Table.SelectColumns(Expand, //List of Relevant Columns),
Columns = List.Zip({TableAHeaders, TableBHeaders}),
#"Replace Matches" = List.Accumulate(
Columns,
FinalSource,
(s,c)=> Table.ReplaceValue(
s,
each Record.Field(_,c{0}),
each Record.Field(_,c{1}),
(x,y,z) as nullable text=> [a=Text.Split(y,"#(lf)"),
b=Text.Split(z,"#(lf)"),
c=List.RemoveMatchingItems(a,b),
d=Text.Combine(c,"#(lf)")][d],
{c{0}}
)
)
in
#"Replace Matches"
Your approach is what I had in mind when I first encountered this problem. I had figured out how to list.zip the headers but was not competent enough to understand how to use List.Accumulate and to build my own replacement function. I tried removing your change type step and the code seems to be working just fine without that step, but let me know of any edge cases I need to be aware of - so that I understand why we need to keep that step.
The only part that I can't figure out when modifying your code is which line I need to modify when I want the blanks to come up as "null" instead of blanks. As I wrote in the other comment, I can certainly do another step where I null all the blanks, but wondering if I can just write it in the same step instead.
Appreciate any help moving forward!
I ran some tests performing the replacement of blanks with nulls within the Table.ReplaceValue function.
You can do it within that function by replacing the "d=" line as below:
d=if List.IsEmpty(c) then null else Text.Combine(c,"#(lf)")
However, this method takes more time than merely replacing the blanks with nulls when you have finished processing the table.
A difference, of course, is that replacing it within the Table.ReplaceValue function will replace only those values that were removed because they match; whereas doing it afterwards will replace ALL of the blanks with nulls, whether they were in the original data or not.
In general, Power Query works better if it knows the data types it is working on. But I don't believe that removing the #"Changed Type" step in my code will cause any problems. If you have many, many rows in your data, you could check to see if explicitly setting just the coupled columns to type text before the #"Replace Matches" steps makes a difference in execution time. I suspect it will not.
And yes, with a more complex replacement line in the #"Replace Matches" step you could wind up with nulls instead of blanks in the replaced items. But I don't have time this morning to look at it further. And I don't know which method would be more efficient (replace in that step or replace all the blanks with nulls as a final step). I'll take a look at it later.
Hi @isg
Following up to confirm if the earlier responses addressed your query. If not, please share your questions and we’ll assist further...
Hi @v-aatheeque ,
Yes! Sorry for the belated response. I was able to successfully implement all three solutions in my file. The only last step for me to do is to run all three of them and compare the runtime duration. Once that is done and I have the fastest code, will put it as the accepted solution. Planning to do this tomorrow afternoon.
Thanks for checking in again. Apologies, I know this topic has been open for a while.
EDIT 12/22: @v-aatheeque Unfortunately, I did not get the chance to compare the run times for the three queries this past Friday. I will have to provide an update next week as I won't be able to run the queries during the holiday break. Sorry again for the much delayed results.
Hi @isg
Have you had a chance to look through the responses shared earlier? If anything is still unclear, we’ll be happy to provide additional support.
Hi @isg
Following up to confirm if the earlier responses addressed your query. If not, please share your questions and we’ll assist further.
Hi! Sorry for the late response. I will still need to look into this solution. Was away for Thanksgiving this past week and didn't get the chance to test this. Will report back when I've successfully implemented it.
Hi @isg
Just checking in to see if you had a chance to try out the suggested solution after the holidays.
Whenever you’re ready, please share an update, we’ll be happy to continue supporting you.
Hi @v-aatheeque !
I checked one of the solutions and will check this one maybe later today / tomorrow. I'm a bit crunched on time, as other responsibilities demand my attention, and understanding the code solution + implementing it takes a little bit of time for me because I'm not as code savvy. I definitely will want to implement the best + fastest running code though so I will definitely check them out one by one.
Thanks again and appreciate your patience.
let
replacer = (text01, text02) => Text.Replace(text01 ?? "", text02 ?? "", ""),
fxr = (lst) => (
(data) => {lst{0}} & List.Transform(List.Zip(data), (x) => replacer(x{0}, x{1})) & data{1}
)(List.Split(List.Skip(lst), (List.Count(lst) - 1) / 2)),
Source = Excel.CurrentWorkbook(){[Name="Table4"]}[Content],
ttl = Table.ToList(Source, fxr),
result = #table(Value.Type(Source), ttl)
in
result
Hi @AlienSx ,
First of all - thank you so much for your input. Second of all - I apologize in advance because I'm not as M savvy as most people here are so I'd probably will need some hand holding to understand the code.
While I attempted to implement your code, I realized that I did not mention in my original post that:
I implemented your code and modified the "Source" line to the table, and it seems to work! However, several things that I was wondering:
Both of these I can definitely just implement as separate steps in the code, but wondering if we could modify the functions instead.
Thanks again for your help and sorry for the late response!
hello, @isg
1. Other columns: you don't need to remove other columns. E.g. use Table.CombineColumnsToRecord to combine all other columns (including primary key) into single column, make sure that this new column goes first, apply table transformation and then expand that column using Table.ExpandRecordColumn.
2. Modify replacer function so that it checks if text01 is null or text01 = text02 (result must be null), text02 is null (result must be text01) otherwise result must be Text.Replace(text01, text02, "")
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!