The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello,
I have a table as per below. I need to remove duplicates so that one number of PO stays with each couple VBELN/POS, it does not really matter in which order.
I imagine this need sto be recursive, I am still not very proficient with it.... I appreciate your help!
Kind regards
Valeria
VBELN | POS | PO |
80001 | 10 | 123456 |
80002 | 20 | 123456 |
80001 | 10 | 654321 |
80002 | 20 | 654321 |
Result: | ||
VBELN | POS | PO |
80001 | 10 | 123456 |
80002 | 20 | 654321 |
Solved! Go to Solution.
Hi, @ValeriaBreve this works on your test data but I am not sure about real life data...
let
Source = your_table,
groups = Table.Group(Source, {"PO"}, {{"VP", each List.Zip({[VBELN], [POS]})}}),
lst = List.Buffer(groups[VP]),
vp_txf =
List.Accumulate(
lst,
{},
(s, c) => [a = List.Difference(c, s), b = s & {List.First(a)}][b]
),
z = Table.FromColumns({groups[PO], vp_txf}, {"PO", "VP"}),
extract = Table.TransformColumns(z, {"VP", each Text.Combine(List.Transform(_, Text.From), "@@@"), type text}),
split = Table.SplitColumn(extract, "VP", Splitter.SplitTextByDelimiter("@@@", QuoteStyle.Csv), {"VBELN", "POS"})
in
split
@ray_aramburo hello, it does not really matter, as long as they unique.... thanks!
All of the lines these lines are unique considering that for each PO they have a different VBELN and POS in each row.
VBELN | POS | PO |
80001 | 10 | 123456 |
80002 | 20 | 123456 |
80001 | 10 | 654321 |
80002 | 20 | 654321 |
If you are expecting as a result
VBELN | POS | PO |
80001 | 10 | 123456 |
80002 | 20 | 654321 |
What is your expectation for
VBELN | POS | PO |
80002 | 20 | 123456 |
80001 | 10 | 654321 |
Proud to be a Super User!
Hello @ray_aramburo , I understand your confusion, I made a mistake in the mock data!!!! Sorry about it. This is how it should look:
VBELN | POS | PO |
80001 | 10 | 123456 |
80001 | 20 | 123456 |
80001 | 10 | 654321 |
80001 | 20 | 654321 |
Result: | ||
VBELN | POS | PO |
80001 | 10 | 123456 |
80001 | 20 | 654321 |
And then it does not matter which PO goes to which POS, as long as there is one PO only for each POS and that they are unique.
Thanks!
I think table is still the same but understanding what you are trying to achieve, go to Power Query, select columns VBELN, POS and PO (Ctrl + Click) go to Home -> Reduce Rows -> Remove Rows
Over there, just click on Remove Duplicates, that should make it work. (Note: if you have additional columns, I'd suggest you to reference the original query and just work with the 3 columns you mentioned)
Proud to be a Super User!
@ray_aramburo Hello, no it does not unfortunately, because then I can remove duplicates from both VBELN and POS, but then the same PO will appear. I need something recursive to say that when a certain PO is used for a given VBELN, it should not be used any longer for the same VBELN... and then for the given VBELN keep distinct POSNR. thanks!
How would you determine that logic?
Proud to be a Super User!
The logic is that there should be no more than 1 line for a given concatenation of VBELN and POS, and that for each of these there should be a different PO (does not matter which one from the table). Thanks!
Could you please share a sample of your whole data, removing any sensitive information?
Proud to be a Super User!
Hi Ray, well it is what I posted:
I have:
VBELN | POS | PO |
80001 | 10 | 123456 |
80001 | 20 | 123456 |
80001 | 10 | 654321 |
80001 | 20 | 654321 |
(Many different VBELN and POs, but the structure is this)
and the results I am expecting:
VBELN | POS | PO |
80001 | 10 | 123456 |
80001 | 20 | 654321 |
or:
VBELN | POS | PO |
80001 | 10 | 654321 |
80001 | 20 | 123456 |
(it does not matter which one)
Thanks!
Hi, @ValeriaBreve this works on your test data but I am not sure about real life data...
let
Source = your_table,
groups = Table.Group(Source, {"PO"}, {{"VP", each List.Zip({[VBELN], [POS]})}}),
lst = List.Buffer(groups[VP]),
vp_txf =
List.Accumulate(
lst,
{},
(s, c) => [a = List.Difference(c, s), b = s & {List.First(a)}][b]
),
z = Table.FromColumns({groups[PO], vp_txf}, {"PO", "VP"}),
extract = Table.TransformColumns(z, {"VP", each Text.Combine(List.Transform(_, Text.From), "@@@"), type text}),
split = Table.SplitColumn(extract, "VP", Splitter.SplitTextByDelimiter("@@@", QuoteStyle.Csv), {"VBELN", "POS"})
in
split
@AlienSx This works great, thank you! I still have difficulties with List.Accumulate. Would you mind explaining me the logic of the List.Accumulate step iny our code? Thanks again!
@ValeriaBreve List.Accumulate goes over the list of VBELN/POS pairs associated with each PO. One by one. Variable "c" is your current item in line while "s" is what you have accumulated so far. List.Difference (c, s) seeks all VBELN/POS pairs (from current "row") we have never seen before and adds first pair found to the accumulator vairable "s". Then goes next step with updated "s".
That's kind of a random and arbitrary logic, but you can try just selecting the PO column and removing duplicates. What Power Query usually does is just takes the first occurrence of the PO and removes the rest.
Proud to be a Super User!
How would you determine which VBELNPOS goes with which PO?
Proud to be a Super User!