Solved: Remove duplicates

ValeriaBreve · ‎06-29-2023

Hello,

I have a table as per below. I need to remove duplicates so that one number of PO stays with each couple VBELN/POS, it does not really matter in which order.

I imagine this need sto be recursive, I am still not very proficient with it.... I appreciate your help!

Kind regards

Valeria

VBELN	POS	PO
80001	10	123456
80002	20	123456
80001	10	654321
80002	20	654321

Result:

VBELN	POS	PO
80001	10	123456
80002	20	654321

AlienSx · ‎06-30-2023

Hi, @ValeriaBreve this works on your test data but I am not sure about real life data...

let
    Source = your_table,
    groups = Table.Group(Source, {"PO"}, {{"VP", each List.Zip({[VBELN], [POS]})}}),
    lst = List.Buffer(groups[VP]),
    vp_txf = 
        List.Accumulate(
            lst,
            {},
            (s, c) => [a = List.Difference(c, s), b = s & {List.First(a)}][b]
        ),
    z = Table.FromColumns({groups[PO], vp_txf}, {"PO", "VP"}),
    extract = Table.TransformColumns(z, {"VP", each Text.Combine(List.Transform(_, Text.From), "@@@"), type text}),
    split = Table.SplitColumn(extract, "VP", Splitter.SplitTextByDelimiter("@@@", QuoteStyle.Csv), {"VBELN", "POS"})
in
    split

View solution in original post

ValeriaBreve · ‎06-29-2023

@ray_aramburo hello, it does not really matter, as long as they unique.... thanks!

ray_aramburo · ‎06-29-2023

All of the lines these lines are unique considering that for each PO they have a different VBELN and POS in each row.

VBELN	POS	PO
80001	10	123456
80002	20	123456
80001	10	654321
80002	20	654321

If you are expecting as a result

VBELN	POS	PO
80001	10	123456
80002	20	654321

What is your expectation for

VBELN	POS	PO
80002	20	123456
80001	10	654321

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

ValeriaBreve · ‎06-29-2023

Hello @ray_aramburo , I understand your confusion, I made a mistake in the mock data!!!! Sorry about it. This is how it should look:

VBELN	POS	PO
80001	10	123456
80001	20	123456
80001	10	654321
80001	20	654321

Result:

VBELN	POS	PO
80001	10	123456
80001	20	654321

And then it does not matter which PO goes to which POS, as long as there is one PO only for each POS and that they are unique.

Thanks!

ray_aramburo · ‎06-29-2023

I think table is still the same but understanding what you are trying to achieve, go to Power Query, select columns VBELN, POS and PO (Ctrl + Click) go to Home -> Reduce Rows -> Remove Rows

Over there, just click on Remove Duplicates, that should make it work. (Note: if you have additional columns, I'd suggest you to reference the original query and just work with the 3 columns you mentioned)

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

ValeriaBreve · ‎06-29-2023

@ray_aramburo Hello, no it does not unfortunately, because then I can remove duplicates from both VBELN and POS, but then the same PO will appear. I need something recursive to say that when a certain PO is used for a given VBELN, it should not be used any longer for the same VBELN... and then for the given VBELN keep distinct POSNR. thanks!

ray_aramburo · ‎06-29-2023

How would you determine that logic?

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

ValeriaBreve · ‎06-29-2023

The logic is that there should be no more than 1 line for a given concatenation of VBELN and POS, and that for each of these there should be a different PO (does not matter which one from the table). Thanks!

ray_aramburo · ‎06-29-2023

Could you please share a sample of your whole data, removing any sensitive information?

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

ValeriaBreve · ‎06-29-2023

Hi Ray, well it is what I posted:

I have:

VBELN	POS	PO
80001	10	123456
80001	20	123456
80001	10	654321
80001	20	654321

(Many different VBELN and POs, but the structure is this)

and the results I am expecting:

VBELN	POS	PO
80001	10	123456
80001	20	654321

or:

VBELN	POS	PO
80001	10	654321
80001	20	123456

(it does not matter which one)

Thanks!

AlienSx · ‎06-30-2023

Hi, @ValeriaBreve this works on your test data but I am not sure about real life data...

let
    Source = your_table,
    groups = Table.Group(Source, {"PO"}, {{"VP", each List.Zip({[VBELN], [POS]})}}),
    lst = List.Buffer(groups[VP]),
    vp_txf = 
        List.Accumulate(
            lst,
            {},
            (s, c) => [a = List.Difference(c, s), b = s & {List.First(a)}][b]
        ),
    z = Table.FromColumns({groups[PO], vp_txf}, {"PO", "VP"}),
    extract = Table.TransformColumns(z, {"VP", each Text.Combine(List.Transform(_, Text.From), "@@@"), type text}),
    split = Table.SplitColumn(extract, "VP", Splitter.SplitTextByDelimiter("@@@", QuoteStyle.Csv), {"VBELN", "POS"})
in
    split

ValeriaBreve · ‎06-30-2023

@AlienSx This works great, thank you! I still have difficulties with List.Accumulate. Would you mind explaining me the logic of the List.Accumulate step iny our code? Thanks again!

AlienSx · ‎06-30-2023

@ValeriaBreve List.Accumulate goes over the list of VBELN/POS pairs associated with each PO. One by one. Variable "c" is your current item in line while "s" is what you have accumulated so far. List.Difference (c, s) seeks all VBELN/POS pairs (from current "row") we have never seen before and adds first pair found to the accumulator vairable "s". Then goes next step with updated "s".

ValeriaBreve · ‎06-30-2023

@AlienSx clear - thank you so much!

ray_aramburo · ‎06-30-2023

That's kind of a random and arbitrary logic, but you can try just selecting the PO column and removing duplicates. What Power Query usually does is just takes the first occurrence of the PO and removes the rest.

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

ray_aramburo · ‎06-29-2023

How would you determine which VBELNPOS goes with which PO?

Did I answer your question? Give your kudos and mark my post as a solution!

Proud to be a Super User!

Remove duplicates

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

Remove duplicates

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025