Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
amethyst_tang_2
Frequent Visitor

Customise order of removal duplicates in power query

Hi, I am looking for ways to customise the removal of the duplicates in power query, any idea how? 

I basically have 3 rows and only keep the middle row. 

1 ACCEPTED SOLUTION
Ahmedx
Super User
Super User

pls try this

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WstAxVtJRMrMAEpamloYWhkCGuamZAQgoxerAFJhbElBgYYpHgQlQyBSuwAiiwBBNgZkBAQXmOBTEAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"External-Rating" = _t, #" lnternal_Score" = _t, Votes = _t, Gross = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"External-Rating", type number}, {" lnternal_Score", Int64.Type}, {"Votes", Int64.Type}, {"Gross", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"External-Rating", "Votes", "Gross"},
 {{"Count", (x)=> Table.FromRecords({ Table.Sort(x,{{"Votes", Order.Ascending}}){Number.RoundUp( Table.RowCount(x)/2,0)-1}})

}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"Count"}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Removed Other Columns", "Count", {"External-Rating", " lnternal_Score", "Votes", "Gross"}, {"External-Rating", " lnternal_Score", "Votes", "Gross"})
in
    #"Expanded Count"

View solution in original post

12 REPLIES 12
Ahmedx
Super User
Super User

pls try this

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WstAxVtJRMrMAEpamloYWhkCGuamZAQgoxerAFJhbElBgYYpHgQlQyBSuwAiiwBBNgZkBAQXmOBTEAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"External-Rating" = _t, #" lnternal_Score" = _t, Votes = _t, Gross = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"External-Rating", type number}, {" lnternal_Score", Int64.Type}, {"Votes", Int64.Type}, {"Gross", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"External-Rating", "Votes", "Gross"},
 {{"Count", (x)=> Table.FromRecords({ Table.Sort(x,{{"Votes", Order.Ascending}}){Number.RoundUp( Table.RowCount(x)/2,0)-1}})

}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"Count"}),
    #"Expanded Count" = Table.ExpandTableColumn(#"Removed Other Columns", "Count", {"External-Rating", " lnternal_Score", "Votes", "Gross"}, {"External-Rating", " lnternal_Score", "Votes", "Gross"})
in
    #"Expanded Count"

to know how to do this watch my video

(x)=> Table.FromRecords({ Table.Sort(x,{{"Votes", Order.Ascending}}){Number.RoundUp( Table.RowCount(x)/2,0)-1}})

https://1drv.ms/v/s!AiUZ0Ws7G26RjHukM5Cp232otYo_?e=yLwqxO

if there are always 3 rows then you can do it like this

(x)=> Table.FromRecords({ Table.Sort(x,{{"Votes", Order.Ascending}}){1}})

and try this

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WstAxVtJRMrMAEpamloYWhkCGuamZAQgoxerAFJhbElBgYYpHgQlQyBSuwAiiwBBNgZkBAQXmOBTEAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"External-Rating" = _t, lnternal_Score = _t, Votes = _t, Gross = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"External-Rating", type number}, {"lnternal_Score", Int64.Type}, {"Votes", Int64.Type}, {"Gross", Int64.Type}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"External-Rating", "Votes", "Gross"}, {{"lnternal_Score",
 (x)=> List.Median(x[lnternal_Score])
}})
in
    #"Grouped Rows"

 

hi do you have a simpler version of this?

and try this

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WstAxVtJRMrMAEpamloYWhkCGuamZAQgoxerAFJhbElBgYYpHgQlQyBSuwAiiwBBNgZkBAQXmOBTEAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"External-Rating" = _t, #" lnternal_Score" = _t, Votes = _t, Gross = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"External-Rating", type number}, {" lnternal_Score", Int64.Type}, {"Votes", Int64.Type}, {"Gross", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"External-Rating", Order.Ascending}, {" lnternal_Score", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
    #"Inserted Modulo" = Table.AddColumn(#"Added Index", "Modulo", each Number.Mod([Index], 3), type number),
    #"Filtered Rows" = Table.SelectRows(#"Inserted Modulo", each ([Modulo] = 2)),
    #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index", "Modulo"})
in
    #"Removed Columns"

 

lukiz84
Memorable Member
Memorable Member

how is the "middle" row defined? there must be some difference to the other rows?

amethyst_tang_2_0-1705931070339.png

Hi this is how the data looks like, everything are the same across all columns except for the internal score, where they are different. Hence, i am trying to use the middle row as the media and remove the top and bottom.

lukiz84
Memorable Member
Memorable Member

Sort your data accordingly, call Table.Buffer, then remove duplicates

amethyst_tang_2_0-1705930910440.png

Hi this is how the data looks like, everything are the same across all columns except for the internal score, where they are different. Hence, i am trying to use the middle row as the media and remove the top and bottom.

hi, would like to keep only the middle row (aka median), and have tried table.buffer before, but seems to be getting the bottom row for the results, any solution on how to change to median?

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.