Solved: How to Remove Duplicate Rows While Keeping a Speci...

GeorgeSmithson · ‎03-10-2022

Hi All,

I am extremely new to Power Query, and as a result do not know what is possible/not possible/not feasible.

I have some data which I am in the process of transforming to match a specific requirement.

The first step in the process was splitting a colum that had multiple lines of data in, to their own separate single line rows.

As a result, I now have multiple rows for each individual line as I would like.

Now I want to know if it is possible to remove the duplicate data from certain rows, whilst only keeping the distinct data from one column in those rows.

So for example, after splitting into separate rows, i'm left with the data circled in red.

But I want to get rid of the bit circled in blue, keeping the bits highlighted in yellow.

Is this something that is possible, or are there better alternatives to what I want to achieve?

v-yingjl · ‎03-14-2022

Hi @GeorgeSmithson ,

You can create an Index column based on each Id and replace values, like this query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlbSUTIyMDLUt9A3NAKyDY2MDU3NgIyICKVYnWglEyR5kHBKWkYKkIqMBMuaIsmaA9lmpqmlqUA6KgosDTYRJm8JZAf4K7gHKxgYmICMciRCjRMRapyJUONChBpXItS4EaHGXSk2FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Id = _t, Date = _t, Reference = _t, Notes = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"Data", each Table.AddIndexColumn(_,"Index",1,1), type table [Id=nullable number, Date=nullable date, Reference=nullable text, Notes=nullable text,Index = nullable number]}}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Data",each [Id],each if [Index] <> 1 then null else [Id],Replacer.ReplaceValue,{"Id"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Date],each if [Index] <> 1 then null else [Date],Replacer.ReplaceValue,{"Date"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",each [Reference],each if [Index] <> 1 then null else [Reference],Replacer.ReplaceValue,{"Reference"}),
    #"Removed Columns" = Table.RemoveColumns(#"Replaced Value2",{"Index"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}})
in
    #"Changed Type1"

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

v-yingjl · ‎03-14-2022

Hi @GeorgeSmithson ,

You can create an Index column based on each Id and replace values, like this query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlbSUTIyMDLUt9A3NAKyDY2MDU3NgIyICKVYnWglEyR5kHBKWkYKkIqMBMuaIsmaA9lmpqmlqUA6KgosDTYRJm8JZAf4K7gHKxgYmICMciRCjRMRapyJUONChBpXItS4EaHGXSk2FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Id = _t, Date = _t, Reference = _t, Notes = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"Data", each Table.AddIndexColumn(_,"Index",1,1), type table [Id=nullable number, Date=nullable date, Reference=nullable text, Notes=nullable text,Index = nullable number]}}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Data",each [Id],each if [Index] <> 1 then null else [Id],Replacer.ReplaceValue,{"Id"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Date],each if [Index] <> 1 then null else [Date],Replacer.ReplaceValue,{"Date"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",each [Reference],each if [Index] <> 1 then null else [Reference],Replacer.ReplaceValue,{"Reference"}),
    #"Removed Columns" = Table.RemoveColumns(#"Replaced Value2",{"Index"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}})
in
    #"Changed Type1"

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BA_Pete · ‎03-11-2022

Hi @GeorgeSmithson ,

Sorry if I've misunderstood, but it sounds like you want to keep the first [Reference] row value for the group, but want to remove all the duplicates to null for the remainder of rows in that group that have a unique [Notes] value against them?

If so, this isn't something you want to do in Power Query. The data should be kept as-is, as this will make your life significantly easier further down the road.

If you want to display your data in that way, i.e. just a single [Reference] value against a number of [Notes] values, then you would use the Matrix visual to get the format you want. If you are exporting to Excel, then you would use a Pivot Table to get your desired format.

As an aside: this looks like an XY Problem. You may find you get a better solution if you can explain exactly what you are trying to achieve overall.

Pete

Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!

Vijay_A_Verma · ‎03-10-2022

Below is a solution prepared which you can download

https://1drv.ms/x/s!Akd5y6ruJhvhuRfeoDvdpgCpUxSQ?e=7kRfrn

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"id"}, {{"Custom", each _, type table [id=number, Date=datetime, Reference=text, Notes=number]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom.1", each Table.AddIndexColumn([Custom],"Index")),
    #"Expanded Custom.1" = Table.ExpandTableColumn(#"Added Custom", "Custom.1", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom.1",{"Custom"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "id1", each if [Index]=0 then [id] else null),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Date1", each if [Index]=0 then [Date] else null),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Ref1", each if [Index]=0 then [Reference] else null),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom3",{"id", "Date", "Reference", "Index"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns1",{"id1", "Date1", "Ref1", "Notes"}),
    #"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"id1", "id"}, {"Date1", "Date"}, {"Ref1", "Ref"}})
in
    #"Renamed Columns"

How to Remove Duplicate Rows While Keeping a Specific Column

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

How to Remove Duplicate Rows While Keeping a Specific Column

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025