Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.
Hi All,
I am extremely new to Power Query, and as a result do not know what is possible/not possible/not feasible.
I have some data which I am in the process of transforming to match a specific requirement.
The first step in the process was splitting a colum that had multiple lines of data in, to their own separate single line rows.
As a result, I now have multiple rows for each individual line as I would like.
Now I want to know if it is possible to remove the duplicate data from certain rows, whilst only keeping the distinct data from one column in those rows.
So for example, after splitting into separate rows, i'm left with the data circled in red.
But I want to get rid of the bit circled in blue, keeping the bits highlighted in yellow.
Is this something that is possible, or are there better alternatives to what I want to achieve?
Solved! Go to Solution.
Hi @GeorgeSmithson ,
You can create an Index column based on each Id and replace values, like this query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlbSUTIyMDLUt9A3NAKyDY2MDU3NgIyICKVYnWglEyR5kHBKWkYKkIqMBMuaIsmaA9lmpqmlqUA6KgosDTYRJm8JZAf4K7gHKxgYmICMciRCjRMRapyJUONChBpXItS4EaHGXSk2FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Id = _t, Date = _t, Reference = _t, Notes = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"Data", each Table.AddIndexColumn(_,"Index",1,1), type table [Id=nullable number, Date=nullable date, Reference=nullable text, Notes=nullable text,Index = nullable number]}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Data",each [Id],each if [Index] <> 1 then null else [Id],Replacer.ReplaceValue,{"Id"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Date],each if [Index] <> 1 then null else [Date],Replacer.ReplaceValue,{"Date"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",each [Reference],each if [Index] <> 1 then null else [Reference],Replacer.ReplaceValue,{"Reference"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Value2",{"Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}})
in
#"Changed Type1"
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @GeorgeSmithson ,
You can create an Index column based on each Id and replace values, like this query:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlbSUTIyMDLUt9A3NAKyDY2MDU3NgIyICKVYnWglEyR5kHBKWkYKkIqMBMuaIsmaA9lmpqmlqUA6KgosDTYRJm8JZAf4K7gHKxgYmICMciRCjRMRapyJUONChBpXItS4EaHGXSk2FgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Id = _t, Date = _t, Reference = _t, Notes = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Id"}, {{"Data", each Table.AddIndexColumn(_,"Index",1,1), type table [Id=nullable number, Date=nullable date, Reference=nullable text, Notes=nullable text,Index = nullable number]}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Data",each [Id],each if [Index] <> 1 then null else [Id],Replacer.ReplaceValue,{"Id"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",each [Date],each if [Index] <> 1 then null else [Date],Replacer.ReplaceValue,{"Date"}),
#"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1",each [Reference],each if [Index] <> 1 then null else [Reference],Replacer.ReplaceValue,{"Reference"}),
#"Removed Columns" = Table.RemoveColumns(#"Replaced Value2",{"Index"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"Id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}})
in
#"Changed Type1"
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @GeorgeSmithson ,
Sorry if I've misunderstood, but it sounds like you want to keep the first [Reference] row value for the group, but want to remove all the duplicates to null for the remainder of rows in that group that have a unique [Notes] value against them?
If so, this isn't something you want to do in Power Query. The data should be kept as-is, as this will make your life significantly easier further down the road.
If you want to display your data in that way, i.e. just a single [Reference] value against a number of [Notes] values, then you would use the Matrix visual to get the format you want. If you are exporting to Excel, then you would use a Pivot Table to get your desired format.
As an aside: this looks like an XY Problem. You may find you get a better solution if you can explain exactly what you are trying to achieve overall.
Pete
Proud to be a Datanaut!
Below is a solution prepared which you can download
https://1drv.ms/x/s!Akd5y6ruJhvhuRfeoDvdpgCpUxSQ?e=7kRfrn
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"Date", type date}, {"Reference", type text}, {"Notes", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"id"}, {{"Custom", each _, type table [id=number, Date=datetime, Reference=text, Notes=number]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom.1", each Table.AddIndexColumn([Custom],"Index")),
#"Expanded Custom.1" = Table.ExpandTableColumn(#"Added Custom", "Custom.1", {"Date", "Reference", "Notes", "Index"}, {"Date", "Reference", "Notes", "Index"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Custom.1",{"Custom"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "id1", each if [Index]=0 then [id] else null),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Date1", each if [Index]=0 then [Date] else null),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "Ref1", each if [Index]=0 then [Reference] else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom3",{"id", "Date", "Reference", "Index"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns1",{"id1", "Date1", "Ref1", "Notes"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"id1", "id"}, {"Date1", "Date"}, {"Ref1", "Ref"}})
in
#"Renamed Columns"