Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Preparing for a certification exam? Ask exam experts all your questions on May 15th. Register now.

Reply
Hakon
Regular Visitor

Power query, create new columns based on duplicates and date

Hi,

 

I am looking to create a column that says if there are any duplicates in the same date, but if there is a duplicate with a newer date then I want to get a value indicating that there is a new version of the duplicate in another column. I'm not going to remove duplicates, just highlight them. Have tried with DAX, but it cannot distinguish whether it is uppercase or lowercase. It reads M3J and M3j as the same name.

 

Hakon_0-1708425338895.png

 

1 ACCEPTED SOLUTION

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjTVMzDUMzIwMlHSUTK0jExLKUoxKXIuqygtS1EwCC7wNfZSitUhRl0WkepKiFNn4kykeVD3mRNSp0KkOkMi1RFrLyRcjAz0DIyIMI+wOqB5sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, GUID = _t]),
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"Date", type date}, {"GUID", type text}},"de"),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"GUID"}, {{"maxd", each List.Max([Date]), type nullable date}, {"mind", each List.Min([Date]), type nullable date}, {"ct", each Table.RowCount(_), Int64.Type}, {"rows", each _, type table [Date=nullable date, GUID=nullable text, Index=number]}}),
    #"Expanded rows" = Table.ExpandTableColumn(#"Grouped Rows", "rows", {"Date", "Index"}, {"Date", "Index"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded rows",{"Date", "GUID", "maxd", "mind", "ct", "Index"}),
    #"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"Index", Order.Ascending}}),
    #"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Duplicates", (k)=> if Table.RowCount(Table.SelectRows(#"Sorted Rows",each [Date]=k[Date] and [GUID]=k[GUID]))>1 then "Duplicate" else ""),
    #"Added Custom" = Table.AddColumn(#"Added Custom1", "New/Old", each if [ct] = 1 or [Date]=[maxd] then "New" else "Old"),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Date", "GUID", "Duplicates", "New/Old"})
in
    #"Removed Other Columns"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

View solution in original post

4 REPLIES 4
lbendlin
Super User
Super User

Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).

Do not include sensitive information or anything not related to the issue or question.

If you are unsure how to upload data please refer to https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...

Please show the expected outcome based on the sample data you provided.

Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...

This is my source data:

 

DateGUID
15.01.202419Yfdrd4rCvxuvd 0SpM3J
15.01.202419Yfdrd4rCvxuvd 0SpM3j
15.01.202419Yfdrd4rCvxuvd 0SpM3t
15.01.202419Yfdrd4rCvxuvd 0SpM4C
15.01.202419Yfdrd4rCvxuvd 0SpM3J
17.01.202419Yfdrd4rCvxuvd 0SpM3$
17.01.202419Yfdrd4rCvxuvd 0SpM31
17.01.202419Yfdrd4rCvxuvd 0SpM3J
17.01.202419Yfdrd4rCvxuvd 0SpM3j
20.02.202419Yfdrd4rCvxuvd 0SpM3J
20.02.202419Yfdrd4rCvxuvd 0SpM3j

 

This is what I want the outcome to be:

 

DateGUIDDuplicate/UniqueNew/Old
15.01.202419Yfdrd4rCvxuvd 0SpM3JDuplicateOld
15.01.202419Yfdrd4rCvxuvd 0SpM3jUniqueOld
15.01.202419Yfdrd4rCvxuvd 0SpM3tUniqueNew
15.01.202419Yfdrd4rCvxuvd 0SpM4CUniqueNew
15.01.202419Yfdrd4rCvxuvd 0SpM3JDuplicateOld
17.01.202419Yfdrd4rCvxuvd 0SpM3$UniqueNew
17.01.202419Yfdrd4rCvxuvd 0SpM31UniqueNew
17.01.202419Yfdrd4rCvxuvd 0SpM3JUniqueOld
17.01.202419Yfdrd4rCvxuvd 0SpM3jUniqueOld
20.02.202419Yfdrd4rCvxuvd 0SpM3JUniqueNew
20.02.202419Yfdrd4rCvxuvd 0SpM3jUniqueNew

 

I know I can group by and select the colums Date and GUID and see if there is more then 1 of the same GUID in the same date, but I have a problem with saying who GUID is new and old since DAX dont seperate upper and lowercase, and I dont know query so good.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjTVMzDUMzIwMlHSUTK0jExLKUoxKXIuqygtS1EwCC7wNfZSitUhRl0WkepKiFNn4kykeVD3mRNSp0KkOkMi1RFrLyRcjAz0DIyIMI+wOqB5sQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, GUID = _t]),
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"Date", type date}, {"GUID", type text}},"de"),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"GUID"}, {{"maxd", each List.Max([Date]), type nullable date}, {"mind", each List.Min([Date]), type nullable date}, {"ct", each Table.RowCount(_), Int64.Type}, {"rows", each _, type table [Date=nullable date, GUID=nullable text, Index=number]}}),
    #"Expanded rows" = Table.ExpandTableColumn(#"Grouped Rows", "rows", {"Date", "Index"}, {"Date", "Index"}),
    #"Reordered Columns" = Table.ReorderColumns(#"Expanded rows",{"Date", "GUID", "maxd", "mind", "ct", "Index"}),
    #"Sorted Rows" = Table.Sort(#"Reordered Columns",{{"Index", Order.Ascending}}),
    #"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Duplicates", (k)=> if Table.RowCount(Table.SelectRows(#"Sorted Rows",each [Date]=k[Date] and [GUID]=k[GUID]))>1 then "Duplicate" else ""),
    #"Added Custom" = Table.AddColumn(#"Added Custom1", "New/Old", each if [ct] = 1 or [Date]=[maxd] then "New" else "Old"),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Date", "GUID", "Duplicates", "New/Old"})
in
    #"Removed Other Columns"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

Thanks, worked perfectly!

Helpful resources

Announcements
PBIApril_Carousel

Power BI Monthly Update - April 2025

Check out the April 2025 Power BI update to learn about new features.

Notebook Gallery Carousel1

NEW! Community Notebooks Gallery

Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.

April2025 Carousel

Fabric Community Update - April 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors