Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
moonpie
Frequent Visitor

Query Editor-Duplicate Values in A column,, trying to find the way to flag duplicates

So I have searched the forums on duplicates, and I have found some good solutions on grouping duplicates, but not exactly what I need. 

 

So let us say that I have the following table (below is fake data I created that more or less shows how my real data is, couldn't put real data in for privacy/security issues).

 

The EmployeeID field does contain duplicate values.

RequestIDEmployeeIDRequestType
1333333Access
2111111Access
3222222Service
4333333Access
5222222Access
6444444Service
7333333Service

 

I want to create a new column with a value of either original or duplicate.

 

If an ID is unique it will have the value of Original.

 

If an ID appears more than once, only the first duplicate ID will the value of Original and the rest of the duplicate with have a value of Duplicate.

 

The table below shows what I am trying to do a lot better.  

RequestIDEmployeeIDRequestTypeFlag
1333333AccessOriginal
2111111AccessOriginal
3222222ServiceOriginal
4333333AccessDuplicate
5222222AccessDuplicate
6444444ServiceOriginal
7333333ServiceDuplicate


Now the reason I want to do this is because I need to create 4 reports

  • Matrix table of only Original counts
  • Matixx table of both duplicate and original counts
  • Regular table that corresponds to the first bulleted item, but with all columns
  • Regular table that corresponds to the second bulleted item but with all columns

I know this might be confusing, or it seems unnecessary, however this is how they are asking me to do it at my work

 

Any help is greatly appreciated.

 

1 ACCEPTED SOLUTION
dax
Community Support
Community Support

Hi moonpie, 

It seems that you want to create a column based on condition in Table, right? If so,  you could try to use below M code to achieve this goal

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGAyDDMTk5tbhYKVYnWskIyDUEA1RxkDIjMAAyglOLyjKTU8ESJjgMMkXWgCRuBuSagAGaQebIBsElYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [RequestID = _t, EmployeeID = _t, RequestType = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"RequestID", Int64.Type}, {"EmployeeID", Int64.Type}, {"RequestType", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"RequestID", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index1", {"EmployeeID"}, {{"group", each _, type table [RequestID=number, EmployeeID=number, RequestType=text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([group],"index",1,1)),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"RequestID", "RequestType", "index"}, {"Custom.RequestID", "Custom.RequestType", "Custom.index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"group"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each if [Custom.index]=1 then "Original" else "Duplicate"),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Custom.index"})
in
    #"Removed Columns1"

Best Regards,
Zoe Zhi

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3

Hi @dax ,

 

I am really struggling to adapt this for my scenario. 

 

I have created a concatenated colum  to identify duplicates and I have sorted the data based on the concatenated row. Now I want to create a duplicate flag (1 if duplicate and 0 if not duplicate).  Here is an example of my data, the table is called api_data program_4378 and the column is called Concatenate (text format)

 

KatieFarrand12_0-1694440951715.png

Warm regards,

Katie

dax
Community Support
Community Support

Hi moonpie, 

It seems that you want to create a column based on condition in Table, right? If so,  you could try to use below M code to achieve this goal

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGAyDDMTk5tbhYKVYnWskIyDUEA1RxkDIjMAAyglOLyjKTU8ESJjgMMkXWgCRuBuSagAGaQebIBsElYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [RequestID = _t, EmployeeID = _t, RequestType = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"RequestID", Int64.Type}, {"EmployeeID", Int64.Type}, {"RequestType", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"RequestID", Order.Ascending}}),
    #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
    #"Grouped Rows" = Table.Group(#"Added Index1", {"EmployeeID"}, {{"group", each _, type table [RequestID=number, EmployeeID=number, RequestType=text]}}),
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([group],"index",1,1)),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"RequestID", "RequestType", "index"}, {"Custom.RequestID", "Custom.RequestType", "Custom.index"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"group"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each if [Custom.index]=1 then "Original" else "Duplicate"),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Custom.index"})
in
    #"Removed Columns1"

Best Regards,
Zoe Zhi

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

moonpie
Frequent Visitor

Hi Zoe,

 

Thank you very much for your help! It worked.

 

-moonpie

 

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.