Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
vjnvinod
Impactful Individual
Impactful Individual

Need to create a calculate coloumn to identify duplicates lines

Hi,

 

I have a table, with so many rows and coloumns,

i need to create a calculate coloumn, to segregate "Original and Duplicate entries"

Certain coloumns

 

GlobalService

DunsNumber

OpportunityID

Opportunity

PipelineOpen_constant

 

if the above coloumns , for the lines(Rows) are equal, then mark these as duplicate, except 1line(which is original)

that means if there are 3 lines same, mark 2 of them as duplicate and 1 as original.

 

let me know if this is possible to do and also i want a table to be created for those entries.

 

1 ACCEPTED SOLUTION
v-frfei-msft
Community Support
Community Support

Hi @vjnvinod ,

 

I have created a sample for your reference, please check the following steps as below.

 

1. Insert an index column in power query. M code for your reference.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlSK1YGRSUhkMg4yBUymgsk0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [catgeory = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"catgeory", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1)
in
    #"Added Index"

 

2. Close and apply, then wen can get the excepted result by a calculated column.

Column =
VAR inde = 'Table'[Index]
VAR previous = inde - 1
RETURN
    IF (
        CALCULATE (
            MAX ( 'Table'[catgeory] ),
            FILTER ( 'Table', 'Table'[Index] = previous )
        ) = 'Table'[catgeory],
        "Duplicated",
        "Original"
    )

Capture.PNG

 

Pbix as attached.

 

 

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.

View solution in original post

2 REPLIES 2
v-frfei-msft
Community Support
Community Support

Hi @vjnvinod ,

 

I have created a sample for your reference, please check the following steps as below.

 

1. Insert an index column in power query. M code for your reference.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSlSK1YGRSUhkMg4yBUymgsk0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [catgeory = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"catgeory", type text}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1)
in
    #"Added Index"

 

2. Close and apply, then wen can get the excepted result by a calculated column.

Column =
VAR inde = 'Table'[Index]
VAR previous = inde - 1
RETURN
    IF (
        CALCULATE (
            MAX ( 'Table'[catgeory] ),
            FILTER ( 'Table', 'Table'[Index] = previous )
        ) = 'Table'[catgeory],
        "Duplicated",
        "Original"
    )

Capture.PNG

 

Pbix as attached.

 

 

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.
danextian
Super User
Super User

Hi @vjnvinod ,

 

This is achievable.

 

  • First create a calculated column to concatenate all these columns.

 

Concatenate =
//you may also use "&" instead of COMBINEVALUES() to concatenate text strings
COMBINEVALUES (
    " ",
    'Table'[GlobalService],
    'Table'[DunsNumber],
    'Table'[OpportunityID],
    'Table'[Opportunity],
    'Table'[PipelineOpen_constant]
)
​

 

 

  • Now, there should some kind of a column to identify which among the duplicate concatenated values come first like datetimestamp, etc. If there isn't, to to the Query Editor (Designer Home Tab >> Transform Data) -->Add Column tab -->Index Column and then apply the changes. Create another calculated column:
    index.png
    KeepThisRow =
    CALCULATE (
        MIN ( 'Table'[Index/DateTimeStamp] ),
        ALLEXCEPT ( 'Table', 'Table'[Concatenate] )
    ) = 'Table'[Index/DateTimeStamp]
    ​

     

Another way to do this, you don't want  to keep all other rows except the first one based on several columns is do Remove Duplicates in Power Query which is very similar to Excel.  In Power Query, select all the columns in your list. Right click and then select Remove Duplicates.

remove duplicates.png

 

 

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors