Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Stelles3000
Regular Visitor

Identify duplicate rows

Hello togehter,

 

Table1

Cost CenterGL AccountDocNoAmount
25250015455506DE452K450.11
25250015455506DE477B450.11
25250405456000DE333C83.45

 

I want to identify dublicate rows with the following rule:

 

Cost Center EQUALS Cost Center

GL Account EQUALS GL Account

Amount EQUALS Amount

DocNo UNEQUALS DocNo

 

The result should be:

Cost CenterGL AccountDocNoAmount
25250015455506DE452K450.11
25250015455506DE477B450.11

 

1 ACCEPTED SOLUTION

But, you can convert a text inside  in measure to a number using the VALUE function

View solution in original post

10 REPLIES 10
Ahmedx
Super User
Super User

MFelix
Super User
Super User

Hi @Stelles3000 ,

 

This depends on how you want to get this but I see 2 different options using DAX or Power Query, but then also depends if you want to have a new table or a column you can do the following:;

 

Power Query:

  • Create a referenced table and do a group by the  3 columns and add a count of the rows and add a row of all values:

MFelix_1-1678473077024.png

  • Filter the values different from 1 and expand~the doc number:

MFelix_2-1678473118754.pngMFelix_3-1678473131095.png

  • Delete the column count

MFelix_4-1678473157811.png

 

Option 2 is to create the same steps but within the table you have and then do a merge to get the count and flag those value has repetead:

MFelix_5-1678473395157.png

Code below (but have some steps that can be changed to be more efficient)

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("dcqxDYAwDATAXVxH0Tv2J9RAKkaIvP8aWNBCd8WtJY2NgEoROkn01Dmd7Uo4UVUlyu8bY/94jvd1AM8zsyOxWXVKxA0=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Cost Center" = _t, #"GL Account" = _t, DocNo = _t, Amount = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Cost Center", Int64.Type}, {"GL Account", Int64.Type}, {"DocNo", type text}, {"Amount", type number}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Cost Center", "GL Account", "Amount"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"Values", each _, type table [Cost Center=nullable number, GL Account=nullable number, DocNo=nullable text, Amount=nullable number]}}),
    #"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each ([Count] = 2)),
    #"Expanded Values" = Table.ExpandTableColumn(#"Filtered Rows", "Values", {"DocNo"}, {"DocNo"}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Cost Center", "GL Account", "Amount", "DocNo"}, #"Expanded Values", {"Cost Center", "GL Account", "Amount", "DocNo"}, "Expanded Values", JoinKind.LeftOuter),
    #"Expanded Expanded Values" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Values", {"Count"}, {"Count"}),
    #"Renamed Columns" = Table.RenameColumns(#"Expanded Expanded Values",{{"Count", "Flag"}}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns",{{"Flag", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type1", each [Flag] , each if [Flag] <> null then "Duplicated" else null,Replacer.ReplaceValue,{"Flag"})
in
    #"Replaced Value"

 

DAX:

New Table use the code below:

Duplicated Values Table =
VAR temptable =
    SUMMARIZE (
        'Table',
        'Table'[GL Account],
        'Table'[Cost Center],
        'Table'[Amount],
        "Duplicated", IF ( COUNTROWS ( 'Table' ) > 1, "Duplicated" )
    )
RETURN
    FILTER ( NATURALLEFTOUTERJOIN ( 'Table', temptable ), [Duplicated] <> BLANK () )

MFelix_6-1678473801383.png

 

 

Column:

Flag =
IF (
    COUNTROWS (
        FILTER (
            'Table',
            'Table'[Cost Center] = EARLIER ( 'Table'[Cost Center] )
                && 'Table'[GL Account] = EARLIER ( 'Table'[GL Account] )
                && 'Table'[Amount] = EARLIER ( 'Table'[Amount] )
        )
    ) > 1,
    "Duplicated"
)

MFelix_7-1678473930311.png

 

If you just want to present this on a visualization best option is to create a measure similar to this:

Flag Metric = 
    COUNTROWS (
        FILTER (
            ALLSELECTED('Table'[Cost Center], 'Table'[GL Account], 'Table'[Amount], 'Table'[DocNo]),
            'Table'[Cost Center] = SELECTEDVALUE( 'Table'[Cost Center] )
                && 'Table'[GL Account] = SELECTEDVALUE(  'Table'[GL Account] )
                && 'Table'[Amount] = SELECTEDVALUE(  'Table'[Amount] )
        )
    ) 

 

Now use it has a filter on your table visualization.

MFelix_8-1678474169464.png

 

 

 

 

 

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Thx for your reply. The part DocNo is unequal DocNo is important. This should not be the same.

Hi @Stelles3000

 

I identify with duplicate but the doc number is different in all the codes if you see I'm only comparing gl account, cost center and amount. The result is giving the same has you present on the post two lines and different doc numbers.

 

Isn't that the result you want? 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Yes, but the DocNo must be different, so it should not equal. There are examples with identical values, I dont want these. 

@Stelles3000 ,

 

Sorry for the additional questions, so when you have a docnum equal you don't want to count the repetead ones?

 

Can you send an example of data where you have that type of information (duplicated doc numbers) and expected result? On the first example you only had 3 lines so it's diffficult to duplicate your data and get accurate results.

 

Thank you in advance.


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



There could be more than 5000 rows in different order. 

 

Cost CenterGL Account DocNoAmountFlag
25250015455506DE452K450.11Dublicated
25250405456000DE333C83.45 
25250015455506DE477B450.11Dublicated
25250025455506DFG56B36.9 
25250035455506DFKB9050.01 
25250045455506DFKB9035.04 
25250025455506DFG56B36.9 

 

This Column "Flag" should be the result. I am looking for a pattern with specific rules.  

 

But what If the Cost Center or the GL Account is not a number. It is a text field. 

But, you can convert a text inside  in measure to a number using the VALUE function

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

Top Kudoed Authors