Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply

How to remove duplicate rows in calculated table based on condition

Hello All,

I have below calculated table..can you please let me know how to remove duplicate ID rows in below table. 

For example 123 ID is duplicated in below table and I need to remove 123 ID row with 10/19/2022 date where link column is empty and keep 123 ID with link value present for 10/24/2022 date.

 

we would like to remove duplicate id's based on above condition

IDlinkDate
123 10/19/2022
456https://www.wikipedia.org 
789  
123https://www.wikipedia.org10/24/2022
568  
897  

 

thank you!

 

regards,

Ashwini

2 ACCEPTED SOLUTIONS
v-yangliu-msft
Community Support
Community Support

Hi  @ashuaswinireddy ,

 

Here are the steps you can follow:

1. In Power query. Add Column – Index Column – From 1.

vyangliumsft_0-1668673146062.png

2. Create calculated column.

Rank =
RANKX(FILTER(ALL('Table'),'Table'[ID]=EARLIER('Table'[ID])),[Index],,ASC)
Flag =
var _maxdate=MAXX(FILTER(ALL('Table'),
'Table'[ID]=EARLIER('Table'[ID])),[Date])
var _count=COUNTX(FILTER(ALL('Table'),'Table'[ID]=EARLIER('Table'[ID])),[ID])
return
IF(
    _count=1&&[Rank]=1,
    1,
    IF(
        _count >1&&'Table'[Date]=_maxdate,1,0)
)

vyangliumsft_1-1668673146064.png

3. Create calculated table.

Table 2 =
var _table1=
FILTER('Table',[Flag]=1)
return
SUMMARIZE(
    _table1,[ID],[link],[Date])

4. Result:

vyangliumsft_2-1668673146067.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

mariussve1
Super User
Super User

Hi again,

Sorry for the late answer, but you can actually solve this in one DAX expression and one step:

Calculated table =
VAR __Table =
    SUMMARIZE (
        'Table',
        'Table'[ID],
        'Table'[Link],
        "RankColumn", ISBLANK ( 'Table'[Link] ) * 1
    )
VAR __Rank =
    SUMMARIZE (
        __Table,
        [ID],
        [Link],
        [RankColumn],
        "Rank", RANKX (
                    FILTER(
                        __Table,
                        [ID] = EARLIER ( [ID] )
                    ),
                    [RankColumn],,ASC
                )
    )
VAR __Filter =
    MINX ( __Rank, [Rank] )
VAR __Result =
    FILTER ( __Rank, [Rank] = __Filter )
RETURN
    SUMMARIZE (
        __Result,
        [ID],
        [Link]
    )


Br
Marius

Br
Marius
BI Fabrikken
www.bifabrikken.no

View solution in original post

8 REPLIES 8
mariussve1
Super User
Super User

Hi again,

Sorry for the late answer, but you can actually solve this in one DAX expression and one step:

Calculated table =
VAR __Table =
    SUMMARIZE (
        'Table',
        'Table'[ID],
        'Table'[Link],
        "RankColumn", ISBLANK ( 'Table'[Link] ) * 1
    )
VAR __Rank =
    SUMMARIZE (
        __Table,
        [ID],
        [Link],
        [RankColumn],
        "Rank", RANKX (
                    FILTER(
                        __Table,
                        [ID] = EARLIER ( [ID] )
                    ),
                    [RankColumn],,ASC
                )
    )
VAR __Filter =
    MINX ( __Rank, [Rank] )
VAR __Result =
    FILTER ( __Rank, [Rank] = __Filter )
RETURN
    SUMMARIZE (
        __Result,
        [ID],
        [Link]
    )


Br
Marius

Br
Marius
BI Fabrikken
www.bifabrikken.no

I had a similar issue with duplicate Document Number - 1 with Blank Transmittal No and 1 with valid Transmittal No. I used this method to remove the blank Tr no! I like this 1 DAX query to build the desired table! Many thanks. 🙂

Thank you for your time and response!

v-yangliu-msft
Community Support
Community Support

Hi  @ashuaswinireddy ,

 

Here are the steps you can follow:

1. In Power query. Add Column – Index Column – From 1.

vyangliumsft_0-1668673146062.png

2. Create calculated column.

Rank =
RANKX(FILTER(ALL('Table'),'Table'[ID]=EARLIER('Table'[ID])),[Index],,ASC)
Flag =
var _maxdate=MAXX(FILTER(ALL('Table'),
'Table'[ID]=EARLIER('Table'[ID])),[Date])
var _count=COUNTX(FILTER(ALL('Table'),'Table'[ID]=EARLIER('Table'[ID])),[ID])
return
IF(
    _count=1&&[Rank]=1,
    1,
    IF(
        _count >1&&'Table'[Date]=_maxdate,1,0)
)

vyangliumsft_1-1668673146064.png

3. Create calculated table.

Table 2 =
var _table1=
FILTER('Table',[Flag]=1)
return
SUMMARIZE(
    _table1,[ID],[link],[Date])

4. Result:

vyangliumsft_2-1668673146067.png

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Thank you for your time and response!

 

I  have another scenario where link is available for the previous date but not for the latest date.

For example- ID 897 link is available on 10/18/2022 and not on 10/26/2022. In this case I need to select ID 897 with link available row on 10/18/2022. Can you please let me know how to handle this scenario.

IDlinkDate
123 10/19/2022
456https://www.wikipedia.org 
789  
123https://www.wikipedia.org10/24/2022
568  
897 https://www.youtube.com 10/18/2022
897 10/26/2022
   

 

Once again thank you for your time!

 

Regards,

Ashwini

AllisonKennedy
Super User
Super User

@ashuaswinireddy 

 

If you want to remove the rows completely from the report, it's more efficient to do this in Power Query than DAX.

 

If you just want to keep the non blank links, then click 'Transform Data' to open power query. Filter the Link column to exclude null and blank values, then Close and Apply changes. 

 

If you want to remove duplicates and keep only the latest date, then Sort by Date in Power Query first. You'll need to add Table.Buffer to ensure the sort stays for the next step as per this answer: Solved: How do I buffer my table in Power Query? - Microsoft Power BI Community

Then right click on the ID column and remove duplicates. Finally, close and Apply.


Please @mention me in your reply if you want a response.

Copying DAX from this post? Click here for a hack to quickly replace it with your own table names

Has this post solved your problem? Please Accept as Solution so that others can find it quickly and to let the community know your problem has been solved.
If you found this post helpful, please give Kudos C

I work as a Microsoft trainer and consultant, specialising in Power BI and Power Query.
www.excelwithallison.com

mariussve1
Super User
Super User

Hi,

 

If you want to keep certain rows duplicated on the ID you need to use rank function:

https://dax.guide/rankx/

If you need more help, Please let me know How to rank (wich rules do you want to use to keep the correct row)

 

Br

Marius


Br
Marius
BI Fabrikken
www.bifabrikken.no

Thank you for your response!
 
I used below formula to create above mentioned table. can you please let me knopw how to use Rankx in this formula to filter out duplicate values for each id
 
table = SUMMARIZE('Links table',' Links table'[id],'Links table'[link],"rtc link date",MAX(' Links table'[date])
 
thank you!

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.