Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
markpendlebury
Frequent Visitor

Remove Rows in Dataset based on most recent ID and email address

Hi everyone hope someone can help me.....I'm pretty new to PowerBI but loving how flexible it is. However, as anything new I've come up against something I think should be pretty simple.

 

I have a dataset that I'm brining into Power Bi (it's an SQL table)

 

The table has an incremental ID and an email address amongst many others. For example....

 

ID, Email

1, 123@xyz.com

2, 345@abc.com

3, 123@xyz.com

4, abc@123.com

5, dce@456.com

6, abc@123.com

 

I need to remove all the rows from the dataset that aren't the latest ones for each email.

 

The resulting table would then ONLY include ;

 

ID, Email

2, 345@abc.com

3, 123@xyz.com

5, dce@456.com

6, abc@123.com

 

I've tried many ways to do this and using the GROUP function and MAX as a condition which is returning the correct ID but I need to actually remove them completely.

 

I'm returning all rows below so that I can cherry pick which columns I want once I've (hopefully) removed the rows I don't want.

 

latestguaranteeid.PNG

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I need to remove the rows as I need the email address to be unique but ensure it's the most recent record so that I can join it to another table. 

 

Thanks in advance, Mark.

 

 

1 ACCEPTED SOLUTION
Phil_Seamark
Microsoft Employee
Microsoft Employee

Hi @markpendlebury

 

I think this calculated table might be getting close.  Just replace my Table2 with the name of your table

 

Table 3 = SELECTCOLUMNS(
            FILTER(
                CROSSJOIN(
                    SELECTCOLUMNS( 
                        SUMMARIZECOLUMNS(
                            'Table2'[Email],
                            "MAX ID",
                            MAX('Table2'[ID])
                            ),
                            "MAX Email",[Email],
                            "MAX ID",[MAX ID]
                          ) ,
            'Table2'
            ),
            'Table2'[Email]=[MAX Email] && 
            Table2[ID] = [MAX ID]
            ),
            "ID",[ID],
            "Email" , [MAX Email]
            )

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

View solution in original post

5 REPLIES 5
v-chuncz-msft
Community Support
Community Support

@markpendlebury,

 

You could modify the code in Advanced Editor.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0MnaoqKzSS87PVYrViVYyAooZm5g6JCYlw8WMsagzAYoB1TgAxeFipkCxlORUBxNTM7iYGbq6WAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Email = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Email", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Email"}, {{"Data", each _, type table}}),
    Transformed = Table.TransformColumns(#"Grouped Rows", {{"Data", each Table.AddIndexColumn(Table.Sort(_, {{"ID", Order.Descending}}), "Rank", 1, 1)}}),
    #"Expanded IDs" = Table.ExpandTableColumn(Transformed, "Data", {"ID", "Rank"}, {"ID", "Rank"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded IDs", each ([Rank] = 1)),
    #"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"ID", Order.Ascending}})
in
    #"Sorted Rows"
Community Support Team _ Sam Zha
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

@v-chuncz-msft Thanks for the answer. In the end I didn't attempt this option as I wasn't sure how or which elements to incorporate and I got the other solution working.

 

I do potentially like this option as it would solve the problem without creating a new table.

Phil_Seamark
Microsoft Employee
Microsoft Employee

Hi @markpendlebury

 

I think this calculated table might be getting close.  Just replace my Table2 with the name of your table

 

Table 3 = SELECTCOLUMNS(
            FILTER(
                CROSSJOIN(
                    SELECTCOLUMNS( 
                        SUMMARIZECOLUMNS(
                            'Table2'[Email],
                            "MAX ID",
                            MAX('Table2'[ID])
                            ),
                            "MAX Email",[Email],
                            "MAX ID",[MAX ID]
                          ) ,
            'Table2'
            ),
            'Table2'[Email]=[MAX Email] && 
            Table2[ID] = [MAX ID]
            ),
            "ID",[ID],
            "Email" , [MAX Email]
            )

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

@Phil_Seamark Thanks for this answer. I thought I might need to create this as a Table and this appears to work perfectly.

I did need to add my additional fields but that was pretty simple. Thanks

 

My original source table was 200,000 rows so I've had to restrict the incoming dataset (which is ok) as the refresh was timing out the Power BI Gateway but I think I've got that working now.

 

Thanks

Or you could go the calculated column way and have a column that shows a 1 or 0 depending if the row is the latest.

 

Just add the following calcualted column and then you can filter on it in the Report, Page or Visual filters.

 

Is Latest = IF( CALCULATE(
                COUNTROWS('Table2'),
                FILTER(
                    ALL(Table2),
                    'Table2'[ID] > EARLIER('Table2'[ID]) 
                    && 'Table2'[Email] = EARLIER('Table2'[Email])
                    )
                    )+0=0,1,0) 

To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.