March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi everyone hope someone can help me.....I'm pretty new to PowerBI but loving how flexible it is. However, as anything new I've come up against something I think should be pretty simple.
I have a dataset that I'm brining into Power Bi (it's an SQL table)
The table has an incremental ID and an email address amongst many others. For example....
ID, Email
1, 123@xyz.com
2, 345@abc.com
3, 123@xyz.com
4, abc@123.com
5, dce@456.com
6, abc@123.com
I need to remove all the rows from the dataset that aren't the latest ones for each email.
The resulting table would then ONLY include ;
ID, Email
2, 345@abc.com
3, 123@xyz.com
5, dce@456.com
6, abc@123.com
I've tried many ways to do this and using the GROUP function and MAX as a condition which is returning the correct ID but I need to actually remove them completely.
I'm returning all rows below so that I can cherry pick which columns I want once I've (hopefully) removed the rows I don't want.
I need to remove the rows as I need the email address to be unique but ensure it's the most recent record so that I can join it to another table.
Thanks in advance, Mark.
Solved! Go to Solution.
I think this calculated table might be getting close. Just replace my Table2 with the name of your table
Table 3 = SELECTCOLUMNS( FILTER( CROSSJOIN( SELECTCOLUMNS( SUMMARIZECOLUMNS( 'Table2'[Email], "MAX ID", MAX('Table2'[ID]) ), "MAX Email",[Email], "MAX ID",[MAX ID] ) , 'Table2' ), 'Table2'[Email]=[MAX Email] && Table2[ID] = [MAX ID] ), "ID",[ID], "Email" , [MAX Email] )
You could modify the code in Advanced Editor.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0MnaoqKzSS87PVYrViVYyAooZm5g6JCYlw8WMsagzAYoB1TgAxeFipkCxlORUBxNTM7iYGbq6WAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Email = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Email", type text}}), #"Grouped Rows" = Table.Group(#"Changed Type", {"Email"}, {{"Data", each _, type table}}), Transformed = Table.TransformColumns(#"Grouped Rows", {{"Data", each Table.AddIndexColumn(Table.Sort(_, {{"ID", Order.Descending}}), "Rank", 1, 1)}}), #"Expanded IDs" = Table.ExpandTableColumn(Transformed, "Data", {"ID", "Rank"}, {"ID", "Rank"}), #"Filtered Rows" = Table.SelectRows(#"Expanded IDs", each ([Rank] = 1)), #"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"ID", Order.Ascending}}) in #"Sorted Rows"
@v-chuncz-msft Thanks for the answer. In the end I didn't attempt this option as I wasn't sure how or which elements to incorporate and I got the other solution working.
I do potentially like this option as it would solve the problem without creating a new table.
I think this calculated table might be getting close. Just replace my Table2 with the name of your table
Table 3 = SELECTCOLUMNS( FILTER( CROSSJOIN( SELECTCOLUMNS( SUMMARIZECOLUMNS( 'Table2'[Email], "MAX ID", MAX('Table2'[ID]) ), "MAX Email",[Email], "MAX ID",[MAX ID] ) , 'Table2' ), 'Table2'[Email]=[MAX Email] && Table2[ID] = [MAX ID] ), "ID",[ID], "Email" , [MAX Email] )
@Phil_Seamark Thanks for this answer. I thought I might need to create this as a Table and this appears to work perfectly.
I did need to add my additional fields but that was pretty simple. Thanks
My original source table was 200,000 rows so I've had to restrict the incoming dataset (which is ok) as the refresh was timing out the Power BI Gateway but I think I've got that working now.
Thanks
Or you could go the calculated column way and have a column that shows a 1 or 0 depending if the row is the latest.
Just add the following calcualted column and then you can filter on it in the Report, Page or Visual filters.
Is Latest = IF( CALCULATE( COUNTROWS('Table2'), FILTER( ALL(Table2), 'Table2'[ID] > EARLIER('Table2'[ID]) && 'Table2'[Email] = EARLIER('Table2'[Email]) ) )+0=0,1,0)
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
128 | |
90 | |
75 | |
56 | |
53 |
User | Count |
---|---|
200 | |
104 | |
101 | |
67 | |
55 |