Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi everyone hope someone can help me.....I'm pretty new to PowerBI but loving how flexible it is. However, as anything new I've come up against something I think should be pretty simple.
I have a dataset that I'm brining into Power Bi (it's an SQL table)
The table has an incremental ID and an email address amongst many others. For example....
ID, Email
1, 123@xyz.com
2, 345@abc.com
3, 123@xyz.com
4, abc@123.com
5, dce@456.com
6, abc@123.com
I need to remove all the rows from the dataset that aren't the latest ones for each email.
The resulting table would then ONLY include ;
ID, Email
2, 345@abc.com
3, 123@xyz.com
5, dce@456.com
6, abc@123.com
I've tried many ways to do this and using the GROUP function and MAX as a condition which is returning the correct ID but I need to actually remove them completely.
I'm returning all rows below so that I can cherry pick which columns I want once I've (hopefully) removed the rows I don't want.
I need to remove the rows as I need the email address to be unique but ensure it's the most recent record so that I can join it to another table.
Thanks in advance, Mark.
Solved! Go to Solution.
I think this calculated table might be getting close. Just replace my Table2 with the name of your table
Table 3 = SELECTCOLUMNS( FILTER( CROSSJOIN( SELECTCOLUMNS( SUMMARIZECOLUMNS( 'Table2'[Email], "MAX ID", MAX('Table2'[ID]) ), "MAX Email",[Email], "MAX ID",[MAX ID] ) , 'Table2' ), 'Table2'[Email]=[MAX Email] && Table2[ID] = [MAX ID] ), "ID",[ID], "Email" , [MAX Email] )
You could modify the code in Advanced Editor.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0MnaoqKzSS87PVYrViVYyAooZm5g6JCYlw8WMsagzAYoB1TgAxeFipkCxlORUBxNTM7iYGbq6WAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Email = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Email", type text}}), #"Grouped Rows" = Table.Group(#"Changed Type", {"Email"}, {{"Data", each _, type table}}), Transformed = Table.TransformColumns(#"Grouped Rows", {{"Data", each Table.AddIndexColumn(Table.Sort(_, {{"ID", Order.Descending}}), "Rank", 1, 1)}}), #"Expanded IDs" = Table.ExpandTableColumn(Transformed, "Data", {"ID", "Rank"}, {"ID", "Rank"}), #"Filtered Rows" = Table.SelectRows(#"Expanded IDs", each ([Rank] = 1)), #"Sorted Rows" = Table.Sort(#"Filtered Rows",{{"ID", Order.Ascending}}) in #"Sorted Rows"
@v-chuncz-msft Thanks for the answer. In the end I didn't attempt this option as I wasn't sure how or which elements to incorporate and I got the other solution working.
I do potentially like this option as it would solve the problem without creating a new table.
I think this calculated table might be getting close. Just replace my Table2 with the name of your table
Table 3 = SELECTCOLUMNS( FILTER( CROSSJOIN( SELECTCOLUMNS( SUMMARIZECOLUMNS( 'Table2'[Email], "MAX ID", MAX('Table2'[ID]) ), "MAX Email",[Email], "MAX ID",[MAX ID] ) , 'Table2' ), 'Table2'[Email]=[MAX Email] && Table2[ID] = [MAX ID] ), "ID",[ID], "Email" , [MAX Email] )
@Phil_Seamark Thanks for this answer. I thought I might need to create this as a Table and this appears to work perfectly.
I did need to add my additional fields but that was pretty simple. Thanks
My original source table was 200,000 rows so I've had to restrict the incoming dataset (which is ok) as the refresh was timing out the Power BI Gateway but I think I've got that working now.
Thanks
Or you could go the calculated column way and have a column that shows a 1 or 0 depending if the row is the latest.
Just add the following calcualted column and then you can filter on it in the Report, Page or Visual filters.
Is Latest = IF( CALCULATE( COUNTROWS('Table2'), FILTER( ALL(Table2), 'Table2'[ID] > EARLIER('Table2'[ID]) && 'Table2'[Email] = EARLIER('Table2'[Email]) ) )+0=0,1,0)
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
If you love stickers, then you will definitely want to check out our Community Sticker Challenge!
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
144 | |
75 | |
63 | |
51 | |
48 |
User | Count |
---|---|
204 | |
86 | |
64 | |
59 | |
56 |