March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I've got this table that contains ID's of calls, where most of the calls have sub ID's. I'm looking for a solution where I can only preserve the ID's with the highest index, so I can see the end status of every call. So I only want the yellow ones ...
Is this possible?
Solved! Go to Solution.
Since, you have already sorted Index column (if not sort on Index), See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RY9BCoMwFETvkrWL6rSaLm21eoaKDD1E72/mJzCLgcfjE/KOIw331KVZY+A/nV2xD1vhr9rRdrSdbCfbbJttnwVfGgOrxa3gW2Ng/QN62963Q8FFY2C7hS18q6BVY2CzCvpoDGwvKGjTCLdBQbtGuA3Z1m1Q0FcjWtt5AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Index = _t, ID1 = _t, ID2 = _t, Status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index.1", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Select", each try if [ID1]<>#"Added Index"[ID1]{[Index.1]+1} then "Y" else "N" otherwise "Y"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Select] = "Y")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index.1", "Select"})
in
#"Removed Columns"
Since, you have already sorted Index column (if not sort on Index), See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("RY9BCoMwFETvkrWL6rSaLm21eoaKDD1E72/mJzCLgcfjE/KOIw331KVZY+A/nV2xD1vhr9rRdrSdbCfbbJttnwVfGgOrxa3gW2Ng/QN62963Q8FFY2C7hS18q6BVY2CzCvpoDGwvKGjTCLdBQbtGuA3Z1m1Q0FcjWtt5AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Index = _t, ID1 = _t, ID2 = _t, Status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index.1", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Select", each try if [ID1]<>#"Added Index"[ID1]{[Index.1]+1} then "Y" else "N" otherwise "Y"),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([Select] = "Y")),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Index.1", "Select"})
in
#"Removed Columns"
the solution work fine on my smalle testset, but it takes to much processing power when I try to apply this to the real dataset. After 1 hour of refreshing only 350 rowes were loaded. Without this code it takes less dan 1 minute ...
Below are 2 approaches tested on 1 million records. First approach based on using buffered list to filter took 160 seconds whereas second approach was based on @AlexisOlson 's approach of using Records (for Max) in the column. The second approach is faster and took 100 seconds for 1 million records.
Excel file (contains VBA code for timing queries) used for this testing with both approaches can be downloaded from - https://1drv.ms/x/s!Akd5y6ruJhvhuXoDyaB0q3-Gj7QA?e=m2SvxC
Edit - There is also an Inner Merge approach which is also quite fast. This is approach 3 below.
Approach 1
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type number}}),
List = List.Buffer(#"Grouped Rows"[Max]),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each (List.Contains(List,[Index])))
in
#"Filtered Rows"
Approach 2
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type nullable number}, {"All", each Table.Max(_, "Index"), type record}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"ID1", "Max"}),
#"Expanded All" = Table.ExpandRecordColumn(#"Removed Columns", "All", {"Index", "ID1", "ID2", "Status"}, {"Index", "ID1", "ID2", "Status"})
in
#"Expanded All"
Approach 3
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Index", Int64.Type}, {"ID1", type text}, {"ID2", type text}, {"Status", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID1"}, {{"Max", each List.Max([Index]), type nullable number}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Index"}, #"Grouped Rows", {"Max"}, "Removed Columns", JoinKind.Inner),
#"Removed Columns1" = Table.RemoveColumns(#"Merged Queries",{"Removed Columns"})
in
#"Removed Columns1"
@Vijay_A_Verma thnx
I can not open or download the excel file. I get the message the file is to big ....
Download this zip file https://1drv.ms/u/s!Akd5y6ruJhvhuX4Rq3hq9KLVP3lH?e=R0gm7S
Yeah, using table lookup logic inside of custom columns tends to be highly inefficient when the table is very large.
I'd recommend trying the last couple of methods I describe in my blog post here:
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.