March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi All,
I'm trying to identify the modified date of SharePoint list items (when a status was first set to "Done").
Therefore, I'm loading the current SP list and extend the list with version history attributes using custom function.
After extending the relevant fields from the added history table, I need to sort, buffer and remove duplicates.
Unfortunately, it takes 5-10 minutes to complete the load, although it is only about 1,500 rows.
It feels like the performance issues starts with sorting the rows - highlighted in red.
Any idea on how I can improve the performance?
let
Quelle = SharePoint.Tables("https://xxxxxxx/", [Implementation=null, ApiVersion=15]),
#"Gefilterte Zeilen" = Table.SelectRows(Quelle, each ([Title] = "Cutover Activities")),
#"Entfernte Spalten" = Table.RemoveColumns(#"Gefilterte Zeilen",{"Id"}),
#"Erweiterte Items" = Table.ExpandTableColumn(#"Entfernte Spalten", "Items", {"ID", "ParentList"}, {"ID", "ParentList"}),
#"Erweiterte ParentList" = Table.ExpandRecordColumn(#"Erweiterte Items", "ParentList", {"ParentWebUrl"}, {"ParentWebUrl"}),
#"Geänderter Typ" = Table.TransformColumnTypes(#"Erweiterte ParentList",{{"ParentWebUrl", type text}, {"ID", type number}}),
#"Spalte nach Trennzeichen teilen" = Table.SplitColumn(#"Geänderter Typ", "ParentWebUrl", Splitter.SplitTextByEachDelimiter({"/"}, QuoteStyle.Csv, true), {"ParentWebUrl.1", "ParentWebUrl.2"}),
#"Geänderter Typ1" = Table.TransformColumnTypes(#"Spalte nach Trennzeichen teilen",{{"ParentWebUrl.1", type text}, {"ParentWebUrl.2", type text}}),
#"Entfernte Spalten1" = Table.RemoveColumns(#"Geänderter Typ1",{"ParentWebUrl.1"}),
#"Umbenannte Spalten" = Table.RenameColumns(#"Entfernte Spalten1",{{"ParentWebUrl.2", "SharePointSiteName"}, {"Title", "SharePointListName"}, {"ID", "ItemID"}}),
#"Aufgerufene benutzerdefinierte Funktion" = Table.AddColumn(#"Umbenannte Spalten", "VersionHistory", each CustomVersionHistory([SharePointSiteName], [SharePointListName], [ItemID])),
#"Erweiterte VersionHistory" = Table.ExpandTableColumn(#"Aufgerufene benutzerdefinierte Funktion", "VersionHistory", {"properties"}, {"VersionHistory.properties"}),
#"Erweiterte VersionHistory.properties" = Table.ExpandTableColumn(#"Erweiterte VersionHistory", "VersionHistory.properties", {"VersionLabel", "field_x005f_12", "Modified"},
{"VersionHistory.properties.VersionLabel", "VersionHistory.properties.field_x005f_12", "VersionHistory.properties.Modified"}),
#"Erweiterte VersionHistory.properties.Modified" = Table.ExpandTableColumn(#"Erweiterte VersionHistory.properties", "VersionHistory.properties.Modified", {"Element:Text"}, {"VersionHistory.properties.Modified.Element:Text"}),
#"Geänderter Typ2" = Table.TransformColumnTypes(#"Erweiterte VersionHistory.properties.Modified",{{"VersionHistory.properties.Modified.Element:Text", type datetime}, {"VersionHistory.properties.VersionLabel", type number}}),
#"Gefilterte Zeilen1" = Table.SelectRows(#"Geänderter Typ2", each ([VersionHistory.properties.field_x005f_12] = "Done")),
#"Sortierte Zeilen" = Table.Sort(#"Gefilterte Zeilen1",{{"VersionHistory.properties.VersionLabel", Order.Ascending}}),
Benutzerdefiniert1 = Table.Buffer(#"Sortierte Zeilen"),
#"Entfernte Duplikate" = Table.Distinct(Benutzerdefiniert1, {"ItemID"})
in
#"Entfernte Duplikate"
Thank you so much for your support!
David
Do Not Sort. (If you can)
Sorting is breaking folding.
Power BI doesn't care if Power Query sorted the data. It's wasted effort.
Unfortunately, sorting is essential part in my approach to identify the first version where the status was modified.
In the end, I'm only interested in the modified date of items when the status field was set from xxx to "Done" for the first time (as in the current list, no actual dates are maintained).
As I'm loading all item versions, I then do the following steps:
- Filter for only for status "Done"
- Sort version lable asc
- Buffer the table after sorting to keep sorting
- Remove duplicates on item id
At the end, I just join the date via ID to the current list and lable it as "actual completion date".
Appreciate any tips on improved approaches as I'm fairly new to PowerBI and right now my full load took about 40min.
Instead of doing this in Power Query, use the ODATA connector to filter the results directly in SharePoint.
No idea if this will be any faster though - also depends on the indexes in your SharePoint list.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.