The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
In the last few months I have been answering dozens of users’ questions. And there were at least three or four of you with different use cases but the same nominator: I have a table with two columns From and To. And for every distinct value of the column From I want to see all nodes which I can reach by traversing the edges From --> To. This problem has a name: transitive closure.
What is a transitive closure?
The omniscient Wikipedia says to transitive closure: “In mathematics, the transitive closure of a binary relation R on a set X is the smallest relation on X that contains R and is transitive.
For example, if X is a set of airports and xRy means "there is a direct flight from airport x to airport y" (for x and y in X), then the transitive closure of R on X is the relation R+ such that x R+ y means "it is possible to fly from x to y in one or more flights". Informally, the transitive closure gives you the set of all places you can get to from any starting place.”
A picture is worth a thousand words:
What we want to achieve is shown on the following image:
The code
Let’s start with our data sample from the previous illustration. It is a table with two columns From and To. Every row is a directed edge from From to To.
It will be recursion, again, I know, but I don’t know any other way to solve this problem in PowerQuery. But before we come to a recursive function, let us create key-value pairs where key is a node and value is a list of all nodes, which you can reach directly from the key. The result won’t be a table but a record instead. (Why do I do that? It will be much faster – you can read more about this performance boost in my recent post Lookup Table in Power Query.)
// get all direct descendants as a record (key-value pairs From ==> To) DirectDescendantsRecord = List.Accumulate( FromDistinctList, [], (state, current) => state & Expression.Evaluate( "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current] ) ),
This fast access to all direct descendants of a node will be very helpful in the next step, in the recursive function fnTransitiveRelationList. This function gets a list of all nodes which we should visit and the newly created record with key-value pairs. It goes recursively through all the nodes until the list is empty. As a result, you get all nodes which you can visit from the current node.
// get recursively list of all descendants fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list => let result = if List.IsEmpty(toBeDoneList) then {} else let newToBeDoneList = List.RemoveItems( // combine all lists together List.Combine( // get a list of direct descendants for every value List.Transform( toBeDoneList, each Record.FieldOrDefault(directDescendantsRecord, _, {}) ) ), toBeDoneList ) in List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)}) in result,
And that’s all. The last part is only a call of the recursive function.
// get recursive all descendants of current value TransitiveClosure = Table.AddColumn( TableFromList, "AllDescendantRelations", (_) => let allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord), removeItself = List.RemoveItems(allRelations, {[From]}), toText = Text.Combine(removeItself, ";") in toText, type list )
And the whole code in one piece.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTJSitWBsIzBLCMgywTMMgayLMAsUyDLDMwyA7LM4SygbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]), ChangedType = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}}), // buffer the source table BufferedTable = Table.Buffer(ChangedType), // create a distinct sorted list of From values FromDistinctList = List.Sort(List.Distinct(Table.Column(BufferedTable, "From"))), // get all direct descendants as a record (key-value pairs From ==> To) DirectDescendantsRecord = List.Accumulate( FromDistinctList, [], (state, current) => state & Expression.Evaluate( "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current] ) ), // create a table of distinct From values TableFromList = Table.FromList(FromDistinctList, Splitter.SplitByNothing(), {"From"}), // get recursively list of all descendants fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list => let result = if List.IsEmpty(toBeDoneList) then {} else let newToBeDoneList = List.RemoveItems( // combine all lists together List.Combine( // get a list of direct descendants for every value List.Transform( toBeDoneList, each Record.FieldOrDefault(directDescendantsRecord, _, {}) ) ), toBeDoneList ) in List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)}) in result, // get recursive all descendants of current value TransitiveClosure = Table.AddColumn( TableFromList, "AllDescendantRelations", (_) => let allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord), removeItself = List.RemoveItems(allRelations, {[From]}), toText = Text.Combine(removeItself, ";") in toText, type list ) in TransitiveClosure
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.