Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Nolock

Transitive Closure in Power Query

In the last few months I have been answering dozens of users’ questions. And there were at least three or four of you with different use cases but the same nominator: I have a table with two columns From and To. And for every distinct value of the column From I want to see all nodes which I can reach by traversing the edges From --> To. This problem has a name: transitive closure.

 

What is a transitive closure?

The omniscient Wikipedia says to transitive closure: “In mathematics, the transitive closure of a binary relation R on a set X is the smallest relation on X that contains R and is transitive.
For example, if X is a set of airports and xRy means "there is a direct flight from airport x to airport y" (for x and y in X), then the transitive closure of R on X is the relation R+ such that x R+ y means "it is possible to fly from x to y in one or more flights". Informally, the transitive closure gives you the set of all places you can get to from any starting place.”

A picture is worth a thousand words:

 

1.png

 

What we want to achieve is shown on the following image:

23.png

 

The code

Let’s start with our data sample from the previous illustration. It is a table with two columns From and To. Every row is a directed edge from From to To.

It will be recursion, again, I know, but I don’t know any other way to solve this problem in PowerQuery. But before we come to a recursive function, let us create key-value pairs where key is a node and value is a list of all nodes, which you can reach directly from the key. The result won’t be a table but a record instead. (Why do I do that? It will be much faster – you can read more about this performance boost in my recent post Lookup Table in Power Query.)

 

// get all direct descendants as a record (key-value pairs From ==> To)
DirectDescendantsRecord = List.Accumulate(
    FromDistinctList,
    [],
    (state, current) =>
        state & 
        Expression.Evaluate(
            "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", 
            [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current]
        )
),

 

This fast access to all direct descendants of a node will be very helpful in the next step, in the recursive function fnTransitiveRelationList. This function gets a list of all nodes which we should visit and the newly created record with key-value pairs. It goes recursively through all the nodes until the list is empty. As a result, you get all nodes which you can visit from the current node.

 

// get recursively list of all descendants
fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
    let
        result = 
            if List.IsEmpty(toBeDoneList) then
                {}
            else
                let
                    newToBeDoneList = List.RemoveItems(
                        // combine all lists together
                        List.Combine(
                            // get a list of direct descendants for every value
                            List.Transform(
                                toBeDoneList, 
                                each Record.FieldOrDefault(directDescendantsRecord, _, {})
                            )
                        ),
                        toBeDoneList
                    )
                in
                    List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
    in
        result,

 

And that’s all. The last part is only a call of the recursive function.

 

// get recursive all descendants of current value
TransitiveClosure = Table.AddColumn(
    TableFromList, 
    "AllDescendantRelations", 
    (_) =>
        let 
            allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord),
            removeItself = List.RemoveItems(allRelations, {[From]}),
            toText = Text.Combine(removeItself, ";")
        in
            toText,
    type list
)

And the whole code in one piece.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTJSitWBsIzBLCMgywTMMgayLMAsUyDLDMwyA7LM4SygbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]),
    ChangedType = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}}),

    // buffer the source table
    BufferedTable = Table.Buffer(ChangedType),

    // create a distinct sorted list of From values
    FromDistinctList = List.Sort(List.Distinct(Table.Column(BufferedTable, "From"))),

    // get all direct descendants as a record (key-value pairs From ==> To)
    DirectDescendantsRecord = List.Accumulate(
        FromDistinctList,
        [],
        (state, current) =>
            state & 
            Expression.Evaluate(
                "[" & current & "= Table.Column(Table.SelectRows(BufferedTable, each [From] = current), ""To"")]", 
                [Table.Column = Table.Column, Table.SelectRows = Table.SelectRows, BufferedTable = BufferedTable, current = current]
            )
    ),

    // create a table of distinct From values
    TableFromList = Table.FromList(FromDistinctList, Splitter.SplitByNothing(), {"From"}),

    // get recursively list of all descendants
    fnTransitiveRelationList = (toBeDoneList as list, directDescendantsRecord as record) as list =>
        let
            result = 
                if List.IsEmpty(toBeDoneList) then
                    {}
                else
                    let
                        newToBeDoneList = List.RemoveItems(
                            // combine all lists together
                            List.Combine(
                                // get a list of direct descendants for every value
                                List.Transform(
                                    toBeDoneList, 
                                    each Record.FieldOrDefault(directDescendantsRecord, _, {})
                                )
                            ),
                            toBeDoneList
                        )
                    in
                        List.Union({toBeDoneList, newToBeDoneList, @fnTransitiveRelationList(newToBeDoneList, directDescendantsRecord)})
        in
            result,

    // get recursive all descendants of current value
    TransitiveClosure = Table.AddColumn(
        TableFromList, 
        "AllDescendantRelations", 
        (_) =>
            let 
                allRelations = fnTransitiveRelationList({[From]}, DirectDescendantsRecord),
                removeItself = List.RemoveItems(allRelations, {[From]}),
                toText = Text.Combine(removeItself, ";")
            in
                toText,
        type list
    )
in
    TransitiveClosure

 

Comments