Re: Recursive function goes in never ending loop

2020HB59567 · ‎11-23-2023

Hello everybody

I am working on a recursive function to build a hierarchy from Active Directory. Execution does not halt. I am not able to understand the mistake that I am committing. Please help

let
    Source = ActiveDirectory.Domains("go.johnsoncontrols.com"),
    Go.JohnsonControls.com = Source{[Domain="Go.JohnsonControls.com"]}[#"Object Categories"],
    user1 = Go.JohnsonControls.com{[Category="user"]}[Objects],
    finaltable= Table.FromRecords({[top.cn="",top.directReports.cn=""]}),
    #"Removed Columns" = Table.RemoveColumns(user1,{"user", "organizationalPerson", "person", "jciUser", "msExchOmaUser", "msExchBaseClass", "msExchIMRecipient", "msExchCertificateInformation", "msExchMultiMediaUser", "msExchMailStorage", "msExchCustomAttributes", "shadowAccount", "posixAccount", "msDS-CloudExtensions", "securityPrincipal", "mailRecipient", "distinguishedName"}),
    #"Expanded top" = Table.ExpandRecordColumn(#"Removed Columns", "top", {"cn", "directReports"}, {"top.cn", "top.directReports"}),
    somestep=Table.SelectRows(#"Expanded top",each [top.directReports]<>null),
    
    getreportees=(adtable as table, idlist as list, n as number )  =>
    let
        
        #"Filtered Rows1" = Table.SelectRows(adtable, each List.Contains(idlist,[top.cn]) ),
        #"Expanded top.directReports" = Table.ExpandListColumn(#"Filtered Rows1", "top.directReports"),
        #"Expanded top.directReports1" = Table.ExpandRecordColumn(#"Expanded top.directReports", "top.directReports", {"cn", "ProvStatus"}, {"top.directReports.cn","top.directReports.ProvStatus"}),
      #"Filtered Rows" = Table.SelectRows(#"Expanded top.directReports1", each [top.directReports.ProvStatus] = "active" or [top.directReports.ProvStatus] = "Active"),
      
      finaltable = 
      if Table.IsEmpty(#"Filtered Rows") then finaltable 
      else Table.Combine({finaltable, Table.RemoveColumns(#"Filtered Rows",{"top.directReports.ProvStatus","displayName"})}),

      y=
      if Table.IsEmpty(#"Filtered Rows") and n = 0 
      then #"Filtered Rows"   
      else if Table.IsEmpty(#"Filtered Rows") and n > 0  then finaltable 
      else @getreportees(somestep,#"Filtered Rows"[top.directReports.cn],n+1)
         
    in
      y, luck=getreportees(somestep,{"cparabn"},0)
      

in
luck

spinfuzer · ‎11-29-2023

Are you sure this hierarchy does not have any weird issues such as a circular relationship?

Could 1 report to 2, 2 report to 3, and 3 report to 1? Build that as a condition to end your loop just in case. If you find any person that is already on the list, end your loop. You could also just stop it at n equal to a little over the max number of levels you think you should have and see if there may be some issues.

See example.

let
    Source = Binary.FromText("W3siRW1wbG95ZWUgSUQiOjEsIk1hbmFnZXIgSUQiOjJ9LHsiRW1wbG95ZWUgSUQiOjIsIk1hbmFnZXIgSUQiOjN9LHsiRW1wbG95ZWUgSUQiOjMsIk1hbmFnZXIgSUQiOjR9LHsiRW1wbG95ZWUgSUQiOjQsIk1hbmFnZXIgSUQiOjV9LHsiRW1wbG95ZWUgSUQiOjUsIk1hbmFnZXIgSUQiOjF9LHsiRW1wbG95ZWUgSUQiOjYsIk1hbmFnZXIgSUQiOjEwfSx7IkVtcGxveWVlIElEIjo3LCJNYW5hZ2VyIElEIjoxMH0seyJFbXBsb3llZSBJRCI6OCwiTWFuYWdlciBJRCI6MTB9LHsiRW1wbG95ZWUgSUQiOjksIk1hbmFnZXIgSUQiOjEwfSx7IkVtcGxveWVlIElEIjoxMCwiTWFuYWdlciBJRCI6bnVsbH1d"),
    #"Imported JSON" = Json.Document(Source,1252),
    #"Converted to Table" = Table.FromList(#"Imported JSON", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    Expand = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Employee ID", "Manager ID"}, {"Employee ID", "Manager ID"}),
    priorStep = Table.AddColumn(Expand, "tree", each {[Employee ID]}),
    build_tree = (tbl as table) =>
        let
            update_tree = 
                Table.TransformColumns(
                    tbl, 
                    {
                        "tree", 
                        each _ & Table.SelectRows(tbl, (T) => T[Employee ID] = List.Last(_))[Manager ID]}),
            loop =
                if update_tree = tbl or List.Transform(update_tree[tree], each List.Distinct(_)) = tbl[tree] 
                then update_tree
                else @build_tree(update_tree)
        in
            loop,
    final = build_tree(priorStep),
    #"Extracted Values" = Table.TransformColumns(final, {"tree", each Text.Combine(List.Transform(_, Text.From), ","), type text})
    
in
    #"Extracted Values"

2020HB59567 · ‎11-29-2023

Hello spinfuzer

I used your code(modified the source to AD as input). Your code works fine. Unfortunately the AD implementation in my org is convoluted and the response is unacceptable. I was able to part about 2000 records in 2 hours. Hence I will not be able to use your code.

Thank you for the code though. I learnt something new!

spinfuzer · ‎11-30-2023

~~Table.Join with the LeftHash algorithm seems to work a lot faster.~~

Actually NestedJoin works faster, had to correct the query.

(c_list as list, p_list as list) =>
let
    Source = 
    Table.AddIndexColumn(
        Table.TransformColumnTypes(
            Table.FromRows(
                List.Zip({c_list,p_list}), 
                {"c", "p"}
            ),
            {
                {"c", type text},
                {"p", type text}
            }
        )
        , "path_index_column"
    ),
    parent_table = Table.AddKey(Table.Buffer(Table.SelectRows(Source, each [p] <> null)), {"c"},true),
    add_path_col = Table.RemoveColumns(Table.AddColumn(Source,"path",each {[c]}), "p"),
    fnPath = 
        (tbl as table) => 
        let
            //join = Table.Join(tbl,"c",Table.PrefixColumns(parent_table,"n"),"n.c", JoinKind.LeftOuter, 3),
            //remove_c = Table.RemoveColumns(join, {"c","n.c", "n.path_index_column"}),
            //before_update_path = Table.RenameColumns(remove_c, {{"n.p", "c"}, {"path","old_path"}}),
            join = Table.NestedJoin(tbl,"c",parent_table,"c", "merged", JoinKind.LeftOuter),
            remove_c = Table.RemoveColumns(join, {"c"}),
            before_update_path = Table.RenameColumns(Table.ExpandTableColumn(remove_c, "merged", {"p"}, {"c"}),{"path","old_path"}),
            update_path = Table.Buffer(Table.AddColumn(before_update_path, "path", each if [c] <> null then [old_path] & {[c]} else [old_path])),
            //update_path = Table.ReplaceValue(before_update_path, each [path], each if [c] <> null then [path] & {[c]} else [path], Replacer.ReplaceValue, {"path"}),
            new_child = update_path[c],
            old_child = join[c],
            final = 
                if List.NonNullCount(new_child) < List.NonNullCount(old_child)
                    then @ fnPath(Table.RemoveColumns(update_path, "old_path"))
                else if List.NonNullCount(new_child) = 0
                    then                     
                        Table.SelectColumns(
                            Table.TransformColumns(
                                Table.AddColumn(
                                    Table.SelectColumns(update_path,{"path_index_column","path"}),
                                    "child",
                                    each [path]{0}
                                ), 
                                {"path", each Text.Combine(_, ",")}
                            ),
                            {"path_index_column","child","path"}
                        )
                else if 
                    List.Transform(
                        Table.SelectRows(update_path, each [c] <> null)[path], 
                        List.Distinct
                    ) 
                    <> 
                    List.Transform(
                        Table.SelectRows(update_path, each [c] <> null)[old_path],
                        List.Distinct
                    )
                // the condition above checks for circular relationships 
                    then @ fnPath(Table.RemoveColumns(update_path, "old_path"))
                else 
                    Table.SelectColumns(
                        Table.TransformColumns(
                            Table.AddColumn(
                                Table.SelectColumns(update_path,{"path_index_column","path"}),
                                "child",
                                each [path]{0}
                            ), 
                            {"path", each Text.Combine(_, ",")}
                        ),
                        {"path_index_column","child","path"}
                    )
                    
        in
            final,
    output = fnPath(add_path_col)
//        Table.FromRows(
 //           List.Zip({c_list,fnPath(add_path_col)}),
  //          {"child_column","path_column"}
  //      )
in
    output

lbendlin · ‎11-30-2023

cannot confirm. Left Join performs worse than Nested Join

I also tried Chris Webb's suggestion on the inner join with sort merge (mode 2) but that is underwhelming as well

spinfuzer · ‎11-30-2023

Testing with 123K+ rows with up to 5 levels on a customer account hierarchy. I am joining up to 5 times maximum instead of doing selectrows 123x5 times.

lbendlin · ‎11-30-2023

challenge accepted 🙂 I am pretty sure our hierarchy is more ragged than yours (plus a couple more levels)

spinfuzer · ‎11-30-2023

Testing on a dummy set right now.

The dataset probably looks something like this:

let
    rows = 100000,
    gap = 10000,
    Source = Table.FromColumns({{1 .. rows},{gap .. (rows-gap)}&List.Repeat({null},gap)},{"EN","MN"}),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"EN", type text}, {"MN", type text}})
in
    #"Changed Type"

lbendlin · ‎11-30-2023

Here is an approach that seems to work rather well (56 seconds for 100K records with up to 9 levels). The join type doesn't seem to matter as it goes against the buffer anyway.

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server",
        "database",
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table",
          CreateNavigationProperties = false
        ]
      )
    ),
    {"EN"},
    true
  ),
  // Group by manager
  Collapse = (tbl) =>
    Table.ExpandTableColumn(
      Table.NestedJoin(
        Table.RenameColumns(
          Table.Group(
            tbl,
            {"MN"},
            {{"Rows", each _, type table [EN = nullable text, Rows = table, MN = nullable text]}}
          ),
          {{"MN", "EN"}}
        ),
        {"EN"},
        Source,
        {"EN"},
        "Join",
        JoinKind.LeftOuter
      ),
      "Join",
      {"MN"}
    ),
  // Group until no more row difference  
  CollapseAll = (tbl) =>
    let
      s = Collapse(tbl)
    in
      if Table.RowCount(tbl) = Table.RowCount(s) then s else @CollapseAll(s),
  Step1 = CollapseAll(Source),
  #"Added Custom" = Table.AddColumn(Step1, "Path", each ""),
  // construct reverse path
  Expand = (tbl) =>
    Table.ReplaceValue(
      Table.ExpandTableColumn(
        Table.SelectColumns(tbl, {"Rows", "Path"}),
        "Rows",
        {"EN", "Rows", "MN"}
      ),
      each [Path],
      each
        if [MN] = [EN] then
          ""
        else if [Path] = "" then
          [MN] & "|" & [EN]
        else
          [Path] & "|" & [EN],
      Replacer.ReplaceValue,
      {"Path"}
    ),
  // Expand until no more nested tables
  ExpandAll = (tbl) =>
    let
      e = Expand(tbl)
    in
      if e[Rows]{0} = null then e else @ExpandAll(e),
  Step2 = ExpandAll(#"Added Custom")
in
  Step2

It goes up and down the hierarchy. On the way up it groups by manager (to avoid duplication of effort) and on the way down it assembles the Path variable.

spinfuzer · ‎12-01-2023

It looks like I wrote my query wrong with my NestedJoin (was refering an old table which put me in an infinite loop). NestedJoin does perform better when I made the correction.

Interesting approach with the group by.

lbendlin · ‎11-28-2023

I am thinking about a different approach. Instead of recalculating the hierarchy for every single user it might be sufficient to create filtered intermediate tables. So let's say you have 100 employees with 10 managers and 2 VPs you should be able to reduce the processing by first calculate the lineage for the 10 managers etc rather than the 100 employees.

lbendlin · ‎11-25-2023

looks like your

#"Filtered Rows"

table is never empty?

2020HB59567 · ‎11-26-2023

Thank you Ibendlin. I checked. If I pass a user who does not any employee reporting to him/her, the table #"Filtered Rows" is empty. Which means after a few passes, the query should halt.

lbendlin · ‎11-27-2023

will you be able to provide some sample data including ragged branches and orphans?

Alternatively you could do this in DAX with the built-in PATH function.

2020HB59567 · ‎11-28-2023

Hi Ibendlin

I will not be able to share the data unfortunately. I have been mulling over the recursive function over last couple of days. I have narrowed down the problem to this line

else @getreportees(somestep,#"Filtered Rows"[top.directReports.cn],n+1)

I am passing the list of all managers who have reporting. I am doing this everytime the recursive function is called. I dont know the details of M. when I pass "somestep" does the parser invoke the connection to AD and execute all the steps prior to somestep?

lbendlin · ‎11-28-2023

This might be of interest to you Replicating PATH function of DAX in Power Query / M – Power BI & Excel are better together (excelnac...

It's a sound theoretical approach but hits some severe performance issues with real world data.

M Language Operators - PowerQuery M | Microsoft Learn

A cyclic value has an infinite expansion when applying structural recursion. The semantics of M makes no special accommodations for such infinite expansions—an attempt to compare cyclic values for equality, for instance, will typically run out of resources and terminate exceptionally.

2020HB59567 · ‎11-28-2023

I used the PATH function as per your suggestion. However the query executes grinds to a halt. It parsed 8000 rows over 8 hours. So this wont work for me 😞

lbendlin · ‎11-29-2023

I did some tests with real world data (100K employees, up to 9 hierarchy layers, as ragged as it gets). Both the recursive function and the version based on List.Generate perform abysmally .

Recursive:

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server", 
        "db", 
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table", 
          CreateNavigationProperties = false
        ]
      )
    ), 
    {"EN"}, 
    true
  ), 
  Path = (c, pt) =>
    let
      p = Table.SelectRows(Source, each [EN] = c)[MN]{0}
    in
      if p = c then pt else @ Path(p, p & "|" & pt), 
  #"Added Custom" = Table.AddColumn(Source, "Path", each Path([MN], [MN] & "|" & [EN])) 

in
  #"Added Custom"

List.Generate:

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server", 
        "db", 
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table", 
          CreateNavigationProperties = false
        ]
      )
    ), 
    {"EN"}, 
    true
  ), 
  #"Added Custom" = Table.AddColumn(Source,"Path",
    each Text.Combine(
      List.Reverse({[EN]} & List.Generate(
            () => [c = [EN], p = [MN]],
            each [c] <> [p],
            each [c = [p], p = Table.SelectRows(Source, (k) => k[EN] = [p])[MN]{0}],
            each [p]
          )
      ),"|"),type text)
in
    #"Added Custom"

No idea where else to apply a buffer to make it at least a little faster.

Still thinking through the idea to store and reuse prior results.

2020HB59567 · ‎11-29-2023

Thank you Ibendlin. I agree with you that this seems to be a power query issue. Earlier we had a DotNet code connecting to active directory and the program used to return results in seconds.

Recursive function goes in never ending loop

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta

Recursive function goes in never ending loop

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026