Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
2020HB59567
Frequent Visitor

Recursive function goes in never ending loop

Hello everybody

I am working on a recursive function to build a hierarchy from Active Directory. Execution does not halt. I am not able to understand the mistake that I am committing. Please help

 

 

 

 

let
    Source = ActiveDirectory.Domains("go.johnsoncontrols.com"),
    Go.JohnsonControls.com = Source{[Domain="Go.JohnsonControls.com"]}[#"Object Categories"],
    user1 = Go.JohnsonControls.com{[Category="user"]}[Objects],
    finaltable= Table.FromRecords({[top.cn="",top.directReports.cn=""]}),
    #"Removed Columns" = Table.RemoveColumns(user1,{"user", "organizationalPerson", "person", "jciUser", "msExchOmaUser", "msExchBaseClass", "msExchIMRecipient", "msExchCertificateInformation", "msExchMultiMediaUser", "msExchMailStorage", "msExchCustomAttributes", "shadowAccount", "posixAccount", "msDS-CloudExtensions", "securityPrincipal", "mailRecipient", "distinguishedName"}),
    #"Expanded top" = Table.ExpandRecordColumn(#"Removed Columns", "top", {"cn", "directReports"}, {"top.cn", "top.directReports"}),
    somestep=Table.SelectRows(#"Expanded top",each [top.directReports]<>null),
    
    getreportees=(adtable as table, idlist as list, n as number )  =>
    let
        
        #"Filtered Rows1" = Table.SelectRows(adtable, each List.Contains(idlist,[top.cn]) ),
        #"Expanded top.directReports" = Table.ExpandListColumn(#"Filtered Rows1", "top.directReports"),
        #"Expanded top.directReports1" = Table.ExpandRecordColumn(#"Expanded top.directReports", "top.directReports", {"cn", "ProvStatus"}, {"top.directReports.cn","top.directReports.ProvStatus"}),
      #"Filtered Rows" = Table.SelectRows(#"Expanded top.directReports1", each [top.directReports.ProvStatus] = "active" or [top.directReports.ProvStatus] = "Active"),
      
      finaltable = 
      if Table.IsEmpty(#"Filtered Rows") then finaltable 
      else Table.Combine({finaltable, Table.RemoveColumns(#"Filtered Rows",{"top.directReports.ProvStatus","displayName"})}),

      y=
      if Table.IsEmpty(#"Filtered Rows") and n = 0 
      then #"Filtered Rows"   
      else if Table.IsEmpty(#"Filtered Rows") and n > 0  then finaltable 
      else @getreportees(somestep,#"Filtered Rows"[top.directReports.cn],n+1)
         
    in
      y, luck=getreportees(somestep,{"cparabn"},0)
      

in
luck

 

 

 

18 REPLIES 18
spinfuzer
Solution Sage
Solution Sage

Are you sure this hierarchy does not have any weird issues such as a circular relationship?

Could 1 report to 2, 2 report to 3, and 3 report to 1?  Build that as a condition to end your loop just in case.  If you find any person that is already on the list, end your loop.  You could also just stop it at n equal to a little over the max number of levels you think you should have and see if there may be some issues.

 

See example.

 

spinfuzer_0-1701248994894.png

 

 

 

 

 

let
    Source = Binary.FromText("W3siRW1wbG95ZWUgSUQiOjEsIk1hbmFnZXIgSUQiOjJ9LHsiRW1wbG95ZWUgSUQiOjIsIk1hbmFnZXIgSUQiOjN9LHsiRW1wbG95ZWUgSUQiOjMsIk1hbmFnZXIgSUQiOjR9LHsiRW1wbG95ZWUgSUQiOjQsIk1hbmFnZXIgSUQiOjV9LHsiRW1wbG95ZWUgSUQiOjUsIk1hbmFnZXIgSUQiOjF9LHsiRW1wbG95ZWUgSUQiOjYsIk1hbmFnZXIgSUQiOjEwfSx7IkVtcGxveWVlIElEIjo3LCJNYW5hZ2VyIElEIjoxMH0seyJFbXBsb3llZSBJRCI6OCwiTWFuYWdlciBJRCI6MTB9LHsiRW1wbG95ZWUgSUQiOjksIk1hbmFnZXIgSUQiOjEwfSx7IkVtcGxveWVlIElEIjoxMCwiTWFuYWdlciBJRCI6bnVsbH1d"),
    #"Imported JSON" = Json.Document(Source,1252),
    #"Converted to Table" = Table.FromList(#"Imported JSON", Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    Expand = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Employee ID", "Manager ID"}, {"Employee ID", "Manager ID"}),
    priorStep = Table.AddColumn(Expand, "tree", each {[Employee ID]}),
    build_tree = (tbl as table) =>
        let
            update_tree = 
                Table.TransformColumns(
                    tbl, 
                    {
                        "tree", 
                        each _ & Table.SelectRows(tbl, (T) => T[Employee ID] = List.Last(_))[Manager ID]}),
            loop =
                if update_tree = tbl or List.Transform(update_tree[tree], each List.Distinct(_)) = tbl[tree] 
                then update_tree
                else @build_tree(update_tree)
        in
            loop,
    final = build_tree(priorStep),
    #"Extracted Values" = Table.TransformColumns(final, {"tree", each Text.Combine(List.Transform(_, Text.From), ","), type text})
    
in
    #"Extracted Values"

 

 

 

 

 

 

 

 

Hello spinfuzer

I used your code(modified the source to AD as input). Your code works fine. Unfortunately the AD implementation in my org is convoluted and the response is unacceptable. I was able to part about 2000 records in 2 hours. Hence I will not be able to use your code.

Thank you for the code though. I learnt something new!

Table.Join with the LeftHash algorithm seems to work a lot faster.

 

Actually NestedJoin works faster, had to correct the query.

 

 

(c_list as list, p_list as list) =>
let
    Source = 
    Table.AddIndexColumn(
        Table.TransformColumnTypes(
            Table.FromRows(
                List.Zip({c_list,p_list}), 
                {"c", "p"}
            ),
            {
                {"c", type text},
                {"p", type text}
            }
        )
        , "path_index_column"
    ),
    parent_table = Table.AddKey(Table.Buffer(Table.SelectRows(Source, each [p] <> null)), {"c"},true),
    add_path_col = Table.RemoveColumns(Table.AddColumn(Source,"path",each {[c]}), "p"),
    fnPath = 
        (tbl as table) => 
        let
            //join = Table.Join(tbl,"c",Table.PrefixColumns(parent_table,"n"),"n.c", JoinKind.LeftOuter, 3),
            //remove_c = Table.RemoveColumns(join, {"c","n.c", "n.path_index_column"}),
            //before_update_path = Table.RenameColumns(remove_c, {{"n.p", "c"}, {"path","old_path"}}),
            join = Table.NestedJoin(tbl,"c",parent_table,"c", "merged", JoinKind.LeftOuter),
            remove_c = Table.RemoveColumns(join, {"c"}),
            before_update_path = Table.RenameColumns(Table.ExpandTableColumn(remove_c, "merged", {"p"}, {"c"}),{"path","old_path"}),
            update_path = Table.Buffer(Table.AddColumn(before_update_path, "path", each if [c] <> null then [old_path] & {[c]} else [old_path])),
            //update_path = Table.ReplaceValue(before_update_path, each [path], each if [c] <> null then [path] & {[c]} else [path], Replacer.ReplaceValue, {"path"}),
            new_child = update_path[c],
            old_child = join[c],
            final = 
                if List.NonNullCount(new_child) < List.NonNullCount(old_child)
                    then @ fnPath(Table.RemoveColumns(update_path, "old_path"))
                else if List.NonNullCount(new_child) = 0
                    then                     
                        Table.SelectColumns(
                            Table.TransformColumns(
                                Table.AddColumn(
                                    Table.SelectColumns(update_path,{"path_index_column","path"}),
                                    "child",
                                    each [path]{0}
                                ), 
                                {"path", each Text.Combine(_, ",")}
                            ),
                            {"path_index_column","child","path"}
                        )
                else if 
                    List.Transform(
                        Table.SelectRows(update_path, each [c] <> null)[path], 
                        List.Distinct
                    ) 
                    <> 
                    List.Transform(
                        Table.SelectRows(update_path, each [c] <> null)[old_path],
                        List.Distinct
                    )
                // the condition above checks for circular relationships 
                    then @ fnPath(Table.RemoveColumns(update_path, "old_path"))
                else 
                    Table.SelectColumns(
                        Table.TransformColumns(
                            Table.AddColumn(
                                Table.SelectColumns(update_path,{"path_index_column","path"}),
                                "child",
                                each [path]{0}
                            ), 
                            {"path", each Text.Combine(_, ",")}
                        ),
                        {"path_index_column","child","path"}
                    )
                    
        in
            final,
    output = fnPath(add_path_col)
//        Table.FromRows(
 //           List.Zip({c_list,fnPath(add_path_col)}),
  //          {"child_column","path_column"}
  //      )
in
    output

 

 

 

 

 

 

spinfuzer_0-1701345038080.png

 

cannot confirm.  Left Join performs worse than Nested Join

 

lbendlin_0-1701357119034.png

I also tried Chris Webb's suggestion on the inner join with sort merge (mode 2) but that is underwhelming as well

 

lbendlin_1-1701357977003.png

 

Testing with 123K+ rows with up to 5 levels on a customer account hierarchy.  I am joining up to 5 times maximum instead of doing selectrows 123x5 times.

 

 

spinfuzer_0-1701366137226.png

 

challenge accepted 🙂  I am pretty sure our hierarchy is more ragged than yours (plus a couple more levels)

Testing on a dummy set right now.

 

The dataset probably looks something like this:

 

let
    rows = 100000,
    gap = 10000,
    Source = Table.FromColumns({{1 .. rows},{gap .. (rows-gap)}&List.Repeat({null},gap)},{"EN","MN"}),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"EN", type text}, {"MN", type text}})
in
    #"Changed Type"

spinfuzer_0-1701370196292.png

 

Here is an approach that seems to work rather well (56 seconds for 100K records with up to 9 levels). The join type doesn't seem to matter as it goes against the buffer anyway.

 

 

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server",
        "database",
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table",
          CreateNavigationProperties = false
        ]
      )
    ),
    {"EN"},
    true
  ),
  // Group by manager
  Collapse = (tbl) =>
    Table.ExpandTableColumn(
      Table.NestedJoin(
        Table.RenameColumns(
          Table.Group(
            tbl,
            {"MN"},
            {{"Rows", each _, type table [EN = nullable text, Rows = table, MN = nullable text]}}
          ),
          {{"MN", "EN"}}
        ),
        {"EN"},
        Source,
        {"EN"},
        "Join",
        JoinKind.LeftOuter
      ),
      "Join",
      {"MN"}
    ),
  // Group until no more row difference  
  CollapseAll = (tbl) =>
    let
      s = Collapse(tbl)
    in
      if Table.RowCount(tbl) = Table.RowCount(s) then s else @CollapseAll(s),
  Step1 = CollapseAll(Source),
  #"Added Custom" = Table.AddColumn(Step1, "Path", each ""),
  // construct reverse path
  Expand = (tbl) =>
    Table.ReplaceValue(
      Table.ExpandTableColumn(
        Table.SelectColumns(tbl, {"Rows", "Path"}),
        "Rows",
        {"EN", "Rows", "MN"}
      ),
      each [Path],
      each
        if [MN] = [EN] then
          ""
        else if [Path] = "" then
          [MN] & "|" & [EN]
        else
          [Path] & "|" & [EN],
      Replacer.ReplaceValue,
      {"Path"}
    ),
  // Expand until no more nested tables
  ExpandAll = (tbl) =>
    let
      e = Expand(tbl)
    in
      if e[Rows]{0} = null then e else @ExpandAll(e),
  Step2 = ExpandAll(#"Added Custom")
in
  Step2

 

 

It goes up and down the hierarchy. On the way up it groups by manager (to avoid duplication of effort) and on the way down it assembles the Path variable.

 

 

It looks like I wrote my query wrong with my NestedJoin (was refering an old table which put me in an infinite loop).  NestedJoin does perform better when I made the correction.

 

 

Interesting approach with the group by.

 

 

 

lbendlin
Super User
Super User

I am thinking about a different approach. Instead of recalculating the hierarchy for every single user it might be sufficient to create filtered intermediate tables.  So let's say you have 100 employees with 10 managers and 2 VPs you should be able to reduce the processing by first calculate the lineage for the 10 managers etc rather than the 100 employees.

lbendlin
Super User
Super User

looks like your 

#"Filtered Rows"

table is never empty?

Thank you Ibendlin. I checked. If I pass a user who does not any employee reporting to him/her, the table #"Filtered Rows" is empty. Which means after a few passes, the query should halt. 

will you be able to provide some sample data including ragged branches and orphans?

Alternatively you could do this in DAX with the built-in PATH function.

Hi Ibendlin

I will not be able to share the data unfortunately. I have been mulling over the recursive function over last couple of days. I have narrowed down the problem to  this line

 

else @getreportees(somestep,#"Filtered Rows"[top.directReports.cn],n+1)

I am passing the list of all managers who have reporting. I am doing this everytime the recursive function is called. I dont know the details of M. when I pass "somestep" does the parser invoke the connection to AD and execute all the steps prior to somestep?

This might be of interest to you Replicating PATH function of DAX in Power Query / M – Power BI & Excel are better together (excelnac...

 

It's a sound theoretical approach but hits some severe performance issues with real world data.

M Language Operators - PowerQuery M | Microsoft Learn

A cyclic value has an infinite expansion when applying structural recursion. The semantics of M makes no special accommodations for such infinite expansions—an attempt to compare cyclic values for equality, for instance, will typically run out of resources and terminate exceptionally.

I used the PATH function as per your suggestion. However the query executes grinds to a halt. It parsed 8000 rows over 8 hours. So this wont work for me 😞

I did some tests with real world data (100K employees, up to 9 hierarchy layers, as ragged as it gets).  Both the recursive function and the version based on List.Generate perform abysmally .

 

Recursive:

 

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server", 
        "db", 
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table", 
          CreateNavigationProperties = false
        ]
      )
    ), 
    {"EN"}, 
    true
  ), 
  Path = (c, pt) =>
    let
      p = Table.SelectRows(Source, each [EN] = c)[MN]{0}
    in
      if p = c then pt else @ Path(p, p & "|" & pt), 
  #"Added Custom" = Table.AddColumn(Source, "Path", each Path([MN], [MN] & "|" & [EN])) 

in
  #"Added Custom"

 

List.Generate:

 

let
  Source = Table.AddKey(
    Table.Buffer(
      Sql.Database(
        "server", 
        "db", 
        [
          Query = "select employeenumber [EN],manageremployeenumber [MN] from table", 
          CreateNavigationProperties = false
        ]
      )
    ), 
    {"EN"}, 
    true
  ), 
  #"Added Custom" = Table.AddColumn(Source,"Path",
    each Text.Combine(
      List.Reverse({[EN]} & List.Generate(
            () => [c = [EN], p = [MN]],
            each [c] <> [p],
            each [c = [p], p = Table.SelectRows(Source, (k) => k[EN] = [p])[MN]{0}],
            each [p]
          )
      ),"|"),type text)
in
    #"Added Custom"

 

No idea where else to apply a buffer to make it at least a little faster.

 

Still thinking through the idea to store and reuse prior results.

Thank you Ibendlin. I agree with you that this seems to be a power query issue. Earlier we had a DotNet code connecting to active directory and the program used to return results in seconds. 

 

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors