Solved: Takes a lot of time to generate output

quantumudit · ‎09-28-2024

I am migrating an Alteryx workflow with an iterative macro to a Dataflow Gen 2 equivalent, utilizing a delta lake table from the lakehouse as the data source.

This function carries out the required transformations on the table; however, the table's structure - such as column names, the number of columns, the order of columns, and the number of rows - remains unchanged. The transformation only impacts the underlying data within the table. Here is the function:

Suppose the name of this function is fxMyFunc

(tbl as table) =>
  let
    Source = tbl,
    AddedCustom = Table.TransformColumnTypes(
      Table.AddColumn(
        Source,
        "On-Hand_Inventory_Quantity_Inter",
        each
          if [RecordID] = 1 then
            [#"On-Hand_Inventory_Quantity"]
          else
            [Transaction_Quantity__NOS_] + Source[#"On-Hand_Inventory_Quantity"]{[RecordID] - 2}
      ),
      {{"On-Hand_Inventory_Quantity_Inter", type number}}
    ),
    RemovedCols = Table.RemoveColumns(AddedCustom, {"On-Hand_Inventory_Quantity"}),
    AddedCustom1 = Table.TransformColumnTypes(
      Table.AddColumn(
        RemovedCols,
        "On-Hand_Inventory_Quantity",
        each
          if [RecordID] = 1 then
            [#"On-Hand_Inventory_Quantity_Inter"]
          else
            List.Sum(
              List.Range(RemovedCols[#"On-Hand_Inventory_Quantity_Inter"], 1, [RecordID] - 1)
            )
      ),
      {{"On-Hand_Inventory_Quantity", type number}}
    ),
    RemovedCols1 = Table.RemoveColumns(
      AddedCustom1,
      {"On-Hand_Inventory_Quantity_Inter", "Inbound/Outbound_Value_Adjustment"}
    ),
    AddedCustom2 = Table.TransformColumnTypes(
      Table.AddColumn(
        RemovedCols1,
        "Inbound/Outbound_Value_Adjustment",
        each
          if [Type_of_Entry__RECPT_SUB_TYP_CD_] = "TOUT" then
            RemovedCols1[Unit_Cost]{[RecordID] - 2} * [Transaction_Quantity__NOS_]
          else
            [#"Inbound/Outbound_Value"]
      ),
      {{"Inbound/Outbound_Value_Adjustment", type number}}
    ),
    AddedCustom3 = Table.TransformColumnTypes(
      Table.AddColumn(
        AddedCustom2,
        "On-Hand_Inventory_Value_Inter",
        each
          if [RecordID] = 1 then
            [#"On-Hand_Inventory_Value"]
          else
            [#"Inbound/Outbound_Value_Adjustment"]
              + AddedCustom2[#"On-Hand_Inventory_Value"]{[RecordID] - 2}
      ),
      {{"On-Hand_Inventory_Value_Inter", type number}}
    ),
    RemovedCols2 = Table.RemoveColumns(AddedCustom3, {"On-Hand_Inventory_Value"}),
    #"Added custom 4" = Table.TransformColumnTypes(
      Table.AddColumn(
        RemovedCols2,
        "On-Hand_Inventory_Value",
        each
          if [RecordID] = 1 then
            [#"On-Hand_Inventory_Value_Inter"]
          else
            List.Sum(List.Range(RemovedCols2[#"On-Hand_Inventory_Value_Inter"], 1, [RecordID] - 1))
      ),
      {{"On-Hand_Inventory_Value", type number}}
    ),
    #"Removed columns 3" = Table.RemoveColumns(
      #"Added custom 4",
      {"On-Hand_Inventory_Value_Inter", "Unit_Cost"}
    ),
    #"Added custom" = Table.TransformColumnTypes(
      Table.AddColumn(
        #"Removed columns 3",
        "Unit_Cost",
        each
          if [#"On-Hand_Inventory_Quantity"] = 0 then
            0
          else
            [#"On-Hand_Inventory_Value"] / [#"On-Hand_Inventory_Quantity"]
      ),
      {{"Unit_Cost", type number}}
    ),
    #"Rounded off" = Table.TransformColumns(
      #"Added custom",
      {{"Unit_Cost", each Number.Round(_, 2), type number}}
    ),
    #"Reordered columns" = Table.ReorderColumns(
      #"Rounded off",
      {
        "RecordID",
        "Posted_Date",
        "Type_of_Entry__RECPT_SUB_TYP_CD_",
        "Transaction_Quantity__NOS_",
        "Inbound/Outbound_Value",
        "Inbound/Outbound_Value_Adjustment",
        "Unit_Cost",
        "On-Hand_Inventory_Quantity",
        "On-Hand_Inventory_Value"
      }
    )
  in
    #"Reordered columns"

The iterative macro in Alteryx can repeat a transformation 57 times, corresponding to the maximum value of the [RecordID] column, and it does this within seconds.

Similarly, I attempted to replicate this process using the List.Generate() function in PowerQuery. I'm performing the transformation 5 times for testing purposes, although I need to execute it 57 times. Below is the M-query for this operation:

Suppose the name of this function is OutputQuery

let
  Source = List.Generate(
    () => [x=1, y=fxMyFunc(BaseTable)],
    each [x] <= 5,
    each [x = [x] + 1, y = fxMyFunc([y])],
    each [y]
)
in
  Source

Accessing Source{0} in the Navigation step takes approximately 17 to 18 seconds. However, accessing Source{1} can take up to 10 minutes, after which Dataflow Gen2 indicates that the evaluation has stopped. It is unclear why there is such a significant time difference just to access the subsequent step, Source{1}.

Here is the query for BaseTable:

let
  Source = Lakehouse.Contents(null){[workspaceId = "xxxxxx-xxx-xxxx-xxx"]}[Data]{[lakehouseId = "xxxx-xxxx-xxx-xxx-xx-xx"]}[Data],
  Navigation = Source{[Id = "MyTableName", ItemKind = "Table"]}[Data]
in
  Navigation

To provide context regarding the dataset, it comprises merely 57 rows and 9 columns. You may refer to the fxMyFunc to ascertain the specific column names and the transformations being applied.

I have attempted recursion in addition to List.Generate(), but it also proved unhelpful. Could you assist me with this? I am uncertain whether I am making a mistake or facing optimization issues. Your guidance would be appreciated.

Thanks,

Udit

lbendlin · ‎09-28-2024

Table.Buffer seems to work just fine

View solution in original post

Anonymous · ‎09-29-2024

Instead of wrapping all of your Table.AddColumns with Table.TransformColumnTypes, you can just end your Table.AddColumns with ", type number". In other words, define your column type as you make the column.

That being said, there's a lot of work going on for each of these custom columns.

--Nate

quantumudit · ‎09-28-2024

Thank you, @lbendlin, for your response to this topic.

I understand the pitfalls of recursion. Initially, I attempted it, but soon realized it was not efficient for even 57 rows of data, which led me to choose List.Generate(). Although I have used List.Buffer() in some instances, I never fully explored it because there was no necessity. However, it appears this is the situation where it might be required.

Here is a sample dataset with 57 rows for your reference (This is the DeltaLake Table, the BaseTable) :

Data Link (Expires in 2 Days)

I would greatly value your assistance.

Thanks,

Udit

lbendlin · ‎09-28-2024

Table.Buffer seems to work just fine

quantumudit · ‎09-28-2024

Thanks @lbendlin

I had not envisioned the solution being so straightforward. The capabilities of Table.Buffer() and List.Buffer() appears to surpass my expectations. I am eager to understand their workings and the reasons for their superior efficiency over standard operations.

Could you share resources such as blogs or videos to help me understand these functions?

Thanks,

Udit

lbendlin · ‎09-29-2024

You can assume that Power Query runs "on disk", interacting with storage media for each iteration. In contrast, Power BI (DAX) for the most part runs "in memory", ie much faster. Table.Buffer loads data into memory for faster re-use.

lbendlin · ‎09-28-2024

Recursion carries (literally carries) a penalty of having to lug your results around. Try to avoid if possible.

Read about the buffering functions like Table.Buffer and List.Buffer and evaluate if you can apply them to your scenario.

Can you provide a sample of the Delat lake table?

Takes a lot of time to generate output

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

Huge last-minute discounts for FabCon Vienna from September 15-18, 2025

Takes a lot of time to generate output

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025