Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
scottdk
Frequent Visitor

Efficient way to conditionally transform nested tables with different structures.

I am looking for more efficient Power Query code to transform nested tables in a column where the structure differs depending on the record.

In this simplified example, there are three types of records with nested tables that each require different transformations. Page001, Table004 and Table006. (Data is from PDF files)

Main_Table

scottdk_0-1717659519772.png

 

 

 

 

 

 

I would like to use Table.TransformColumns. I can't find a way to access the field [Id] from MAIN_TABLE for conditionally processing [Data].  I have been like a monkey at a typewriter writing M code for this hoping some combination will work.

 

eg

 

// Code does not work.

METHOD_3 = Table.TransformColumns( MAIN_TABLE, { "Data", 
   if MAIN_TABLE[Id] = "Table004" then
      Table.Skip(_, each [Column1] <> "Date")
   else if MAIN_TABLE[Id] = "Table006" then 
      [
         FirstRow       = Table.FirstN( _, 1 ),
         ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_)),
         RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1), 
         PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
      ][PromoteHeaders]
   else
      _
} )

// Error: The name '_' wasn't recognized.  Make sure it's spelled correctly.

 

I can perform "conditional" transformations on the nested tables by splitting the table for each transformation type, or using Table.AddColumn to create a duplicate of the Data column (new discovery since I started writing up this question). I'm a bit averse to constantly adding and deleting columns for every transformation I do, and I'd like to find a way to do this more efficiently with Table.TransformColumns instead of Table.AddColumn.

 

Method 1:

  1. Split Main_Table into three tables. 
  2. Perform Transformations on each type of nested table. There are many more transformations than in the code block.
  3. Append the results of the three transformations back together.

 

   /////////////////////////////////////////////////////////
   // METHOD_1 : Deconstruct Table, process, and reconstruct
   /////////////////////////////////////////////////////////

   // eg No Transformations on Page001 at this level (Extracting headers)
   Page001 = Table.SelectRows( MAIN_TABLE, each [Id] = "Page001" ),

   // Transformations on Table004
   Table004 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table004" ),
   Table004_Transformed = 
      Table.TransformColumns(Table004, { "Data", each 
         [
            Data = _,   
            // Example Transformations on nested tables in column "Data"
            Table004_SkipRows = Table.Skip(Data, each [Column1] <> "Date")
            /// etc etc
         ][Table004_SkipRows] }),


   // Alternative Transformations on Table006
   Table006 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table006" ),
   Table006_Transformed = 
      Table.TransformColumns(Table006, { "Data", each 
         [
            Data = _,   
            // Example Transformations on nested tables in column "Data"
            FirstRow       = Table.FirstN( Data, 1 ),
            ReplaceRow1LF  = Table.ReplaceValue(FirstRow, "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(Data)),
            RecombinedRows = ReplaceRow1LF & Table.Skip( Data, 1), 
            PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
                ][PromoteHeaders] }),
                

   // Combine the three tables back together
   CombineData = Page001 & Table004_Transformed & Table006_Transformed,
   // Sort to restore original order
   METHOD_1 = Table.Sort(CombineData,{{"File", Order.Ascending}, {"Id", Order.Ascending}}),

 

Method 2:

Since starting to write up this question, I discovered I can use Table.AddColumn which is much more code efficient. I do, however, prefer not to be adding and deleting columns each time I want to do a transformation if I can apply the same logic inside Table.TransformColumns.

 

   //////////////////////////////////////////////////////////////////////
   // METHOD_2 : Insert a new column and transform conditionally
   //////////////////////////////////////////////////////////////////////

   Method2_AddColumn = Table.AddColumn( MAIN_TABLE, "Transformed Data", each

      if       [Id] = "Table004" then
         Table.Skip([Data], each [Column1] <> "Date")

      else if  [Id] = "Table006" then 
         [
            FirstRow       = Table.FirstN( [Data], 1 ),
            ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames([Data]) ),
            RecombinedRows = ReplaceRow1LF & Table.Skip( [Data], 1), 
            PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
         ][PromoteHeaders]

      // No transformations at this level on Page001. Return as is.
      else [Data] 
   )[[File], [Id], [Kind], [Transformed Data] ],

   // Rename the new column
   METHOD_2 = Table.RenameColumns(Method2_AddColumn,{"Transformed Data","Data"}),

 

Sample Tables and code.

 

let
   file1_page1 = Table.FromRows(
      {
         {"ABC",  null,                "11,725",   "555 111 222"},
         {null,   null,                null,       "STATEMENT: 01/03/24 to 31/03/24"},
         {"TXT",  null,                "Fraggle",  "Bank"},
         {null,   "ACCOUNT: 123-456",  null,       null}
      },
      {"Column1",       "Column2",     "Column3",  "Column4"}
   ),

   file1_table4 = Table.FromRows(
      {
         {"Some",       "Random",      "Text"},
         {"More",       "Random",      "Text"},
         {"Date",       "Transaction", null},
         {"Processed",  "Details",     "Amount"},
         {"01//03//24", "CAFE",        "$5.00"},
         {"04//03//24", "FUEL",        "$999.00"}
      },
      {"Column1",       "Column2",     "Column3"}
   ),

   file2_page1 = Table.FromRows(
      {
         {"1,234",   null,                               "ABC",      null,       "$0.42"},
         {null,      "STATEMENT: 01/04/24 to 30/04/24",  null,       null,       null},
         {null,      null,                               null,       null,       "ACCOUNT: 123-456"},
         {null,      null,                               null,       null,       null}
      },
      {  "Column1", "Column2",                           "Column3", "Column4",   "Column5"}
   ),

   file2_table4 = Table.FromRows(
      {
         {"Other",      "Random",      "Text"},
         {"More",       "Random",      "FRAGGLES"},
         {"XXX",        "YYY",         "ZZZ"},
         {"Date",       "Transaction", null},
         {"Processed",  "Details",     "Amount"},
         {"01//04//24", "CAFE",        "$5.00"},
         {"01//04//24", "FUEL",        "$999.00"}
      },
      {"Column1",       "Column2",     "Column3"}
   ),

   file2_table6 = Table.FromRows(
      {
         {"Date#(lf)Processed",  "Transaction#(lf)Details",    "Amount"},
         {"03//04//24",          "Tellytubby Costume Outlet",  "$500.00"},
         {"03//04//24",          "Lucha Libre Lessons",        "$199.00"},
         {"03//04//24",          "Bathtubs-R-US",              "$1000.00"},
         {"03//04//24",          "Bulk Jello Supplies",        "$250.00"},
         {"10//04//24",          "Home and Bathroom Repairs",  "$10,000.00"}
      },
      {"Column1",                "Column2",                 "Column3"}
   ),

   MAIN_TABLE = Table.FromRows(
      {
         {"File1",   "Page001",  "Page",  file1_page1},
         {"File1",   "Table004", "Table", file1_table4},
         {"File2",   "Page001",  "Page",  file2_page1},
         {"File2",   "Table004", "Table", file2_table4},
         {"File2",   "Table006", "Table", file2_table6}
      },
      {  "File",     "Id",       "Kind", "Data"}
   ),


   /////////////////////////////////////////////////////////
   // METHOD_1 : Deconstruct Table, process, and reconstruct
   /////////////////////////////////////////////////////////

   // eg No Transformations on Page001
   Page001 = Table.SelectRows( MAIN_TABLE, each [Id] = "Page001" ),

   // Transformations on Table004
   Table004 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table004" ),
   Table004_Transformed = 
      Table.TransformColumns(Table004, { "Data", each 
         [
            Data = _,   
            // Example Transformations on nested tables in column "Data"
            Table004_SkipRows        = Table.Skip(Data, each [Column1] <> "Date")
            /// etc etc
         ][Table004_SkipRows] }),

   // Alternative Transformations on Table006
   Table006 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table006" ),
   Table006_Transformed = 
      Table.TransformColumns(Table006, { "Data", each 
         [
            Data = _,   
            // Example Transformations on nested tables in column "Data"
            FirstRow       = Table.FirstN( Data, 1 ),
            ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(Data) ),
            RecombinedRows = ReplaceRow1LF & Table.Skip( Data, 1), 
            PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
                ][PromoteHeaders] }),

   // Combine the three tables back together
   CombineData = Page001 & Table004_Transformed & Table006_Transformed,
   // Sort to restore original order
   METHOD_1 = Table.Sort(CombineData,{{"File", Order.Ascending}, {"Id", Order.Ascending}}),

   //////////////////////////////////////////////////////////////////////
   // METHOD_2 : Insert a new column and transform conditionally
   //////////////////////////////////////////////////////////////////////

   Method2_AddColumn = Table.AddColumn( MAIN_TABLE, "Transformed Data", each

      if       [Id] = "Table004" then
         Table.Skip([Data], each [Column1] <> "Date")

      else if  [Id] = "Table006" then 
         [
            FirstRow       = Table.FirstN( [Data], 1 ),
            ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames([Data]) ),
            RecombinedRows = ReplaceRow1LF & Table.Skip( [Data], 1), 
            PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
         ][PromoteHeaders]

      else [Data] 
   )[[File], [Id], [Kind], [Transformed Data] ],

   // Rename the new column
   METHOD_2 = Table.RenameColumns(Method2_AddColumn,{"Transformed Data","Data"}),

   /////////////////////////////////////////////////////////
   // Compare METHOD_1 and METHOD_2 outputs
   /////////////////////////////////////////////////////////

   CompareOutput = Table.FromRows( 
      {
         {"METHOD_1",   METHOD_1},
         {"METHOD_2",   METHOD_2},
         {"COMPARE",    if METHOD_1 = METHOD_2 then "SAME" else "DIFFERENT"}
      },
      {  "Method",     "Result"}
   )

in
    CompareOutput

 

 

 

 

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION
scottdk
Frequent Visitor

After a couple of days messing around with this on and off before posting my question I have found a method that doesn't involve adding and removing columns thanks to expanding on this post https://stackoverflow.com/questions/31548135/power-query-transform-a-column-based-on-another-column 

Code-wise I think METHOD_2 is a little easier to grasp, but the purist in me likes METHOD_3 not adding extra columns to clean up.

Writing up the whole question with simpler examples than my real-world PDF mess gave me some insight and clarity on how all this works. 

 

//////////////////////////////////////////////////////////////////////
// METHOD_3 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////

METHOD_3 = Table.FromRecords(Table.TransformRows( MAIN_TABLE, (r) =>  Record.TransformFields( r, {
{"Data", each 
   if r[Id] = "Table004" then 
      Table.Skip(_, each [Column1] <> "Date")

   else if r[Id] = "Table006" then
      [  FirstRow       = Table.FirstN( _, 1 ),
         ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_) ),
         RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1), 
         PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
      ][PromoteHeaders]

   else _
}}))),

 

View solution in original post

2 REPLIES 2
scottdk
Frequent Visitor

After a couple of days messing around with this on and off before posting my question I have found a method that doesn't involve adding and removing columns thanks to expanding on this post https://stackoverflow.com/questions/31548135/power-query-transform-a-column-based-on-another-column 

Code-wise I think METHOD_2 is a little easier to grasp, but the purist in me likes METHOD_3 not adding extra columns to clean up.

Writing up the whole question with simpler examples than my real-world PDF mess gave me some insight and clarity on how all this works. 

 

//////////////////////////////////////////////////////////////////////
// METHOD_3 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////

METHOD_3 = Table.FromRecords(Table.TransformRows( MAIN_TABLE, (r) =>  Record.TransformFields( r, {
{"Data", each 
   if r[Id] = "Table004" then 
      Table.Skip(_, each [Column1] <> "Date")

   else if r[Id] = "Table006" then
      [  FirstRow       = Table.FirstN( _, 1 ),
         ReplaceRow1LF  = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_) ),
         RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1), 
         PromoteHeaders = Table.PromoteHeaders( RecombinedRows ) 
      ][PromoteHeaders]

   else _
}}))),

 

Hi @scottdk 

It is glad that you find the good solution, you can mark it as a solution so that more users can refer it.

 

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.