Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
I am looking for more efficient Power Query code to transform nested tables in a column where the structure differs depending on the record.
In this simplified example, there are three types of records with nested tables that each require different transformations. Page001, Table004 and Table006. (Data is from PDF files)
Main_Table
I would like to use Table.TransformColumns. I can't find a way to access the field [Id] from MAIN_TABLE for conditionally processing [Data]. I have been like a monkey at a typewriter writing M code for this hoping some combination will work.
eg
// Code does not work.
METHOD_3 = Table.TransformColumns( MAIN_TABLE, { "Data",
if MAIN_TABLE[Id] = "Table004" then
Table.Skip(_, each [Column1] <> "Date")
else if MAIN_TABLE[Id] = "Table006" then
[
FirstRow = Table.FirstN( _, 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_)),
RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders]
else
_
} )
// Error: The name '_' wasn't recognized. Make sure it's spelled correctly.
I can perform "conditional" transformations on the nested tables by splitting the table for each transformation type, or using Table.AddColumn to create a duplicate of the Data column (new discovery since I started writing up this question). I'm a bit averse to constantly adding and deleting columns for every transformation I do, and I'd like to find a way to do this more efficiently with Table.TransformColumns instead of Table.AddColumn.
Method 1:
/////////////////////////////////////////////////////////
// METHOD_1 : Deconstruct Table, process, and reconstruct
/////////////////////////////////////////////////////////
// eg No Transformations on Page001 at this level (Extracting headers)
Page001 = Table.SelectRows( MAIN_TABLE, each [Id] = "Page001" ),
// Transformations on Table004
Table004 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table004" ),
Table004_Transformed =
Table.TransformColumns(Table004, { "Data", each
[
Data = _,
// Example Transformations on nested tables in column "Data"
Table004_SkipRows = Table.Skip(Data, each [Column1] <> "Date")
/// etc etc
][Table004_SkipRows] }),
// Alternative Transformations on Table006
Table006 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table006" ),
Table006_Transformed =
Table.TransformColumns(Table006, { "Data", each
[
Data = _,
// Example Transformations on nested tables in column "Data"
FirstRow = Table.FirstN( Data, 1 ),
ReplaceRow1LF = Table.ReplaceValue(FirstRow, "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(Data)),
RecombinedRows = ReplaceRow1LF & Table.Skip( Data, 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders] }),
// Combine the three tables back together
CombineData = Page001 & Table004_Transformed & Table006_Transformed,
// Sort to restore original order
METHOD_1 = Table.Sort(CombineData,{{"File", Order.Ascending}, {"Id", Order.Ascending}}),
Method 2:
Since starting to write up this question, I discovered I can use Table.AddColumn which is much more code efficient. I do, however, prefer not to be adding and deleting columns each time I want to do a transformation if I can apply the same logic inside Table.TransformColumns.
//////////////////////////////////////////////////////////////////////
// METHOD_2 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////
Method2_AddColumn = Table.AddColumn( MAIN_TABLE, "Transformed Data", each
if [Id] = "Table004" then
Table.Skip([Data], each [Column1] <> "Date")
else if [Id] = "Table006" then
[
FirstRow = Table.FirstN( [Data], 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames([Data]) ),
RecombinedRows = ReplaceRow1LF & Table.Skip( [Data], 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders]
// No transformations at this level on Page001. Return as is.
else [Data]
)[[File], [Id], [Kind], [Transformed Data] ],
// Rename the new column
METHOD_2 = Table.RenameColumns(Method2_AddColumn,{"Transformed Data","Data"}),
Sample Tables and code.
let
file1_page1 = Table.FromRows(
{
{"ABC", null, "11,725", "555 111 222"},
{null, null, null, "STATEMENT: 01/03/24 to 31/03/24"},
{"TXT", null, "Fraggle", "Bank"},
{null, "ACCOUNT: 123-456", null, null}
},
{"Column1", "Column2", "Column3", "Column4"}
),
file1_table4 = Table.FromRows(
{
{"Some", "Random", "Text"},
{"More", "Random", "Text"},
{"Date", "Transaction", null},
{"Processed", "Details", "Amount"},
{"01//03//24", "CAFE", "$5.00"},
{"04//03//24", "FUEL", "$999.00"}
},
{"Column1", "Column2", "Column3"}
),
file2_page1 = Table.FromRows(
{
{"1,234", null, "ABC", null, "$0.42"},
{null, "STATEMENT: 01/04/24 to 30/04/24", null, null, null},
{null, null, null, null, "ACCOUNT: 123-456"},
{null, null, null, null, null}
},
{ "Column1", "Column2", "Column3", "Column4", "Column5"}
),
file2_table4 = Table.FromRows(
{
{"Other", "Random", "Text"},
{"More", "Random", "FRAGGLES"},
{"XXX", "YYY", "ZZZ"},
{"Date", "Transaction", null},
{"Processed", "Details", "Amount"},
{"01//04//24", "CAFE", "$5.00"},
{"01//04//24", "FUEL", "$999.00"}
},
{"Column1", "Column2", "Column3"}
),
file2_table6 = Table.FromRows(
{
{"Date#(lf)Processed", "Transaction#(lf)Details", "Amount"},
{"03//04//24", "Tellytubby Costume Outlet", "$500.00"},
{"03//04//24", "Lucha Libre Lessons", "$199.00"},
{"03//04//24", "Bathtubs-R-US", "$1000.00"},
{"03//04//24", "Bulk Jello Supplies", "$250.00"},
{"10//04//24", "Home and Bathroom Repairs", "$10,000.00"}
},
{"Column1", "Column2", "Column3"}
),
MAIN_TABLE = Table.FromRows(
{
{"File1", "Page001", "Page", file1_page1},
{"File1", "Table004", "Table", file1_table4},
{"File2", "Page001", "Page", file2_page1},
{"File2", "Table004", "Table", file2_table4},
{"File2", "Table006", "Table", file2_table6}
},
{ "File", "Id", "Kind", "Data"}
),
/////////////////////////////////////////////////////////
// METHOD_1 : Deconstruct Table, process, and reconstruct
/////////////////////////////////////////////////////////
// eg No Transformations on Page001
Page001 = Table.SelectRows( MAIN_TABLE, each [Id] = "Page001" ),
// Transformations on Table004
Table004 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table004" ),
Table004_Transformed =
Table.TransformColumns(Table004, { "Data", each
[
Data = _,
// Example Transformations on nested tables in column "Data"
Table004_SkipRows = Table.Skip(Data, each [Column1] <> "Date")
/// etc etc
][Table004_SkipRows] }),
// Alternative Transformations on Table006
Table006 = Table.SelectRows( MAIN_TABLE, each [Id] = "Table006" ),
Table006_Transformed =
Table.TransformColumns(Table006, { "Data", each
[
Data = _,
// Example Transformations on nested tables in column "Data"
FirstRow = Table.FirstN( Data, 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(Data) ),
RecombinedRows = ReplaceRow1LF & Table.Skip( Data, 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders] }),
// Combine the three tables back together
CombineData = Page001 & Table004_Transformed & Table006_Transformed,
// Sort to restore original order
METHOD_1 = Table.Sort(CombineData,{{"File", Order.Ascending}, {"Id", Order.Ascending}}),
//////////////////////////////////////////////////////////////////////
// METHOD_2 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////
Method2_AddColumn = Table.AddColumn( MAIN_TABLE, "Transformed Data", each
if [Id] = "Table004" then
Table.Skip([Data], each [Column1] <> "Date")
else if [Id] = "Table006" then
[
FirstRow = Table.FirstN( [Data], 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames([Data]) ),
RecombinedRows = ReplaceRow1LF & Table.Skip( [Data], 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders]
else [Data]
)[[File], [Id], [Kind], [Transformed Data] ],
// Rename the new column
METHOD_2 = Table.RenameColumns(Method2_AddColumn,{"Transformed Data","Data"}),
/////////////////////////////////////////////////////////
// Compare METHOD_1 and METHOD_2 outputs
/////////////////////////////////////////////////////////
CompareOutput = Table.FromRows(
{
{"METHOD_1", METHOD_1},
{"METHOD_2", METHOD_2},
{"COMPARE", if METHOD_1 = METHOD_2 then "SAME" else "DIFFERENT"}
},
{ "Method", "Result"}
)
in
CompareOutput
Solved! Go to Solution.
After a couple of days messing around with this on and off before posting my question I have found a method that doesn't involve adding and removing columns thanks to expanding on this post https://stackoverflow.com/questions/31548135/power-query-transform-a-column-based-on-another-column
Code-wise I think METHOD_2 is a little easier to grasp, but the purist in me likes METHOD_3 not adding extra columns to clean up.
Writing up the whole question with simpler examples than my real-world PDF mess gave me some insight and clarity on how all this works.
//////////////////////////////////////////////////////////////////////
// METHOD_3 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////
METHOD_3 = Table.FromRecords(Table.TransformRows( MAIN_TABLE, (r) => Record.TransformFields( r, {
{"Data", each
if r[Id] = "Table004" then
Table.Skip(_, each [Column1] <> "Date")
else if r[Id] = "Table006" then
[ FirstRow = Table.FirstN( _, 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_) ),
RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders]
else _
}}))),
After a couple of days messing around with this on and off before posting my question I have found a method that doesn't involve adding and removing columns thanks to expanding on this post https://stackoverflow.com/questions/31548135/power-query-transform-a-column-based-on-another-column
Code-wise I think METHOD_2 is a little easier to grasp, but the purist in me likes METHOD_3 not adding extra columns to clean up.
Writing up the whole question with simpler examples than my real-world PDF mess gave me some insight and clarity on how all this works.
//////////////////////////////////////////////////////////////////////
// METHOD_3 : Insert a new column and transform conditionally
//////////////////////////////////////////////////////////////////////
METHOD_3 = Table.FromRecords(Table.TransformRows( MAIN_TABLE, (r) => Record.TransformFields( r, {
{"Data", each
if r[Id] = "Table004" then
Table.Skip(_, each [Column1] <> "Date")
else if r[Id] = "Table006" then
[ FirstRow = Table.FirstN( _, 1 ),
ReplaceRow1LF = Table.ReplaceValue( FirstRow , "#(lf)", " ", Replacer.ReplaceText, Table.ColumnNames(_) ),
RecombinedRows = ReplaceRow1LF & Table.Skip( _, 1),
PromoteHeaders = Table.PromoteHeaders( RecombinedRows )
][PromoteHeaders]
else _
}}))),
Hi @scottdk
It is glad that you find the good solution, you can mark it as a solution so that more users can refer it.
Best Regards!
Yolo Zhu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
144 | |
87 | |
65 | |
50 | |
45 |
User | Count |
---|---|
217 | |
88 | |
81 | |
65 | |
56 |