Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
JP_Curious
Frequent Visitor

Custom function works until I try to invoke it on a column of nested tables.

Background:

I'm trying to read hundreds of PDF pages of accounting general ledger detail reports into a clean data table. Power query's ability to parse this data is suboptimal. It seems to read the PDF from left to right, so report column names that are left-aligned end up with their own column while the values that are right-aligned end up in the next colum. A Purchase Order ID column header might line up with its data of five-digit values, but but then a manually generated two-digit PO gets thrown into its own column. The result is that my data gets stretched across an execssive number of columns inconsistently from one page to the next and makes it seemingly impossible to buid a transform query that will address each page of the PDF.

 

My Approach:

I wanted to create a function that would dynamically merge the columns of each PDF page under the report's data headers to force a consistent table structure from one page to the next. The plan was to build a nested function that would accept a PDF page, dynamically identify the default columns to merge for each data header, call a function to iterate through each row of data header and columns list calling a function to merge columns.

 

Result:

I can invoke my function on a query of any page of the PDF, but when I try to invoke it on the table where each page resides as a nested table within the [Data] column, 100% of my results are errors, even on the pages I've individually invoked successfully. I've been using ChatGPT to troubleshoot, it seems like my issue is "lazy evaluation" inside a row-level operation, but I haven't had any success resolving this.

 

Request:

Does anyone know how to diagnose or overcome this "lazy evaluation" that appears to be gumming up the works? Are there any best practices or common patterns that might be more effective when dealing with PDF files?

 

QueryErrors.png

 

functionDynamicColumnMerge.png

 

 

(SourcePage as table) as table =>
let
    // Define the function to merge a single set of columns
    DynamicColumnMerge = (SourceTable as table, MergeList as list, ColumnName as text) as table =>
        let
            MergeColumn = Table.CombineColumns(
                SourceTable, 
                MergeList, 
                Combiner.CombineTextByDelimiter("", QuoteStyle.None), 
                ColumnName
            )
        in
            MergeColumn,

    // Define the function to iterate over multiple merge instructions
    IterativeColumnMerge = (SourceTable as table, MergeParams as table) as table =>
        let
            MergedTable = List.Accumulate(
                Table.ToRecords(MergeParams), 
                SourceTable, 
                (currentTable, row) => DynamicColumnMerge(currentTable, row[MergeColumns], row[NewColumnName])
            )
        in
            MergedTable,

    // Main query logic
    Source = SourcePage,
    Indexed = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
    #"Added HeaderRow Column" = Table.AddColumn(Indexed, "HeaderRow", each 
        if [Column1] = "GL Date" then [Index] 
        else if [Column2] = "GL Date" then [Index] 
        else if [Column3] = "GL Date" then [Index] 
        else null
    ),
    #"Filtered HeaderRow" = Table.SelectRows(#"Added HeaderRow Column", each ([HeaderRow] = 3)),
    HeaderRow = #"Filtered HeaderRow"{0}[HeaderRow],
    #"Jumped Back to Source" = Source,
    #"Removed Top Rows" = Table.Skip(#"Jumped Back to Source", HeaderRow),
    #"Kept First Row" = Table.FirstN(#"Removed Top Rows", 1),
    #"Demoted Headers" = Table.DemoteHeaders(#"Kept First Row"),
    #"Transposed Table" = Table.Transpose(#"Demoted Headers"),
    #"Filled Down" = Table.FillDown(#"Transposed Table", {"Column2"}),
    //MergeLists = Table.Group(#"Filled Down", {"Column2"}, {{"Count", each _[Column1]}}),
    MergeLists = Table.Group(
        #"Filled Down", 
        {"Column2"}, 
        {
            {"MergeColumns", each _[Column1], type list},  // Ensure this is a list
            {"NewColumnName", each Text.Combine(List.Distinct(_[Column2]), "_"), type text}  // Generate a column name
        }
    ),


    // Apply the iterative merging function
    FinalMergedTable = IterativeColumnMerge(Source, MergeLists)

in
    FinalMergedTable

 

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

I don't think you need merges. You need appends and zips (sideways appends).

 

Can you post a sample file or two?

View solution in original post

5 REPLIES 5
v-vpabbu
Community Support
Community Support

Hi @JP_Curious,

 

May I ask if you have gotten this issue resolved?

If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.

 

Regards,
Vinay Pabbu

Hi @JP_Curious,

 

As we haven't heard back from you, we wanted to kindly follow up to see if you have resolved this issue?

If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.

 

Regards,
Vinay Pabbu

Hi @JP_Curious,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

 

Regards,
Vinay Pabbu

v-vpabbu
Community Support
Community Support

Hi @JP_Curious,

 

Thanks @lbendlin for Addressing the issue.

 

As suggested by Super user can you please provide sample file.

 

Regards,

Vinay Pabbu

lbendlin
Super User
Super User

I don't think you need merges. You need appends and zips (sideways appends).

 

Can you post a sample file or two?

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.