Solved: How to Import PDF in Single Column

Jbuzios · ‎04-09-2025

Good morning,

How do I import a PDF without dividing it into columns?

Att.

Julio

MarkLaf · ‎04-23-2025

The only built-in connector is Pdf.Tables, and I do not believe there is any way to prevent it from parsing the PDF content into a table of tables of page text and tables (charts, etc.). You may want to look into using Power Query R or Python.

That said, here is a general set of PQ transformations you can use to convert the default Pdf.Tables parsing into a something similar to Lines.FromBinary, which I think is what you are asking for.

let
    // Load the PDF file a location
    // Note: Use File.Contents for local files
    Source = Web.Contents(
        "https://file-examples.com/storage/feeed4f6296807c3196e058/2017/10/file-example_PDF_1MB.pdf"
    ),
    // Parse the PDF file to extract its content
    PdfParse = Pdf.Tables(Source),
    // Filter the parsed content to get only the pages
    GetPagesOnly = Table.SelectRows(PdfParse, each ([Kind] = "Page")),
    // Combine all the separate tables of page content into one table for the whole file
    CombineAllPageContent = Table.Combine(GetPagesOnly[Data]),
    // Merge all columns into a single column (called "Merged")
    MergeAllPageColumns = Table.CombineColumns(
        CombineAllPageContent,
        Table.ColumnNames(CombineAllPageContent),
        Combiner.CombineTextByDelimiter("", QuoteStyle.None),
        "Merged"
    )
in
    MergeAllPageColumns

Quick visual of the steps to see what the above is doing:

View solution in original post

v-echaithra · ‎04-29-2025

Hi @Jbuzios ,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithra E.

MarkLaf · ‎04-23-2025

The only built-in connector is Pdf.Tables, and I do not believe there is any way to prevent it from parsing the PDF content into a table of tables of page text and tables (charts, etc.). You may want to look into using Power Query R or Python.

That said, here is a general set of PQ transformations you can use to convert the default Pdf.Tables parsing into a something similar to Lines.FromBinary, which I think is what you are asking for.

let
    // Load the PDF file a location
    // Note: Use File.Contents for local files
    Source = Web.Contents(
        "https://file-examples.com/storage/feeed4f6296807c3196e058/2017/10/file-example_PDF_1MB.pdf"
    ),
    // Parse the PDF file to extract its content
    PdfParse = Pdf.Tables(Source),
    // Filter the parsed content to get only the pages
    GetPagesOnly = Table.SelectRows(PdfParse, each ([Kind] = "Page")),
    // Combine all the separate tables of page content into one table for the whole file
    CombineAllPageContent = Table.Combine(GetPagesOnly[Data]),
    // Merge all columns into a single column (called "Merged")
    MergeAllPageColumns = Table.CombineColumns(
        CombineAllPageContent,
        Table.ColumnNames(CombineAllPageContent),
        Combiner.CombineTextByDelimiter("", QuoteStyle.None),
        "Merged"
    )
in
    MergeAllPageColumns

Quick visual of the steps to see what the above is doing:

v-echaithra · ‎04-22-2025

Hi @Jbuzios ,

We wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithra E.

rohit1991 · ‎04-15-2025

Hi @Jbuzios ,
When importing a PDF into Power BI, it often auto-detects and splits the data into multiple columns based on layout. To keep everything in a single column, you can open Power Query, find the table or page you’re importing, and use the “Combine” or “Extract Text” options instead of loading structured tables. Alternatively, choose to import from the “Document” or “Page” level rather than individual tables—this way, each line or block of text stays together. You can then clean or split the data manually as needed.

Did it work? ✔ Give a Kudo • Mark as Solution – help others too!

v-echaithra · ‎04-14-2025

Hi @Jbuzios ,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithra E.

SundarRaj · ‎04-09-2025

Assuming you have the data already inputed in PQ. Using the following steps post that should work:
= Table.FromRows( List.Combine( Table.ToColumns( [YourTableName] ) ) )

If you're looking for all the contents in different columns to be in one single column, then this should work. Thanks!
Let me know if I understood your query correctly.

Sundar Rajagopalan

adudani · ‎04-09-2025

hi @Jbuzios ,

probably multiple steps to this depending on the pdf structure. Potential steps:

1. Combine Data from Multiple PDF Files into a Single Excel File or Combine Data from Multiple PDFs with Inconsistent Column Names!
2. to get it into a couple of columns : Unpivot Multiple Column Groups

if this doesn't resolve the issue, kindly provide a sample input masking senstitive data and a sample output

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a Kudos 🙂
Kind Regards,
Avinash

How to Import PDF in Single Column

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

Join us at FabCon Vienna from September 15-18, 2025

How to Import PDF in Single Column

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025