Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
evonneflood1
New Member

How Can I convert this PDF File into Excel?

I have a PDF file that I am trying to convert into Excel.  It's got multiple data records, with each record having this format.

 

evonneflood1_1-1722644723703.png

 

How can I convert each of these 'clusters' into a single row in Excel?  Right now I have been:

1.  Converting the pdf into Excel using the utility (Get Data - From File - PDF, Selecting all the tables and 'transforming' them)

2.  Transposing the data in each table and creating a header row.

3.  Loading into Excel using append so it's all in 1 worksheet.

4.  Copying the range into a table.

5.  MANUALLY moving the second row of data to be on the same row as the first.

6.  Doing LOTS of cleanup.

 

My original file is 1,400 records so it's really not sustainable for me to do manually.  A smaller sample file of 20 records took me 30 minutes to get to a usable format.

 

Thank you!

1 ACCEPTED SOLUTION
v-heq-msft
Community Support
Community Support

Hi @evonneflood1 ,

According to your description, converting pdf files to excel format the method you mentioned is indeed a viable path. But because the data in your pdf file is not displayed in a uniform format, you'd better clean the data after importing it into excel. You can use m code to go through this process. Of course you can also go through the power query in some of the buttons to achieve. In short, excel provides a conversion button, but still need you to complete the process of data cleaning.
You can refer to the following documents to get more steps.

How to Convert PDF to Excel using Excel Power Query - Data Cycle Analytics

How to Import PDF Files into Excel with Power Query - Excel Campus

Best regards,
Albert He

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

5 REPLIES 5
v-heq-msft
Community Support
Community Support

Hi @evonneflood1 ,

According to your description, converting pdf files to excel format the method you mentioned is indeed a viable path. But because the data in your pdf file is not displayed in a uniform format, you'd better clean the data after importing it into excel. You can use m code to go through this process. Of course you can also go through the power query in some of the buttons to achieve. In short, excel provides a conversion button, but still need you to complete the process of data cleaning.
You can refer to the following documents to get more steps.

How to Convert PDF to Excel using Excel Power Query - Data Cycle Analytics

How to Import PDF Files into Excel with Power Query - Excel Campus

Best regards,
Albert He

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

watkinnc
Super User
Super User

With pdfs, the key is to make sure that you have the table column sorted descending by the number of columns. This is the only way that you will keep all columns, and their order, when you combine the tables.

When you have your tables in a column, add a column using Table.ColumnCount([TableColumn]), and then sort that column descending. This will solve a lot of you issues with transforming pdfs.

 

--Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!
leila_saffarian
Advocate I
Advocate I

hi @evonneflood1  . could you please send me three pages of this file in PDF? I want to test some solutions, and if one of them works correctly, I’ll explain it here.

@leila_saffarian 

How can I get the sample file to you?  There isn't the option here to attach it.

 

Thank you!

Please send me by email:   leila.saffarian@gmail.com 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors