Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi, Brand new to Power Query Excel. Actually stumbled accross it when i searched for how to import a pdf into Excel. Here is my issue. I have a pdf document that has about 180 pages. On each page is a table with 6 columns. I have used Get Data from Pdf and it has uploaded the document in PQ beautifully but as 180 seperate tables.
I want to keep only column 1 and 5 on each table but the only way I can see to do it is to go into each table and select the columns I want to keep. I have to do this 1 table at a time - i.e. repeat 180 times. There has to be a way I can remove columns 2 - 4 and 6 in one go? Please.
I should add - after I perform the - Data > Get Data > From File > From PDF process, I am choosing Transform data, then selecting multiple options. This way I can also ignore Pages (non tables) as I dont need this data
Hi @mussaenda
Thanks for your reply. I really know little of this so let me try to explain what I am trying to do.
I have a PDF that contain multiple pages. Each page relates to an organisation and each organisation has a table that lists contacts and email address. I need name and email address colums only from each table. Once I have that as a list in excel, I will be sending an email - simply copy and paste the email address into my email client in the 'to' box. I am not trying to do anything special. I converted the PDF to excel by getting data in the following method - Data > Get Data > From File > From PDF. This process is creating 180 tables because there are 180 tables in the PDF.
Hi @Gaffers ,
one way to do it is to get the data from a folder and do the transformation before expanding it as a table.
Do you need the 180 pages separately or you are doing an append after?
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.