Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi,
Right now I have some PDFs which have multiple tables within each PDF.
I need to extract a single table from each of the PDFs.
The way I'm attempting or suggesting to do it is by:
1. Import the PDF data
2. In Power Query, I expand the data. At this point, I now have 5 columns: Source.Name, Id, Name, Kind and Data. The seperated tables are within the Data Column.
3. This is where i'm confused. My goal is to extract the tables that have these headers: ITEM NO., PART NUMBER, DESCRIPTION, QTY.
So, should I create a column that somehow checks the tables and gives a true/false value based on if the table contains those headers? How would I do that?
Thanks
Solved! Go to Solution.
@TameshDoobay
The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.
Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.
Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!
Follow me on LinkedIn!!!
@TameshDoobay
The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.
Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.
Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!
Follow me on LinkedIn!!!
Hi @TameshDoobay
Can you share a sample file? and screenshots to understand the problem better.
Regards,
NG
Unfortunately I can't share a file due to business retrictions. However I've created a sample file and will share screenshots.
It looks like this when I import the data:
Then, I expand the Content Column, and it looks like this:
From here, I know that I want to only keep Table001. It's the only table that contains the headers ITEM NO., PART NUMBER, DESCRIPTION and QTY.
Basically, I want some kind of advanced filtering capability to find and "keep" these tables.
Keep in mind, I need to do this en mass. IE, with multiple PDFs loaded in.
Thanks,
Tamesh
User | Count |
---|---|
73 | |
70 | |
38 | |
25 | |
23 |
User | Count |
---|---|
96 | |
93 | |
50 | |
43 | |
42 |