The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
Right now I have some PDFs which have multiple tables within each PDF.
I need to extract a single table from each of the PDFs.
The way I'm attempting or suggesting to do it is by:
1. Import the PDF data
2. In Power Query, I expand the data. At this point, I now have 5 columns: Source.Name, Id, Name, Kind and Data. The seperated tables are within the Data Column.
3. This is where i'm confused. My goal is to extract the tables that have these headers: ITEM NO., PART NUMBER, DESCRIPTION, QTY.
So, should I create a column that somehow checks the tables and gives a true/false value based on if the table contains those headers? How would I do that?
Thanks
Solved! Go to Solution.
@TameshDoobay
The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.
Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.
Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!
Follow me on LinkedIn!!!
@TameshDoobay
The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.
Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.
Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!
Follow me on LinkedIn!!!
Hi @TameshDoobay
Can you share a sample file? and screenshots to understand the problem better.
Regards,
NG
Unfortunately I can't share a file due to business retrictions. However I've created a sample file and will share screenshots.
It looks like this when I import the data:
Then, I expand the Content Column, and it looks like this:
From here, I know that I want to only keep Table001. It's the only table that contains the headers ITEM NO., PART NUMBER, DESCRIPTION and QTY.
Basically, I want some kind of advanced filtering capability to find and "keep" these tables.
Keep in mind, I need to do this en mass. IE, with multiple PDFs loaded in.
Thanks,
Tamesh
User | Count |
---|---|
82 | |
82 | |
35 | |
32 | |
32 |
User | Count |
---|---|
93 | |
79 | |
62 | |
54 | |
51 |