Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
Hi I am trying to import a pdf table to PowerBI. from - https://www.caas.gov.sg/docs/default-source/pdf/singapore-registered-aircraft-engine-nos---apr-2020d...
I am new to both PowerBI and Python , but from research managed to get this code working
import tabula
file = "https://www.caas.gov.sg/docs/default-source/pdf/singapore-registered-aircraft-engine-nos---apr-2020d..."
tables = tabula.read_pdf(file, pages = "all", multiple_tables = True)
but for some reason it is not showing as a table in PowerBI , even though no errors are reported
Solved! Go to Solution.
Hi @Pandadev ,
You can use Python to get the table in PDF and export the table, and then select the appropriate connector in power Bi to connect the export file.
http://theautomatic.net/2019/05/24/3-ways-to-scrape-tables-from-pdfs-with-python/
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @Pandadev ,
why don't you use the Power BI PDF Connector?
https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-connect-pdf
Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast - Power BI Tutorials
Thanks , yes that works fine for this pdf , but when I import a pdf where I only want certain tables , how can I merge all those selected tables to one table. all the selected tables will have the same structure , column order etc.
Hi @Pandadev ,
Based on your description. You can get this PDF file using the web connector.
let
Source = Pdf.Tables(Web.Contents("https://www.caas.gov.sg/docs/default-source/pdf/singapore-registered-aircraft-engine-nos---apr-2020d1324ca0a72f4d42bd40c25673b42c82.pdf-is-not/td-p/1121450"), [Implementation="1.1"]),
Table001 = Source{[Id="Table001"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Table001, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"NO.", Int64.Type}, {"TYPE", type text}, {"REG", type text}, {"OPERATOR", type text}, {"ENGINE TYPE", type text}})
in
#"Changed Type"Use append queries to append more tables.
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks , here is an example where I only need selected tables from the pdf - http://inaca.or.id/wp-content/uploads/2019/07/CAR19.pdf
where the column headers are the same as below , is this possible
Hi @Pandadev ,
looks good, try it yourself 😉
Marcus Wegener works as Full Stack Power BI Engineer at BI or DIE.
His mission is clear: "Get the most out of data, with Power BI."
twitter - LinkedIn - YouTube - website - podcast - Power BI Tutorials
is there a way to create just one table which has tables containing the correct columns , as the powerbi is pulling in lot's of tables that are not required. was wandering if i could say if the table contains the column headers then add it.
Hi @Pandadev ,
You can use Python to get the table in PDF and export the table, and then select the appropriate connector in power Bi to connect the export file.
http://theautomatic.net/2019/05/24/3-ways-to-scrape-tables-from-pdfs-with-python/
Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.