Solved: Dataflows not able to get data from PDF

PVO3 · ‎11-06-2023

Hello,

I posted a reply in this topic. I urgently need some help so I opened a new topic.

The error: "Error: PipelineException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt" is returned when trying to use "Invoke custom function" or Pdf.Tables in a dataflow. The source are PDF files from a Sharepoint folder.

I think the error is caused by the extra step "Table.TransformColumnTypes" that dataflows automatically adds. I'm not sure how to prevent dataflows from doing this. This because the 'Table' datatype is not supported by Dataflows. But the end result is just a single Text column. By doing the same query over CSV files, the 'Table' datatype but no error is returned and it works fine. But on the otherhand the preview is correct, even with these added steps.

- I have a dataflow and a PBI dataset with the exact same query (that gathers data from PDF's from a Sharepoint folder).

- The PBI dataset is refreshing/working like expected

- Both the Dataset as the Dataflow are in the same Premium workspace which I am the owner of.

- The dataflow uses the advanced engine for calculations

- The preview of the dataflow returns the desired result

- I have similar dataflows running that extract data from txt and xlms files, these run without issues.

- The error occurs exactly at the moment that I use the Pdf.Tables function, or alternativly the "Invoke custom function". Without these steps, it runs fine.

Help would be greatly appreciated. Thanks a lot!

let
  Source = SharePoint.Files("https://SharepointSite", [ApiVersion = 15]),
  Path = Table.SelectRows(Source, each [Folder Path] = "https://SharepointFolderWithPDF"),
  SelectContent = Table.SelectColumns(Path, {"Content"}),
  ExtractContent = Table.AddColumn(SelectContent, "Custom", each Pdf.Tables([Content])),
  #"Kolommen transformeren" = Table.TransformColumnTypes(ExtractContent, {{"Custom", type text}}),
  #"Fouten vervangen" = Table.ReplaceErrorValues(#"Kolommen transformeren", {{"Custom", null}}),
  #"Kolommen verwijderen" = Table.RemoveColumns(#"Fouten vervangen", Table.ColumnsOfType(#"Fouten vervangen", {type table, type record, type list, type nullable binary, type binary, type function}))
in
  #"Kolommen verwijderen"

*Last 3 steps automatically added by Dataflows.

PVO3 · ‎12-06-2023

MS confirmed this is currently a known limitation in Premium workspaces.

https://learn.microsoft.com/en-us/power-query/connectors/pdf#power-bi-dataflows-in-a-premium-capacit...

Currently trying to get this fixed by using a gateway

View solution in original post

PVO3 · ‎12-06-2023

MS confirmed this is currently a known limitation in Premium workspaces.

https://learn.microsoft.com/en-us/power-query/connectors/pdf#power-bi-dataflows-in-a-premium-capacit...

Currently trying to get this fixed by using a gateway

lbendlin · ‎11-19-2023

You cannot just transform the column type from table to text. You need to expand the table to new columns first.

PVO3 · ‎11-19-2023

Thanks @lbendlin. I know. The code is just to indicate at what point the error triggers.

Whatever I do from here, the error "Error: PipelineException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt" is returned. So also when expanding.

Dataflows not able to get data from PDF

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

FabCon is coming to Atlanta

Dataflows not able to get data from PDF

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026