Solved: Error reading parquet files: Illegal Parquet type:...

Kesahli · ‎10-28-2024

Hi All,

I have been getting the following error when reading some parquet files in a pyspark notebook.

Illegal Parquet type: INT64 (TIME(NANOS,true))

The parquet files are loaded by a copy activity in a pipeline, contained in a forEach loop, so its not easy to pull them out and manually map to say a string for later conversion.

I have done a bit of searching and it seems this was a known spark issue some time ago that was apparently rectified in Spark 3.2 ([SPARK-40819] Parquet INT64 (TIMESTAMP(NANOS,true)) now throwing Illegal Parquet type instead of aut...)

I have tried running the below in first cell of notebook.

spark.conf.set("spark.sql.legacy.parquet.nanosAsLong", "true")

spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInRead", "CORRECTED")

spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInWrite", "CORRECTED")

spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "CORRECTED")

spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED")

Additionally I have tried setting the above in a Spark environment and assigning that to the notebook.

Any other suggestions or help would be appreciated.

Cheers.

Kesahli · ‎10-28-2024

Just an update...

I have successfully read the offending files using pandas and the fastparquet engine (after setting up a new environment to load that library).

Once read into a pandas frame, I convert to a spark df in order to continue with the rest of the notebook without having to refactor. I found I do need to run the spark.conf.set() calls above in order to write to delta tables (obviously parquet underneath).

Not elegant but is a workaround unless anyone has something else?

View solution in original post

Andrei_qrg7t · ‎07-17-2025

Hi guys! I don't think this issue is solved. We faced the same problem too, and reading into pandas df is not an option for us.

Is there any response from the fabric team?

Kesahli · ‎07-17-2025

Not that I've seen. Unfortunately forcing a read with Pandas was the only solution I came up with.

Kesahli · ‎10-28-2024

Just an update...

I have successfully read the offending files using pandas and the fastparquet engine (after setting up a new environment to load that library).

Once read into a pandas frame, I convert to a spark df in order to continue with the rest of the notebook without having to refactor. I found I do need to run the spark.conf.set() calls above in order to write to delta tables (obviously parquet underneath).

Not elegant but is a workaround unless anyone has something else?

Anonymous · ‎10-28-2024

HI @Kesahli,

I'm glad to hear you find the workaround, did you mind to share these codes here? I think they will help for others who faced the simalri scenario.

Regards,

Xiaoxin Sheng

Error reading parquet files: Illegal Parquet type: INT64 (TIME(NANOS,true))

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025

Join us at FabCon Vienna from September 15-18, 2025

Error reading parquet files: Illegal Parquet type: INT64 (TIME(NANOS,true))

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025