Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
Kesahli
Frequent Visitor

Error reading parquet files: Illegal Parquet type: INT64 (TIME(NANOS,true))

Hi All, 

I have been getting the following error when reading some parquet files in a pyspark notebook. 

 

Illegal Parquet type: INT64 (TIME(NANOS,true))

 

The parquet files are loaded by a copy activity in a pipeline, contained in a forEach loop, so its not easy to pull them out and manually map to say a string for later conversion.

 

I have done a bit of searching and it seems this was a known spark issue some time ago that was apparently rectified in Spark 3.2 ([SPARK-40819] Parquet INT64 (TIMESTAMP(NANOS,true)) now throwing Illegal Parquet type instead of aut...)

 

I have tried running the below in first cell of notebook. 

spark.conf.set("spark.sql.legacy.parquet.nanosAsLong", "true")
spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInRead", "CORRECTED")
spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInWrite", "CORRECTED")
spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInRead", "CORRECTED")
spark.conf.set("spark.sql.legacy.parquet.datetimeRebaseModeInWrite", "CORRECTED")
 
Additionally I have tried setting the above in a Spark environment and assigning that to the notebook. 
 
Any other suggestions or help would be appreciated. 
 
Cheers.

 

 

 

1 ACCEPTED SOLUTION
Kesahli
Frequent Visitor

Just an update...

 

I have successfully read the offending files using pandas and the fastparquet engine (after setting up a new environment to load that library). 

Once read into a pandas frame, I convert to a spark df in order to continue with the rest of the notebook without having to refactor. I found I do need to run the spark.conf.set() calls above in order to write to delta tables (obviously parquet underneath). 

Not elegant but is a workaround unless anyone has something else?

View solution in original post

2 REPLIES 2
Kesahli
Frequent Visitor

Just an update...

 

I have successfully read the offending files using pandas and the fastparquet engine (after setting up a new environment to load that library). 

Once read into a pandas frame, I convert to a spark df in order to continue with the rest of the notebook without having to refactor. I found I do need to run the spark.conf.set() calls above in order to write to delta tables (obviously parquet underneath). 

Not elegant but is a workaround unless anyone has something else?

Anonymous
Not applicable

HI @Kesahli,

I'm glad to hear you find the workaround, did you mind to share these codes here? I think they will help for others who faced the simalri scenario.

Regards,

Xiaoxin Sheng

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.