Solved: Data storing from json into lakehouse tables fails...

tammekas · ‎12-06-2024

Hi!

I am having issues when creating and populating new table with data in notebook in runtime 1.3.

We are having code what reads in list of jsons. Every item in json has attribute content what is dynamic and can be quite big. also it can be go down so many levels. when reading we are giving along the schema for data. In schema there is declared the first level of content atribute child items and said they are strings. Thats cause we want to avoid fixt structure because items there can be random depth, ranom size etc. The content data is there for later reading only and not used for searching or filtering.

Saving happens as this:

        df.write.format('delta').mode('append').save(target_path)

What I have found.

Current code works fine in runtime 1.2 and not droping any errors.

When runing with runtime 1.3 and dataframe contains smaller data it creates and stores data fine. but when data is bigger it creates table with error sign and there is no data in.

When digging into the logs I ended up with following error:

2024-12-04 14:46:26,037 ERROR Utils [Executor task launch worker for task 0.0 in stage 21.0 (TID 33)]: Aborting task
org.apache.spark.sql.delta.DeltaRuntimeException: [DELTA_STATS_COLLECTION_COLUMN_NOT_FOUND] nullCount stats not found for column in Parquet metadata: [content, body, content].

I googled it and ended up https://learn.microsoft.com/en-us/azure/databricks/delta/data-skipping#specify-delta-statistics-colu...

But I do not understand if it is related to my error or not.

and what I need to do to get working the file saving in runtime 1.3 ?

Anonymous · ‎12-08-2024

Hi @tammekas ,

The error message indicates that Delta Lake failed to find a transaction log entry for table art.

I have a couple of suggestions:

Ensure that the _delta_log folder exists in the path where the table is located.

If the folder does not exist, try recreating the transaction log folder.
Clear Spark's cache to ensure that no old cached data is interfering with the new write operation.

spark.catalog.clearCache()

If you have any other questions please feel free to contact me.

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

Anonymous · ‎12-17-2024

Hi @tammekas ,

Is my follow-up just to ask if the problem has been solved?

If so, can you accept the correct answer as a solution or share your solution to help other members find it faster?

Thank you very much for your cooperation!

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Anonymous · ‎12-08-2024

Hi @tammekas ,

The error message indicates that Delta Lake failed to find a transaction log entry for table art.

I have a couple of suggestions:

Ensure that the _delta_log folder exists in the path where the table is located.

If the folder does not exist, try recreating the transaction log folder.
Clear Spark's cache to ensure that no old cached data is interfering with the new write operation.

spark.catalog.clearCache()

If you have any other questions please feel free to contact me.

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Data storing from json into lakehouse tables fails in runtime 1.3 but not in runtime 1.2

Helpful resources

Fabric Community Update - July 2025

Fabric Monthly Update - June 2025

Party with Power BI’s own Guy in a Cube

Data storing from json into lakehouse tables fails in runtime 1.3 but not in runtime 1.2

Helpful resources

Fabric Community Update - July 2025

Fabric Monthly Update - June 2025