Solved: Partitioned delta parquet files don't import into ...

Krumelur · ‎06-11-2024

Using Fabric, I created a dataset stored in delta parquet format and partitiones by EventData=YYYY-MM-DD. Then, I'm running a Pyspark script to load this data into "Tables". It will generate a table named "pageview_delta_small" but without any columns. If I create my data without partitions, it will work. What am I doing wrong?

Anonymous · ‎06-12-2024

Hi @Krumelur ,

Thanks for using Fabric Community.

You can explicitly define the schema for your DataFrame before writing it to the table. This ensures Spark uses the correct schema regardless of the partition data.

# Define schema as a list of tuples (column_name, data_type)

schema = [("column1", "string"), ("column2", "int"), ...]



# Load your delta parquet data

df = spark.read.format("delta").load("path/to/your/data")



# Write data with explicit schema

df.write.format("delta").option("partitionBy", "EventData").saveAsTable("pageview_delta_small", schema=schema)

Can you please try above code?

Hope this is helpful. Please do let me know incase of further queries.

View solution in original post

Krumelur · ‎06-12-2024

Your solution uses Python and I can confirm that it's working fine there, even without specifying a schema.

However, using the SQL syntax, the table will be empty.

Anonymous · ‎06-12-2024

Hi @Krumelur ,

Can you please try below Spark SQL code?

CREATE TABLE IF NOT EXISTS pageview_delta_small -- Ensure this matches the expected table name

USING DELTA

PARTITIONED BY (EventData) -- Specify the partitioning column

LOCATION '/data/pageviews'; -- Location of your Delta table data

Krumelur · ‎06-12-2024

This won't work because: It is not allowed to specify partitioning when the table schema is not defined.

At this point, I just use Python. 🙂

Anonymous · ‎06-12-2024

Hi @Krumelur ,

Thanks for your reply.
Glad to know that you were able to get to a resolution using pyspark. Please continue using Fabric Community on your further queries.

Anonymous · ‎06-12-2024

Hi @Krumelur ,

Thanks for using Fabric Community.

You can explicitly define the schema for your DataFrame before writing it to the table. This ensures Spark uses the correct schema regardless of the partition data.

# Define schema as a list of tuples (column_name, data_type)

schema = [("column1", "string"), ("column2", "int"), ...]



# Load your delta parquet data

df = spark.read.format("delta").load("path/to/your/data")



# Write data with explicit schema

df.write.format("delta").option("partitionBy", "EventData").saveAsTable("pageview_delta_small", schema=schema)

Can you please try above code?

Hope this is helpful. Please do let me know incase of further queries.

Partitioned delta parquet files don't import into tables

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025

Join us at FabCon Vienna from September 15-18, 2025

Partitioned delta parquet files don't import into tables

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025