Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Krumelur
Microsoft Employee
Microsoft Employee

Partitioned delta parquet files don't import into tables

18b679e1-9f20-43de-ae9e-b48261f8d31a.jpg

Using Fabric, I created a dataset stored in delta parquet format and partitiones by EventData=YYYY-MM-DD. Then, I'm running a Pyspark script to load this data into "Tables". It will  generate a table named "pageview_delta_small" but without any columns. If I create my data without partitions, it will work. What am I doing wrong?

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Krumelur ,

Thanks for using Fabric Community.

You can explicitly define the schema for your DataFrame before writing it to the table. This ensures Spark uses the correct schema regardless of the partition data. 

 

# Define schema as a list of tuples (column_name, data_type)

schema = [("column1", "string"), ("column2", "int"), ...]



# Load your delta parquet data

df = spark.read.format("delta").load("path/to/your/data")



# Write data with explicit schema

df.write.format("delta").option("partitionBy", "EventData").saveAsTable("pageview_delta_small", schema=schema)


Can you please try above code?

Hope this is helpful. Please do let me know incase of further queries.

View solution in original post

5 REPLIES 5
Krumelur
Microsoft Employee
Microsoft Employee

Your solution uses Python and I can confirm that it's working fine there, even without specifying a schema.

However, using the SQL syntax, the table will be empty.

Anonymous
Not applicable

Hi @Krumelur ,

Can you please try below Spark SQL code?

CREATE TABLE IF NOT EXISTS pageview_delta_small -- Ensure this matches the expected table name

USING DELTA

PARTITIONED BY (EventData) -- Specify the partitioning column

LOCATION '/data/pageviews'; -- Location of your Delta table data

This won't work because: It is not allowed to specify partitioning when the table schema is not defined.

 

At this point, I just use Python. 🙂

Anonymous
Not applicable

Hi @Krumelur ,

Thanks for your reply. 
Glad to know that you were able to get to a resolution using pyspark. Please continue using Fabric Community on your further queries.

Anonymous
Not applicable

Hi @Krumelur ,

Thanks for using Fabric Community.

You can explicitly define the schema for your DataFrame before writing it to the table. This ensures Spark uses the correct schema regardless of the partition data. 

 

# Define schema as a list of tuples (column_name, data_type)

schema = [("column1", "string"), ("column2", "int"), ...]



# Load your delta parquet data

df = spark.read.format("delta").load("path/to/your/data")



# Write data with explicit schema

df.write.format("delta").option("partitionBy", "EventData").saveAsTable("pageview_delta_small", schema=schema)


Can you please try above code?

Hope this is helpful. Please do let me know incase of further queries.

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.