Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hello,
I would like to understand how does a partition column work.
For the context, I have a pipeline that goes from a MySQL database to a Lakehouse, I would like to do is an upsert (with a query that only get the lines updated based on an update date).
Back to my pipeline, I choose Overwrite and a partition column on my ID. My thought was that it would just "upsert" the data, so insert the news and update the olds.
What it is doing is just overwriting everything, so I only have the result of my request.
Is it a misconfiguration ?
What is a point of choosing a partition columns ?
Also, Append with a partition column is not an option, it is not updating the olds lines if my understanding is correct.
Thank you for your time,
Sincerely,
Greg
Solved! Go to Solution.
@gregbortolotti - Overwrite with a Partition specific will overwrite the entire partition, it does not perform a merge. Likewise, the Append will add new rows the partition but will not update the old versions. You can do a merge however in Notebooks, we do a lot of that with Partitioned and Non-Partitioned data, link to the Databricks documentation (still applies to Fabric, I just like the documentation better).
Partitioning for very large tables to help speed up data access activities based on the partition field. Think of it as grouping data/transactions together. If you have transactional data that doesn't change, sometimes you would partition that by Transaction Date; that way when looking for Transactions in that time period it only queries that Partition vs the whole dataset.
Some more options for how to update or upsert by using the Delta Lake Python API in a Notebook:
Thank you very much for explanation. Now it is clearer to me now.
Best regards
Here is also a Fabric Idea to get the Upsert functionality natively in Data Pipeline.
Consider voting to highlight this need:
Support UPSERTs and DELETEs when copying data into Lakehouse Tables from Pipeline copy activity, as opposed to Appending new rows
https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=87f3d168-6022-ee11-a81c-6045bdc01ce4
@gregbortolotti - Overwrite with a Partition specific will overwrite the entire partition, it does not perform a merge. Likewise, the Append will add new rows the partition but will not update the old versions. You can do a merge however in Notebooks, we do a lot of that with Partitioned and Non-Partitioned data, link to the Databricks documentation (still applies to Fabric, I just like the documentation better).
Partitioning for very large tables to help speed up data access activities based on the partition field. Think of it as grouping data/transactions together. If you have transactional data that doesn't change, sometimes you would partition that by Transaction Date; that way when looking for Transactions in that time period it only queries that Partition vs the whole dataset.
Some more options for how to update or upsert by using the Delta Lake Python API in a Notebook:
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.