Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.
Hi,
Usually we try to keep the parquet file sizes large, otherwise the excess of small files can create problems for processing. The default file size, unless I'm mistaken, is set for 1GB. Most examples end up creating a single parquet file by default.
However, we can update records in delta tables, but not in parquet format. When we update one single record in a delta table, the entire parquet file containing that record is duplicated. Unless we will work with the history of the table, we should VACUUM the files frequently.
Is the 1GB file size bad for tables when we plan to make many updates on the records? It seems so, could someone confirm?
What's worse, many small files causing problems to read, or one or only a few very large files, causing problems to update?
(By the way, in my opinion record updates should be kept to a minimum, but the features are announced this way, in fact, even the Data Warehouse uses this file format for storage).
Kind Regards,
Dennes
Hello @DennesTorres
I think this is catch 22 situation , if you keep the file size at 1 GB updates are slow but read is fine m otherwise updates are fine and read is slow . As i understand from the post that you do not perform a lot of updates and do perform lot of read and so I think going ahead with 1 GB file size , should be the right approah .
Thanks
Himanshu
Hi,
Thank you, I also agree and It's important to get this kind of feedback.
I believe as a result, when someone decides to use an upsert in opposited to a dimension type 2 (SCD), this has a consequence and they should know that to take a decision if 1GB of file size is good for their solution. Or not? Any thoughts about this?
I'm also wondering why we have no control about this on the data warehouse, I posted a different question about this.
Kind Regards,
Dennes
User | Count |
---|---|
15 | |
3 | |
3 | |
3 | |
2 |