Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
arpost
Advocate V
Advocate V

Does loading to Delta table from Files duplicate data in a lakehouse?

Greetings, community. I have a bunch of files I'm planning to load into a Lakehouse in CSV format. From there, I have considered loading them as Delta tables where possible. Does this duplicate the data, however, as the data is persisted in "raw" file format and then generated in Parquet format for the Delta table?

1 ACCEPTED SOLUTION
AndyDDC
Solution Sage
Solution Sage

Hi @arpost yes this will duplicate the data but you are transforming into a far better and more efficient format when saving as Delta, plus the underlying parquet will be compressed and likely smaller size than the source CSVs

View solution in original post

1 REPLY 1
AndyDDC
Solution Sage
Solution Sage

Hi @arpost yes this will duplicate the data but you are transforming into a far better and more efficient format when saving as Delta, plus the underlying parquet will be compressed and likely smaller size than the source CSVs

Helpful resources

Announcements
Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

April Fabric Update Carousel

Fabric Monthly Update - April 2024

Check out the April 2024 Fabric update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors