- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Data types in Delta
In the olden days of data warehousing, a developper/modeler would be careful to optimize data types (especially on large fact tables) for a variety of reasons.
I'm wondering if there's still a reason to care about this is a Fabric world.
- I see that the Data warehouse support many T-SQL types
- But these probably map to delta logical types under the hood
- Which themselves map to Parquet premitives... where the smallest type (beside the bit) is an INT32
So... any reasons to keep learning about data type optimization or should we just drop this whole topic altogether?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I did offer some advice as to why we still want to optimise data types. The engine will provide the resources necesarry to process data operations based on data types and precision. The bigger the data type and precision, the bigger the resources are used to process. If you keep your data type sizes to a minimum, this will help the engine only provision what it needs.
Also, if we define our tables with data type sizes in Delta, it will stick to the data type size (e.g. cannot declare a string as 100 and then insert 200 chars in)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @fredforest
Glad that your query got resolved.
Please continue using Fabric Community for any help regarding your queries.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @fredforest
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
No, beside @AndyDDC who shared the same question I had, I have not had any further information on the topic sadly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @fredforest
In addition to @AndyDDC, Microsoft Fabric is a unified platform that allows you to store, process, and analyze data using various compute engines, such as Power BI, SQL, Spark, and others. All these engines can access the same data stored in a central lakehouse, which uses the Delta Lake table format. Delta Lake is an open-source storage layer that supports ACID transactions, schema enforcement, and time travel on top of Parquet files.
When you create a table in Microsoft Fabric, you can specify the data types for each column using T-SQL syntax. However, not all T-SQL data types are supported by Parquet, and some of them may have different precisions or representations. Therefore, Microsoft Fabric will map the T-SQL data types to the corresponding Parquet types under the hood and apply some optimizations to improve the performance and compression of the data.
Microsoft Fabric offers significant optimizations, data type optimization remains a valuable practice for storage efficiency, query performance, data integrity, compatibility, and best practices. Understanding these implications will enable you to make informed decisions for your data warehouse design and ensure optimal performance and data quality within Microsoft Fabric.
For more details please refer: Link
I hope this helps. Please do let us know if you have any further questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @fredforest
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @fredforest
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
If you have any question relating to the current thread, please do let us know and we will try out best to help you.
In case if you have any other question on a different issue, we request you to open a new thread.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, I did offer some advice as to why we still want to optimise data types. The engine will provide the resources necesarry to process data operations based on data types and precision. The bigger the data type and precision, the bigger the resources are used to process. If you keep your data type sizes to a minimum, this will help the engine only provision what it needs.
Also, if we define our tables with data type sizes in Delta, it will stick to the data type size (e.g. cannot declare a string as 100 and then insert 200 chars in)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That's a great question @fredforest personally I'm going to keep optimising data types even in this new world of delta and parquet. One of the best practices I worked with in synapse serverless (which the lakehouse endpoint and warehouse endpoint are built from) was optimising data types. Reason for this was that the data types of the columns being used in a workload would directly impact the amount of cluster resources was given to process the workload. The larger the data type, the more resources were provided. Also with joining data together across different delta tables, still can't beat integers.

Helpful resources
Join our Fabric User Panel
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Fabric Monthly Update - June 2025
Check out the June 2025 Fabric update to learn about new features.

Subject | Author | Posted | |
---|---|---|---|
03-31-2025 01:43 PM | |||
05-12-2025 11:33 PM | |||
04-21-2025 08:21 PM | |||
04-16-2025 02:38 AM | |||
02-26-2025 07:17 AM |
User | Count |
---|---|
55 | |
27 | |
18 | |
10 | |
4 |
User | Count |
---|---|
71 | |
67 | |
20 | |
8 | |
6 |