Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
delaneyjwh
New Member

Data reduction techniques for high cardinality text column?

I have some columns in different tables that take up a LOT of space. One column specifically consumes more than 40% of our data model size. 

 

I know that the typical data reduction methods are either:

- Remove columns you don't need

- Remove rows you don't need

- Convert data types to numeric values when possible

 

I do need these columns, I have already reduced the number of rows as much as I can, and the data types for these columns are text because the values are in this format: "a36be-f3c5-d293f93da2-f03df-a49f".

 

The high cardinality of the data for these columns is blowing up our model size. What would be the best way to reduce our data size without removing data from our model entirely?

1 REPLY 1
lbendlin
Super User
Super User

You cannot apply techniques like separation of date and time parts  to GUIDs.  GUIDs by their very nature have to have high cardinality.  You could theoretically replace the GUID with an integer index column but that would only reduce the storage needs, not the cardinality.

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.