Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Mays-Alreem
Frequent Visitor

Data Modeling for a concatenated column

I am working on a project in power bi when I making the data modeling, one of the columns has multiple values about country code, I want to separate them and then transform them into a dimension tableScreenshot 2023-06-04 163854.png

 

but when I merge it with the origin table which is considered to be a fact, I have a problem which is increasing the size of the fact table as an example from 10000 rows to 84000 rows, also it causes (many to many) relationship if I add an index for the fact table, is this way will be right?
can you suggest any other way much optimize and memory size saving?
I attached a screenshot for the source column and screenshots for the two dimension tables I created. Screenshot 2023-06-04 164526.png

 

Screenshot 2023-06-04 164541.png

 

5 REPLIES 5
Mays-Alreem
Frequent Visitor

Hello @NaveenGandhi 

Thanks for your help.

In fact, I am training on a dataset from the net, I took it from this site

Products Catalog - from E-commerce Retail site NewChic.com - dataset by jfreex | data.world

I want to learn the optimized way to modeling the data.

 

Hi @Mays-Alreem ,

 

Since you said 'I want to learn the optimized way to modeling the data.' and it seems you have just started learning about it.

 

The data set you are using is a sample data, we are, at least I am not sure for what purpose data would have collected. Let's say its product sales data in different countries then our interest is to analyze the sales by country. You also felt the same way, so created a country dim table, right?

 

Now the sample table showed here, contains multiple countries in one column. In practical sense, sometime in transactional softwares data is collected this way, but there is a difference in transactional process and reporting process. So to make the data reporting ready, we have to transform it.

 

The said table is basically a fact table and as far as I would like, will not use in the same format. Still I will try to answer all the question which may likely be relevant for this dataset. So there is as such no need to keep using this raw data table. So, we would first remove any unwanted columns, split country column by some delimiter, in your case a comma and then unpivot all country columns. So this wide table will convert to a long table.

 

I assume that ID column of this table is just a identity column for individual transactions and you may or may not need use that, so not much worry about in intial stage, just keep it.

 

Then create a relationship between dim country table you created and the transformed fact tables country code.

 

Now it should be 1 to many relationship between countries. Such design will help you to answer many complex question fairly easily since base design is good/optimum. Same way, you would create dim tables for transactions dates etc and all such dim table shall be used in the slicers.

 

The above concept is bascially dimensional modelling and I am fond of this simple concept becuase it works beautifully in efficient and flexible report creation.

 

I can not access the data without joining the link you provided.

 

If you wish to know more, please google it or let me know.

 

Hope it helps.

@Mays-Alreem 

Check the attached PBIX. Basically this will allow you to use unique CodCountry to be used in a slicer and filter the accessories table data accordingly. I have created a measure "Filter_" and used in the tables visual level fitler.

Since the CodCountry is just a representation of cod available for countries, there is no much use in trying to make it as a dim table anyways.

Let me know if this helps!

If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!



@NaveenGandhi

  • Thanks alot for your efforts but this way will remove alot of records, I solved it with many to many relationship as in the pictures I put above, the problem was the direction of the cardinality it was one at the side of the fact table but it's worked good then used the crossfilter function in DAX in some visuals... Thanks again for your help. 
NaveenGandhi
Super User
Super User

Hello @Mays-Alreem 

If you want the dimension table(Unique COD Country Sliicer) to just filter the data in a table visual, I have a workaround. But there isn't a way to normalize this further and create many-1 relationship, since 1 ID has multiple CODCountry.

Share a sample data if you want solution just for filtering.

Let me know if this helps!

If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.