Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
pyampy
Frequent Visitor

Flat File to Star Schema

Hi,

 

Although a lot of posts exist on this topic "Flatfile to star schema", I have been unable to wrap around the process that suits my situation.

 

I have a flat file that looks like the following. The Geography has 50 possible parameters, Sex has 3, Age Groups has 100 and Residence Settings has 3. And the rows repeat for all possible combinations giving population data for the groups

 

I uploaded this to PowerBI as a CSV file. Applying filters and visualizing this data is extremely slow. Which makes me want to convert this a star schema. But having watched a ton of videos, and having read through this form, I am still unsure how to achieve this. 

 

Do I just make a "demension" table for Sex, generate a reference ID and bring it over to this table? And similarly do the same for other dimensions? I don't see that reducing the duplicates or making anything faster since the number of rows in this table will be unaffected. 

 

Any advice would be appreciated!

 

GeographySexAge Groups

Residence Setting

Population Group #1Population Group #2Population Group #3Population Group #4
City #1TotalTotal

Apartment

100205010
City #1TotalTotal

Detached

20506050
City #1TotalTotal

Semi-detached

2050100
City #1Total0 to 14 years

Apartment

............
City #1Total0 to 14 years

Detached

............
.........

...

............
City #1Male50 to 54 years

Apartment

............
.........

...

............
City #50Female60 and over

Semi-detached

............
3 REPLIES 3
lbendlin
Super User
Super User

Unpivot your population group data.  Separate out the dimensions you mentioned and replace them with integer keys in the fact table.

Thank you! When you say unpivot, how would the table look like - Would it look like the following with figures only in one column?

 

GeographySexAge Groups

Residence Setting

GroupValue
GeoKeySKeyAgeKey

ResKey

GroupKey_Pop#1100
GeoKeySKeyAgeKey

ResKey

GroupKey_Pop#220

 

I understand that the geography table will have just a list of all geographies, similar for Sex (Total, Male and Female), Age (all ages) and Residence Setting (Apartment, detached, semi-detached).

yes, that looks right.

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

Top Solution Authors