Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreShape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.
Hi,
Although a lot of posts exist on this topic "Flatfile to star schema", I have been unable to wrap around the process that suits my situation.
I have a flat file that looks like the following. The Geography has 50 possible parameters, Sex has 3, Age Groups has 100 and Residence Settings has 3. And the rows repeat for all possible combinations giving population data for the groups
I uploaded this to PowerBI as a CSV file. Applying filters and visualizing this data is extremely slow. Which makes me want to convert this a star schema. But having watched a ton of videos, and having read through this form, I am still unsure how to achieve this.
Do I just make a "demension" table for Sex, generate a reference ID and bring it over to this table? And similarly do the same for other dimensions? I don't see that reducing the duplicates or making anything faster since the number of rows in this table will be unaffected.
Any advice would be appreciated!
Geography | Sex | Age Groups | Residence Setting | Population Group #1 | Population Group #2 | Population Group #3 | Population Group #4 |
City #1 | Total | Total | Apartment | 100 | 20 | 50 | 10 |
City #1 | Total | Total | Detached | 20 | 50 | 60 | 50 |
City #1 | Total | Total | Semi-detached | 20 | 50 | 10 | 0 |
City #1 | Total | 0 to 14 years | Apartment | ... | ... | ... | ... |
City #1 | Total | 0 to 14 years | Detached | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... |
City #1 | Male | 50 to 54 years | Apartment | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... |
City #50 | Female | 60 and over | Semi-detached | ... | ... | ... | ... |
Unpivot your population group data. Separate out the dimensions you mentioned and replace them with integer keys in the fact table.
Thank you! When you say unpivot, how would the table look like - Would it look like the following with figures only in one column?
Geography | Sex | Age Groups | Residence Setting | Group | Value |
GeoKey | SKey | AgeKey | ResKey | GroupKey_Pop#1 | 100 |
GeoKey | SKey | AgeKey | ResKey | GroupKey_Pop#2 | 20 |
I understand that the geography table will have just a list of all geographies, similar for Sex (Total, Male and Female), Age (all ages) and Residence Setting (Apartment, detached, semi-detached).
yes, that looks right.
User | Count |
---|---|
93 | |
92 | |
85 | |
83 | |
49 |
User | Count |
---|---|
150 | |
142 | |
112 | |
73 | |
55 |