The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Good day,
I need advice on how to solve this situation. I have "Job" tables from the ERP system that contain individual jobs. This table has dozens of new items every day. I also have a table "CalcProcessCost" in which there are individual calculation items such as operation 1, operation 2, etc. and their prices. I don't know how to create a data model. CalcProcessCost is definitely a fact table. Is it correct to specify the "Job" table as a dimensional table, even though new items are added to it all the time? Cardinality is 1:N.
Thank you in advance for the answers
Solved! Go to Solution.
From my point of view, it is appropriate to specify the "Job" table as a dimensional table,
even though you will have new items which can be added to it all the time. Having dimension tables that are updated frequently with new records is known as a Type 2 dimension,
where each change to a dimension attribute creates a new record in the dimension table.
In your case, the "Job" table can be a Type 2 dimension, where each new job added to the table creates a new record with a unique identifier,
and any changes to the attributes of an existing job create a new record with a new version identifier.
Your fact table "CalcProcessCost" would then have a foreign key reference to the dimension table using the unique identifier of the job.
The cardinality of the relationship between the "Job" table and the "CalcProcessCost" table is 1:N, which means that each job can have multiple calculation items associated with it. Therefore, the fact table "CalcProcessCost" would have a foreign key reference to the "Job" dimension table.
From my point of view, it is appropriate to specify the "Job" table as a dimensional table,
even though you will have new items which can be added to it all the time. Having dimension tables that are updated frequently with new records is known as a Type 2 dimension,
where each change to a dimension attribute creates a new record in the dimension table.
In your case, the "Job" table can be a Type 2 dimension, where each new job added to the table creates a new record with a unique identifier,
and any changes to the attributes of an existing job create a new record with a new version identifier.
Your fact table "CalcProcessCost" would then have a foreign key reference to the dimension table using the unique identifier of the job.
The cardinality of the relationship between the "Job" table and the "CalcProcessCost" table is 1:N, which means that each job can have multiple calculation items associated with it. Therefore, the fact table "CalcProcessCost" would have a foreign key reference to the "Job" dimension table.
Thank you for answer.
I would also like to know how to solve the situation when I have the "Customer" table, which is linked to the "Job" table. If I were to connect it like this, it would not be a Star Schema but rather a snowflake.
I read everywhere that star schema is better than Snowflake
I solved it by power querying the CustomerId from the "Job" table to "CalcProcessCost" so that I can directly join the "Customer" table with "CalcProcessCost"
Am I doing this right?
thank you
Can you please provide a portion of your data and the structure of your tables so we can help you ?
My point is that the data from the EPR system is structured in such a way that the "CalcProcessCost" table is not basically linked to the "Customer" table.
I completed the link by using merge Queries in POWERQUERY and adding the Calculation.Job.CustomerID link (green in the picture) from the "Jobs" table to the "CalcProcessCost" table. The simplest thing would be to connect the "Customer" table with the "Jobs" table, because they can be related to each other in the beginning, but then it would no longer be StarSchema. I also had to fill in the CreationDate (yellow in the picture) from the "Jobs" table.
The model works and the reports work, but I don't know if I'm making it unnecessarily complicated just because I want StarSchema.
My model:
model from ERP:
Based on the info you provided, your star schema seems logical to me.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
118 | |
87 | |
75 | |
53 | |
45 |
User | Count |
---|---|
135 | |
121 | |
77 | |
65 | |
64 |