Learn from the best! Meet the four finalists headed to the FINALS of the Power BI Dataviz World Championships! Register now
Data nuts...
I have a desktop query built that merges in several tables to create an "uber" view of employee data. I have noticed as I pull in more attributes, the query takes longer and longer to execute, as you'd expect. All of the various tables have common connections like employee ID or similar which is what I use to make joins.
I haven't played much with keeping the tables more "pure" (no joins), and then use the DAX side and use the relationships to make the data accessible when building visualizations. My prior approach was the mega-join, so that I can have the filters and slicers operational on any of the data pieces I want.
This last week I built calculated columns in DAX to a "pure" source. It seems to work ok, I haven't pushed further into it to do a true performance comparison, but before I go there and put in the work, wanted to get others' opinions if this is a better way (better=performance).
thanks!
Solved! Go to Solution.
Can you show us some screenshots ?
The standard approach would be to have one or more "fact" tables that contain dates and amounts and multiple "dimension" tables that link to one or both of the fact tables that contain information about the "facts" that you use to filter on.
You should try for a "star" schema, where you do not have joins between multiple dimension tables. Often the "lookup fields" in dimension tables are denormalized and just hold the actual value, rather than a foreign key. Sometimes fact tables are denormalized, the table of order lines has the information from the order table added to it, for example.
I would think that the engine would be fastest working along relationships... looking things up in code would probably be the most expensive way.
I don't know if you have dax studio...it can show you execution times for your queries and is a big help in optomizing.
Help when you know. Ask when you don't!
forgot,
this post on bislogans.com gives some resources for creating star schemas https://bislogans.com/2-convert-your-data-to-a-star-schema/
Help when you know. Ask when you don't!
Can you show us some screenshots ?
The standard approach would be to have one or more "fact" tables that contain dates and amounts and multiple "dimension" tables that link to one or both of the fact tables that contain information about the "facts" that you use to filter on.
You should try for a "star" schema, where you do not have joins between multiple dimension tables. Often the "lookup fields" in dimension tables are denormalized and just hold the actual value, rather than a foreign key. Sometimes fact tables are denormalized, the table of order lines has the information from the order table added to it, for example.
I would think that the engine would be fastest working along relationships... looking things up in code would probably be the most expensive way.
I don't know if you have dax studio...it can show you execution times for your queries and is a big help in optomizing.
Help when you know. Ask when you don't!
Thanks for reply...I'm not a DB person, so I'm trying to learn as I go. Here's a sample scenario:
| Employee ID | Phone Number | Job Title | Manager Name | Manager ID | |
| 2 | 123-456-7890 | CEO | Chairperson of Board | 1 | |
| Employee ID | Address | City | State | Zipcode | Country |
| 2 | 123 Main Street | Anytown | YZ | 12345 | USA |
| Employee ID | Salary | Denomination | Bonus | ||
| 2 | $1 | USD | $1 |
|
3 tables from 3 different sources (don't ask why they're not in the same source, they're just not).
I would want to make analyses from each of the 3 tables based on different things...like how much salary do employees in City A cost vs. B, etc.
Thus I had merged them all together to be able to perform the visualizations.
I guess my fundamental question comes down to this.... (I'm a SW developer by past, but not a DB one, so excuse the analogy):
The data comparisons between tables are either a "compile time" or "run time" effort. The compile time is the queries and table joins...whereas the "compile time" is looking through the 2 different tables and doing the "join/compare" while I'm clicking through my visualizations.
If I had big tables with filters that changed the visualizations, wouldn't the "run time" effort be at or nearly the same as a "compile time"? Unless the run-time version can greatly reduce the data being compared from any filters applied, I don't see a material performance benefit, given it would still have to compare row by row of each related table.
So in all of this testing I realized my relationships weren't right. This led me down the mega-merge path...and in re-creating the mega merge in DAX I realized my relationships were wrong. Fixing them simplified everything...It's always all about the relationships. 🙂
Let the people say "Amen"
Help when you know. Ask when you don't!
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 45 | |
| 35 | |
| 30 | |
| 15 | |
| 15 |
| User | Count |
|---|---|
| 58 | |
| 56 | |
| 38 | |
| 21 | |
| 21 |