Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
i have a huge amount of data about 2.6 million rows..
i need to reduce the loading time.. cuz it takes more than a minute to load
the dax measures that calculate starting count ( count_min) and ending count( count_max ) and turnover for multiple values .. but it takes too long to execute
the huge date is split into multiple different tables and a distinct user table as fact table..
the different tables have data based on date
and i need to calculate the values for each table in a specific point in time .. which requires searching each table and getting the max date before the specific date and this same logic need to be done for each table and i have atleast 5 diff tables where i need to get the data from..
this is one of many main measures for the count at starting date.
count_min =
var SelectedMaxDate =
// Min('Calendar'[Date])
[relative_date_min]
// DATE(2024,2,31)
var selectedDealer =
Values('Dealer/District/Zone/Region_Lookup'[DealerID])
// SELECTEDVALUE('Dealer/District/Zone/Region_Lookup'[DealerID])
// "10130"
var selectedJob =
Values(Job_Lookup[JobID])
// SELECTEDVALUE(Job_Lookup[JobID])
// "D102"
var selectedTenure =
Values(TenureSort_Reference[TenureSort])
var selectedDriveR =
Values(DriveReimbursement_Shadow[DriveParticipation])
RETURN
calculate(
SUMX(
'User_Shadow',
CALCULATE (
COUNTX(
FILTER(
User_Shadow,
// Calculation_shadow[latestJob_min] = Maxx(RELATEDTABLE(Job_Shadow),Job_Shadow[PRIMARY_JOB_CD]) &&
// Calculation_shadow[latestDealer_min] = Maxx(RELATEDTABLE(Dealer_Shadow),Dealer_Shadow[DLR_CD])
// &&
Calculation_shadow[latestDealer_min] in selectedDealer
&& Calculation_shadow[latestJob_min] in selectedJob
&& Calculation_shadow[latestTenure_min] in selectedTenure
&& Calculation_shadow[latestDriveR_min] in selectedDriveR
&&
NOT(Calculation_shadow[latestJob_min] IN {"D345", "D346", "D347", "D365", "D366", "D367"})
&& LEFT(Calculation_shadow[latestJob_min], 1) = "D"
&&
// ( LatestHireBeforeSelectedDate = Calculation_shadow[latestHire_min]
// && Calculation_shadow[latestHire_min] <= SelectedMaxDate )
Calculation_shadow[latestHire_min] <= SelectedMaxDate
&& (
ISBLANK(Calculation_shadow[latestFire_min])
|| (
// Calculation_shadow[latestFire_min] <= SelectedMaxDate
// &&
Not(DATEVALUE(Calculation_shadow[latestFire_min]) > DATEVALUE(Calculation_shadow[latestHire_min] ))
)
)
),
// DISTINCT(User_Shadow[ExternalCode])
User_Shadow[ExternalCode]
)
,CROSSFILTER(
'Job_Lookup'[JobID],
Job_Shadow[PRIMARY_JOB_CD]
,None
)
,CROSSFILTER(
'Dealer/District/Zone/Region_Lookup'[DealerID],
Dealer_Shadow[DLR_CD]
,None
)
)
)
) +0
Everything referenced from Calculation_shadow is a measure with similar logic to this one below
latestJob_max =
var SelectedMaxDate =
// MAX('Calendar'[Date])
[relative_date_max]
VAR MaxDateBefore =
CALCULATE(
MAX(Job_Shadow[umd_date]),
ALL(Job_Shadow[umd_date],Job_Shadow[PRIMARY_JOB_CD]), // This removes filters only from umd_date and ExternalCode
// Job_Shadow[ExternalCode] = "D00031301",
Job_Shadow[umd_date] < SelectedMaxDate,
REMOVEFILTERS(Job_Lookup)
)
VAR Result =
CALCULATE(
MAX(Job_Shadow[PRIMARY_JOB_CD]),
Job_Shadow[PRIMARY_JOB_CD] in VALUES(Job_Lookup[JobID]),
// ALLEXCEPT(Job_Shadow, Job_Shadow[SomeOtherColumn]), // Keeps filters on SomeOtherColumn, adjust as necessary
// Job_Shadow[ExternalCode] = "D00031301",
Job_Shadow[umd_date] = MaxDateBefore
)
RETURN
Result
count_terminated_range =
var SelectedMaxDate =
// Max('Calendar'[Date])
[relative_date_max]
var SelectedMinDate =
[relative_date_min]
// Min('Calendar'[Date])
// DATE(2024,2,31)
var selectedDealer =
Values('Dealer/District/Zone/Region_Lookup'[DealerID])
// SELECTEDVALUE('Dealer/District/Zone/Region_Lookup'[DealerID])
// "10130"
var selectedJob =
Values(Job_Lookup[JobID])
// SELECTEDVALUE(Job_Lookup[JobID])
// "D102"
var selectedTenure =
Values(TenureSort_Reference[TenureSort])
var selectedDriveR =
Values(DriveParticipation_Lookup[Column1])
return
calculate(
SUMX(
'User_Shadow',
CALCULATE (
COUNTX(
FILTER(
User_Shadow,
// ( Calculation_shadow[latestDealer_min] in selectedDealer || Calculation_shadow[latestDealer_max] in selectedDealer )
// && ( Calculation_shadow[latestJob_min] in selectedJob || Calculation_shadow[latestJob_max] in selectedJob)
// &&
// ( Calculation_shadow[latestTenure_min] in selectedTenure || Calculation_shadow[latestTenure_max] in selectedTenure)
// && ( Calculation_shadow[latestDriveR_min] in selectedDriveR || Calculation_shadow[latestDriveR_max] in selectedDriveR)
[latestJob_max] in selectedJob
&& [latestTenure_max] in selectedTenure
&& [latestDriveR_max] in selectedDriveR
&& NOT(Calculation_shadow[latestJob_max] IN {"D345", "D346", "D347", "D365", "D366", "D367"})
&& LEFT(Calculation_shadow[latestJob_max], 1) = "D"
// &&
// [latestJob] in VALUES(RELATEDTABLE(Job_Lookup))[JobID]
&&
(
(
Not(ISBLANK([latestFire_max]))
&& (
[latestDealer_max] in selectedDealer
&& SelectedMinDate <= Calculation_shadow[latestFire_max]
&& Calculation_shadow[latestFire_max] < SelectedMaxDate
&&
// not(
// DATEVALUE(Calculation_shadow[latestFire_max]) <= DATEVALUE(Calculation_shadow[latestHire_max])
// )
// DATEVALUE(Calculation_shadow[latestFire_max]) > DATEVALUE(Calculation_shadow[latestHire_max])
// INT(Calculation_shadow[latestFire_max]) > INT(Calculation_shadow[latestHire_max])
format(Calculation_shadow[latestFire_max] , "YYYYMMDD") > format(Calculation_shadow[latestHire_max] , "YYYYMMDD")
// && NOT(DATEVALUE(Calculation_shadow[latestFire_max]) = DATEVALUE(Calculation_shadow[latestHire_max]))
)
)
||
(
HASONEVALUE('Dealer/District/Zone/Region_Lookup'[DealerID])
&& [latestDealer_min] in selectedDealer
&& Not( [latestDealer_max] = [latestDealer_min] )
)
)
// 'Hire_Shadow'[TRMNTN_DT] >= SelectedMinDate
// && 'Hire_Shadow'[TRMNTN_DT] <= SelectedMaxDate
),
User_Shadow[ExternalCode]
)
,CROSSFILTER(
'Job_Lookup'[JobID],
Job_Shadow[PRIMARY_JOB_CD]
,None
)
,CROSSFILTER(
'Dealer/District/Zone/Region_Lookup'[DealerID],
Dealer_Shadow[DLR_CD]
,None
)
)
)
// ,CROSSFILTER(
// 'Calendar'[Date],
// Hire_Shadow[HIRE_DATE]
// ,None
// )
) +0
count_avgEmp = DIVIDE([count_min] + [count_max],2,0)
turnover_master =
DIVIDE([count_terminated_range], [count_avgEmp], 0)
where im having the most lag time is creating a line chart for each month to get turnover which includes calculating count_min and count_max(similar logic but with max date) and count_terminated_range..
any way to fix the load time ?
or reduce the data to be processed ?
or any solutions .. to tackle my issue ?
any help is appreciated thanks
Solved! Go to Solution.
- Install DAX Studio
- learn how to use it to optimize queries (videos on SQLBI.com)
- examine your queries and refactor them based on your findings.
Hi @spradhan_pm ,
The direction lbendlin provided was pretty good, and his dedication was much appreciated.
Improve the efficiency of your data model by:
1. Removing unnecessary columns
2. Remove unnecessary rows
3. Grouping basis and aggregation
4. Optimizing column data types
5. Customizing column preferences
6. Disable Power Query query load
7. Disable automatic date/time
8. Switch to mixed mode
More related information can be found in:
Data reduction techniques for Import modeling - Power BI | Microsoft Learn
If there are calculated columns in the model, please consider moving these to Power Query when possible, as calculated columns increase the size of the model and slow down the refresh time.
You can also use Performance Analyzer to see which visual objects take the longest to load and which DAX queries take the longest.
For more information about Performance Analyzer see:
If you have any other questions please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
Hi @spradhan_pm ,
The direction lbendlin provided was pretty good, and his dedication was much appreciated.
Improve the efficiency of your data model by:
1. Removing unnecessary columns
2. Remove unnecessary rows
3. Grouping basis and aggregation
4. Optimizing column data types
5. Customizing column preferences
6. Disable Power Query query load
7. Disable automatic date/time
8. Switch to mixed mode
More related information can be found in:
Data reduction techniques for Import modeling - Power BI | Microsoft Learn
If there are calculated columns in the model, please consider moving these to Power Query when possible, as calculated columns increase the size of the model and slow down the refresh time.
You can also use Performance Analyzer to see which visual objects take the longest to load and which DAX queries take the longest.
For more information about Performance Analyzer see:
If you have any other questions please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
anything more specific based on my queries?
Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information or anything not related to the issue or question.
If you are unsure how to upload data please refer to https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Please show the expected outcome based on the sample data you provided.
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...
- Install DAX Studio
- learn how to use it to optimize queries (videos on SQLBI.com)
- examine your queries and refactor them based on your findings.
User | Count |
---|---|
55 | |
22 | |
18 | |
16 | |
11 |
User | Count |
---|---|
81 | |
55 | |
39 | |
20 | |
13 |