This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
With the click of a mouse button, dataset creators of any skill level can improve the query performance of their DirectQuery datasets! In the Power BI portal, display the settings for a DirectQuery dataset, expand the Scheduled refresh and performance optimization section, and toggle the Automatic aggregations training option to On, as in the following screenshot, and don’t forget to configure a data refresh schedule to update your aggregations on a regular basis. That's basically all it takes. For more information about how automatic aggregations can help to improve the performance of your report visualizations, see the Automatic Aggregations Overview in the product documentation.
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
Sometimes, however, when you enable automatic aggregations (auto aggs), you end up with no aggregations and no corresponding performance gains. In these troubled situations, it can be helpful to have more control over the auto aggs process than the user interface provides. Granular control lets you fine-tune auto aggs. You can exclude specific tables by setting a minimum size limit below which you don’t want to generate auto aggs. Or maybe you want to test auto-aggs training with different configuration settings and verify that Power BI indeed added auto aggs to your dataset. The Tabular Object Model (TOM) and the Tabular Model Scripting Language (TMSL) give you the advanced options to cover these scenarios, as explained in this article.
If you are a BI developer and want to follow the .NET Core console application used in this article in your own environment, refer to the attached TOM sample code [attachment=1] for details. Make sure you use the latest Analysis Services client libraries and host your dataset on a Premium or Fabric capacity with XMLA Read/Write enabled. If you prefer to work with TMSL instead, keep your SQL Server Management Studio (SSMS) installation at the latest version. And if you are not a BI developer, you still might find the following troubleshooting guidance useful. So, let's quickly review how automatic aggregations work before diving deeper into various scenarios and code snippets.
For auto aggs to work, you must accomplish the following three essential tasks:
With the basics covered, let’s now dive into a common troubleshooting scenario: You enabled auto aggs, but you don’t notice any perf gains. In this situation, a good first step is to check that you actually have aggs tables in your dataset. Perhaps Power BI didn’t generate any because there just wasn’t any meaningful input data in the query log, or perhaps the dataset hasn’t been refreshed yet so training never ran, or perhaps training didn’t finish within the time limit of 1 hour, which means training hasn’t completed yet and will resume the next day. Auto aggs training can take a long time depending on how fast the data source can process the cardinality queries and other queries that Power BI generates during the training phase. Whatever the reason, if there aren’t any aggs tables in the dataset, your DAX queries can’t leverage aggregations, of course.
Aggs tables are relatively easy to spot in SSMS. Connect to your workspace, expand your DirectQuery dataset, and look for tables with GUIDs as their names. The following screenshot shows one such aggs table named b83b1a0c-5712-4f35-8d73-d1db0a0c6c33. After scripting out this table, you can see that it is a hidden system-managed table with an inferred partition source.
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
In managed code using TOM, it’s equally uncomplicated to check for aggs tables. See the following code snippet. As mentioned earlier, the TOM sample code [attachment=2] is attached to this article.
/// Check if there are any auto aggs tables, /// i.e. system-managed tables with an inferred partition source.
var model = database.Model;
if (model.Tables.Where(t => t.SystemManaged && t.Partitions.All(x => x.SourceType == PartitionSourceType.Inferred)).Any())
{
Console.WriteLine("This dataset has auto aggs tables!");
}
else
{
Console.WriteLine("This dataset does not have any auto aggs tables!");
}
Let’s continue our troubleshooting journey and assume the dataset has no auto-aggs tables yet. A logical next question could be if auto aggs are enabled at all. Now, you might be tempted to look for auto-aggs settings in the dataset. For example, you could check the AutomaticAggregationOptions property of the Model object. Yet, note that Power BI does not store its auto-aggs settings in the model metadata. The following screenshot illustrates this aspect. The dataset owner enabled auto aggs in Power BI, but there are no AutomaticAggregationOptions in the model. The auto-aggs settings that Power BI uses are only available on the dataset settings page.
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
So, what’s the purpose of the AutomaticAggregationOptions property if Power BI doesn’t use it? Its purpose is to store a default auto-aggs configuration that you can apply programmatically to auto-aggs training by using the ApplyAutomaticAggregations method. Power BI doesn’t use this property, but custom solutions can.
Instead of using a default auto-aggs configuration, you can also provide an explicit AutomaticAggregationOptions object as an input parameter to the ApplyAutomaticAggregations method. This is the method Power BI uses to submit its own configuration settings. And this is the method that lets you easily and quickly trigger auto-aggs training with different configuration settings as well. The following snippet code illustrates this approach. Refer to the AutomaticAggregationOptions Class in the API documentation for details about supported properties, such as QueryCoverage, DetailTableMinRows, and AggregationTableSizeLimit.
/// Perform auto aggs training using a one-off configuration.
///
model.ApplyAutomaticAggregations(
new AutomaticAggregationOptions
{
QueryCoverage = 0.5
});
database.Update(Microsoft.AnalysisServices.UpdateOptions.ExpandFull);Still, even a successful completion of the ApplyAutomaticAggregations method is no guarantee that auto-aggs training indeed generated aggs tables. As mentioned several times already, the query log might be empty, so auto-aggs training only “successfully” finds out that there are no aggs to generate. Or auto-aggs training took longer than 1 hour. In this case, you can perform training repeatedly until it finishes. Training is incremental and resumes where the previous cycle left off. Just train, train, train, and when aggs training is finally done, perform a single data refresh operation.
However, how do you decide that aggs training is finished? For this, you must capture the training status raised in an AutoAggsTraining - Progress Report End trace event. This is perhaps most easily accomplished by using SQL Profiler. Make sure you include at least the Progress Report End event class in the trace. The following trace includes an AutoAggsTraining - Progress Report End event with a few important pieces of information:
| Counter | Comments |
| DroppedByUser | Query shapes dropped because the queried table(s) no longer exist in the dataset. |
| ImportMode | Query shapes dropped because the queried table(s) are in import mode, which auto aggs do not cover. |
| LimitedRelationship | Query shapes dropped because of a limited relationship between the queried table(s). |
| RelationshipMismatch | Query shapes dropped because of a relationship mismatch between the queried table(s). |
| CalculatedColumn | Query shapes dropped because the queried table(s) have calculated columns, which is not supported. |
| DualRootTable | Query shapes dropped because the queried table(s) are in dual mode, which auto aggs do not cover. |
| TableExcludedByUser | Query shapes dropped because the queried table(s) are explicitly excluded by the user. |
| UnsupportedMDataSource | Query shapes dropped because the queried table(s) use unsupported M data sources. |
| UnsupportedNativeDataSource | Query shapes dropped because the queried table(s) use unsupported native data source. |
| SSOEnabled | Query shapes dropped because the queried table(s) use an SSO-enabled data source, which is not supported. |
| CardinalityEstimationFailure | Query shapes dropped because the cardinality estimation of the queried table(s) failed. |
| LargeSizeAggs | Query shapes dropped because the aggregation size of the queried table(s) is too large. |
| TableTooSmall | Query shapes dropped because the queried table(s) are too small to create aggregations. |
| LeftUncovered | Query shapes dropped because the queried table(s) are left uncovered because of user-provided training targets. |
| Unknown | Query shapes dropped because of unknown failures. |
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
Congratulations! At this point, we have mastered the first two essential auto aggs tasks. We generated DAX queries and performed auto-aggs training, which added a system-managed aggs table to the dataset. Yet, the aggs table does not yet contain any data, as the DAX query result in the following screenshot reveals. To load the data, we must run data refresh using any of the available options (manual, scheduled, or programmatic).
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
The next screenshot shows this article’s console app performing a programmatic refresh through the following TOM code. The sample code then submits the above DAX query for all system-managed aggs tables to count their rows. In production datasets, there would likely be many aggs tables and any aggs table with actual data can help to improve the performance of your DirectQuery dataset.
/// Refresh the model to import data into the aggs tables. ///
model.RequestRefresh(RefreshType.Full);
model.SaveChanges();
Troubleshooting_automatic_aggregations_to_optimize_DirectQuery_performance
And this concludes this little excursion into auto-aggs processing details. We hope the explanations provide you with sufficient details to master the most common auto-aggs troubleshooting situations. The sample code might also serve as a starting point for a custom solution to add fine-tuned auto aggs to your enterprise BI DirectQuery models. Don’t hesitate to download the attached TOM sample code [attachment=4] and try out different AutomaticAggregationOptions settings. For example, you could set the AggregationTableMaxRows parameter or the DetailTableMinRows parameter and see for yourself how different limits affect the generation of your aggs tables. And as always, please provide us with feedback as you leverage automatic aggregations to boost the performance of your DirectQuery datasets. We would love to hear from you!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.