Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
MyNameHere
Advocate I
Advocate I

Direct Query Modelling & Optimization -> Embedded

I'm trying to build a model and it's driving me nuts.

A lot of questions here. Documenting my thoughts and trying to paint the picture.

 

I have 6 dimensions ( max 200k records) and 2 fact tables with approx 10 million records or so.

I originally denormalised the fact tables to avoid duplicate rows with the same index.

 

Turns out Direct query on  this structure with the auto-generated SQL executes subqueries with "select * "on tables relevant to the query so the wide denormalised tables a can be a severe performance obstacle. Even if the chart only reference one dimension, the query drags in every column from the fact table and the dimension table. Logical reads are out of control.

 

Currently I have written view on top of the fact tables to separate them into subfacts with the  5-6i ndexes and 1 or 2 facts each. I have views on top of the tables that hide the metadata from PBI. (only SELECT ... FROM ...)

Should I physically split these tables out rather than use views?

Should I normalise the fact tables and duplicate the indexes more and verticalise the fact columns?

 

I then upgraded to Azure SQl database premium tier (for the ssd) and changed the fact tables to memory-optimized and added clustered columnstore indexes.

 

Still runs slow. What do I do next?

Upgrading further to Azure SQL database seems overkill. Including the persist layer I have less than 10gb.

SSAS Tabular is not an option cos I cant buy hardware and the idea is to use Power BI embedded.

 

How do I diagnose what the problem is?

Do I need Application Insights?

The cpu time on the queries is low but the elapsed time is 10-12 x the cpu time. 

I dont know if this is in the normal range.

Do I need to reorganize/defrag/maintain the clustered columnstore indexes on the fact tables?

 

Also, how much processing is done on the website vs the database?

Do I need a minimum amount of website hardware?

 

Any help much appreciated.

 

 

2 REPLIES 2
Eric_Zhang
Employee
Employee

@MyNameHere

 

In my opinion, you should start from the Azure database end. Try to trace the sqls sent from Power BI to your azure database and analysis the sql execution plans to tune the performance.

 

Check How do I get SQL Profiler info from SQL Azure?

pqian
Employee
Employee

@MyNameHere, I think the subselect shouldn't cause any problems in SQL engine, it knows how to optimize the query and will fold projections into the subselect. The problem maybe elsewhere. Are your views performing OK (outside of PowerBI)?

 

It maybe worthwhile to trace the SQL statements been send when you perform a report operation. Also, check the cross filtering direction for your relationships. 

 

If you can identify the slow performing SQL, we'll be able to diagnose the performance bottleneck a little better.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.