Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Hi everyone,
I have a report that is a composite model that uses direct query to fetch newly changed values from DataVerse (we have a power apps integration where users can change values and we want it to flow from dataverse into our power bi report). Genereally speaking, i have a solution that works but its not efficient and is causing spikes in CPU usage across the users.
One of the main issues that comes up time and time again when trying to optimize the report, is around how the SQL gets generated when there are filters on our main dimension (lets call it DIMENSION_A) that we are splitting the data by. You can see in the picture below, that instead of just querying, then filtering using the data model, it sticks the entire list of ids into the where clause of the direct query. Because my interests are in speed to display updates with a medium size dataset (50-100k rows), this WHERE clause i think hurts performance not helps it.
My work around solution so far is essentially creating 2 tables. One "archive" table thats in import mode and another "update" table thats in direct query mode. Using dax i can get the "new" records after a certain last modified date then limit the entries in DIMENSION_A to only those who have been recently changed. In my dax formula i have a process to make sure things arent double counted. Its the best i could do without an alternative solution to prevent these huge WHERE clauses from being generated. I do see performance gains with this option but its not ideal.
note here: i am aware that strings are inefficient and and will be converting them to int. Doesnt change the fact the list is being put in the where clause.
Does anyone know an approach to prevent this from happening? Ive tried several things including:
You help is greatly appreciate!!
Had a similar problem recently, but even worse. They seem to have introduced a limit on the Dataverse which means some of these large, generated queries won't even run any more (Error: "Too many literal binary expressions in the query. The maximum number is 200")
My workaround (though it's definitely not a solution) was to use virtual relationships with TREATAS and group by a higher granularity (week start dates instead of dates). To fix this, Microsoft would probably need to introduce an option similar to "Assume referential integrity" for composite datasets that shifts the joining to the Formula Engine so it doesn't introduce a massive WHERE clause. I wouldn't hold your breath on that one!
One thing i realize that might be a suggestion here is putting the things im filtering on about DIMENSION_A in DataVerse. That is one solution for more traditional characteristics about DIMENSION_A. However, the report I have often filters on calculated measures that are specific to the user so this isnt a general solution to the problem.
User | Count |
---|---|
84 | |
79 | |
71 | |
48 | |
43 |
User | Count |
---|---|
111 | |
54 | |
50 | |
40 | |
40 |