Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Power BI service uses incoherent query caches

The behavior of reports in the Power BI service means that after a model is refreshed, a user opennig a report can see totally inconsistent results as they interact with the visuals - sometimes seeing data before the refresh, and sometimes after. 

 

Repro would be as follows:

1 Have model in the service refreshed with Mondays data

2 User opens report, and selects slicer for Country France:  they see Mondays data

3. User selects slicer for Country Germany :  they see Mondays data

4 Refresh model with Tuesdays data

4 User selects slicer for Country France again :  they see Mondays data (as hitting cache) 

5 User selects slicer for Country UK :  they see Tuesdays data (as missing cache) 

6. As they continue to interact, whether they are seeing Mondays or Tuesday data is totally unknown to them and unpredicatable  - it depends on how the system chooses to use caches

7. User reopens the report sometime later  - it still remains true that could be Mon or Tuesday data, depending on query cache use   [This seemed really surprisng - it certainly occured once for me - I have not tried to reproduce]

8. User refreshes visuals - now everything is sorted out, and everything shows Tuesdays data

 

Note I explained here using example of slicer selection by country, where it might not be obvious to the user they are looking at different versions of the data - though in other cases it can be utterly myterious, and look like PBI is returning different and incorrect data on a whim.

 

I understand query caches need to be used, and that the user has a way to clear IF they know that the model has been refreshed.  But seeing as step 6 is a terrible state to be in - using incoherent caches is bad data - then somewhere on the way to that state, the caches should be discarded. Either:

- at step 5 (the first time query misses the cache, upon discovering the model has been refreshed, the system should clear the old caches)  

- at step 7 (at least when the user reopens the report, which they might reasonably expect is sure to be getting the latest data)

 

Presumably the caches age out - so if the user repeatedly reponed the report over subsequent days, presumably they eventually see Tuesdays data without needing to refresh visuals. So there Power BI is silently refreshing the visuals (discarding cache) with no user interaction - step 5 is a MUCH better point to do this.    

 

Status: Investigating
Comments
Anonymous
Not applicable

Hi @paul_sanders 

Refreshing gives different results depending on your data connection mode and how you set up your refresh. If you are using Import connection mode, when the data source is updated, you need to refresh your model to make sure that the data in your imported data model is up-to-date, and you will be able to get the latest data when you report interaction. If you are using Direct Query or Live Connection connection mode, then you will have the query sent to the data source and return the latest result when you directly interact with the report .
In addition, the time setting for automatic refresh, incremental or full refresh, can all lead to inconsistent refresh results, so you need to combine your refresh settings and data connection mode to determine the specific reason for inconsistent refresh results.

 

Best Regards,
Community Support Team _ Ailsa Tao

paul_sanders
Frequent Visitor

The model is using Import mode, as I think is obvious from step "4. Refresh model with Tuesdays data".  It is doing simple Full refresh.  

So the reason for the inconsistent query caches is because Power BI seems quite happy to use inconconsistent query caches, that have cached results obtained with different data versions of the underlying model. I am trying to argue it should NOT do that - delivering data quickly is great, which is why query caches are used. Delivery data quickly at the expense of delivering correct, coherent data is NOT great. One could go as far as to say it's terrible 🙂 

This widget could not be displayed.