The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi all,
We are running a service using Embedded on Azure along with the Import method, and it refreshes upon trigger during the day. All is working well except that we have been hitting the 3GB memory capacity limit on A1. Then, at times, it falls over rather than throttling back.
It is hugely expensive to move up to A2 (double the price) for just a few seconds of processing time as we creep over the 3GB limit. Our costs are at risk of moving from £6,700 per annum to £13,400 per annum. Our pricing model with our customers doesn't justify it, so we want to keep it on A1 for as long as possible.
Below is a graph from Azure of the Power BI max memory usage in the last 24 hours. you can see it peaking just over 3GB.
This time it didn't fall over. But it is not consistent. I had a conversation with a Microsoft employee on this and he said extra memory is allocated if available. But, at other times, it will refuse to refresh. Instead, we should be seeing a graph hitting the ceiling and holding there as it starts throttling back processing.
[EDIT: note if I graph the average memory usage it peaks at 2.5GB]
We have a plan to reduce the processing time though it remains to be seen whether it is effective. In the meantime, I have a question:
Why does it fall over rather than throttle back? Is it on the roadmap to improve this aspect? Clearly, if we are running an automated refresh on-demand during the day for our customers, then we need guaranteed refreshes. I would be happy if it managed its memory/CPU and slowed things down as we exceeded the limits. We could then make a considered decision to upgrade whilst knowing that refreshing was still taking place.
[EDIT: Why does the memory peak at just one point? Is it just taking advantage of the maximum capacity, or is there some specific reason why it peaks so high? As the average memory usage peaked at 2.5GB, does this mean that it won't fall over, or is that irrelevant?]
(This post http://community.powerbi.com/t5/Developer/New-Power-BI-Embedded-pricing-model-page-render/m-p/306765... provides a little more insight although it concerns the other limit - page renders. Reading the comment posted by Eric Zhang from the product team, the capacity seems to be a combination of CPU and memory. I don't use Direct Query.)
Thanks,
Solved! Go to Solution.
Hi @Anonymous,
I don't know if you have already seen these metris that are added in Azure for PowerBI Embedded.
https://azure.microsoft.com/en-us/updates/power-bi-embedded-azure-memory-metric-available/
https://azure.microsoft.com/en-us/updates/azure-alerts-now-availble-for-power-bi-embedded/
So now you can set some alerts when the memory or the query processing pass or reach some limits.
Hope it helps.
Thanos
We have the exact same problem
We hit the max on an A1 SKU in no time. We were forced to move to an A2 tier, which is DOUBLE the cost. So we switch our capacity to A1 in the evening, then the next morning switch back to A2, manually update the datasets which did not refresh overnight, and then run the day on A2. Come evening and we repeat the procedure.
Even on A2 we hit the maximum without any effort. In fact a single dashboard with not much data pushes us over the limit. And there is no warning - we get to know about it when customers phone in saying the visuals "show me a cross with a circle around it".
We use embedded.
What i see happening though is the memory usage does not come down. It steadily climbs during the day. Only if we pause and restart the server do we get the memory to drop. But then it starts climbing again... see the screen shot below. The Yellow circle is where we restart in the morning, and then it drops to about 500 MB but start to climb. The straight parts on the graphs is where there is NO activity... yet it consumes more than A1 SKU capacity. The spikes are where someone opens a single dashboard. You will see a single dashboard render (with really not much data) pushes it over the limit in no time - mostly because we start at such a high point even before it was rendered.
I have tried logging a call with MS TEchnical, only to find out i have to buy a service package, which I did, under protest... but then when i tried to log a support call it said I don't have the correct rights to log a service request... (I have the rights to buy it, but not use it??)
PBI is a great product, but MS is doing their best to make it unaffordable and unuseable at this point...
The reason that it grows is that it is cacheing the data, I understand. Your graph is very familiar to us! Having to manually switch tiers twice a day is a real pain, but definitely saves on the cost. We considered this but then tried other things first:
1. Avoid work in Power Query - push back changes into SQL Views or similar.
2. Logically break up the pbix so that the relevant data is only loaded when needed. Ok, we haven't done this but MS 'recommended' it. For example, keep data from different years in different pbixes, as most people look at the current year regularly but the previous years rarely.
3. Reduce the size of the pbix - remove columns from tables - especially fact tables, when they are not needed. This helped a lot.
4. Use Direct Query if possible. Haven't tried this as it means re-building a twenty page dashboard with a lot of complexity.
Your phrase " In fact a single dashboard with not much data pushes us over the limit. " is worrying. That shouldn't be happening - is your pbix size enormous? We are running about 55MB in A1 and it peaks at about 2-2.5GB, well within the 3GB limit.
@roelf - let me know how you get on!
Thanks @Anonymous for the feedback.
Some of our queries are definately not optimised - that's for sure.
We run quite a few files though - mainly because the RLS Dashboard we had could not update the dataset... So we had to split them into seperate files - one per customer... which is another pain in the neck.
A typical file for us is less than 5 MB. The datasets on PBI ranges between 1 MB and the biggest one is 19MB - which is not massive... You also say below you run a 20 page file - ours are 4 and some 5
We export the information to PBI.COM in JSON, and then parse and unpack the JSON in Power Query - so maybe we should reduce that ?
Your point on "Logicaly break up the pbix" - how did you manage that? Is the data only "pulled" when you say open a page? We can look at breaking up tables so that they are only called when someone opens the page/sheet
Remove columns - i assume this means don't import them in the data query ? Or are you saying import, but then delete them from the dataset in Power Query ?
Direct Query - we tried this, but the response is super slow... plus we encountered too many measures we wanted to use which are not available in DQ.
I have also been told that the Mem usage grows because of cacheing - but is there a way to clear cache ? That to me is quite a problem, cause that means everytime someone opens a report, there's not way of "un-doing" his usage?
Thanks again for the assistance.
Hi @roelf
Your memory usage will depend on how many customers (and therefore how many pbixes are open at any one time) you have. It could be that it's all working fine but you just have too many pbixes....? How much memory is used by one or by two customers?
JSON unpacking - maybe. We get memory peaks when our pbix is refreshed - it is refreshed on completion of some uploading at various points during the day. For us, to reduce the peak, we needed to reduce the Power Query processing (which takes place on refresh), and to do that we needed to optimise query folding - which meant in turn to reduce as much M code as possible. The reasons for your max memory could be different for you?
Breaking up the pbix - oh, not rocket science - just remove columns and set row filters in Power Query which should then fold the query back to the source. There's stuff online about what M code will reduce query folding.
Remove columns - see above. By removing them in Power Query, all the steps are 'folded' into a single query - thus optimising the query against the original source.
Direct Query - yes, it sets restrictions on DAX usage. There's a pbix option to ignore the limitations but at the risk of the refresh timing out or taking too long.
Cacheing - alas, no. The closing down of an inactive pbix is managed somewhere in the background of the service. I don't believe there are any ways to manage this manually or programmatically, unfortunately.
I don't know! The Azure metrics don't tell us. My colleague says he can count the 'connections' (outside of Power BI Embedded) but he hasn't had time yet to do so.
The Azure metrics, while welcome, don't have enough detail to monitor usage. I would want to be able to look at the usage per connection and map that back to the token. Maybe it's in development....
We have about 250-300 individuals who can connect, but I would imagine that 50-100 might connect during any particular day - but as I said, we don't actually know.
Peak usage? We're running 2-2.5GB, having brought that down from 2.8-3.2GB when we were forced onto A1. Although I see a gradual increase in cacheing, it's not as steep as yours.
Hi,
There is no workaorund that I am aware of to deal the situation when it comes to the page render limits.
The behavior should be:
1. Refresh call sent out,
2. Limits check, return the code,
The result of the report is holded in the last response, and I don't think the Refresh code would contain the responses when it was triggerred, so what it would do is to "throttle back" from the server to stop the new response.
Read more about the Page Render and other parts in the following blog:
Power BI Embedded....on its way out...
And:
Understanding the Power BI Capacity Based SKUs
Regards,
Michael
Hi Michael
Thanks for responding. Unfortunately your two links only have background information. The first one was written by me (!!!) a year ago when the new Embedded was just coming out. We know more now. And I've read John White's licencing summary but it doesn't give any technical information about how Microsoft handles memory capacity and CPU usage at the point of refresh and 'page renders' and what happens when you hit each ceiling at A1, A2, etc.
I have been told by another Microsoft employee that there will be some tools in the near future in Azure to help see what is going on. That is essential for the success of this product.
It is critical for ISVs to be able to predict, provide & manage capacity, and price accordingly with their customers. At the moment we are flying blind.
If any other Microsoft employee out there can help answer my questions (or you know someone who can), that would be great. Happy to discuss privately if needs be.
Thanks,
Hi @Anonymous,
I don't know if you have already seen these metris that are added in Azure for PowerBI Embedded.
https://azure.microsoft.com/en-us/updates/power-bi-embedded-azure-memory-metric-available/
https://azure.microsoft.com/en-us/updates/azure-alerts-now-availble-for-power-bi-embedded/
So now you can set some alerts when the memory or the query processing pass or reach some limits.
Hope it helps.
Thanos
I had seen the metrics but hadn't looked at the alerts in any detail. We'll have a look and see if we can configure something useful from it.
Thanks very much!
Hopefully some better tools will be coming out soon. And if anyone has any more explanation for when the refresh crashes out, that would be helpful.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.