The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now
Simple SQL query that runs 7 seconds in SQL takes 19 minutes(!) to validate in Dataflows
19 minutes!!!???Not just one query.
I've tried SQL Server and PostgreSQL - the time it takes to Validate is just crazy.
What is going on?
Same here. Who doesn't love working night and weekends? I see the issue in both Premium Capacity and non-Premium Workspaces, which I think is telling. Somewhat surprising to me is that my dataflow refreshes do not seem to be impacted.
Good to know I am not the only one extremely frustrated with the utter slowness of these validations
Does anyone know if having your own server capacity, rather than resources shared amongst numerous tenants, would solve the problem? I have a Premium PPU licence, so I'm guessing this is what's happening - I'm sharing with lots of folk.
I've been reading about Azure Data Lake Storage Gen2, but I'm not a techy and there are no techies in the company I can ask for help from.
Hi Claire. I am seeing the same issue in both our Premium Capacity (dedicated to our tenant) and non-Premium (shared with other tenants). We definitely have most of our Dataflows in Premium Capacity and that is probably true for the other posts here (though I obviously have no way of confirming that).As for the Azure Data Lake Storage Gen2. I'm not sure which doc you're looking at it, but maybe related to this option (high-level):By default, the files related to your dataflows (CSV and JSON) are stored "somewhere" that you can't get to directly. There is an option to change that to your own Azure Storage account which then gives you access to those files. I have that in place for some of my Workspaces but it is not making any difference in terms of the Validation performance.FTR, I have been a heavy dataflow user for more than two years. I want my work to be easily reusable by me and others... While I've grown accustomed to some slowness at the Validation step (especially if comparing to working inside of PBI Desktop), something changed last week. Now things that were taking 10s of seconds are taking 10s of minutes, sometimes close to an hour. Not OK...Here's something you might want to try out: Even when things are running normal, if I am building new things, I do it from Power BI Desktop first and then copy it to a Dataflow. If I need to make minor changes after, I do it from the Dataflow editor. If I have more complex changes to make, I copy it from the Dataflow to Power BI Desktop and then back again. The copy also brings in any “upstream” dependences, so it is a very easy to use feature. One catch is if you are copying “back” it will not overwrite and will instead duplicate. The way to address that is by using the Advanced Editor to overwrite the underlying code instead.
Thank you for taking the time to explain your thoughts.
I first wrote all my queries in desktop, but when that started to grind to a halt, I decided I'd better research other ways. For sure, my queries are probably a bit rough and would make some people cry (!), but my dataset is tiny compared to some.
Thankfully, I already did what you suggested and copied/pasted queries from desktop via the advanced editor, and only used the dataflow interface for minor tweaks.
Previewing the query in service is nigh on impossible (it times out after 10 mins), and I put that down to sharing a server with many other tenants. When there are errors (I'm told there are errors, but it's impossible to see what they are!), I hook back in via desktop to see if the errors are obvious there, e.g. double data type (my lingo is probably incorrect). More often than not, it all looks ok to me.
What is particularly interesting in what you say is how the tardiness of validation seems to have fallen off a cliff in the last week or so; that gives me a ray of hope that it can/will be better than what I have now.
Thanks again, Robert.
This idea is about two years old.
One thing I wanted to point out is that it is currently possible to work around the problem by parameterizing dataflows in a way that has a "fast" and "slow/normal" mode. The fast mode would only return a small subset of data (eg one trailing week), and it would be used during development.
The fast mode, should you create one, would allow "validating" operations to be performed more quickly
When it is time to change your dataflow parameter to run in the "slow/normal" mode you make the switch like follows:
Here is more info:
Dataflows - U/I Fixed? - No more waiting on "Validating Queries"!!!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.