Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

PowerQuery - Validating queries when saving is taking excessively long time

I'm struggling to be able to work efficiently with Dataflows as I am experiencing excessive times to 'validate' queries after pressing the 'Save & Close' button. For entities with even a minimal amount of transformation (eg. simply changing some column data types) I've experienced wait times of over 3 hours to simply save the query to exit out back to the main entity menu/view. Should things be taking this long for an output schema validation process, regardless of the size of the underlying table data? Is this an issue being experienced by others as well?

This is making it simply impossible for me to work with the Dataflows product in this way.

 

2019-12-03_16-00-57.png

Status: New
Comments
TomaszBielak
New Member

Same here, it is impossible to validate query that was functioning without any problems for months. Each attempt ends with timeout error.

v-qiuyu-msft
Community Support

Hi all, 

 

Sorry for the late response. If the same issue still occurs, I would suggest you create a support ticket to get help. 

 

Support Ticket.gif

 

Best Regards,
Qiuyun Yu

lawsonbe
Helper I

I am experiencing the same problem. This part of the system holds so much promise. I am trying to pull in 93 csv tables from a single data source and want to be able to incorporate it into an even larger data set. There is nothing special or unusual about the tables. All data types are allowed in flows. Dataflows seem the ideal way to do this. As I am about to hit save, I see no warnings. When I hit save after entering all the querries, the warnings start appearing. There is a long count up and then a count down. It just never reaches zero and finally says 'Failed to Save' on the entity screen. I thought engineers would be watching these posts and answer here instead of directing you to create a support ticket.

ryan0585
Advocate II

I agree with @lawsonbe.  I've created support tickets in the past, and I didn't get the help there I needed, more someone telling me they couldn't recreate the error I was seeing because they didn't have PostgreSQL (separate issue there, obviously).

If anyone at Microsoft is monitoring this, it sounds like what you need to do is stress test the data flows a bit, and that is not going to be something you should do over the phone with your customers... it'll probabloy take a bit.

=================================

If it was me, I'd get a data flow going of 20 tables of different sizes from different sources, some SQL, some ODBC, some Excel/CSV.  For the first 5 tables, load them up one at a time and save the data flow between each table, monitoring the time it takes to validate queries after each save.  You should have 5 separate time metrics here at the end of those first five tables, and if it's anything like my experience so far, you should see the time taking longer each time a new dataset is added to your dataflow.

For datasets 5-10. do the same thing... load them one at a time and save the dataflow after each dataset is added, noting the increasing amount of time it takes to validate queries.  Again, the time to validate is probably increasing, as folks here are saying it is.

Now load datasets 11-15 up together and save the dataflow once, getting one metric for the time it takes to validate queries.  Once you have that, do the same thing for datasets 16-20.

=================================

There's enough feedback here from the community to do some additional testing for the dataflows feature.  You know people are saying they are struggling with the "validating queries" step, and you know people are saying the problem exacerbates once the number of datasets in the dataflow increase.  Stress test the feature.  As your customers (my company pays for a premium membership), we don't feel like we should have to walk you through how to test this stuff by opening support tickets.

This is an issue.  Please just have someone stress test the thing and try to fix it in a future update.

michaelshparber
Advocate V

Same here.

I have several API queries and it takes forever just to save the dataflow....

I don't mind it takes an hour to refresh.

But saving should be fast.

I have opened an idea to skip validation when saving, please vote:

https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/40604332-skip-validating-queries-...

@uberdube @ryan0585 @lawsonbe @v-qiuyu-msft @TomaszBielak @ChrisRMcAlpine @simr @Anonymous @Jessesmith4410 @Anonymous @RenaudB 

cvillegas
Advocate IV

I posted the same thing on Reddit last week and it seems a lot of people are facing the same issue. I even tested this process by creating a dataflow with a single manually created table entry with 1 row / column. Clicked save and waited about 15 minutes for this query to save. Support has never really been helpful as it's almost always getting phone calls and emails with little to no assistance and then a person letting you know that they've sent the issue to the product team and then the request floats off into the ether. The only time I've gotten true assistance was last night when I sent an email directly to the dataflows team at MSFT HQ in Redmond asking about Enhanced Compute in Dataflows and almost immediately one of the lead Principal Architects from MSFT emailed me back and asked for all of the pertinent data that he needed. He said he'd respond today so I'm hoping he gives me some info on that as well as this issue that all of you have mentioned because the service is almost unusable this month.

snowrider1799
Helper I

I am also having the same problem. Pulling from WebApi, and the "validating queries" is taking forever, I can't proceed with my work at all, and i still have a lot left to do on the dataflows, but this part is frustrating, we cannot control and don't know what is going on at the back, why is there so much to validate when my queries steps are just below 10? 

ryan0585
Advocate II

For those running into this issue...

It might not be ideal, but I've gotten around this by creating a different dataflow for each dataset I'm pulling in.  Again, I know that isn't ideal... it means having to manage multiple refresh schedules, configurations, etc., but if the alternative is being dead in the water, it works.

For me, I've noticed the problem exacerbates when there are more datasets added to the same data flow.  If I try to keep it to one (at most, 3-5) datasets in a data flow, I can generally get by.  The problem really starts for me once I get into the double digits of datasets (i.e. tables/sources) in a single data flow.

Stored as separate data flows, you can still bring them into a report in a single step.  Try it out if you're stuck and see if it works.  Might just be enough to get by until this gets fixed.

snowrider1799
Helper I

Hi @ryan0585 , thanks for your suggestion. I am having 5 tables (datasets) in a dataflows. I bet this is not many. I tried with double digits, it took me more than 5 hours and still loading. So i changed my approach, splitting them into multiple dataflows, yet still take more than 2 hours for validating queries. 😞

lawsonbe
Helper I

I have had a little more luck recently using 5-10 tables. I am also using the data flow to merge tables in a star schema (fact table with all associated dim tables). This will allow me to pull only the single merged table through to PowerBI. I am also working with a single table and writing a separate querry to do some complicated conversions and logic in PowerQuery in the dataflow. I think doing this processing work in the data flow before bringing it into PowerBI will speed up the dashboards. It is still not all I had hoped it would be but I am starting to see where it might really shine.