Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Shape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.

Reply
Fromit87
Advocate I
Advocate I

Error on PBI Service dataset refresh

Hi!

 

My model includes datasources from Databricks (cloud), Oracle (on-prem) and SharePoint (cloud).

For the Oracle data the gateway is set up and runs perfectly (in all my dataflows). All privacy options are set to "organizational". I use a deployment pipeline with three workspaces (dev, test, prod). I work with PBI Desktop April 2022, as I am aware there are issues with the May 2022 update and the Databricks connector.

 

In PBI Desktop I am able to refresh my model without any issues. I publish to the dev workspace and check the privacy settings and the gateways/credentials - the refresh runs without errors.

 

I move the dataset from dev to test workspace via my deployment pipeline. Again ensured privacy settings, gateways/credentials are as in the dev workspace -  but this time I get an error:

 

"A retriable error occurred while attempting to download a result file from the cloud store but the retry limit had been exceeded"

The table mentioned in the error message is a table that combines Oracle (on-prem) and Databricks (cloud) data.

 

Any idea, what I am doing wrong? Why is it working in one workspace, but not in the other - despite same Gateways, credentials, privacy set ups? Thanks in advance!

 

Fromit87_0-1653408329503.png

 

18 REPLIES 18
Anonymous
Not applicable

Did you connect in Direct mode or import mode?

Hi @Anonymous,

I connect via import mode.

daniel_st
Advocate III
Advocate III

@Fromit87 were you able to resolve the issue? I'm getting the same error.

Hi @daniel_st,

Indeed, last Friday we figured that an additional parameter has to be added in the source step. Please have a try, and let me know if it works for you, too.

The issue might be related to Databrick's new Cloud Fetch architecture in runtime 8.3 and above. With a cluster runtime of 7.3 the refresh on service worked smoothly for the same dataset. The issue only occurs, when merging on prem with cloud data in PBI power query and using a on prem Enterprise gateway. As 7.3 has end of support in September the fix below worked for clusters running on 10.4, at least in our environment. 

 

= Databricks.Catalogs("xyz.cloud.databricks.com", "sql/protocolv1/o/0/clusterxyz", [Catalog=null, Database=null, EnableExperimentalFlagsV1_1_0=null, EnableQueryResultDownload="0"])

 

copy @GilbertQ 

EnableQueryResultDownload="0" fixes the issue for me but strangely this seems to specifically be related to a config issue on my laptop for my UserID.  How do I know?  I can open PBI on my laptop using a colleagues credentials "run as another user" and have no issues, I can do the same on their laptop with my UserID and no issues, so it's specifically my UserID on my laptop. 

 

I assume this is related to some driver/config issue since I've previously installed Oracle, ODAC, etc., and even though I've uninstalled these I have to believe there's something left behind with a change to Spark ODBC driver or something.  If future users of this issue can confirm it's related to your UserID on a specific laptop only maybe we can understand what is misconfigured to avoid needing this code.

Worked for me, Thank you

hi @Fromit87 ,
Thanks for your reply. I'm aware of the parameter EnableQueryResultDownload="0".
Microsoft & Databricks suggested to set this parameter in order to get it running. However when you set this parameter you disable Cloud Fetch and you get very slow performance importing data. 

 

I noticed that the issue doesn't occur when I go through the public internet (withouth any gateway in place and connecting directly from PBI service to databricks).

 

Let's hope that Microsoft updates the driver in the next release.

I am having the same issue and adding that parameter works fine desktop however just for ~25k records it takes about 2 mins to refresh the power bi service which is not at all acceptable performance. 

 

I have been following up with MS team and Databricks team on this and have been getting the same answer to add the parameter. Due to security reasons, I cannot connect it without the gateway. 

 

Are you able to confim how slow the performance was in your case? 

Fromit87
Advocate I
Advocate I

Hi @GilbertQ,

Indeed, it's weird. I setup a completly new PBI file with a new table that combines a Oracle and a Databricks source connection (created a new connection, no copy/pasting).

Refresh works fine on PBI Desktop, but on Service I get the same Error: 

"A retriable error occurred while attempting to download a result file from the cloud store but the retry limit had been exceeded."

are you using a Power BI gateway in between? If so make sure you have installed the latest version. We had the very same issue and it was fixed with the November release. Hope that helps for you too 🙂

Thanks @daniel_st, lucky you - unfortunately we have the latest Gateway release running, but I still get the error when merging databricks with other non-databricks sources... only works with parameter EnableQueryResultDownload="0"

Did anyone find any solution for this? 

 

Both MS and Databricks not helping enough on this and hand bowling it to each other. 

Same here, MS / Databricks are not helpful...

Does the latest announcement maybe change something in this regard?

Announcing the Public Preview of Power BI Dataset Scale-Out | Microsoft Power BI Blog | Microsoft Po...

Yeah that is strange it could be a setup on your gateway, do you have the cloud and on-premise data sources configured in the gateway as shown below?

 

GilbertQ_0-1653956424122.png

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Hi @GilbertQ ,

 

I checked with my IT department, and yes the Gateway settings have this option activated. My IT department has raised a ticket with MS. As soon as they find out the rootcause and fix. I'll post it here.

Thank you for you time and assistance so far! Appreciated!

 

Fromit87

GilbertQ
Super User
Super User

Hi @Fromit87 

 

Can you confirm for your Dev workspace it is pointing via the gateway to the same source as in Power BI Desktop?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Hi @GilbertQ,

Yes, sources are identical in the Gateway and PBI Desktop. This morning the refresh on the Dev workspace failed as well with the same error as mentioned in my original post. 
I also have to mention, the model refresh in the service worked until a week ago. The only thing that I changed, I had to move all Databricks sources to the Databricks E2 environment (that entailed changing the server address and path address to the new Databricks cluster).

Hi @Fromit87 

 

That does seem a little weird. Can you confirm if you create a new data source to the new databricks cluster that it works?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.