Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
ToddChitt
Super User
Super User

Notebook fails with error 200 when run from Pipeline

I'm fairly new to Spark Notebooks. I have one that dumps a JSON file into a couple of tables in my Lakehouse. It works fine when the Notebook is run on its own. But when I try to run it from the context of a Pipeline, I get this error:

 

Notebook execution failed at Notebook service with http status code - '200', please check the Run logs on Notebook, additional details - 'Error name - AnalysisException, Error value - org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Spark SQL queries are only possible in the context of a lakehouse. Please attach a lakehouse to proceed.)' 

 

What is going on? "Please attache a lakehouse to proceed"? What does that mean? In the setup of the Notebook activity in the Pipeline, I select the Workspace and Notebook from validated lists. The Pipeline should *know* that the Notebook is a part of a particular Lakehouse.

 

Any help would be appreciated. 




Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!





1 ACCEPTED SOLUTION

Hi @ToddChitt 

Thanks for the details. Can you please send me the code which you are running in the notebook? I will try to repro this from my side and let you know.

And also can you please try to create a new notebook and then attach the lakehouse. Now try to run this notebook from the pipeline. If the issue still persists please do let me know. This way there wont be a grey out option and you can attach or detach the lakehouse.

vnikhilanmsft_0-1710246947366.png


Did you try to set the target lakehouse as the default lakehouse? Default lakehouse is identified by the pin icon.

vnikhilanmsft_2-1710247776344.png

 



Thanks.

View solution in original post

11 REPLIES 11
pshepunde
Helper I
Helper I

I am getting similar error but it is about variable/dataframe note defined.. Name Error. When I ran notebook manually, it is successfull but when triggered from pipeline it is failing with exceptions mentioned above.

 

below is the line of code causing error, where read_particular_day_files is a user defined function.

numfiles_pti_ar, df_toprocess_pti_ar, df_policy_transaction_info_ar = read_particular_day_files(accountingDate, '', 'PolicyTransactionInfo')
ToddChitt
Super User
Super User

Hello @v-nikhilan-msft and thank you for the prompt reply.

Per your suggestion I created a new stand-alone notebook (i.e.: NOT from already inside the lakehouse) then manually ADDED the lakehouse as a new source. 

 

This worked! Invoking the Notebook from the context of a Pipeline run just fine.

 

Thank you. 

Do you still want the code? All it does is read a JSON file, parse it into 4 data frames and save each to a table in the lakehouse.

ToddChitt_0-1710249339148.png

 

ToddChitt_1-1710249378354.png

 




Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!





Hi @ToddChitt 

Thanks for the code. Glad that your query got resolved. Please continue using Fabric Community for any help regarding your queries.

hello v-nikhilan-msft,

I am getting similar error but it is about variable/dataframe note defined.. Name Error. When I ran notebook manually, it is successfull but when triggered from pipeline it is failing with exceptions mentioned above.

 

below is the line of code causing error, where read_particular_day_files is a user defined function.

numfiles_pti_ar, df_toprocess_pti_ar, df_policy_transaction_info_ar = read_particular_day_files(accountingDate, '''PolicyTransactionInfo')
 
please help me resolve this issue

Hi @pshepunde 

Thanks for the post. But as the initial ask is different from your issue, can you please create a new post and tag me. I will surely help.

Please attach some screenshots of the error too.

Thanks.

I would consider this a bug: A Notebook created from the context of a Lakehouse has trouble access that lakehouse when run from inside a Pipeline. But a Notebook created on its own, with the Lakehouse added afterwards has no issues. 

 

Please don't tell me that this was "by design"!

 




Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!





This is clearly a bug, I managed to make it work creating a new empty notebook , but this should be fixed

I also experienced this issue and agree it is a bug. However an alternative workaround to creating a new notebook is to completely remove and add the linked lakehouse back. To do this you can select the default pinned lakehouse and select Remove all Lakehouses. Once added back it seems to trigger from pipeline with no errors.

Ben1133111_0-1715024942571.png

 

ToddChitt
Super User
Super User

Hello @v-nikhilan-msft 

The Notebook was created from within the Lakehouse where the tables are. It is already connected. As stated, it runs just fine by itself and populates the tables in the lakehouse. 

I cannot add it again via the method you describe above as it is already connected.

If I follow the steps for "+ Data Sources", I select "Lakehouses", then the radio buttong for "Existing lakehouse". The lakehouse that the notebook was created under is listed, but grayed out with the message: "This item is preselected and can't be unchecked."

For all intents and purposes, I have to assume that the Notebook is already joined/knows about/is connected to the lakehouse. 

 

The error mentions "Spark SQL queries" but none of my three code blocks uses Spark SQL, they all use the default of PySpark.

 

Regards,




Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!





Hi @ToddChitt 

Thanks for the details. Can you please send me the code which you are running in the notebook? I will try to repro this from my side and let you know.

And also can you please try to create a new notebook and then attach the lakehouse. Now try to run this notebook from the pipeline. If the issue still persists please do let me know. This way there wont be a grey out option and you can attach or detach the lakehouse.

vnikhilanmsft_0-1710246947366.png


Did you try to set the target lakehouse as the default lakehouse? Default lakehouse is identified by the pin icon.

vnikhilanmsft_2-1710247776344.png

 



Thanks.

v-nikhilan-msft
Community Support
Community Support

Hi @ToddChitt 
Thanks for using Fabric Community.
Can you please follow the below steps and retry :

At the leftside of the notebook please click on lakehouses.

vnikhilanmsft_0-1710217761933.png

 



Click on Add lakehouse.

vnikhilanmsft_1-1710217796126.png

 



Select the lakehouse where the table resides. Attach the lakehouse to the notebook.

vnikhilanmsft_2-1710217814681.png

 



Now try to run the pipeline. Please let me know if the issue still persists. Hope this helps.

Helpful resources

Announcements
Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

April Fabric Update Carousel

Fabric Monthly Update - April 2024

Check out the April 2024 Fabric update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.