Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
anusha_2023
Helper II
Helper II

Connecting and pinning Lakehouse to the Notebook

I have a scheduled notebook running on daily basis which is connected with Lakehouse for reading input parquet files and writing output as Tables. These tables are connected with Power BI report and semantic model. Now my daily schduled script is failing because pinned Lakehouse to the Notebook  is getting disconnected and I need to add the lakehouse everyday manually and re-run the script. May I know workaround for this issue?

1 ACCEPTED SOLUTION
govindarajan_d
Solution Supplier
Solution Supplier

If you clone the notebook and then pin the lakehouse to the cloned notebook, does the same happen?

View solution in original post

7 REPLIES 7
v-cboorla-msft
Community Support
Community Support

Hi @anusha_2023 

 

Glad that your query got resolved.

Please continue using Fabric Community for any help regarding your queries.

HimanshuS-msft
Community Support
Community Support

Hello @anusha_2023 
Thanks for using the Fabric community.
I did tried to repro the issue and as I understand , the issue here is to the notebooks scheduled runs are failing .

I tried this to read a file and write the content to a table , 

 

df = spark.read.option("multiline", "true").json("Files/JSON/test.json")
# df now is a Spark DataFrame containing JSON data from "Files/JSON/test.json".
df.write.mode("append").format("delta").saveAsTable("test_table")
display(df)

and then I scheduled it for every 10 mins . The run just went fine without any failure . 

HimanshuSmsft_0-1706129903419.png

 


I am sure that I am missing something here . What is the error you are getting ?

Thanks
HImanshu

I tried with sample data by pinning the lakehouse and loading as table with a new notebook shedules are successful. 

anusha_2023_0-1706144890125.png

I have issue with my daily scheduled notebooks.

 

This is the error: 

 

Py4JJavaError: An error occurred while calling o6337.parquet. : java.io.FileNotFoundException: Operation failed: "Not Found", 404, PUT, http://onelake.dfs.fabric.microsoft.com/b04579d1-31c1-4194-a912-9f15db327234/6aa4d65e-dc1f-4212-8c1f... ArtifactNotFound, "Artifact '6aa4d65e-dc1f-4212-8c1f-1cafc20c20d5' is not found in workspace 'b04579d1-31c1-4194-a912-9f15db327234'."

Caused by: Operation failed: "Not Found", 404, PUT, http://onelake.dfs.fabric.microsoft.com/b04579d1-31c1-4194-a912-9f15db327234/6aa4d65e-dc1f-4212-8c1f... ArtifactNotFound, "Artifact '6aa4d65e-dc1f-4212-8c1f-1cafc20c20d5' is not found in workspace 'b04579d1-31c1-4194-a912-9f15db327234'." at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:231) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:191) 

 

 

As a work aroud If I gave absolute path then I could able to write as parquet file. But saveAsTable is not working with absolute path as below.

df_bookings.write.mode("overwrite").format("delta").saveAsTable("<absolute-path>/Tables/bookingsdailyupdate")
 
When I am using save option instead I could not able to save to the default Tables area of the Lakehouse. The system is suggesting me to move to files folderand they are saving as delta files in the Files folder of the lakehouse.
df_bookings.write.mode("overwrite").format("delta").save("<absolute-path>/Tables/bookingsdailyupdate") 

 

 

govindarajan_d
Solution Supplier
Solution Supplier

If you clone the notebook and then pin the lakehouse to the cloned notebook, does the same happen?

If I made another copy and schedule the new notebook its worked at first time. Thanks for the tip. But second time schedule is successful but script did not ran. As below showing the monitor hub first run is successful at 2:23AM showing in the first screen shot and next runs are showing succeeded but check the setting details in the second screenshot below.

anusha_2023_1-1706151661845.png

 

anusha_2023_0-1706151444864.png

Notebook is not really running the script. Does that mean am I am ran out of ran capacity or what would be the reason behind?

How are you saying the notebook did not execute? The data does not get written to table?

 

If you go to recent runs, you will be able to individually go into each run's status and you can click on item snapshots to see how the notebook has run. Please check that and see if any individual cell is causing a problem. 

govindarajan_d_0-1706420047634.png

 

Thanks for the reply. Yes, I have checked in the same way, and see the below screenshot. The server is not getting started. If I run manually by going into the notebook it works but if I try to schedule server is not picked up. Please check the below screenshot

latestFailure.PNG

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors