Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
bklooste
Frequent Visitor

Fabric CU Leak

Im running 4 background  jobs
An event stream
2 pipelines running a notebook using pyspark streaming.
A pipeline  running SQL to copy for Lakehouse to ware house.

I have been careful to add timeouts in the pipelines and notebooks

Amount of data is like 20 messages an hour.

What im seeing is during the day  the amount of CU these background jobs use increase.
However when i stop and start Fabric instance it goes back down again.

bklooste_0-1721174505805.png

bklooste_1-1721174766257.png

 

How do i go about solving this ?



4 REPLIES 4
bklooste
Frequent Visitor

staggering  did not help

bklooste
Frequent Visitor

2 of the notebooks are like this.

 

def write2table(df2, epoch_id😞
    df2.write.format("delta").mode("append").partitionBy("partition").save(table_delta_file_location)
 
df = spark \
    .readStream \
    .format("eventhubs") \
    .options(**ehConf) \
    .option("failOnDataLoss", "false") \
  .load()

dfa.writeStream \
    .outputMode("append") \
    .trigger(processingTime='120 seconds') \
    .option("mergeSchema", "false") \
    .option("checkpointLocation",checkpointLocation) \
    .foreachBatch(write2table) \
    .start() \
    .awaitTermination(590)
bklooste
Frequent Visitor

During day restart same.  I cant believe that this is coincidence of more jobs  running at the same peak load. There is only 4 jobs and it was doing the same when there were 2 jobs.  ( Both pipelines running every 11 minutes running a notebook with a 2 minute trigger). 


I know i can prob work around it by running the notebook once per day and leave it running but i want to know why this is happening and how to get detail ?

bklooste_0-1721264757397.png

 

v-jiewu-msft
Community Support
Community Support

Hi @bklooste ,

Based on the description, try staggering job start times to distribute the load more evenly throughout the day.

Besides, try to reduce the polling frequency of the event stream and pipelines.

 

Best Regards,

Wisdom Wu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.