Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
DennesTorres
Impactful Individual
Impactful Individual

What are the scenarios for Delta Streaming ?

Hi,

I'm not much sure if I fully understand the usage scenarios of Delta Streaming in Fabric. I'm noticing information about this included in documentations and labs as a preparation for certification, so I think I may be missing something about this.

For example in this lab: https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/03-delta-lake.html

 

The following line starts a readstream waiting for files in the folder:

iotstream = spark.readStream.schema(jsonSchema).option("maxFilesPerTrigger", 1).json(inputPath)

This raises lots of questions:

1) Is this continuous ? I tried a continuous execution in notebooks before and failed. The support told be the notebooks are not intended for continuous execution. In this way, this question: Is this continuous?

2) If it's continuous, it survives even if the notebook session is stopped, or does the notebook session becomes continuous when a streaming is started ?

3) If the session stops and the streaming persists, how to manage the streaming later? Stop? Restart?

4) If this is not continuous, what's the benefit of a streaming which is not continuous and will stop once the session stops ?

Thank you in advance for clarifications!

 

Kind Regards,

 

Dennnes

3 REPLIES 3
v-ssriganesh
Community Support
Community Support

Hi @DennesTorres,
Thanks for posting your query in microsoft fabric community forum.

  • Yes, Delta Streaming can be continuous. While notebooks might not be ideal for extended streaming due to potential session timeouts, Delta Streaming itself persists independently.
  • Stopping the notebook session doesn't terminate the streaming job. The session acts as a trigger for initiating the stream, but the underlying Spark job continues processing data.
  • Spark provides tools to manage streaming jobs. You can monitor, stop, and restart them using the Spark UI or programmatically.

 Benefits of Continuous Delta Streaming:

  • Continuously process incoming data streams for near real-time analytics.
  • Gain insights from data with minimal delay compared to batch processing.
  • Efficiently handle fluctuating data volumes with horizontal scaling.

 

I hope this clarifies your questions about Delta Streaming in Fabric. Should you have any further inquiries or require additional assistance, please do not hesitate to ask. If you find this information helpful, kindly Accept it as a solution and leave a "Kudos" to aid other members in locating it more easily.

Thank you.

 

Hi,

"

  • Stopping the notebook session doesn't terminate the streaming job. The session acts as a trigger for initiating the stream, but the underlying Spark job continues processing data.
  • Spark provides tools to manage streaming jobs. You can monitor, stop, and restart them using the Spark UI or programmatically."

Could you provide details about how to do this, how to manage streaming jobs after the session finished ?

Hi @DennesTorres ,

That's a great follow-up question! Here's how you can manage Delta Streaming jobs after your notebook session finishes:

  • Once your Delta Streaming job is initiated through the notebook, locate the link to the Spark UI in your notebook's output. This UI provides a web interface to monitor and manage Spark applications, including your streaming job. Within the Spark UI, look for the Streaming tab or section. This will display a list of active streaming queries, including your Delta Streaming job.
  • Each streaming job entry will provide details like its name, ID, and current status. You can then perform actions like:
    • Stop: Clicking the stop button associated with your job will gracefully terminate the streaming query.
    • Monitor Progress: You can monitor the progress of your streaming job, including the number of records processed, latency, and any errors encountered.
  • Alternatively, you can leverage the Spark Session API within your notebook to programmatically manage streaming jobs. Here's a basic example:

    Code:

    from pyspark.sql import SparkSession
    spark = spark.sparkContext.getOrCreate()._jsparkSession
    spark.streams.active.stop()
    streaming_query = spark.streams.getQueryByName("your_query_name")
    streaming_query.stop()

 

By following these methods, you can effectively manage your Delta Streaming jobs even after your notebook session has ended. If you find this helpful, please Accept it as a solution and consider leaving a "Kudos" so other members can locate it more easily.
Thank you.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.