Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
DennesTorres
Impactful Individual
Impactful Individual

What are the scenarios for Delta Streaming ?

Hi,

I'm not much sure if I fully understand the usage scenarios of Delta Streaming in Fabric. I'm noticing information about this included in documentations and labs as a preparation for certification, so I think I may be missing something about this.

For example in this lab: https://microsoftlearning.github.io/mslearn-fabric/Instructions/Labs/03-delta-lake.html

 

The following line starts a readstream waiting for files in the folder:

iotstream = spark.readStream.schema(jsonSchema).option("maxFilesPerTrigger", 1).json(inputPath)

This raises lots of questions:

1) Is this continuous ? I tried a continuous execution in notebooks before and failed. The support told be the notebooks are not intended for continuous execution. In this way, this question: Is this continuous?

2) If it's continuous, it survives even if the notebook session is stopped, or does the notebook session becomes continuous when a streaming is started ?

3) If the session stops and the streaming persists, how to manage the streaming later? Stop? Restart?

4) If this is not continuous, what's the benefit of a streaming which is not continuous and will stop once the session stops ?

Thank you in advance for clarifications!

 

Kind Regards,

 

Dennnes

1 ACCEPTED SOLUTION

Hi @DennesTorres,

Thank you for your patience and for sharing your thoughts!

After reviewing the link you provided, Richbenmintz offered a comprehensive overview of Delta Streaming's capability to operate continuously. Their points about the persistence of the streaming job, even if the notebook session is stopped, are accurate. Streaming jobs can indeed continue running independently of the notebook session, enabling near real-time analytics.

Providing comprehensive insights and detailed analysis:
So, my response aimed to clarify the behavior of streaming in notebooks, but it may have caused some confusion. While it's true that if you don’t call awaitTermination() or stop(), the stream can keep running, I should have emphasized that the stream will indeed stop if the notebook session is terminated.

  • Yes, Delta Streaming can be continuous, but it typically requires a Spark job definition for long-term execution.
  • The streaming job persists independently of the notebook session, but if the session is stopped, the stream will stop as well.
  • You can manage streaming jobs through the Spark UI or programmatically, but they will need to be restarted if the session ends.

I appreciate your willingness to experiment further, as hands-on experience is invaluable for understanding these concepts. If this proves helpful, kindly Accept it as a solution and leave a "Kudos" so other members can easily find it.
Thank you. 

View solution in original post

6 REPLIES 6
v-ssriganesh
Community Support
Community Support

Hello @DennesTorres,

Could you please confirm if your questions have been resolved? If they have, kindly mark the helpful response and Accept it as the solution. This will assist other community members in resolving similar issues more efficiently.

Thank you.

Hi,

It explains a lot, but I don't know. I will need to make a detailed experience in relation to this.

I posted this same question in other locations and received detailed answers as well, but opposite to this one. In this way, I will need to test to understand what's exactly the answer.

Any help analysing each other answer will be very welcome.

https://www.reddit.com/r/MicrosoftFabric/comments/1hp5hkn/what_are_the_scenarios_for_delta_streaming...

Hi @DennesTorres,

Thank you for your patience and for sharing your thoughts!

After reviewing the link you provided, Richbenmintz offered a comprehensive overview of Delta Streaming's capability to operate continuously. Their points about the persistence of the streaming job, even if the notebook session is stopped, are accurate. Streaming jobs can indeed continue running independently of the notebook session, enabling near real-time analytics.

Providing comprehensive insights and detailed analysis:
So, my response aimed to clarify the behavior of streaming in notebooks, but it may have caused some confusion. While it's true that if you don’t call awaitTermination() or stop(), the stream can keep running, I should have emphasized that the stream will indeed stop if the notebook session is terminated.

  • Yes, Delta Streaming can be continuous, but it typically requires a Spark job definition for long-term execution.
  • The streaming job persists independently of the notebook session, but if the session is stopped, the stream will stop as well.
  • You can manage streaming jobs through the Spark UI or programmatically, but they will need to be restarted if the session ends.

I appreciate your willingness to experiment further, as hands-on experience is invaluable for understanding these concepts. If this proves helpful, kindly Accept it as a solution and leave a "Kudos" so other members can easily find it.
Thank you. 

v-ssriganesh
Community Support
Community Support

Hi @DennesTorres,
Thanks for posting your query in microsoft fabric community forum.

  • Yes, Delta Streaming can be continuous. While notebooks might not be ideal for extended streaming due to potential session timeouts, Delta Streaming itself persists independently.
  • Stopping the notebook session doesn't terminate the streaming job. The session acts as a trigger for initiating the stream, but the underlying Spark job continues processing data.
  • Spark provides tools to manage streaming jobs. You can monitor, stop, and restart them using the Spark UI or programmatically.

 Benefits of Continuous Delta Streaming:

  • Continuously process incoming data streams for near real-time analytics.
  • Gain insights from data with minimal delay compared to batch processing.
  • Efficiently handle fluctuating data volumes with horizontal scaling.

 

I hope this clarifies your questions about Delta Streaming in Fabric. Should you have any further inquiries or require additional assistance, please do not hesitate to ask. If you find this information helpful, kindly Accept it as a solution and leave a "Kudos" to aid other members in locating it more easily.

Thank you.

 

Hi,

"

  • Stopping the notebook session doesn't terminate the streaming job. The session acts as a trigger for initiating the stream, but the underlying Spark job continues processing data.
  • Spark provides tools to manage streaming jobs. You can monitor, stop, and restart them using the Spark UI or programmatically."

Could you provide details about how to do this, how to manage streaming jobs after the session finished ?

Hi @DennesTorres ,

That's a great follow-up question! Here's how you can manage Delta Streaming jobs after your notebook session finishes:

  • Once your Delta Streaming job is initiated through the notebook, locate the link to the Spark UI in your notebook's output. This UI provides a web interface to monitor and manage Spark applications, including your streaming job. Within the Spark UI, look for the Streaming tab or section. This will display a list of active streaming queries, including your Delta Streaming job.
  • Each streaming job entry will provide details like its name, ID, and current status. You can then perform actions like:
    • Stop: Clicking the stop button associated with your job will gracefully terminate the streaming query.
    • Monitor Progress: You can monitor the progress of your streaming job, including the number of records processed, latency, and any errors encountered.
  • Alternatively, you can leverage the Spark Session API within your notebook to programmatically manage streaming jobs. Here's a basic example:

    Code:

    from pyspark.sql import SparkSession
    spark = spark.sparkContext.getOrCreate()._jsparkSession
    spark.streams.active.stop()
    streaming_query = spark.streams.getQueryByName("your_query_name")
    streaming_query.stop()

 

By following these methods, you can effectively manage your Delta Streaming jobs even after your notebook session has ended. If you find this helpful, please Accept it as a solution and consider leaving a "Kudos" so other members can locate it more easily.
Thank you.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.