Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more
Hi,
I am trying to create a streaming data pipeline where I have a single source folder in ADLS Gen2, which receive multiple files related to different oracle tables. In the fabric workspace, I have a spark streaming notebook which reads the files from this single source folder in ADLS Gen2, and I want to write it to multiple folders as per table name in One Lake present in the Fabric Workspace. How can we achieve this streaming requirement without creating multiple streams for each table?
Solved! Go to Solution.
Hi @_augustine_ ,
Thanks for using Fabric Community.
I am just trying to give some idea and a generalised approach based on my understanding.
Here's how you can achieve your streaming requirement in Microsoft Fabric without creating multiple streams for each table:
1. Pipeline with Notebook Activity:
2. Event-Driven Trigger:
3. Notebook Logic:
4. Benefits:
By following these steps, you can create a single streaming pipeline in Microsoft Fabric that efficiently reads data from a single ADLS Gen2 folder, extracts table names, partitions the stream, and writes data to separate folders in One Lake based on the table name.
I just gave some idea over your scenario, I hope this might be helpful. Please do let me know incase of further queries.
Hi,
Thanks a lot for the solution, I am following the above approaches but the notebook part, I want to use Spark Streaming notebook, and an option to stop the streaming notebook if there is no file in the ADLS Gen2.
Example:
Multiple files are coming at a certain interval in ADLS Gen2, which triggers the pipeline and the spark streaming notebook. When there is no file the pipeline stops, stopping the spark streaming notebook.
In the solution you provided, files coming after the pipeline running, wont be read by the notebook. Right?
Hi @_augustine_ ,
Glad to know that you got some insights.
Yes, your understanding is right. It can read the files which are present before execution, but not that came after the pipeline execution.
You can also refer this - Get started with streaming data in lakehouse - Microsoft Fabric | Microsoft Learn
I hope this might help you.
Thank you
Hi @_augustine_ ,
Thanks for using Fabric Community.
I am just trying to give some idea and a generalised approach based on my understanding.
Here's how you can achieve your streaming requirement in Microsoft Fabric without creating multiple streams for each table:
1. Pipeline with Notebook Activity:
2. Event-Driven Trigger:
3. Notebook Logic:
4. Benefits:
By following these steps, you can create a single streaming pipeline in Microsoft Fabric that efficiently reads data from a single ADLS Gen2 folder, extracts table names, partitions the stream, and writes data to separate folders in One Lake based on the table name.
I just gave some idea over your scenario, I hope this might be helpful. Please do let me know incase of further queries.
Hi @_augustine_ ,
We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .
Thanks
Hi @_augustine_ ,
We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .
Thanks
Hi,
Thanks a lot for the solution, I am following the above approaches but the notebook part, I want to use Spark Streaming notebook, and an option to stop the streaming notebook if there is no file in the ADLS Gen2.
Example:
Multiple files are coming at a certain interval in ADLS Gen2, which triggers the pipeline and the spark streaming notebook. When there is no file the pipeline stops, stopping the spark streaming notebook.
In the solution you provided, files coming after the pipeline running, wont be read by the notebook. Right?
Hi @_augustine_ ,
Glad to know that you got some insights.
Yes, your understanding is right. It can read the files which are present before execution, but not that came after the pipeline execution.
You can also refer this - Get started with streaming data in lakehouse - Microsoft Fabric | Microsoft Learn
I hope this might help you.
Thank you
Check out the March 2025 Fabric update to learn about new features.
Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
User | Count |
---|---|
13 | |
4 | |
2 | |
2 | |
1 |
User | Count |
---|---|
8 | |
5 | |
4 | |
3 | |
3 |