Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply

RealTime File Processing in Microsoft Fabric

Hi,

I'm currently working on a POC where data from multiple sources lands in a Lakehouse folder. The requirement is to automatically pick up each file as soon as it lands, process it, and push the data to EventHub.

We initially considered using Data Activator for this, but it doesn't support passing parameters to downstream jobs. This poses a risk, especially when multiple files arrive simultaneously, as it could lead to conflicts or incorrect processing.

Additionally, we are dealing with files that can range from a single record to millions of records, which adds another layer of complexity.

Given these challenges, what would be the best approach to handle this scenario efficiently and reliably? Any suggestions would be greatly appreciated.

Thanks in advance!

9 REPLIES 9
v-ssriganesh
Community Support
Community Support

Hi @ananthkrishna99,
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.

v-ssriganesh
Community Support
Community Support

Hi @ananthkrishna99,

May I ask if you have resolved this issue? If so, please mark it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-ssriganesh
Community Support
Community Support

Hi @ananthkrishna99,

Thank you for confirming that you can use an eventhouse this is an excellent choice for your real-time file processing needs Kudos to @lbendlin for the great suggestion. To meet your requirements for automatically processing files as they arrive, sending data to Azure Event Hubs, handling simultaneous file arrivals, and scaling for files ranging from single records to millions, the best approach is to use Eventstream with Eventhouse in Microsoft Fabric. This solution efficiently processes files in near real-time, routes data to Event Hubs, and uses metadata to manage concurrent files and address the Data Activator limitation on parameter passing.

At a high level, Eventstream monitors your file landing location, ingests data into an Eventhouse for real-time analytics, and routes processed data to Event Hubs. Metadata ensures conflict-free processing of simultaneous files, and Eventhouse scales to handle varying file sizes, making it ideal for your proof of concept (POC).

For your reference, official documentation links:

 

If this information is helpful, please “Accept as solution” and give a "kudos" to assist other community members in resolving similar issues more efficiently.
Thank you.

Event Stream doesn't support all file formats, I want to parse csv, excel, xml and the files are also of very large size which is around 2GB also sometimes. Event stream is for event based processing and not for actual file processing. Databricks Streaming jobs have file watcher capability which is lacking in Fabric. The Data Activator just send notification to the jobs saying a file has arrived. But it can't send any meta data of the which is required for the downstream jobs. Not sure when will these features are planned to release. Please let us know if you have any idea on these.

Hello @ananthkrishna99,

Thank you for the additional context around your use case particularly the need to process large CSV, Excel, and XML files (up to 2GB). You are correct that Eventstream and Data Activator in Microsoft Fabric currently have limitations.

To work around these limitations, here’s a hybrid strategy you can implement using Fabric-native tools:

  • Can still act as a lightweight trigger to detect new files. Metadata is limited but useful in certain structured landing scenarios.
  • Use these to orchestrate downstream processing by capturing arrival events (from Eventstream or another trigger source) and passing file metadata to notebooks as parameters.
  • Spark Structured Streaming in Notebooks it handled natively with Spark and Use preprocessing or Python libraries inside the notebook. For large files, Spark supports partitioning and checkpointing to maintain state and avoid conflicts when multiple files arrive concurrently.
  • For better scalability, consider converting Excel and XML files to CSV or Parquet before loading into Spark.
  • Use Fabric to stream the processed data to Event Hubs for real-time analytics or downstream apps.

Please refer following official documentation: https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-streaming-data and Check the Microsoft Fabric Roadmap: https://roadmap.fabric.microsoft.com/?product=administration%2Cgovernanceandsecurity
for updates.

We understand the importance of native file watching and metadata-rich triggers. While these features are not yet available in Fabric, they are on the radar for future enhancements. We encourage you to submit ideas in the Fabric Ideas - Microsoft Fabric Community to help prioritize these features.

Hello @ananthkrishna99,

Hope everything’s going great on your end! Just checking in has the issue been resolved, or are you still running into problems? Sharing an update can really help others facing the same thing.

Thank you.

Hi @ananthkrishna99,
I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please accept it as a solution and give it a 'Kudos' so other community members with similar problems can find a solution faster.
Thank you.

lbendlin
Super User
Super User

Do you have an option to land this data directly in the eventhouse, rather than the lakehouse?

@lbendlin , yes I can land the data into EventHouse

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors