Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hi,
I've an eventstream that reads from an event hub and writes on a KQL database and a lakehouse.
At regular time intervals a notebook is executed to delete a data portion from the lakehouse table, but often a DELTA_CONCURRENT_APPEND error is arised because the events arrive continuously.
Perhaps I need to change my practice to limit the size of the lakehouse table.
I've already posted this thread in the evenstream forum, I was advised to write it here.
Any suggests to me, please? Thanks
Hi @pmscorca
I’m not very familiar with Eventstream, but it seems like you want to keep the data in your destination lakehouse from getting too large. You can do this in two ways: one is to reduce the data ingested into the lakehouse, and the other is to reduce the data you no longer need in the lakehouse. Obviously, you have already considered both but haven’t implemented them. Here are some specific suggestions.
Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!
Hi, thanks for your reply.
I'm trying to explain better this scenario.
Evenstream allows to read events from a source, in this case it is Azure Event Hubs, to write them into a destination, in this case a KQL database, in real-time. The events arrive continuously.
For a KQL database it is possible to set a data retention.
For historicization purposes, in Eventstream it is possible to add a lakehouse as a destination. This is a temporary repository that precedes the writing into a warehouse. To avoid to accumulate too many events in the lakehouse I thought to delete a data portion (in terms of rows, not fields) old of 30 minutes, because the events arrive continuously. I'm interesting to write the entire event in the lakehouse and not only specific event fields, so when I talk about limiting the size of the lakehouse table I refer to remove entire event rows already written in the table. At regular time intervals, the events in the lakehouse are written in the downstream warehouse.
I thought to use a delete in a notebook to limit the size of the lakehouse table because it doesn't seem be a simple data retention option to set as instead it occurs for a KQL database.
The DELTA_CONCURRENT_APPEND error occurs when two or more operations attempt to add data to the same delta table at the same time: so or it exists an alternative manner to remove event rows old of 30 minute or the delete operation via notebook must function in a scenario with events that arrive continuously.
Thanks
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
13 | |
4 | |
3 | |
3 | |
3 |
User | Count |
---|---|
8 | |
8 | |
7 | |
6 | |
5 |