Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
I am currently exploring what are the differnet ways to drop drop duplicates in eventstream via event processor.
I have understanding that by using groupby we can atleast identify the duplicates.
Note : Need to drop duplicates in eventstream only
Solved! Go to Solution.
Sure
in the Eventstream you can add a transformation task - here you will find the "group by" operator.
in this task you can add a windowing function and setup the needed aggregation
I hope you can find it 😊
I wanted to follow up since I haven't heard from you in a while. Have you had a chance to try the suggested solutions?
If your issue is resolved, please consider marking the post as solved. However, if you're still facing challenges, feel free to share the details, and we'll be happy to assist you further.
Looking forward to your response!
Best Regards,
Community Support Team _ C Srikanth.
Hello @chetanhiwale ,
thank you for your question at this Fabric community forum.
If the ingested data is sent to an Eventhouse KQL databse table, you can also consider ignoring duplicates in the Eventhouse using a materialized view.
Consider this example based on the Weather data service:
.create materialized-view with(lookback=20m, lookback_column = "EventProcessedUtcTime", backfill=true, docString="Unique Weather data entries", folder="MaterializedViews") WeatherDataset on table WeatherData
{
WeatherData
| summarize arg_max(EventProcessedUtcTime, *) by dateTime
}
Here, the latest ingested rows with the same EventProcessedUtcTime are represented.
The table on which this view is based has a limited retention time to keep the database storage size at the minimum:
.alter-merge table WeatherData policy retention softdelete = 1h recoverability = disabled
Of course, this only works for this particular type of timeseries database.
If you find this reply to help with your problem, please consider hitting the upvote or accept button...
I wanted to follow up since I haven't heard from you in a while. Have you had a chance to try the suggested solutions?
If your issue is resolved, please consider marking the post as solved. However, if you're still facing challenges, feel free to share the details, and we'll be happy to assist you further.
Looking forward to your response!
Best Regards,
Community Support Team _ C Srikanth.
We haven't heard from you since last response and just wanted to check whether the solution provided has worked for you. If yes, please Accept as Solution to help others benefit in the community.
Thank you.
If the above information is helpful, please give us Kudos and mark the response as Accepted as solution.
Best Regards,
Community Support Team _ C Srikanth.
Thank you for being part of the Microsoft Fabric Community.
As highlighted by @datacoffee , the proposed approach appears to effectively address your requirements. Could you please confirm if your issue has been resolved?
If you are still facing any challenges, kindly provide further details, and we will be happy to assist you.
If the above information is helpful, please give us Kudos and mark the response as Accepted as solution.
Best Regards,
Community Support Team _ C Srikanth.
Hi
You can make a windowing function with the "group by" transformation.
This will give you options in different windowing functions and you can add a time-span in which the window should look and remove the dublicates (group the rows).
It is not the best option to remove them here though - I would remove them in the Eventhouse using an update policy or a materialized view. In this way you are out of the windowing function and can have a faster stream of data to the Eventhouse. The update policy or materialized view is also quicker than the Eventstream for data processing 🙂
Hi @datacoffee , thanks for the answer.
Yes , we can drop duplicates in eventhouse, but our need is to drop them in eventstream only.
In groupby, which operation are you suggesting , can you please mention ?
Using count, we can identify the duplicates but removing them seems a different task to me.
Sure
in the Eventstream you can add a transformation task - here you will find the "group by" operator.
in this task you can add a windowing function and setup the needed aggregation
I hope you can find it 😊
Yeah, @datacoffee is correct. You can find the group by operation in the drop down list of transformation.
When you config it, you can select 'Average' as the aggregation method and which column to aggregate on. Then select all the remaining columns as Group operation by.
This is how you drop duplicates with current no code experience.
Once Eventstream can support users write SQL code (coming soon), you can use the following code to remove duplicates:
In the current no code experience, you can use 'group by' operator
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.