Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
DennesTorres
Impactful Individual
Impactful Individual

Group by on eventstream's lakehouse target doesn't aggregate numeric fields

Hi,

The problem is: The fields are not available to apply an aggregation function. I can make a count, but I can't make a sum, avg or other.

 

This was on the NYTaxi sample data, close the the existing real time walkthrough on Microsoft portal, but I included a lakehouse target (kusto and lakehouse, two targets) and this happened on the transformation for the lakehouse.

Is this a bug, or did I miss something? The image below illustrate the problem. The only thing which appears is "partitionId":

DennesTorres_0-1696351596900.png


Kind Regards,

 

Dennes

2 ACCEPTED SOLUTIONS

Hi @DennesTorres ,

As I understand you are facing issue in applying aggregate operations on the columns in the destination. Here we have two cases.

 

1)In the first case the destination in the eventstream is set to be Lakehouse table. The sample data which you are using (i.e) Yellow Taxi is converted to string by default in this case as we are reading the data from a CSV file which does not have type. Hence all the columns will be converted to string. For lake house, it is string by default but you can change the type using event processor no code editor.

 

2) In the second case the destination is set to be a KQL database. For kusto, it has the type as conversion by default. Hence the datatypes will not be changed and will be similar to that of the source datatypes as you have shown me in the screenshot.

 

So if you want to use the destination for the eventstream as Lakehouse and want to apply aggregate operation on your columns, we have to change the datatype of the columns. We cannot apply aggregate operation on a string datatype column. This can be achieved by using event processor no code editor. I have created a repro at my end and attaching the screenshots for your reference. I am able to apply aggregate operation on other columns as well. 

Hence it is not a bug.
Please refer this link for changing the datatypes of the columns: link 


vnikhilanmsft_1-1696606198753.png

 

 

vnikhilanmsft_0-1696606143514.png

 

vnikhilanmsft_2-1696606252556.png

Hope this helps. Do let us know if you have any further queries. Glad to help.

View solution in original post

Hi,

 

This explain and works. But it doesn't change the fact it's a terrible idea to have transformations and types completely tied witht he target.

Is bad for us, who will use the technology, because it's prone to make maintenance more difficult. It's bad for the developers because it multiply the need of development and creates the requirement to keep the development in sync in two different places - and it's not in sync now, the targets don't have the same features.

This problem was not there before. When using streaming dataflows, you can notice by the image below, the transformations are done by the dataflow, independent of the target. I could add two targets receiving the same transformation result.

DennesTorres_0-1697128897111.png


When using Stream Analytics, the transformatiosn were done by the stream analytics query language, and we could drop the result in multiple targets

DennesTorres_1-1697129062556.png

The idea to keep the transformations and even the data types tied to the target doesn't seems a good idea at all and the previous technologies were not like this.

I hope this feedback reaches the development team.

Kind Regards,

Dennes



View solution in original post

9 REPLIES 9
v-nikhilan-msft
Community Support
Community Support

Hi @DennesTorres  ,
Thanks for using Fabric Community . 

For the sample data which you are using (i.e) Yellow Taxi, the incoming event stream on that sample data has only VARCHAR columns, except for partition_ID which is of the type INTEGER. An aggregate function can only run on integers or decimals. This is the reason why we are able to apply aggregate only on partion_ID. I have attached the screenshot for your reference.

Hope this helps. Do let us know if you have any further queries.

 

 

vnikhilanmsft_0-1696418621736.png

Hi,

On the image below, you may notice 3 real fields and one long, from the Yellow taxi sample data.

I even converted one of the real fields into integer.

 

DennesTorres_0-1696539672712.png

 


So, this explanation that "all the fields are string", would not possible, unless the bug with the lakehouse target is way bigger than I though and each target is recognizing different data types from the same source, what should not happen at all.

Did I miss something or is this a bug on how the targets are recognizing the data types?

Kind Regards,

Dennes

Hi @DennesTorres ,

As I understand you are facing issue in applying aggregate operations on the columns in the destination. Here we have two cases.

 

1)In the first case the destination in the eventstream is set to be Lakehouse table. The sample data which you are using (i.e) Yellow Taxi is converted to string by default in this case as we are reading the data from a CSV file which does not have type. Hence all the columns will be converted to string. For lake house, it is string by default but you can change the type using event processor no code editor.

 

2) In the second case the destination is set to be a KQL database. For kusto, it has the type as conversion by default. Hence the datatypes will not be changed and will be similar to that of the source datatypes as you have shown me in the screenshot.

 

So if you want to use the destination for the eventstream as Lakehouse and want to apply aggregate operation on your columns, we have to change the datatype of the columns. We cannot apply aggregate operation on a string datatype column. This can be achieved by using event processor no code editor. I have created a repro at my end and attaching the screenshots for your reference. I am able to apply aggregate operation on other columns as well. 

Hence it is not a bug.
Please refer this link for changing the datatypes of the columns: link 


vnikhilanmsft_1-1696606198753.png

 

 

vnikhilanmsft_0-1696606143514.png

 

vnikhilanmsft_2-1696606252556.png

Hope this helps. Do let us know if you have any further queries. Glad to help.

Hi,

 

This explain and works. But it doesn't change the fact it's a terrible idea to have transformations and types completely tied witht he target.

Is bad for us, who will use the technology, because it's prone to make maintenance more difficult. It's bad for the developers because it multiply the need of development and creates the requirement to keep the development in sync in two different places - and it's not in sync now, the targets don't have the same features.

This problem was not there before. When using streaming dataflows, you can notice by the image below, the transformations are done by the dataflow, independent of the target. I could add two targets receiving the same transformation result.

DennesTorres_0-1697128897111.png


When using Stream Analytics, the transformatiosn were done by the stream analytics query language, and we could drop the result in multiple targets

DennesTorres_1-1697129062556.png

The idea to keep the transformations and even the data types tied to the target doesn't seems a good idea at all and the previous technologies were not like this.

I hope this feedback reaches the development team.

Kind Regards,

Dennes



Thank you for sharing this feedback. We heard this feedback. The team is working on bringing the event processing logic design to the main canvas that can enable the common transformation logic be reused by multiple destinations. 

Hi @DennesTorres ,
Thanks for your valuable feedback. We can certainly keep an eye on it for upvote.

Appreciate if you could share the feedback on our feedback channel. Which would be open for the user community to upvote & comment on. This allows our product teams to effectively prioritize your request against our existing feature backlog and gives insight into the potential impact of implementing the suggested feature.

Hi @DennesTorres ,
We haven’t heard from you on the last response and was just checking back to see if your issue got resolved. Please let us know if you have any further queries.

Hi @DennesTorres ,

We haven’t heard from you on the last response and was just checking back to see if your issue got resolved. Please let us know if you have any further queries.

Hi @DennesTorres ,
Glad that your query got resolved. Please continue using Fabric Community for any help regarding your queries.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors
Top Kudoed Authors