Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
champioz
Frequent Visitor

Inconsistent GroupBy behavior that ignores certain rows and duplicates others

Hi,

I'm trying to do a simple two-column groupby with a single aggregation, and noticed I had a few entries with incorrect output. So I filtered to a single ID both before and after grouping to compare their results

 

Here is that one ID's appearance with no grouping. I am grouping by the ID and StageName columns and taking the MIN of CreatedDate:

champioz_0-1716841609866.png

 

Here is what happens when I filter to just that ID and then GroupBy (correct):

champioz_1-1716841637408.png

 

Here is what happens when I GroupBy the entire dataset and then filter to just this ID (incorrect, missing Closed Lost and dupes Probe):

champioz_2-1716841663401.png

 

As you can see in the steps, this GroupBy and filtering are the only operations I am performing. How is this error possible?

 

Edit: The plot thickens. I reduced the size of the dataset by taking out a chunk of records I was going to remove afterwards, and now the GroupBy performs consistently correctly on the previous erroneous entries. Is there a row limit on grouping accuracy in power query? I went from about 100,000 to about 50,000.

2 REPLIES 2
v-junyant-msft
Community Support
Community Support

Hi @champioz ,

I tested it on a small dataset but was not able to replicate the situation you are facing and there is no documentation that mentions that there is a limit to the number of rows that can be processed in Power Query. This may be a temporary display error, please test with a different ID to see if you face the same issue, or click refresh preview or a different machine to see if the issue is resolved.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hmm, I'm not sure about a display error as the errant output was persistent upon applying power query results in the PBI table view. There were only about 25 out of 42,000 unique IDs affected that I could find but you can see how it would make me uneasy about the result accuracy. If it would be helpful I can try to replicate this behavior with a batch of fake data and try it on multiple machines, though at present I don't really have any need for further help as reducing the dataset seems to have worked.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.