Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
sambgv
Frequent Visitor

Distinct values increase after just removing columns

Hi! 

 

I spotted a strange behaviour in datasets that I work with. 

 

So when I just open a certain table in the Power Query, I see a number for distinct values in the column "ID". When I remove all but this column, I see a different bigger number for distinct values in the column. When I pick "ID" and 2-3 other columns and remove the rest I get another number for distinct values in the column "ID". I do not apply any filters to the table, just remove columns and get different results. Could you please explain why this happens? It seems the dataset gets larger after just removing columns. Also, this happens to all tables that I work with.

 

Datasource: Amazon Redshift. (OLAP) 

 

Thank you 🙂 

1 ACCEPTED SOLUTION
DOLEARY85
Super User
Super User

Hi,

 

Power Query by default only shows a preview of the data from the query so it's possible that when you remove columns from a table, that it could affect the cardinality of the remaining columns. 

 

If you're concerned that there is an error somewhere, you could duplicate the input in Power Query, remove the columns in one of the inputs then load to Power BI Report View. You can then do a distict count measure of the ID field from both inputs. I'd imagine they would be the same.

 

If I answered your question, please mark my post as solution, Appreciate your Kudos 👍

View solution in original post

3 REPLIES 3
DOLEARY85
Super User
Super User

Hi,

 

Power Query by default only shows a preview of the data from the query so it's possible that when you remove columns from a table, that it could affect the cardinality of the remaining columns. 

 

If you're concerned that there is an error somewhere, you could duplicate the input in Power Query, remove the columns in one of the inputs then load to Power BI Report View. You can then do a distict count measure of the ID field from both inputs. I'd imagine they would be the same.

 

If I answered your question, please mark my post as solution, Appreciate your Kudos 👍

Thanks! It worked. Do you by chance have any links for further reading regarding the topic? It's just I never encountered this "issue" before. 

Unfortunately nothing specific about the issue but you could try the Power query documentation, there should be something related to data sources, cardinality and query folding which may also account for the issue:

 

https://learn.microsoft.com/en-us/power-query/

 

If I answered your question, please mark my post as solution, Appreciate your Kudos 👍

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.