Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi community,
I have an issue with aggregating a table and getting the correct distinct count on every level of aggregation. I need this aggregation on Power Query (or SQL) not on DAX.
The table has several columns and I need the distinct count of one of them and the sum of the amounts of others.
The issue is when we need the total on different levels. For example the daily distinct count is correct, but if we sum it up to get the monthly one, it will be wrong because a same value can appear in several days. And there are plenty of combinations between dates (year, quarter, month,day) and every column and every combination of them.
The table looks like this and the column where we need the distinct count is F_id.
Any thoughts more than welcome. thanks.
| Date | A_id | B_id | C_id | D_id | E_id | F_id | Amount1 | Boolean | Amount2 |
| 2/13/2022 | 9a | 2b | 2 | 1d | 2f | 1 | 1 | 3000 | |
| 2/13/2022 | 2a | 2b | 1 | 2d | 4e | 4f | 1 | 1 | 1000 |
| 2/12/2022 | 5a | 2b | 1 | 2d | 13e | 4f | 1 | 1 | 3000 |
| 2/12/2022 | 7a | 2b | 1 | 2d | 4f | 1 | 1 | 2000 | |
| 2/11/2022 | 11a | 2b | 2 | 3d | 5e | 3f | 2 | 1 | 0 |
| 2/11/2022 | 3a | 2b | 2 | 3d | 3f | 2 | 1 | 0 | |
| 2/11/2022 | 11a | 2b | 2 | 3d | 3e | 3f | 2 | 1 | 0 |
| 2/11/2022 | 3a | 2b | 2 | 2d | 4f | 1 | 1 | 2000 | |
| 2/11/2022 | 3a | 2b | 2 | 3d | 3f | 1 | 1 | 0 | |
| 2/11/2022 | 1a | 2b | 2 | 3d | 17e | 3f | 4 | 1 | 0 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 3f | 2 | 1 | 0 | |
| 2/11/2022 | 13a | 2b | 2 | 2d | 16e | 4f | 1 | 1 | 1000 |
| 2/11/2022 | 6a | 2b | 2 | 2d | 2e | 4f | 1 | 1 | 3000 |
| 2/11/2022 | 11a | 2b | 2 | 2d | 5e | 4f | 1 | 1 | 2000 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 3f | 3 | 1 | 0 | |
| 2/11/2022 | 4a | 2b | 2 | 2d | 4f | 1 | 1 | 1000 | |
| 2/11/2022 | 11a | 2b | 2 | 3d | 1e | 3f | 1 | 1 | 0 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 10e | 3f | 3 | 1 | 0 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 9e | 3f | 2 | 1 | 0 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 6e | 3f | 3 | 1 | 0 |
| 2/11/2022 | 11a | 2b | 2 | 3d | 8e | 3f | 1 | 1 | 0 |
| 2/11/2022 | 1a | 2b | 2 | 3d | 3f | 2 | 1 | 0 | |
| 2/11/2022 | 12a | 1a | 2 | 3d | 21e | 1f | 1 | 1 | 18000 |
| 2/11/2022 | 8a | 2b | 2 | 2d | 25e | 4f | 1 | 1 | 1000 |
| 2/11/2022 | 1a | 2b | 2 | 3d | 3f | 3 | 1 | 0 | |
| 2/10/2022 | 10a | 2b | 2 | 1d | 19e | 5f | 1 | 1 | 4000 |
| 2/10/2022 | 10a | 2b | 2 | 1d | 15e | 5f | 1 | 1 | 5000 |
| 2/9/2022 | 15a | 2b | 1 | 2d | 7e | 4f | 1 | 1 | 2000 |
Solved! Go to Solution.
Hi @gp10, check this one:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndRdDoMgDADgqxiel0hb8Ocsxoct0/sfYeAGg7UI8wESEr+0pZVlUdgD9agR1U3Nd7fhw29uwdNtnT/s/vRZpLVW6y2HGKH/BD00m99SCYnEIC2XQIySREdOu1+IAVIPECBAVid5aX1M2kPpbnFHnHWyMpVoVIhmi9Eq5Y2NaUIWDtM0uYIxpmnKToCFW8mYUB4MpzMT7cApno5MKVmMnZdvtblOKtZpGrooFinFg9iPkz5KTm/1TCU4F+a05oaL8aaWAq/N2/FEvW1geNwnZK2YpF5MwsDxsZHb+M/U6Kg0f4nhaIZNI5ok4rm1zNqvnSMVXuOx8HusLw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, A_id = _t, B_id = _t, C_id = _t, D_id = _t, E_id = _t, F_id = _t, Amount1 = _t, Boolean = _t, Amount2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}}, "en-US"),
#"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Date]), Int64.Type),
#"Inserted Quarter" = Table.AddColumn(#"Inserted Year", "Quarter", each Date.QuarterOfYear([Date]), Int64.Type),
#"Inserted Month" = Table.AddColumn(#"Inserted Quarter", "Month", each Date.Month([Date]), Int64.Type),
GroupedRowsYear = Table.Group(#"Inserted Month", {"Year"}, {{"Detail Year", each _, type table}}),
StepBack = #"Inserted Month",
YearQuarterMonthDistinct = Table.Distinct(Table.SelectColumns(StepBack, {"Year", "Quarter", "Month"})),
MergedQueryItself = Table.NestedJoin(YearQuarterMonthDistinct, {"Year"}, GroupedRowsYear, {"Year"}, "GroupedRowsYear", JoinKind.LeftOuter),
#"Expanded Detail" = Table.ExpandTableColumn(MergedQueryItself, "GroupedRowsYear", {"Detail Year"}, {"Detail Year"}),
id_ColumnNames = Table.FromList(List.Select(Table.ColumnNames(#"Changed Type"), each Text.EndsWith(_, "id")), Splitter.SplitByNothing()),
AddedYearQuarterMonth = Table.AddColumn(id_ColumnNames, "Custom", each {"Year", "Quarter", "Month"}),
Expanded_id_YearQuarterMonth = Table.ExpandListColumn(AddedYearQuarterMonth, "Custom"),
StepBack2 = #"Expanded Detail",
Ad_SummaryMonthColumns = List.Accumulate(
List.Buffer(Table.ToRows(Expanded_id_YearQuarterMonth)),
StepBack2,
(s,c)=> Table.AddColumn(s, c{0} & "_Distinct_" & c{1}, each Table.RowCount(Table.Distinct(Table.SelectRows([Detail Year], (a)=> Record.Field(a, c{1}) = Record.Field(_, c{1})), {{c{0}, Comparer.Ordinal}})), Int64.Type))
in
Ad_SummaryMonthColumns
Hi @gp10, check this one:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndRdDoMgDADgqxiel0hb8Ocsxoct0/sfYeAGg7UI8wESEr+0pZVlUdgD9agR1U3Nd7fhw29uwdNtnT/s/vRZpLVW6y2HGKH/BD00m99SCYnEIC2XQIySREdOu1+IAVIPECBAVid5aX1M2kPpbnFHnHWyMpVoVIhmi9Eq5Y2NaUIWDtM0uYIxpmnKToCFW8mYUB4MpzMT7cApno5MKVmMnZdvtblOKtZpGrooFinFg9iPkz5KTm/1TCU4F+a05oaL8aaWAq/N2/FEvW1geNwnZK2YpF5MwsDxsZHb+M/U6Kg0f4nhaIZNI5ok4rm1zNqvnSMVXuOx8HusLw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, A_id = _t, B_id = _t, C_id = _t, D_id = _t, E_id = _t, F_id = _t, Amount1 = _t, Boolean = _t, Amount2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}}, "en-US"),
#"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Date]), Int64.Type),
#"Inserted Quarter" = Table.AddColumn(#"Inserted Year", "Quarter", each Date.QuarterOfYear([Date]), Int64.Type),
#"Inserted Month" = Table.AddColumn(#"Inserted Quarter", "Month", each Date.Month([Date]), Int64.Type),
GroupedRowsYear = Table.Group(#"Inserted Month", {"Year"}, {{"Detail Year", each _, type table}}),
StepBack = #"Inserted Month",
YearQuarterMonthDistinct = Table.Distinct(Table.SelectColumns(StepBack, {"Year", "Quarter", "Month"})),
MergedQueryItself = Table.NestedJoin(YearQuarterMonthDistinct, {"Year"}, GroupedRowsYear, {"Year"}, "GroupedRowsYear", JoinKind.LeftOuter),
#"Expanded Detail" = Table.ExpandTableColumn(MergedQueryItself, "GroupedRowsYear", {"Detail Year"}, {"Detail Year"}),
id_ColumnNames = Table.FromList(List.Select(Table.ColumnNames(#"Changed Type"), each Text.EndsWith(_, "id")), Splitter.SplitByNothing()),
AddedYearQuarterMonth = Table.AddColumn(id_ColumnNames, "Custom", each {"Year", "Quarter", "Month"}),
Expanded_id_YearQuarterMonth = Table.ExpandListColumn(AddedYearQuarterMonth, "Custom"),
StepBack2 = #"Expanded Detail",
Ad_SummaryMonthColumns = List.Accumulate(
List.Buffer(Table.ToRows(Expanded_id_YearQuarterMonth)),
StepBack2,
(s,c)=> Table.AddColumn(s, c{0} & "_Distinct_" & c{1}, each Table.RowCount(Table.Distinct(Table.SelectRows([Detail Year], (a)=> Record.Field(a, c{1}) = Record.Field(_, c{1})), {{c{0}, Comparer.Ordinal}})), Int64.Type))
in
Ad_SummaryMonthColumns
Hi @gp10
You can put the following code to advanced editor in power query
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndRdDoMgDADgqxiel0hb8Ocsxoct0/sfYeAGg7UI8wESEr+0pZVlUdgD9agR1U3Nd7fhw29uwdNtnT/s/vRZpLVW6y2HGKH/BD00m99SCYnEIC2XQIySREdOu1+IAVIPECBAVid5aX1M2kPpbnFHnHWyMpVoVIhmi9Eq5Y2NaUIWDtM0uYIxpmnKToCFW8mYUB4MpzMT7cApno5MKVmMnZdvtblOKtZpGrooFinFg9iPkz5KTm/1TCU4F+a05oaL8aaWAq/N2/FEvW1geNwnZK2YpF5MwsDxsZHb+M/U6Kg0f4nhaIZNI5ok4rm1zNqvnSMVXuOx8HusLw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, A_id = _t, B_id = _t, C_id = _t, D_id = _t, E_id = _t, F_id = _t, Amount1 = _t, Boolean = _t, Amount2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"A_id", type text}, {"B_id", type text}, {"C_id", Int64.Type}, {"D_id", type text}, {"E_id", type text}, {"F_id", type text}, {"Amount1", Int64.Type}, {"Boolean", Int64.Type}, {"Amount2", Int64.Type}}),
#"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Date]), Int64.Type),
#"Inserted Quarter" = Table.AddColumn(#"Inserted Year", "Quarter", each Date.QuarterOfYear([Date]), Int64.Type),
#"Inserted Month" = Table.AddColumn(#"Inserted Quarter", "Month", each Date.Month([Date]), Int64.Type),
#"Added Custom" = Table.AddColumn(#"Inserted Month", "DistinctCount_Month", each List.Count(List.Distinct(Table.SelectRows(#"Inserted Month",(x)=>x[Year]=[Year] and x[Month]=[Month])[F_id]))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "DistinctCount_year", each List.Count(List.Distinct(Table.SelectRows(#"Inserted Month",(x)=>x[Year]=[Year])[F_id]))),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "DistinctCount_Quarter", each List.Count(List.Distinct(Table.SelectRows(#"Inserted Month",(x)=>x[Year]=[Year] and x[Quarter]=[Quarter])[F_id])))
in
#"Added Custom2"
Output
Best Regards!
Yolo Zhu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @Anonymous , thank you so much for your time and effort.
While this covers a part of the issue, it does adress only the date part.
Not the combinations of all other columns.
My goal is to get rid of the F_id column.
In this solution if I want to see the distinct count by A_id for example, I can still see only the distinct count for the month regardless of A_id's values.
All the other combinations are missing.
Thanks again.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!