March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello all,
I have a problem which sounds so easy but as I'm a newbie I can't manage it and need your expertise to help me please:
As a sample you can have a look at the table below:
I need to remove all the rows which have not Main Category value on their date (in this example I should remove rows 9,10,11), The main category value here is the beginning point for my process per day and if it does not exist in a day, I should remove all the other records on that specific date or ignore them
could you please let me know how can I make a DAX code or M-query to fix it? as my data are thousands or millions, needs to have the fastest one.
Many thanks
Solved! Go to Solution.
Try this solution in Power Query.
1. Create table MainCategoryDates, consisting of distinct dates for "Main Category" rows.
let
Source = OriginalTable,
FilterRows = Table.SelectRows(Source, each ([Main column] = "Main Category")),
RemoveColumns = Table.SelectColumns(FilterRows,{"Date"}),
RemoveDuplicates = Table.Distinct(RemoveColumns)
in
RemoveDuplicates
2. Merge table MainCategoryDates with the data table using an inner join on the Date column.
let
Source = Table.NestedJoin(OriginalTable, {"Date"}, MainCategoryDates, {"Date"}, "MainCategoryDates", JoinKind.Inner),
RemoveColumns = Table.RemoveColumns(Source,{"MainCategoryDates"})
in
RemoveColumns
3. Result:
Proud to be a Super User!
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k3MzFNwTixJTc8vqlTSUTIyMDLUNTDTNTBUitWJVkovKlAwxC5shCmMyzBDI7guY+zCJsjCFnBhU+zCZmjCsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Main Column" = _t, Date = _t]),
#"Grouped by Date" = Table.Group(Source, {"Date"}, {{"res", each if List.PositionOf([Main Column], "Main Category")<>-1 then _ else null}}),
#"Expanded Rows" = let cols=Table.ColumnNames(Source) in Table.ExpandTableColumn(Table.SelectRows(Table.RemoveColumns(#"Grouped by Date", "Date"), each not([res] is null)), "res", cols, cols)
in
#"Expanded Rows"
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8k3MzFNwTixJTc8vqlTSUTIyMDLUNTDTNTBUitWJVkovKlAwxC5shCmMyzBDI7guY+zCJsjCFnBhU+zCZmjCsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Main Column" = _t, Date = _t]),
#"Grouped by Date" = Table.Group(Source, {"Date"}, {{"res", each if List.PositionOf([Main Column], "Main Category")<>-1 then _ else null}}),
#"Expanded Rows" = let cols=Table.ColumnNames(Source) in Table.ExpandTableColumn(Table.SelectRows(Table.RemoveColumns(#"Grouped by Date", "Date"), each not([res] is null)), "res", cols, cols)
in
#"Expanded Rows"
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
May I ask if I want to add another column to have the condition based on that as well, how can I extend this query? I have another column called category_id, and I want to check if we have the Main category check the category_id of all these group1, group2, ... and if their category_ids is the same as the Main category then accumulate prices of them:
For the above example if group 1, group 2 has the same category_id as the Main category,
I need to sum up for Feb 12th prices of Main category + group 1 + group 2 = 1290+340+367
You can use Group By in Power Query. See step GroupRows:
let
Source = OriginalTable,
GroupByDate = Table.Group(
Source,
{"Date"},
{{"res", each if List.PositionOf([Main column], "Main Category") <> - 1 then _ else null}}
),
ExpandRows =
let
cols = Table.ColumnNames(Source)
in
Table.ExpandTableColumn(
Table.SelectRows(Table.RemoveColumns(GroupByDate, "Date"), each not ([res] is null)),
"res",
cols,
cols
),
ChangeType = Table.TransformColumnTypes(
ExpandRows,
{{"ID", Int64.Type}, {"Date", type date}, {"Price", Int64.Type}}
),
GroupRows = Table.Group(
ChangeType,
{"Date", "category_id"},
{{"Sum Price", each List.Sum([Price]), type number}}
)
in
GroupRows
Proud to be a Super User!
@DataInsights thanks a lot, it works well and now I learned how can I group-by a table with one or more conditions ... that was what exactly I wanted 🙂
Many thanks, it works well 🙂
Try this solution in Power Query.
1. Create table MainCategoryDates, consisting of distinct dates for "Main Category" rows.
let
Source = OriginalTable,
FilterRows = Table.SelectRows(Source, each ([Main column] = "Main Category")),
RemoveColumns = Table.SelectColumns(FilterRows,{"Date"}),
RemoveDuplicates = Table.Distinct(RemoveColumns)
in
RemoveDuplicates
2. Merge table MainCategoryDates with the data table using an inner join on the Date column.
let
Source = Table.NestedJoin(OriginalTable, {"Date"}, MainCategoryDates, {"Date"}, "MainCategoryDates", JoinKind.Inner),
RemoveColumns = Table.RemoveColumns(Source,{"MainCategoryDates"})
in
RemoveColumns
3. Result:
Proud to be a Super User!
Thank you, it works well for both solutions 🙂
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
21 | |
19 | |
17 | |
9 | |
5 |
User | Count |
---|---|
36 | |
29 | |
16 | |
15 | |
12 |