Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Chrisjr
Helper IV
Helper IV

Remove groups if value at min date is not null in each rows of the group

Dear Community, 

 

I have the following dataset:

 

BrandModel Version01/202101-02-2103/202104/202101-05-2106/202107/2021
Skoda, Enyaq (SUV, 5D)IV 132kW 380null375nullnullnull4000
Skoda, Enyaq (SUV, 5D)IV 150kW 4400null4675nullnullnull495
Kia, EV6 (SUV, 5D)77KWH 168kW EV nullnullnull5390nullnullnull
Kia, EV6 (SUV, 5D)77KWH 168kW EV nullnullnull590nullnullnull
Volkswagen, ID.4 (SUV, 5D)82KWH 128kW EV A R PROnullnullnullnullnullnullnull
Volkswagen, ID.4 (SUV, 5D)82KWH 150kW EV A R PRO PE.nullnull460nullnull4954640
Volvo, XC40 (SUV, 5D)69KWH 175kW EV A R COREnullnullnullnullnullnull500
Volvo, XC40 (SUV, 5D)69KWH 175kW EV A R COREnullnullnullnullnullnull6500
FORD, MUSTANG MACH-E (SUV, 5D)76KWH 198kW EV A R RWD480nullnullnullnullnullnull
FORD, MUSTANG MACH-E (SUV, 5D)76KWH 198kW EV A R RWD400nullnullnullnullnullnull
Renault, MEGANE E-TECH (Hatchback, 5D)R130 96kW EV A TECHNOnullnullnullnullnullnullnull
Renault, MEGANE E-TECH (Hatchback, 5D)R220 160kW EV A TECHNOnullnullnullnullnullnullnull
Toyota, BZ4X (SUV, 5D)71KWH 150kW EV Anullnullnullnullnullnullnull
Toyota, BZ4X (SUV, 5D)71KWH 150kW EV Anullnullnullnullnullnullnull
Mercedes-Benz, EqA (SUV, 5D)67KWH 140kW EV A BUSINESS.nullnullnull560nullnullnull
Mercedes-Benz, EqA (SUV, 5D)67KWH 140kW EV A BUSINESS.nullnullnull78900nullnullnull
Polestar, 2 (Sedan, 5D)82KWH 220kW EV A Rnullnull300nullnullnullnull
Polestar, 2 (Sedan, 5D)82KWH 220kW EV A Rnullnull2500nullnullnullnull
Tesla, Model Y (SUV, 5D)LONG 378kW EV A 4 LONG R.nullnullnullnullnullnull600
Tesla, Model Y (SUV, 5D)LONG 378kW EV A 4 LONG R.nullnullnullnullnullnull600
Tesla, Model Y (SUV, 5D)LONG 378kW EV A 4 LONG R.nullnullnullnullnullnullnull
Audi, Q4 (SUV, 5D)82KWH 150kW EV A R ATTRAC.nullnullnull48500nullnullnull
Audi, Q4 (SUV, 5D)82KWH 150kW EV A R ATTRAC.nullnullnull4800nullnullnull

 

The challenge I have is the following: For each group of brand, if for the Min month, there is a value in both model version, I keep the group. However, if for the MIN month, there is a value for one model and for the same min month, the value is null for one or the other model version, I have to delete the group.

 

For example, I would keep the first group "Skoda, Enyaq (SUV, 5D)" in green color.  I would also keep the group Kia, EV6 (SUV, 5D) as the for the min date for all the model of this group there is a record. But I would have to delete the group in Tesla, Model Y (SUV, 5D) red as we don't have a not null value for all the models at the same "min" date. Same for the Volkswagen, ID.4 (SUV, 5D) group that I would need to delete.

 

The end result would be as below: 

Chrisjr_0-1677845710215.png

 

 

 

Anyone would have an idea on how to achieve this? 

(I am still a beginer in power query)

 

Thank you

2 ACCEPTED SOLUTIONS
ImkeF
Super User
Super User

Hi @Chrisjr ,
if you want a solution that:
a) works for different months as well and

b) is not too slow on large datasets
then you have to apply some advanced M-code, unfortunately:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Jan-21", type number}, {"Feb-21", type number}, {"Mar-21", type number}, {"Apr-21", type number}, {"May-21", type number}, {"Jun-21", type number}, {"Jul-21", type number}}),
    AddMonthValues = Table.AddColumn(#"Changed Type", "MonthValues", each Record.RemoveFields(_, {"Brand", "Model Version"})), 
    AddMinMonth = Table.AddColumn(
        AddMonthValues, 
        "Min Month", 
        each Table.SelectRows(Record.ToTable([MonthValues]), (t) => t[Value] <> null)[Name]{0}?
    ), 
    #"Grouped Rows" = Table.Group(
        AddMinMonth, 
        {"Brand"}, 
        {{"DistintMinMonths", each List.Count(List.Distinct([Min Month]))}, {"AllRows", each _}}
    ), 
    DeleteGroups = Table.SelectRows(#"Grouped Rows", each ([DistintMinMonths] = 1)), 
    #"Expanded AllRows" = Table.ExpandTableColumn(DeleteGroups, "AllRows", List.Difference(Table.ColumnNames(#"Changed Type"), {"Brand"})), 
    Cleanup = Table.RemoveColumns(#"Expanded AllRows", {"DistintMinMonths"})
in
    Cleanup

Please also check the file enclosed.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

Hello, @Chrisjr this?

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
    add_index = Table.AddIndexColumn(Source, "idx", 0, 1, Int64.Type),
    unpivot = Table.UnpivotOtherColumns(add_index, {"Brand", "Model Version", "idx"}, "month", "Value"),
    filter_nulls = Table.SelectRows(unpivot, each ([Value] <> "null")),
    date_type = Table.TransformColumns( filter_nulls, {"month", each Date.FromText( _, [Format = "MMM-yy"])}),
    gr = Table.Group(date_type, {"Brand"}, {{"rows", each Table.SelectColumns(_, {"idx", "month"})}}),
    selected_id = 
        List.Buffer(
            Table.Combine(
                List.Transform(
                    gr[rows], 
                    (x) => [a = List.Min(x[month]), b = Table.SelectRows(x, each _[month] = a)][b]
                )
            )[idx]
        ),
    out = 
        Table.RemoveColumns(
            Table.SelectRows( add_index, each List.Contains(selected_id, [idx])),
            {"idx"}
        )
in
    out

View solution in original post

9 REPLIES 9
ImkeF
Super User
Super User

Hi @Chrisjr ,
sorry, but I don't have time for it now.
So I would recommend to open a new thread for it.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

No worries @ImkeF  , thanks for your anwer

Chrisjr
Helper IV
Helper IV

Hi again @ImkeF , I wonder if you can help me again with a slight change in the scenario above:

 

For each group of brand, if for the Min month, there is a value in both model version, I still keep the group. But now, if for the MIN month, there is a value for one model and for the same min month, the value is null for one or the other model version, I have to keep only the model of the brand with the earliest record.

For example, for Tesla, Model Y (SUV, 5D), I would need to keep the 2 first rows and delete the third one.

Or for Volkswagen, ID.4 (SUV, 5D), I would need to keep the keep only the second model

end result.png

 

Thanks for your time,

 

Chris

Hello, @Chrisjr this?

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
    add_index = Table.AddIndexColumn(Source, "idx", 0, 1, Int64.Type),
    unpivot = Table.UnpivotOtherColumns(add_index, {"Brand", "Model Version", "idx"}, "month", "Value"),
    filter_nulls = Table.SelectRows(unpivot, each ([Value] <> "null")),
    date_type = Table.TransformColumns( filter_nulls, {"month", each Date.FromText( _, [Format = "MMM-yy"])}),
    gr = Table.Group(date_type, {"Brand"}, {{"rows", each Table.SelectColumns(_, {"idx", "month"})}}),
    selected_id = 
        List.Buffer(
            Table.Combine(
                List.Transform(
                    gr[rows], 
                    (x) => [a = List.Min(x[month]), b = Table.SelectRows(x, each _[month] = a)][b]
                )
            )[idx]
        ),
    out = 
        Table.RemoveColumns(
            Table.SelectRows( add_index, each List.Contains(selected_id, [idx])),
            {"idx"}
        )
in
    out

Hi @AlienSx thanks a lot for your answer.  Seems like it is what I need.  At the "selected id" step, what do you do (still biginner in power query 🙂 )? As I need to keep the null values, would it still work? 

Hi @Chrisjr selected_id step calculates min_month (variable a) and selects id's of records (within a group) with month = min_month. [idx] at the end refers to a single column with id's (which is list). Then we use this list (of id's) to select rows from original table. Since original table has all your nulls - nothing is lost.

nulls.jpg

Thanks for the explanation @AlienSx 

Chrisjr
Helper IV
Helper IV

Thanks a lot @ImkeF 🙏. Had the same idea to accomplish it but had no idea how to M code it. 

ImkeF
Super User
Super User

Hi @Chrisjr ,
if you want a solution that:
a) works for different months as well and

b) is not too slow on large datasets
then you have to apply some advanced M-code, unfortunately:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Jan-21", type number}, {"Feb-21", type number}, {"Mar-21", type number}, {"Apr-21", type number}, {"May-21", type number}, {"Jun-21", type number}, {"Jul-21", type number}}),
    AddMonthValues = Table.AddColumn(#"Changed Type", "MonthValues", each Record.RemoveFields(_, {"Brand", "Model Version"})), 
    AddMinMonth = Table.AddColumn(
        AddMonthValues, 
        "Min Month", 
        each Table.SelectRows(Record.ToTable([MonthValues]), (t) => t[Value] <> null)[Name]{0}?
    ), 
    #"Grouped Rows" = Table.Group(
        AddMinMonth, 
        {"Brand"}, 
        {{"DistintMinMonths", each List.Count(List.Distinct([Min Month]))}, {"AllRows", each _}}
    ), 
    DeleteGroups = Table.SelectRows(#"Grouped Rows", each ([DistintMinMonths] = 1)), 
    #"Expanded AllRows" = Table.ExpandTableColumn(DeleteGroups, "AllRows", List.Difference(Table.ColumnNames(#"Changed Type"), {"Brand"})), 
    Cleanup = Table.RemoveColumns(#"Expanded AllRows", {"DistintMinMonths"})
in
    Cleanup

Please also check the file enclosed.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors