Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
Dear Community,
I have the following dataset:
Brand | Model Version | 01/2021 | 01-02-21 | 03/2021 | 04/2021 | 01-05-21 | 06/2021 | 07/2021 |
Skoda, Enyaq (SUV, 5D) | IV 132kW | 380 | null | 375 | null | null | null | 4000 |
Skoda, Enyaq (SUV, 5D) | IV 150kW | 4400 | null | 4675 | null | null | null | 495 |
Kia, EV6 (SUV, 5D) | 77KWH 168kW EV | null | null | null | 5390 | null | null | null |
Kia, EV6 (SUV, 5D) | 77KWH 168kW EV | null | null | null | 590 | null | null | null |
Volkswagen, ID.4 (SUV, 5D) | 82KWH 128kW EV A R PRO | null | null | null | null | null | null | null |
Volkswagen, ID.4 (SUV, 5D) | 82KWH 150kW EV A R PRO PE. | null | null | 460 | null | null | 495 | 4640 |
Volvo, XC40 (SUV, 5D) | 69KWH 175kW EV A R CORE | null | null | null | null | null | null | 500 |
Volvo, XC40 (SUV, 5D) | 69KWH 175kW EV A R CORE | null | null | null | null | null | null | 6500 |
FORD, MUSTANG MACH-E (SUV, 5D) | 76KWH 198kW EV A R RWD | 480 | null | null | null | null | null | null |
FORD, MUSTANG MACH-E (SUV, 5D) | 76KWH 198kW EV A R RWD | 400 | null | null | null | null | null | null |
Renault, MEGANE E-TECH (Hatchback, 5D) | R130 96kW EV A TECHNO | null | null | null | null | null | null | null |
Renault, MEGANE E-TECH (Hatchback, 5D) | R220 160kW EV A TECHNO | null | null | null | null | null | null | null |
Toyota, BZ4X (SUV, 5D) | 71KWH 150kW EV A | null | null | null | null | null | null | null |
Toyota, BZ4X (SUV, 5D) | 71KWH 150kW EV A | null | null | null | null | null | null | null |
Mercedes-Benz, EqA (SUV, 5D) | 67KWH 140kW EV A BUSINESS. | null | null | null | 560 | null | null | null |
Mercedes-Benz, EqA (SUV, 5D) | 67KWH 140kW EV A BUSINESS. | null | null | null | 78900 | null | null | null |
Polestar, 2 (Sedan, 5D) | 82KWH 220kW EV A R | null | null | 300 | null | null | null | null |
Polestar, 2 (Sedan, 5D) | 82KWH 220kW EV A R | null | null | 2500 | null | null | null | null |
Tesla, Model Y (SUV, 5D) | LONG 378kW EV A 4 LONG R. | null | null | null | null | null | null | 600 |
Tesla, Model Y (SUV, 5D) | LONG 378kW EV A 4 LONG R. | null | null | null | null | null | null | 600 |
Tesla, Model Y (SUV, 5D) | LONG 378kW EV A 4 LONG R. | null | null | null | null | null | null | null |
Audi, Q4 (SUV, 5D) | 82KWH 150kW EV A R ATTRAC. | null | null | null | 48500 | null | null | null |
Audi, Q4 (SUV, 5D) | 82KWH 150kW EV A R ATTRAC. | null | null | null | 4800 | null | null | null |
The challenge I have is the following: For each group of brand, if for the Min month, there is a value in both model version, I keep the group. However, if for the MIN month, there is a value for one model and for the same min month, the value is null for one or the other model version, I have to delete the group.
For example, I would keep the first group "Skoda, Enyaq (SUV, 5D)" in green color. I would also keep the group Kia, EV6 (SUV, 5D) as the for the min date for all the model of this group there is a record. But I would have to delete the group in Tesla, Model Y (SUV, 5D) red as we don't have a not null value for all the models at the same "min" date. Same for the Volkswagen, ID.4 (SUV, 5D) group that I would need to delete.
The end result would be as below:
Anyone would have an idea on how to achieve this?
(I am still a beginer in power query)
Thank you
Solved! Go to Solution.
Hi @Chrisjr ,
if you want a solution that:
a) works for different months as well and
b) is not too slow on large datasets
then you have to apply some advanced M-code, unfortunately:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Jan-21", type number}, {"Feb-21", type number}, {"Mar-21", type number}, {"Apr-21", type number}, {"May-21", type number}, {"Jun-21", type number}, {"Jul-21", type number}}),
AddMonthValues = Table.AddColumn(#"Changed Type", "MonthValues", each Record.RemoveFields(_, {"Brand", "Model Version"})),
AddMinMonth = Table.AddColumn(
AddMonthValues,
"Min Month",
each Table.SelectRows(Record.ToTable([MonthValues]), (t) => t[Value] <> null)[Name]{0}?
),
#"Grouped Rows" = Table.Group(
AddMinMonth,
{"Brand"},
{{"DistintMinMonths", each List.Count(List.Distinct([Min Month]))}, {"AllRows", each _}}
),
DeleteGroups = Table.SelectRows(#"Grouped Rows", each ([DistintMinMonths] = 1)),
#"Expanded AllRows" = Table.ExpandTableColumn(DeleteGroups, "AllRows", List.Difference(Table.ColumnNames(#"Changed Type"), {"Brand"})),
Cleanup = Table.RemoveColumns(#"Expanded AllRows", {"DistintMinMonths"})
in
Cleanup
Please also check the file enclosed.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hello, @Chrisjr this?
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
add_index = Table.AddIndexColumn(Source, "idx", 0, 1, Int64.Type),
unpivot = Table.UnpivotOtherColumns(add_index, {"Brand", "Model Version", "idx"}, "month", "Value"),
filter_nulls = Table.SelectRows(unpivot, each ([Value] <> "null")),
date_type = Table.TransformColumns( filter_nulls, {"month", each Date.FromText( _, [Format = "MMM-yy"])}),
gr = Table.Group(date_type, {"Brand"}, {{"rows", each Table.SelectColumns(_, {"idx", "month"})}}),
selected_id =
List.Buffer(
Table.Combine(
List.Transform(
gr[rows],
(x) => [a = List.Min(x[month]), b = Table.SelectRows(x, each _[month] = a)][b]
)
)[idx]
),
out =
Table.RemoveColumns(
Table.SelectRows( add_index, each List.Contains(selected_id, [idx])),
{"idx"}
)
in
out
Hi @Chrisjr ,
sorry, but I don't have time for it now.
So I would recommend to open a new thread for it.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi again @ImkeF , I wonder if you can help me again with a slight change in the scenario above:
For each group of brand, if for the Min month, there is a value in both model version, I still keep the group. But now, if for the MIN month, there is a value for one model and for the same min month, the value is null for one or the other model version, I have to keep only the model of the brand with the earliest record.
For example, for Tesla, Model Y (SUV, 5D), I would need to keep the 2 first rows and delete the third one.
Or for Volkswagen, ID.4 (SUV, 5D), I would need to keep the keep only the second model
Thanks for your time,
Chris
Hello, @Chrisjr this?
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
add_index = Table.AddIndexColumn(Source, "idx", 0, 1, Int64.Type),
unpivot = Table.UnpivotOtherColumns(add_index, {"Brand", "Model Version", "idx"}, "month", "Value"),
filter_nulls = Table.SelectRows(unpivot, each ([Value] <> "null")),
date_type = Table.TransformColumns( filter_nulls, {"month", each Date.FromText( _, [Format = "MMM-yy"])}),
gr = Table.Group(date_type, {"Brand"}, {{"rows", each Table.SelectColumns(_, {"idx", "month"})}}),
selected_id =
List.Buffer(
Table.Combine(
List.Transform(
gr[rows],
(x) => [a = List.Min(x[month]), b = Table.SelectRows(x, each _[month] = a)][b]
)
)[idx]
),
out =
Table.RemoveColumns(
Table.SelectRows( add_index, each List.Contains(selected_id, [idx])),
{"idx"}
)
in
out
Hi @AlienSx thanks a lot for your answer. Seems like it is what I need. At the "selected id" step, what do you do (still biginner in power query 🙂 )? As I need to keep the null values, would it still work?
Hi @Chrisjr selected_id step calculates min_month (variable a) and selects id's of records (within a group) with month = min_month. [idx] at the end refers to a single column with id's (which is list). Then we use this list (of id's) to select rows from original table. Since original table has all your nulls - nothing is lost.
Thanks a lot @ImkeF 🙏. Had the same idea to accomplish it but had no idea how to M code it.
Hi @Chrisjr ,
if you want a solution that:
a) works for different months as well and
b) is not too slow on large datasets
then you have to apply some advanced M-code, unfortunately:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("3ZVdT4MwFIb/ysmuNOlMB+XrkrHqFt2HhX2o8QJHowYC0W0a/TX+Fn+ZBWWu2bpNp4nxhqYU3ie85z2Hi4uKH2dRiICmT+Ed7Pn9AQKjsV9BldYAaroWD19fxEa3sbimsyTJN5bxuZEXgjGuXKINsgb+kCXi+YWXzTXCjlHoHt/mqgNT0rSs42ETaqYdD8VZobxSxNAdvPpsd+l1yoMsiSeP4TVPEbQaB0Qi2FpB0N4J4AKDHuuqOOplW1Bu/gIIevRg2W1z+WvyCuQnBJeohwzByCNYophOQbGMT4rXZfRL32Pg32eYJeSwyxoI2n0/cDtH0Ha9ZpXKETALmrNQIDZs5F7Yipor67MDC3+VxXgazpKpwNEjt0OBVgPqNWGvGU7HN1fhOC6ZrKZjcMySmD/V+Wb+tmdqGhZdhX8CGmRP2VS0bv2cjGQza3Lg/6R8m9+PecQn1TpPn8X8uXPlqL/PHzJ3qt73Wx3q+8tNWzbPit79PZplO6pk5rxelvDJNLxHoAkQj8JUnkYiBvOkL6noGzK/M0AztiAEfJKI8reziCdwJtl10hVdrFvzZiVQ3GFKt1ZPoo9B9H9Ac+/cWXSL4HTjf8gNAuZ6SgixlYX6UYoScvkG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Brand = _t, #"Model Version" = _t, #"Jan-21" = _t, #"Feb-21" = _t, #"Mar-21" = _t, #"Apr-21" = _t, #"May-21" = _t, #"Jun-21" = _t, #"Jul-21" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Jan-21", type number}, {"Feb-21", type number}, {"Mar-21", type number}, {"Apr-21", type number}, {"May-21", type number}, {"Jun-21", type number}, {"Jul-21", type number}}),
AddMonthValues = Table.AddColumn(#"Changed Type", "MonthValues", each Record.RemoveFields(_, {"Brand", "Model Version"})),
AddMinMonth = Table.AddColumn(
AddMonthValues,
"Min Month",
each Table.SelectRows(Record.ToTable([MonthValues]), (t) => t[Value] <> null)[Name]{0}?
),
#"Grouped Rows" = Table.Group(
AddMinMonth,
{"Brand"},
{{"DistintMinMonths", each List.Count(List.Distinct([Min Month]))}, {"AllRows", each _}}
),
DeleteGroups = Table.SelectRows(#"Grouped Rows", each ([DistintMinMonths] = 1)),
#"Expanded AllRows" = Table.ExpandTableColumn(DeleteGroups, "AllRows", List.Difference(Table.ColumnNames(#"Changed Type"), {"Brand"})),
Cleanup = Table.RemoveColumns(#"Expanded AllRows", {"DistintMinMonths"})
in
Cleanup
Please also check the file enclosed.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries