The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hello all,
I am trying to solve the following problem in in Power Bi using Power Query. I have found some stuff around this forum but it is not exactly what i need to do, and i am not so expert to figure out by myself.
I have a lookup table like this:
NAME | ID |
aaa | id12 |
aaa | id34 |
aaa | id23 |
bbb | id15 |
bbb | id25 |
ccc | id67 |
ccc | id54 |
ccc | id33 |
ddd | id20 |
ddd | id65 |
ddd | id78 |
And a Raw Data Table like this:
(The very first row is not part of table. I added it for the sake of clarity)
#aaa | #aaa | #aaa | #bbb | #bbb | #ccc | #ccc | #ccc | #ddd | #ddd | #ddd | ||
timestamp | object | id12 | id34 | id23 | id15 | id25 | id67 | id54 | id33 | id20 | id65 | id78 |
13/04/2022 13:54 | L1 | 3202 | 4841 | 5548 | 7142 | |||||||
12/04/2022 04:22 | L3 | 2541 | 8963 | 5047 | 7596 | |||||||
11/04/2022 06:47 | N6 | 6165 | 2834 | 8034 | 3569 | 7984 | ||||||
10/04/2022 18:42 | M4 | 2586 | 8555 | 6135 | 9539 | 6527 | ||||||
09/04/2022 11:11 | M4 | 1290 | 9642 | 9944 | 5077 | 3355 | ||||||
08/04/2022 12:11 | L2 | 8060 | 5835 | 8539 | 6354 | |||||||
07/04/2022 14:00 | N6 | 6910 | 3955 | 3967 | 2868 | 9192 | ||||||
06/04/2022 13:56 | N6 | 2370 | 3002 | 4200 | 7257 | 8699 | ||||||
05/04/2022 01:11 | L1 | 7230 | 4520 | 1606 | 3594 | |||||||
04/04/2022 08:37 | M4 | 3205 | 5283 | 8986 | 8943 | 5553 | ||||||
03/04/2022 23:37 | L1 | 4740 | 6182 | 9407 | 6881 | |||||||
02/04/2022 10:00 | L2 | 1445 | 6379 | 1282 | 7989 | |||||||
01/04/2022 14:37 | L1 | 9878 | 3534 | 7907 | 2312 |
I would like to create an output table like the following:
timestamp | object | aaa | bbb | ccc | ddd |
13/04/2022 13:54 | L1 | 3202 | 4841 | 5548 | 7142 |
12/04/2022 04:22 | L3 | 2541 | 8963 | 5047 | 7596 |
11/04/2022 06:47 | N6 | 6165 | 2834 | 3569 | 7984 |
10/04/2022 18:42 | M4 | 2586 | 6135 | 9539 | 6527 |
09/04/2022 11:11 | M4 | 1290 | 9944 | 5077 | 3355 |
08/04/2022 12:11 | L2 | 8060 | 5835 | 8539 | 6354 |
07/04/2022 14:00 | N6 | 6910 | 3955 | 2868 | 9192 |
06/04/2022 13:56 | N6 | 2370 | 3002 | 7257 | 8699 |
05/04/2022 01:11 | L1 | 7230 | 4520 | 1606 | 3594 |
04/04/2022 08:37 | M4 | 3205 | 8986 | 8943 | 5553 |
03/04/2022 23:37 | L1 | 4740 | 6182 | 9407 | 6881 |
02/04/2022 10:00 | L2 | 1445 | 6379 | 1282 | 7989 |
01/04/2022 14:37 | L1 | 9878 | 3534 | 7907 | 2312 |
So that, it reprocess the raw data table replacing the "idxx" columns with "name" according to the lookup table and assign for that "name" only one value. The value must be the first non-null value of that "name" group for each row
Hope that i have been clear enough. The output example should help to better understand what i am trying to do here
Please note that the Lookup table and the Raw data table are both queried from MySQL database, so if necessary i could rearrange those two tables when loading them in power bi.
Thank you
Solved! Go to Solution.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately. If you have columns other than these, then delete Changed type step and do a Changed type for complete table from UI again)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VZPZjQQhDERTWfX3SINPbGLo3QRGk38ay2EakBCX4FG2i8/n4jfQGxPiD1ARvl7XDbWjulWH0dgYnoUI27OoLQPj9X11Eg4SF2yXb6odynbXXGmBEucDJK4TBAOkpR/503FCQWXM0KhJtdSH0UjUG8Zt7gUsRXxWuMn65S7LAmoiMvHUZi7kD1QF80HzgEEBmDBAT+OQK0fa3Jl7kHkFSTSfCpgFDAfsxh6TppUjI3kWtgtr2qjWa4BygLiktCfMIVjk/WlyXXLQtFXSwfFQpcsPusGQ8oSlbg7GtJRmlEY2dT9gErCIEPpRWvdYcC1Ak+4RkviMkAfHCuWZ9urRSI5UP0yH9bqa82Y0ETo0heGRBqyL4sxLh4It9zunw6dqBgEKv0Maae/1A+ZVMqW8Sga4UYdTfXr0qd9S5JbXRyPZnJ79VIQE9Qt+/wE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [timestamp = _t, object = _t, id12 = _t, id34 = _t, id23 = _t, id15 = _t, id25 = _t, id67 = _t, id54 = _t, id33 = _t, id20 = _t, id65 = _t, id78 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"timestamp", type datetime}, {"object", type text}, {"id12", Int64.Type}, {"id34", Int64.Type}, {"id23", Int64.Type}, {"id15", Int64.Type}, {"id25", Int64.Type}, {"id67", Int64.Type}, {"id54", Int64.Type}, {"id33", Int64.Type}, {"id20", Int64.Type}, {"id65", Int64.Type}, {"id78", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"timestamp", "object", "Index"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each Table1{[ID=[Attribute]]}[NAME]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Attribute"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Custom]), "Custom", "Value", List.Sum),
#"Sorted Rows" = Table.Sort(#"Pivoted Column",{{"Index", Order.Ascending}}),
#"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns1"
Test code for Table1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxMVNJRykwxNFKK1UFwjU1QuEbGYG5SUhJEsSkK1wjCTU5OBnPNzFG4piYoXGOIUSkpKRC9BihcM1MUrrmFUmwsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [NAME = _t, ID = _t])
in
Source
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately. If you have columns other than these, then delete Changed type step and do a Changed type for complete table from UI again)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VZPZjQQhDERTWfX3SINPbGLo3QRGk38ay2EakBCX4FG2i8/n4jfQGxPiD1ARvl7XDbWjulWH0dgYnoUI27OoLQPj9X11Eg4SF2yXb6odynbXXGmBEucDJK4TBAOkpR/503FCQWXM0KhJtdSH0UjUG8Zt7gUsRXxWuMn65S7LAmoiMvHUZi7kD1QF80HzgEEBmDBAT+OQK0fa3Jl7kHkFSTSfCpgFDAfsxh6TppUjI3kWtgtr2qjWa4BygLiktCfMIVjk/WlyXXLQtFXSwfFQpcsPusGQ8oSlbg7GtJRmlEY2dT9gErCIEPpRWvdYcC1Ak+4RkviMkAfHCuWZ9urRSI5UP0yH9bqa82Y0ETo0heGRBqyL4sxLh4It9zunw6dqBgEKv0Maae/1A+ZVMqW8Sga4UYdTfXr0qd9S5JbXRyPZnJ79VIQE9Qt+/wE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [timestamp = _t, object = _t, id12 = _t, id34 = _t, id23 = _t, id15 = _t, id25 = _t, id67 = _t, id54 = _t, id33 = _t, id20 = _t, id65 = _t, id78 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"timestamp", type datetime}, {"object", type text}, {"id12", Int64.Type}, {"id34", Int64.Type}, {"id23", Int64.Type}, {"id15", Int64.Type}, {"id25", Int64.Type}, {"id67", Int64.Type}, {"id54", Int64.Type}, {"id33", Int64.Type}, {"id20", Int64.Type}, {"id65", Int64.Type}, {"id78", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"timestamp", "object", "Index"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each Table1{[ID=[Attribute]]}[NAME]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Attribute"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Custom]), "Custom", "Value", List.Sum),
#"Sorted Rows" = Table.Sort(#"Pivoted Column",{{"Index", Order.Ascending}}),
#"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns1"
Test code for Table1
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSkxMVNJRykwxNFKK1UFwjU1QuEbGYG5SUhJEsSkK1wjCTU5OBnPNzFG4piYoXGOIUSkpKRC9BihcM1MUrrmFUmwsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [NAME = _t, ID = _t])
in
Source
Hello Vijay_A_Verma,
Sorry to bug you again, but i was too excited for the quick solution (i have been on it for 1 entire week) that i didn't notice that it does not cover all cases.
There are some cases in the raw data table with more the one value for each id group, for example:
#aaa | #aaa | #aaa | #bbb | #bbb | #ccc | #ccc | #ccc | #ddd | #ddd | #ddd | ||
timestamp | object | id12 | id34 | id23 | id15 | id25 | id67 | id54 | id33 | id20 | id65 | id78 |
10/04/2022 18:42 | M4 | 2586 | 8555 | 6135 | 9539 | 6527 | ||||||
09/04/2022 11:11 | M4 | 1290 | 9642 | 9944 | 5077 | 3355 |
Here for the "aaa" there are 2 values, and i would like to take only the first non-null value, so i am expecting
timestamp | object | aaa | bbb | ccc | ddd |
10/04/2022 18:42 | M4 | 2586 | 6135 | 9539 | 6527 |
09/04/2022 11:11 | M4 | 1290 | 9944 | 5077 | 3355 |
While with your solutions, it is adding all the "idxx" of same group:
Do you know how it could be possible to manage this cases as well?
Thank you!
Never mind, i figured it out
I added a "remove duplicate" steps, selecting the "custom" and the "timestamp" (in order to have an unique releation for each row). Is not so elegant as solution but it does the trick!
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VZPbjcQgDEVbWeV7pMFPbGrIbgOj6b+N5WECSIgAgcO1ffl8Ln4DvTEh/gAV4et13VA7qkv1MxobwzMRYXsmtWVgvL6vTsJB4oLt8E21Q9nOmistUOJ8gMR1gmCAtPQtfzp2KKiMERo1qZb6ZzQS9YZxm2sBSxGfFW6yfrnLsoCaiEw8tZEL+QNVwXzQPGBQACYM0NPY5MqRNnfmHmReQRLNqwJmAcMBu7HHpGnlyEieie3Cmjaq9RqgHCAuKe0JcwgWeb+aXJccNG2VdHA8VOnyg24wpDxhqZuDMS2lGaWRTd0PmAQsIoQ9ABZcANCk+08Sn9HxYFihPFNe/RmJkeqF6a5eU3PeTCZCh54wO9KAdUGceelQsOV853R4VM0gQOF1SCPlvXbAvMqllFe5ADfqcKlPfz61W4rc8npkJJvLs5+KkKA+v+8/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [timestamp = _t, object = _t, id12 = _t, id34 = _t, id23 = _t, id15 = _t, id25 = _t, id67 = _t, id54 = _t, id33 = _t, id20 = _t, id65 = _t, id78 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"timestamp", type datetime}, {"object", type text}, {"id12", Int64.Type}, {"id34", Int64.Type}, {"id23", Int64.Type}, {"id15", Int64.Type}, {"id25", Int64.Type}, {"id67", Int64.Type}, {"id54", Int64.Type}, {"id33", Int64.Type}, {"id20", Int64.Type}, {"id65", Int64.Type}, {"id78", Int64.Type}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"timestamp", "object", "Index"}, "Attribute", "Value"),
#"Added Custom" = Table.AddColumn(#"Unpivoted Other Columns", "Custom", each Table1{[ID=[Attribute]]}[NAME]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Attribute"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"Custom", "timestamp"}),
#"Pivoted Column" = Table.Pivot(#"Removed Duplicates", List.Distinct(#"Removed Duplicates"[Custom]), "Custom", "Value", List.Sum),
#"Sorted Rows" = Table.Sort(#"Pivoted Column",{{"Index", Order.Ascending}}),
#"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns1"
Thanks!
Thank you very much Vijay_A_Verma!
It seems exactly what i was looking for. I am going to try it on my actual data (that are a bit different) and let you know!
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.