Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Apologies if something similar has been posted.
I have a table with 5281 rows of data. I have taken screenshot, data hid as priviliged. so the first two rows are the same person and he did two types of CPD training on different dates, I would like just one row but that row to show both his trainings and both start dates as I need statistics from this. So rows 2-9 there are 4 different people but they have done two CPD trainings. Now not all 5281 row are duplicates, some are unique and also some entries have 3 entries. How do I just have one row for the duplicates containing all the data and I don't want to delete any information. Any help greatly appreciated.
Hi @bigrig33
You can group by the fullname and use Text.Combine() to combine them, you can refer to the following code to advanced editor.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlLSUUrUKEnVBNJGBkbG+ob6pkqxOlAJhKAZWNAYU7UhQgJd0ARTtRFYwhRVtbFSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Type = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Type", type text}, {"Date", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Text.From([Date])),
#"Grouped Rows1" = Table.Group(#"Added Custom", {"ID"}, {{"Count", each Text.Combine([Type],","), type text}, {"aa", each Text.Combine([Custom],","), type text}})
in
#"Grouped Rows1"
Output
Best Regards!
Yolo Zhu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @bigrig33 ,
Better to Pivot the data with the Traning and Start Date as the Value
ooooor
You can have a dim table of the entries and the table on your screenshot will be the fact table
@mussaenda Many thanks for replies, but again my stats will be skewed. So take the first two entires on screenshot, same full name, same payno, but they did two trainings on two seperate days, is there any way of getting all that info on the one line, without any data being lost. Like for Instance as i have over 5000 rows, I will have the same fullname duplicated 3 and 4 times because they have done 3 or 4 different trainings, but I still need to get stats on all the trainings they did, but I just want one entry of the Fullname and also the Payno not 3 or 4, if that makes sense. Regards..
I understand, but have you tried the suggestion? That is why it is important to have a star schema when building your data. For situations like this.
@mussaenda i tried it but unfortunately doesn't seem to be working, see I receive this data on two seperate sheets, I don't create the data, so alot of cleanup, but ultimately, I want to know who has completed training and who hasn't but I don't want 3 or 4 records per person, I just want one row to show all the information relating to that person including there trainings and start dates for the training
Please provide a sample data that we can copy to provide with a solution
Hi @mussaenda
So I had to edit data as some sensitive, but you will get the jist. I have multipe entries of similar data based over 5200 rows, I want just one row per entry per payno but i want all the data to follow, so lets take the first two lines, can I have one entry for those two lines,but both training entries and also start dates on the same line.
Hi @bigrig33 ,
you can do like this to have one entry
or you can do like this
or you can do like this and create a star schema
other than those, i hope others will meet your requirements.
Hope this helps
@bigrig33 In Power Query Editor, select all of the columns where there is duplicate information. (Everything except Training and Start Date in your case). Right-click one of those column headers and select "Remove duplicates".
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcspPUtJR8s9LBZIh5flA0jnARSOguDI5Iz8nP71SEyhiaKBvBEJGhkqxOji0YCoLKUpMriTVbPfU/KL0VGKMx6USv/m+mXkpWByFpjAWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t, Column5 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type date}}),
#"Removed Duplicates" = Table.Distinct(#"Changed Type", {"Column1", "Column2", "Column3"})
in
#"Removed Duplicates"
@Greg_Deckler many thanks for the response, so when I highlight those columns, remove duplicates isn't coming up as an option. It's only if I highlight the first column will remove duplicates happen, but If i remove duplicates from that, it removes the stats I need. Any way around this. Regards..
@bigrig33 Not sure why that isn't coming up. I click the first column, hold down Shift, select the last column ( or use Ctrl to select multiple, discontiguous columns). Right-click the header of one of the columns and it shows up (see screen shot). But regardless, the code you want is simply this (you can add it in Advanced Editor):
#"Removed Duplicates" = Table.Distinct(#"Previous Step", {"Column1", "Column2", "Column3"})
@Greg_Deckler again really appreciate you taking the time to respond, but I'm really a novice with excel. So I suppose I'm looking for it to be as straighforward as to take the first two entries on screenshot, same full name, same payno, but they did two trainings on two seperate days, is there any way of getting all that info on the one line, without any data being lost. Like for Instance as i have over 5000 rows, I will have the same fullname duplicated 3 and 4 times because they have done 3 or 4 different trainings, but I still need to get stats on all the trainings they did, but I just want one entry of the Fullname and also the Payno not 3 or 4, if that makes sense. Regards..
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.