Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
I have some test data in .txt format. Each .txt is a test.
example:
TestA-1.txt
TestA-2.txt
TestA-3.txt
TestB-1.txt
TestB-2.txt
TestB-3.txt
Each file will look like this:
Date=2/15/2023
108.2 107.9 110.0 111.3 111.3
The number of data entries (second row) is not known, but won't be more than 100. I don't care about row 1.
First, split file name so I have a TestID and TestNumber and sort by TestID. All TestAs, TestBs, etc should be next to each other. Not too hard.
Second, find the average for each column of a TestID, and insert this as a new row. *Hard - I don't know how to do this*
I would like the data to look like this after the power query is run. (it should work for any number of files in a folder - TestC, TestD, etc)
I have tried grouping the data by SampleID which allows me to get an average for the columns. However, I cannot figure out how to add this information to the original data so that I have both the raw data and the average (as seen in the picture above). I even tried combining two queries on the same data so that both appear, but it had unexpected results (such as appending the data into new columns instead of using existing columns (which were filled with null)).
The sample data files can be obtained from my google drive:
https://drive.google.com/file/d/1UkSrK3JnMuRx9lVs4lKg1sy4DCuED1er/view?usp=sharing
Note: I only have access to the features provided by Microsoft Excel's Power Query.
Thanks in advance for any help or pointers.
Solved! Go to Solution.
Here you go, just replace the <<Folder with data>> with the location in your local drive with the data.
let
Source = Folder.Files("<<Folder with data>>"),
#"Filtered Rows" = Table.SelectRows(Source, each [Extension] = ".txt"),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.Skip(Csv.Document([Content]))),
#"Extracted Text Before Delimiter" = Table.TransformColumns(#"Added Custom", {{"Name", each Text.BeforeDelimiter(_, ".txt"), type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Text Before Delimiter", "Name", Splitter.SplitTextByEachDelimiter({"-"}, QuoteStyle.Csv, false), {"Name.1", "Name.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Name.1", type text}, {"Name.2", Int64.Type}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name.1", "Test ID"}, {"Name.2", "Test #"}}),
#"Removed Other Columns" = Table.SelectColumns(#"Renamed Columns",{"Test ID", "Test #", "Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Column1"}, {"Data.Column1"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded Data",{{"Data.Column1", "Data"}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Renamed Columns1", "Data", Splitter.SplitTextByDelimiter("#(tab)", QuoteStyle.Csv), {"Data.1", "Data.2", "Data.3", "Data.4", "Data.5", "Data.6", "Data.7", "Data.8"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Data.1", type number}, {"Data.2", type number}, {"Data.3", type number}, {"Data.4", type number}, {"Data.5", type number}, {"Data.6", type number}, {"Data.7", type number}, {"Data.8", type number}}),
#"Extracted Data" = #"Changed Type1",
#"Grouped Rows" = Table.Group(#"Extracted Data", {"Test ID"}, {{"Test #", each "avg"}} & List.Transform(Table.ColumnNames(Table.RemoveColumns(#"Extracted Data", {"Test ID", "Test #"})), each {_, (tbl) => List.Average(Table.Column(tbl, _))})),
Custom1 = #"Extracted Data" & #"Grouped Rows",
#"Sorted Rows" = Table.Sort(Custom1,{{"Test ID", Order.Ascending}, {"Test #", Order.Ascending}})
in
#"Sorted Rows"
This is what is should look like:
Table.Group:
Parameter 1: the table - #"Extracted Data"
Parameter 2: The key column - "Test ID"
Parameter 3: The new aggregated columns (a list of name, calculation pairs) -
other columns (notice the & is to append it to the "Test #" column:
List.Transform on each of the column names (excluding the "Test ID" and "Test #" columns) to a key value pair where key is the name of the column (in this case the _, since "each" is syntax sugar for "(_) =>" which is a function that takes one input named _) and a value of the average of the column that "each" names. Note that tbl is the sub table that is aggregated, i.e. all the tests for a given Test ID.
Here you go, just replace the <<Folder with data>> with the location in your local drive with the data.
let
Source = Folder.Files("<<Folder with data>>"),
#"Filtered Rows" = Table.SelectRows(Source, each [Extension] = ".txt"),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.Skip(Csv.Document([Content]))),
#"Extracted Text Before Delimiter" = Table.TransformColumns(#"Added Custom", {{"Name", each Text.BeforeDelimiter(_, ".txt"), type text}}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Extracted Text Before Delimiter", "Name", Splitter.SplitTextByEachDelimiter({"-"}, QuoteStyle.Csv, false), {"Name.1", "Name.2"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Name.1", type text}, {"Name.2", Int64.Type}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"Name.1", "Test ID"}, {"Name.2", "Test #"}}),
#"Removed Other Columns" = Table.SelectColumns(#"Renamed Columns",{"Test ID", "Test #", "Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Column1"}, {"Data.Column1"}),
#"Renamed Columns1" = Table.RenameColumns(#"Expanded Data",{{"Data.Column1", "Data"}}),
#"Split Column by Delimiter1" = Table.SplitColumn(#"Renamed Columns1", "Data", Splitter.SplitTextByDelimiter("#(tab)", QuoteStyle.Csv), {"Data.1", "Data.2", "Data.3", "Data.4", "Data.5", "Data.6", "Data.7", "Data.8"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Data.1", type number}, {"Data.2", type number}, {"Data.3", type number}, {"Data.4", type number}, {"Data.5", type number}, {"Data.6", type number}, {"Data.7", type number}, {"Data.8", type number}}),
#"Extracted Data" = #"Changed Type1",
#"Grouped Rows" = Table.Group(#"Extracted Data", {"Test ID"}, {{"Test #", each "avg"}} & List.Transform(Table.ColumnNames(Table.RemoveColumns(#"Extracted Data", {"Test ID", "Test #"})), each {_, (tbl) => List.Average(Table.Column(tbl, _))})),
Custom1 = #"Extracted Data" & #"Grouped Rows",
#"Sorted Rows" = Table.Sort(Custom1,{{"Test ID", Order.Ascending}, {"Test #", Order.Ascending}})
in
#"Sorted Rows"
This is what is should look like:
Thank you very much!
I didn't know you could do something like
Custom1 = #"Extracted Data" & #"Grouped Rows",
Also, could you please explain how you grouped the Test IDs here
#"Grouped Rows" = Table.Group(#"Extracted Data", {"Test ID"}, {{"Test #", each "avg"}} & List.Transform(Table.ColumnNames(Table.RemoveColumns(#"Extracted Data", {"Test ID", "Test #"})), each {_, (tbl) => List.Average(Table.Column(tbl, _))})),
I would really appreciate it.