Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
Check it out now!Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more
Hello,
I am looking for a method to sum each individual column of a large data set, and group columns based on the name.
Example Data:
abc_X_123.stat.energize | abc_X_456.stat.energize | abc_X_789.stat.energize | abc_Y_123.stat.energize | abc_Z_123.stat.energize | abc_Z_456.stat.energize |
1 | null | null | 1 | 1 | null |
null | null | 1 | null | null | 1 |
null | 1 | null | null | 1 | null |
(Bold and Italicized to make date more readable)
End Goal:
X | Y | Z |
3 | 1 | 3 |
My main struggle with these transformations is finding a way to sum each individual column. The name part is easily handles by seperating by a delimiter.
My current approach has been to transpose the table, remove and delete the extra letters after the underscore, create a record transform the record to a list, sum the list, and delete all data except my sums and names. However this does not work because of the size of the data which essentially makes excel freeze after I transpose the table.
My next idea involved creating a duplicate query one containing column names only, which I can do without issue, and one containing only the data. On the table of data I then tried making a record into a list and using list.zip, which gives me the error:
Expression.Error: We cannot convert the value null to type List.
Details:
Value=
Type=[Type]
Is there a different way to tackle this problem, it seems like most functions I find sum the row, which is not at all what I need.
Hi @Rick_S137,
Just following up to see if the solution provided was helpful in resolving your issue. Please feel free to let us know if you need any further assistance.
If the response addressed your query, kindly mark it as Accepted Solution and click Yes if you found it helpful — this will benefit others in the community as well.
Best regards,
Prasanna Kumar
Hi @Rick_S137,
Thank you for reaching out to the Microsoft Fabric Forum Community.
Also, thanks to @lbendlin for the prompt and helpful response.
Try this Combine and Transform" from a folder in Power Query, several steps are automatically generated—such as Filtered Hidden Files, Invoke Custom Function, Expanded Table Column, and Changed Type. These are designed to prepare and combine your CSV files into one table, so you shouldn’t place your custom column-summing logic in the Source step. Instead, insert a new step after the final auto-generated one (usually Changed Type) to apply your transformation to the fully combined data. This way, you avoid hardcoding column names and ensure the query continues to work as new files are added to the folder.
let
Source = #"Changed Type",
ColumnNames = Table.ColumnNames(Source),
ColumnSums = List.Transform(
ColumnNames,
each [
OriginalName = _,
Sum = List.Sum(List.RemoveNulls(Table.Column(Source, _)))
]
),
SumsTable = Table.FromRecords(ColumnSums),
AddGroupName = Table.AddColumn(
SumsTable,
"Name",
each Text.BetweenDelimiters([OriginalName], "_", "_")
),
GroupedResult = Table.Group(
AddGroupName,
{"Name"},
{{"Total", each List.Sum([Sum]), type number}}
)
in
GroupedResult
If you find this response helpful, please consider marking it as the accepted solution and giving it a thumbs-up to support others in the community.
Thank you & Regards,
Prasanna kumar
You are not summing all individual columns. Now you introduce column groups. Will they always have a single character and an underscore at the beginning of the column name?
The column names will follow this pattern: abc_Name_123.stat.energize, the only part of the column name I am interested in keeping is the part between the underscores.
After summing each indivdual column and prepping the names, I would use the Group By function to group and sum all columns name X. My main issue is finding a way to sum an each individual column. Again I am not sure if this is the most effective method, but it is what my initial thoughts and attempts were.
My first post was a poor description please disregard the information I provided in that post.
The column names will follow this pattern: abc_Name_123.stat.energize, the only part of the column name I am interested in keeping is the part between the underscores.
so, "Name" ?
Yes, Name is a variable. There are different names in my data set, and there are also repeat names. In the case in which a name is repeated it needs to be summed and grouped with the column with which it shares the same name.
Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information. Do not include anything that is unrelated to the issue or question.
Please show the expected outcome based on the sample data you provided.
Need help uploading data? https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...
The data is now edited to match my data completely.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUcorzclBUIZQDObG6kRjkUYXQVKF3TyIWbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [abc_X_123.stat.energize = _t, abc_X_456.stat.energize = _t, abc_X_789.stat.energize = _t, abc_Y_123.stat.energize = _t, abc_Z_123.stat.energize = _t, abc_Z_456.stat.energize = _t]),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(Source, {}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Attribute.1", "Attribute.2", "Attribute.3"}),
#"Removed Other Columns" = Table.SelectColumns(#"Split Column by Delimiter",{"Attribute.2", "Value"}),
#"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each ([Value] = "1")),
#"Changed Type" = Table.TransformColumnTypes(#"Filtered Rows",{{"Value", Int64.Type}}),
#"Pivoted Column" = Table.Pivot(#"Changed Type", List.Distinct(#"Changed Type"[Attribute.2]), "Attribute.2", "Value", List.Sum)
in
#"Pivoted Column"
How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the entire Source step with your own source.
NOTE: Your expected outcome is an anti-pattern. This is Power BI, not Excel. Leave the pivoting to the visuals.
I don't fully understand how to use source code in power query, but I have a question about these lines,
in type table [abc_X_123.stat.energize = _t, abc_X_456.stat.energize = _t, abc_X_789.stat.energize = _t, abc_Y_123.stat.energize = _t, abc_Z_123.stat.energize = _t, abc_Z_456.stat.energize = _t]),
are these names hardcoded in, because these are not the actual names, and even if they were I have 200+ columns I cannot use any hardcoded mames as it would take me much to long to type all them in by hand.
I will still attempt to use this code in the meantime.
That is just the meta data of the sample data you provided.
Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the entire Source step with your own source.
I tried to use the source step but it doesnt work. I believe it is because of what I describe below
My initial source step is not the table I will be working with, power query seems to add steps automatically before hand. So do I add the Code from all the other steps as well?
I draw the data for my query from a folder, and use the "combine and transform" option, currently I have one CSV in the folder, but I will add the rest once I have the power query editing my data properly. I will also continually add data over time.
Here are pictures of what my power query looks like when i load it for the first time with my data. I would include the function bar above the table, but it has sensitive text in it.
There is no need to change type so soon in the process. Most likely you didn't disable that automatic feature.
If your data source are CSV files in a folder then provide at least two sample CSV files.
Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information. Do not include anything that is unrelated to the issue or question.
Please show the expected outcome based on the sample data you provided.
Need help uploading data? https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...