Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for a $400 discount.
Register nowGet inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.
Hello everyone,
I am still not very experienced with Power BI and data management in Power Query.
I would like to be able to reprocess in PBI the tables that my colleagues use for different projects and update on a weekly basis. The tables are all structured the same way and follow some basic rules, which I will now explain. In the meantime, I am attaching a simplified version.
The tables you see are a summary sheet in XLS that normally pulls data from different sheets, which are updated and modified over time. For simplicity, I have removed the other sheets and adjusted the formulas so that the "Summary" sheet works correctly on its own.
The tables have different column groups. The first consists of three columns with always the same headers, followed by groups of four columns each (ranging from 1 to 6 groups), separated by an empty column, and ending with a group of two final columns for totals.
The first group of three columns always includes: ID, DESCRIPTION, U.M.
The second group of four columns always includes: "Quants", "Unit Price", "Total", "% on Total".
These latter groups have the first and second rows merged into four cells where values such as "ZONE 3" and "3A" are displayed. All the values in the underlying columns obviously refer to "ZONE 3" and specifically to the subgroup "3A."
The values I need to extract from XLS are "Quants" and "Unit Price". The others, "Total" and "% on Total", can be handled directly in PBI.
Rows:
As you can see, there are some groupings that divide categories into different subcategories. The categories are almost always the same, while the subcategories may increase or decrease in number based on needs. Therefore, I need to distinguish between categories and subcategories.
Some values are written as text and should not cause errors during data processing. In XLS, we use the IFERROR formula to prevent this issue, but I don’t know how to handle it in Power BI.
How can I manage these tables?
Do you have any suggestions?
thank you in advance.
Best
Marco
Solved! Go to Solution.
Hi @Marco_88, check this:
Output
let
Source = Excel.Workbook(File.Contents("C:\Users\abc\Downloads\PBI Table Example_R01.xlsx"), null, false),
Sheet = Source{[Item="Summary",Kind="Sheet"]}[Data],
RemovedBlankAndTotalCols = [ blankCols = Table.SelectRows(Table.TransformColumnTypes(Table.Profile(Sheet),{{"NullCount", Int64.Type}, {"Count", Int64.Type}}), each [Count] = [NullCount])[Column],
totalCols = List.RemoveFirstN(Table.ColumnNames(Sheet), List.PositionOf(List.Transform(Record.ToList(Sheet{0}), Text.Trim), "TOTAL", Occurrence.First, Comparer.OrdinalIgnoreCase)),
removed = Table.RemoveColumns(Sheet, blankCols & totalCols)
][removed],
RemovedBlankAndTotalRows = Table.SelectRows(RemovedBlankAndTotalCols, each (not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))) and (not List.Contains(Record.ToList(_), "TOTAL", (x,y)=> Text.StartsWith(Text.From(x) ?? "", y)))),
Headers = Table.PromoteHeaders(Table.Skip(RemovedBlankAndTotalRows, each not List.Contains(Record.ToList(_), "unit price", Comparer.OrdinalIgnoreCase))),
GroupedRows = Table.Group(Headers, "Column1", {{"T", each
// [ T = GroupedRows{3}[T],
[ T = _,
Zones = List.Zip(List.TransformMany(Table.ToRows(Table.FirstN(RemovedBlankAndTotalCols, 2)),
each {List.RemoveNulls(List.Skip(_, 3))},
(x,y) => y )),
Transformed = List.TransformMany(List.Skip(Table.ToRows(T)),
each List.Split(List.Skip(_, 3), 4),
(x,y)=> List.FirstN(Record.ToList(T{0}), 2) & List.FirstN(x, 3) & y),
Combined = List.Transform(List.Zip({Transformed, List.Repeat(Zones, Table.RowCount(T)-1)}), List.Combine),
ToTable =
[ colNames = {"Category Code", "Category Description"} & List.FirstN(Record.ToList(RemovedBlankAndTotalRows{0}), 3) & List.FirstN(List.Skip(Table.ColumnNames(T), each Text.StartsWith(_, "column", Comparer.OrdinalIgnoreCase)), 4) & {"ZONE", "SUBGROUP"},
tbl = Table.FromRows(Combined, colNames)
][tbl]
][ToTable], type table}}, 0, (x,y)=> Byte.From(y is text)),
CombinedT = Table.Combine(GroupedRows[T])
in
CombinedT
Hi @dufoq3 ,
thank you for your reply.
attached you can find an example. I created a new sheet called Summary-Append where you can find an example of the same original sheet but formatted like a "database" that I'm able to analyze in Power BI. I know that might be better ways to format this sheet to work in PBI but at the moment it's the best way I know.
https://docs.google.com/spreadsheets/d/1AE2v_wDlqOAelkuLqYJcvcr3YAKEWBIB/edit?usp=sharing&ouid=11350...
thank you.
Marco
Hi @Marco_88, check this:
Output
let
Source = Excel.Workbook(File.Contents("C:\Users\abc\Downloads\PBI Table Example_R01.xlsx"), null, false),
Sheet = Source{[Item="Summary",Kind="Sheet"]}[Data],
RemovedBlankAndTotalCols = [ blankCols = Table.SelectRows(Table.TransformColumnTypes(Table.Profile(Sheet),{{"NullCount", Int64.Type}, {"Count", Int64.Type}}), each [Count] = [NullCount])[Column],
totalCols = List.RemoveFirstN(Table.ColumnNames(Sheet), List.PositionOf(List.Transform(Record.ToList(Sheet{0}), Text.Trim), "TOTAL", Occurrence.First, Comparer.OrdinalIgnoreCase)),
removed = Table.RemoveColumns(Sheet, blankCols & totalCols)
][removed],
RemovedBlankAndTotalRows = Table.SelectRows(RemovedBlankAndTotalCols, each (not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))) and (not List.Contains(Record.ToList(_), "TOTAL", (x,y)=> Text.StartsWith(Text.From(x) ?? "", y)))),
Headers = Table.PromoteHeaders(Table.Skip(RemovedBlankAndTotalRows, each not List.Contains(Record.ToList(_), "unit price", Comparer.OrdinalIgnoreCase))),
GroupedRows = Table.Group(Headers, "Column1", {{"T", each
// [ T = GroupedRows{3}[T],
[ T = _,
Zones = List.Zip(List.TransformMany(Table.ToRows(Table.FirstN(RemovedBlankAndTotalCols, 2)),
each {List.RemoveNulls(List.Skip(_, 3))},
(x,y) => y )),
Transformed = List.TransformMany(List.Skip(Table.ToRows(T)),
each List.Split(List.Skip(_, 3), 4),
(x,y)=> List.FirstN(Record.ToList(T{0}), 2) & List.FirstN(x, 3) & y),
Combined = List.Transform(List.Zip({Transformed, List.Repeat(Zones, Table.RowCount(T)-1)}), List.Combine),
ToTable =
[ colNames = {"Category Code", "Category Description"} & List.FirstN(Record.ToList(RemovedBlankAndTotalRows{0}), 3) & List.FirstN(List.Skip(Table.ColumnNames(T), each Text.StartsWith(_, "column", Comparer.OrdinalIgnoreCase)), 4) & {"ZONE", "SUBGROUP"},
tbl = Table.FromRows(Combined, colNames)
][tbl]
][ToTable], type table}}, 0, (x,y)=> Byte.From(y is text)),
CombinedT = Table.Combine(GroupedRows[T])
in
CombinedT
Yeah!
it seems to work perfectly!
thank you very much!
For simplicity, I have removed the other sheets
Do it the other way around. Throw away this Summary sheet and feed Power BI from the original sheets.
Hi lbendlin,
I know that it would be better to take the data directly from the other sheets but to make the reading homogeneous this is the best way as the format of this sheet is the only thing common to several of my collegues' projects.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the February 2025 Power BI update to learn about new features.
User | Count |
---|---|
27 | |
25 | |
25 | |
13 | |
9 |
User | Count |
---|---|
24 | |
19 | |
16 | |
13 | |
10 |