Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
Marco_88
Frequent Visitor

Heo to transform XLS Table in PBI?

Hello everyone,
I am still not very experienced with Power BI and data management in Power Query.
I would like to be able to reprocess in PBI the tables that my colleagues use for different projects and update on a weekly basis. The tables are all structured the same way and follow some basic rules, which I will now explain. In the meantime, I am attaching a simplified version.

The tables you see are a summary sheet in XLS that normally pulls data from different sheets, which are updated and modified over time. For simplicity, I have removed the other sheets and adjusted the formulas so that the "Summary" sheet works correctly on its own.

The tables have different column groups. The first consists of three columns with always the same headers, followed by groups of four columns each (ranging from 1 to 6 groups), separated by an empty column, and ending with a group of two final columns for totals.

The first group of three columns always includes: ID, DESCRIPTION, U.M.
The second group of four columns always includes: "Quants", "Unit Price", "Total", "% on Total".
These latter groups have the first and second rows merged into four cells where values such as "ZONE 3" and "3A" are displayed. All the values in the underlying columns obviously refer to "ZONE 3" and specifically to the subgroup "3A."

The values I need to extract from XLS are "Quants" and "Unit Price". The others, "Total" and "% on Total", can be handled directly in PBI.

Rows:
As you can see, there are some groupings that divide categories into different subcategories. The categories are almost always the same, while the subcategories may increase or decrease in number based on needs. Therefore, I need to distinguish between categories and subcategories.

Some values are written as text and should not cause errors during data processing. In XLS, we use the IFERROR formula to prevent this issue, but I don’t know how to handle it in Power BI.

 

How can I manage these tables?
Do you have any suggestions?

 

thank you in advance.

Best

 

Marco

link XSL table - Google Drive 

1 ACCEPTED SOLUTION

Hi @Marco_88, check this:

 

Output

dufoq3_0-1739799362502.png

let
    Source = Excel.Workbook(File.Contents("C:\Users\abc\Downloads\PBI Table Example_R01.xlsx"), null, false),
    Sheet = Source{[Item="Summary",Kind="Sheet"]}[Data],
    RemovedBlankAndTotalCols = [ blankCols = Table.SelectRows(Table.TransformColumnTypes(Table.Profile(Sheet),{{"NullCount", Int64.Type}, {"Count", Int64.Type}}), each [Count] = [NullCount])[Column],
    totalCols = List.RemoveFirstN(Table.ColumnNames(Sheet), List.PositionOf(List.Transform(Record.ToList(Sheet{0}), Text.Trim), "TOTAL", Occurrence.First, Comparer.OrdinalIgnoreCase)),
    removed = Table.RemoveColumns(Sheet, blankCols & totalCols)
  ][removed],
    RemovedBlankAndTotalRows = Table.SelectRows(RemovedBlankAndTotalCols, each (not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))) and (not List.Contains(Record.ToList(_), "TOTAL", (x,y)=> Text.StartsWith(Text.From(x) ?? "", y)))),
    Headers = Table.PromoteHeaders(Table.Skip(RemovedBlankAndTotalRows, each not List.Contains(Record.ToList(_), "unit price", Comparer.OrdinalIgnoreCase))),
    GroupedRows = Table.Group(Headers, "Column1", {{"T", each
        // [ T = GroupedRows{3}[T],
        [ T = _,
          Zones = List.Zip(List.TransformMany(Table.ToRows(Table.FirstN(RemovedBlankAndTotalCols, 2)),
            each {List.RemoveNulls(List.Skip(_, 3))},
            (x,y) => y )),
          Transformed = List.TransformMany(List.Skip(Table.ToRows(T)),
            each List.Split(List.Skip(_, 3), 4),
            (x,y)=> List.FirstN(Record.ToList(T{0}), 2) & List.FirstN(x, 3) & y),
          Combined = List.Transform(List.Zip({Transformed, List.Repeat(Zones, Table.RowCount(T)-1)}), List.Combine),
          ToTable = 
            [ colNames = {"Category Code", "Category Description"} & List.FirstN(Record.ToList(RemovedBlankAndTotalRows{0}), 3) & List.FirstN(List.Skip(Table.ColumnNames(T), each Text.StartsWith(_, "column", Comparer.OrdinalIgnoreCase)), 4) & {"ZONE", "SUBGROUP"},
              tbl = Table.FromRows(Combined, colNames)
            ][tbl]
        ][ToTable], type table}}, 0, (x,y)=> Byte.From(y is text)),
    CombinedT = Table.Combine(GroupedRows[T])
in
    CombinedT

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

7 REPLIES 7
dufoq3
Super User
Super User

Hi @Marco_88, could you provide expected output for this?


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hi @dufoq3 , 
thank you for your reply.
attached you can find an example. I created a new sheet called Summary-Append where you can find an example of the same original sheet but formatted like a "database" that I'm able to analyze in Power BI. I know that might be better ways to format this sheet to work in PBI but at the moment it's the best way I know. 

https://docs.google.com/spreadsheets/d/1AE2v_wDlqOAelkuLqYJcvcr3YAKEWBIB/edit?usp=sharing&ouid=11350... 

thank you.
Marco 

Hi @Marco_88, check this:

 

Output

dufoq3_0-1739799362502.png

let
    Source = Excel.Workbook(File.Contents("C:\Users\abc\Downloads\PBI Table Example_R01.xlsx"), null, false),
    Sheet = Source{[Item="Summary",Kind="Sheet"]}[Data],
    RemovedBlankAndTotalCols = [ blankCols = Table.SelectRows(Table.TransformColumnTypes(Table.Profile(Sheet),{{"NullCount", Int64.Type}, {"Count", Int64.Type}}), each [Count] = [NullCount])[Column],
    totalCols = List.RemoveFirstN(Table.ColumnNames(Sheet), List.PositionOf(List.Transform(Record.ToList(Sheet{0}), Text.Trim), "TOTAL", Occurrence.First, Comparer.OrdinalIgnoreCase)),
    removed = Table.RemoveColumns(Sheet, blankCols & totalCols)
  ][removed],
    RemovedBlankAndTotalRows = Table.SelectRows(RemovedBlankAndTotalCols, each (not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))) and (not List.Contains(Record.ToList(_), "TOTAL", (x,y)=> Text.StartsWith(Text.From(x) ?? "", y)))),
    Headers = Table.PromoteHeaders(Table.Skip(RemovedBlankAndTotalRows, each not List.Contains(Record.ToList(_), "unit price", Comparer.OrdinalIgnoreCase))),
    GroupedRows = Table.Group(Headers, "Column1", {{"T", each
        // [ T = GroupedRows{3}[T],
        [ T = _,
          Zones = List.Zip(List.TransformMany(Table.ToRows(Table.FirstN(RemovedBlankAndTotalCols, 2)),
            each {List.RemoveNulls(List.Skip(_, 3))},
            (x,y) => y )),
          Transformed = List.TransformMany(List.Skip(Table.ToRows(T)),
            each List.Split(List.Skip(_, 3), 4),
            (x,y)=> List.FirstN(Record.ToList(T{0}), 2) & List.FirstN(x, 3) & y),
          Combined = List.Transform(List.Zip({Transformed, List.Repeat(Zones, Table.RowCount(T)-1)}), List.Combine),
          ToTable = 
            [ colNames = {"Category Code", "Category Description"} & List.FirstN(Record.ToList(RemovedBlankAndTotalRows{0}), 3) & List.FirstN(List.Skip(Table.ColumnNames(T), each Text.StartsWith(_, "column", Comparer.OrdinalIgnoreCase)), 4) & {"ZONE", "SUBGROUP"},
              tbl = Table.FromRows(Combined, colNames)
            ][tbl]
        ][ToTable], type table}}, 0, (x,y)=> Byte.From(y is text)),
    CombinedT = Table.Combine(GroupedRows[T])
in
    CombinedT

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Yeah!
it seems to work perfectly!

 

thank you very much!

You're welcome Marco 😉


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

lbendlin
Super User
Super User

For simplicity, I have removed the other sheets 

Do it the other way around. Throw away this Summary sheet and feed Power BI from the original sheets.

Hi lbendlin, 

 

I know that it would be better to take the data directly from the other sheets but to make the reading homogeneous this is the best way as the format of this sheet is the only thing common to several of my collegues' projects. 

 

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.