Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
TheCreepster1
Helper II
Helper II

Very Slow Performance on Group By

Hi,


Unfortunately, our inventory and sales data reside in different systems so I have had to construct two data queries to bring this information in to Power Query. I have constructed the two queries so that in they are laid out in the same format:

 

TheCreepster1_0-1647599904876.png

I've then appended the two queries and performed a group by on the Location and Product, aggregating the measure fields. However, the volume of data is such that it's taken nearly an hour so far to load the data into the Power BI data model, with no sign that is near to being finished. In total I have approx 13.5m rows of inventory data and 2.5m rows of sales data. Below is the actual M code being generated by Power Query. Is there anything I can do to make this more efficient? I thought the append rather than a merge with full outer join would be a more efficient way of doing this, but so far it's not proving that way!

 

 

let
    #"Sales Query" = SapHana.Database("xxxxxxxx:xxxxx", [Query="select#(tab)CREATION_WEEK as WEEK_ID,#(lf)#(tab)#(tab)ltrim(gsv.PRCTR, '0') as SITE,#(lf)#(tab)#(tab)GENERIC,#(lf)#(tab)#(tab)sum(UNITS) as UNITS,#(lf)#(tab)#(tab)sum(NET_VALUE) as NET_VALUE,#(lf)#(tab)#(tab)sum(NDW) as COST_OF_SALES,#(lf)#(tab)#(tab)sum(GROSS_MARGIN) as GROSS_MARGIN,#(lf)#(tab)#(tab)0 as STOCK#(lf)#(lf)from aggregates.global_single_view gsv#(lf)#(lf)left join ""_SYS_BIC"".""EuropeMerch/CV_MER_LOCATION_CHANNEL_HIERARCHY"" lh on case when gsv.vtweg = '10' then gsv.SOLD_TO else gsv.PRCTR end = lh.LOCATION_JOIN and lh.LOCATION_JOIN != ''#(lf)#(lf)where LEDGER = 'LOCAL'#(lf)and TYPE like 'SALE%'#(lf)and lh.REGION = 'Americas'#(lf)and lh.CHANNEL = 'Direct to Consumer'#(lf)and CREATION_WEEK >201899#(lf)group by CREATION_WEEK,#(lf)#(tab)#(tab)gsv.PRCTR,#(lf)#(tab)#(tab)GENERIC#(lf)", Implementation="2.0"]),
    #"Appended Query" = Table.Combine({#"Sales Query", #"Stock Query"}),
    #"Grouped Rows" = Table.Group(#"Appended Query", {"WEEK_ID", "SITE", "GENERIC"}, {{"UNITS", each List.Sum([UNITS]), type nullable number}, {"NET_VALUE", each List.Sum([NET_VALUE]), type nullable number}, {"COST_OF_SALES", each List.Sum([COST_OF_SALES]), type nullable number}, {"GROSS_MARGIN", each List.Sum([GROSS_MARGIN]), type nullable number}, {"STOCK", each List.Sum([STOCK]), type nullable number}})
in
    #"Grouped Rows"

 

 

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi @TheCreepster1 ,

 

Don't append and merge at all.

 

Set your model up in the STAR SCHEMA structure, where your Sales and Inventory tables are the FACT tables, and any common fields (e.g. Location, Product etc.) are aggregated into DIMENSION tables that are then related to both fact tables.

You can then write measures that span both fact tables, as well as display data from both in visuals using the dimension table fields to maintain commonality between them.

https://docs.microsoft.com/en-us/power-bi/guidance/star-schema 

 

Another option would be to have your SQL query written on the source as a view, rather than being a native query from Power Query. Presuming both of your tables are on the same DB/server, your append/group steps should be able to be folded back to the source, which would be significantly quicker at performing these functions than Power Query.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

3 REPLIES 3
BA_Pete
Super User
Super User

Hi @TheCreepster1 ,

 

Don't append and merge at all.

 

Set your model up in the STAR SCHEMA structure, where your Sales and Inventory tables are the FACT tables, and any common fields (e.g. Location, Product etc.) are aggregated into DIMENSION tables that are then related to both fact tables.

You can then write measures that span both fact tables, as well as display data from both in visuals using the dimension table fields to maintain commonality between them.

https://docs.microsoft.com/en-us/power-bi/guidance/star-schema 

 

Another option would be to have your SQL query written on the source as a view, rather than being a native query from Power Query. Presuming both of your tables are on the same DB/server, your append/group steps should be able to be folded back to the source, which would be significantly quicker at performing these functions than Power Query.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi,

 

Don't know why this didn't even cross my mind! Everything was set up to join as a star schema but I had it in my head I needed to smash the inventory and sales together for the end user. Unfortunately the two data sources are coming from different databases, hence the inability to do it in the SQL back end but I've managed to sort it through your first suggestion, thank you!

Vijay_A_Verma
Super User
Super User

Try buffering the table to improve the performance.

Replace 

#"Appended Query" = Table.Combine({#"Sales Query", #"Stock Query"}),

with 

#"Appended Query" = Table.Buffer(Table.Combine({#"Sales Query", #"Stock Query"})),

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.