Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
Otavio_Hurtado
Frequent Visitor

Optimize Query with data calculations

I have a database which is grouped by sales data per location, month, product_id and sales_segment. 
The thing is I want to compare previous year's sales data, where normally using SQL I would create a subquery and populate new columns to my original database with the previous year's data. 
Since I couldn't my query to recursively search itself on like a sumifs would on Excel, I've tried the same approach I would take on SQL by duplicating my database and using Table.SelectRows to search my duplicate database for the previous year's sales data like this:
PS: PYDATE is a column with the previous year's date it should filter, for example: Current Month: "SEP22" PYDATE will contain "SEP21" for that row.

Table.AddColumn(#"Renamed Columns1", "Sales (Q) PY", each List.Sum(Table.SelectRows(
DuplicateDatabase,
(NestedTable) =>
NestedTable[Country] = [Country] and
NestedTable[DATE] = [PYDATE] and
NestedTable[Sales_Segment] = [Sales_Segment] and
NestedTable[ProductID] = [ProductID] and
NestedTable[Location] = [Location]
)[#"Sales (Q)"]))

Now the problem I'm having is that my solution works, but it takes forever to run since I have a large dataset. 
I imagine having it run on a measure on power bi will be even slower than my current approach.
Is there a more optimal way for calculating this data ?

DATASET EXAMPLE:

COUNTRY | DATE | SALES_SEGMENT | PRODUCT_ID | LOCATION | SALES(QTY) | PYDATE | PY_SALES(QTY)
AB | OCT22 | D | A1 | ABC | 100 | OCT21 | ?
AB | OCT21 | D | A1 | ABC | 50 | OCT20 | ?
CD | SEP22 | O | B2 | DEF | 90 | SEP21 | ?



3 REPLIES 3
Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Otavio_Hurtado - 500 million rows will be okay as long as the Cardinality of the Foreign Key dimension columns are small, so don't load too many unique values!!  In the example above, if you have a Row ID for each transaction, then remove it and aggregrate in SQL before loading to Power BI (add a Count of Transaction Rows)

You may want to consider limiting the history in query to reduce size. And consider incrementally loading data improve load times.

Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Otavio_Hurtado - I think that DAX can handle this much better than SQL and Power Query.  But how large is your Datastet?

To acheive this you need to create the following Table:

COUNTRY   DATE   SALES_SEGMENT   PRODUCT_ID   LOCATION   SALES(QTY)
AB 01/10/2022   D   A1   ABC                     100.00
AB 01/10/2022   D   A1   ABC                       50.00
CD 01/09/2022   O   B2   DEF                       90.00
AB 01/10/2021   D   A1   ABC  ?? 
AB 01/10/2021   D   A1   ABC  ?? 
CD 01/09/2021   O   B2   DEF  ?? 

 

The trick is to replace the Month with Start of Month date.  This will allow you add a Calendar Table and then use the Time-Intelligence functions in DAX.  This will allow you to avoid the expensive Joins required by SQL and Power Query.

About 24Million rows, I was also recommended that approach by a colleague, will try that.

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Power BI Header Change

Power BI Header Update

Read about the exciting updates for the Power BI forum.

Top Solution Authors
Top Kudoed Authors