Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Power BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.

Reply
Ackbar-Learner
Resolver I
Resolver I

How to reduce calculation time on a query?

Hi I have a few queries which are dependent on one another. The final query is taking too long to deliver.

 

Query 1 - the first query calls an API endpoint to get all projects data and with that project data, the code will apply another function to query another API endpoint to get all project invoicing data. If there are 100,000 lines of projects, it will query the 2nd API endpoint 100,000 times to get each project's invoicing data. Below is the first query:

let
Source = FunctionToken("projects"),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"ProjectId", "CustomerId", "IsActive"}, {"ProjectId", "CustomerId", "IsActive"}),
#"Removed Duplicates" = Table.Distinct(#"Expanded Column1"),
#"Added Custom" = Table.AddColumn(#"Removed Duplicates", "PIDInvoicingData", each "projects/"&Number.ToText([ProjectId])&"/invoicingdata"),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom", each FunctionToken([PIDInvoicingData])),
#"Expanded Custom" = Table.ExpandRecordColumn(#"Added Custom1", "Custom", {"ProjectId", "InvoiceGroup", "InvoiceGroupId", "UseProjectEmployeeTarification", "TarificationList", "TarificationListId", "KmPrice", "CreditRestrictionPerc", "AdministrativeCostPerc", "ChargeExpenses", "ChargeProductUse", "AccCode", "LedgerAccount", "InvoicingRemark", "DelayDuration", "DelayType", "DelayTypeId"}, {"ProjectId.1", "InvoiceGroup", "InvoiceGroupId", "UseProjectEmployeeTarification", "TarificationList", "TarificationListId", "KmPrice", "CreditRestrictionPerc", "AdministrativeCostPerc", "ChargeExpenses", "ChargeProductUse", "AccCode", "LedgerAccount", "InvoicingRemark", "DelayDuration", "DelayType", "DelayTypeId"}),
#"Removed Columns1" = Table.RemoveColumns(#"Expanded Custom",{"PIDInvoicingData", "ProjectId.1"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns1",{{"ProjectId", Int64.Type}, {"InvoiceGroup", type text}, {"InvoiceGroupId", Int64.Type}, {"UseProjectEmployeeTarification", type logical}, {"TarificationList", type text}, {"TarificationListId", Int64.Type}, {"KmPrice", type number}, {"CreditRestrictionPerc", Percentage.Type}, {"AdministrativeCostPerc", Percentage.Type}, {"ChargeExpenses", type logical}, {"ChargeProductUse", type logical}, {"AccCode", type text}, {"LedgerAccount", Int64.Type}, {"InvoicingRemark", type text}, {"DelayType", type text}, {"DelayTypeId", Int64.Type}, {"DelayDuration", Int64.Type}, {"IsActive", type logical}, {"CustomerId", Int64.Type}}),
#"Divided Column" = Table.TransformColumns(#"Changed Type", {{"AdministrativeCostPerc", each _ / 100, Percentage.Type}}),
#"Inserted Merged Column" = Table.AddColumn(#"Divided Column", "ProjectCustomerID", each Text.Combine({Text.From([ProjectId], "en-GB"), Text.From([CustomerId], "en-GB")}, ""), type text),
#"Changed Type1" = Table.TransformColumnTypes(#"Inserted Merged Column",{{"ProjectCustomerID", Int64.Type}})
in
#"Changed Type1"

 

Query 2 - the 2nd query will fetch all time registrations thorugh an API endpoint and then will merge queries with Query 1 to get some additional data from Query 1 to Query 2. Below is Query 2:

 

let
Source = FunctionTokenHistory("timeregistrations", "DateRegistration", "2024-12-31", "2026-01-01"),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"TimeRegistrationId", "CustomerId", "ProjectId", "PersonId", "PrestationId", "DateRegistration", "TimeFrom", "TimeTo", "Duration", "InvoiceId", "Overtime", "InternalRemarks", "Invoicable", "TrVat", "TrPrice", "Urgent", "Recalc", "Remarks", "InternalPrice", "ExForfait", "Benchmark", "Period", "CustomerInvoiceId"}, {"TimeRegistrationId", "CustomerId", "ProjectId", "PersonId", "PrestationId", "DateRegistration", "TimeFrom", "TimeTo", "Duration", "InvoiceId", "Overtime", "InternalRemarks", "Invoicable", "TrVat", "TrPrice", "Urgent", "Recalc", "Remarks", "InternalPrice", "ExForfait", "Benchmark", "Period", "CustomerInvoiceId"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Column1",{{"TimeRegistrationId", Int64.Type}, {"CustomerId", Int64.Type}, {"ProjectId", Int64.Type}, {"PersonId", Int64.Type}, {"PrestationId", Int64.Type}, {"TimeFrom", type time}, {"TimeTo", type time}, {"Duration", Int64.Type}, {"InvoiceId", Int64.Type}, {"Overtime", type logical}, {"InternalRemarks", type text}, {"Invoicable", type logical}, {"TrVat", type number}, {"TrPrice", type number}, {"Urgent", type logical}, {"Recalc", type logical}, {"Remarks", type text}, {"ExForfait", type logical}, {"InternalPrice", type number}, {"Benchmark", Int64.Type}, {"Period", Int64.Type}, {"CustomerInvoiceId", Int64.Type}}),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","T"," ",Replacer.ReplaceText,{"DateRegistration"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Replaced Value",{{"DateRegistration", type datetime}}),
#"Changed Type2" = Table.TransformColumnTypes(#"Changed Type1",{{"DateRegistration", type date}}),
#"Added Custom" = Table.AddColumn(#"Changed Type2", "FDuration", each #duration(0,0,[Duration],0)),
#"Changed Type3" = Table.TransformColumnTypes(#"Added Custom",{{"FDuration", type duration}}),
#"Inserted Total Hours" = Table.AddColumn(#"Changed Type3", "Total Hours", each Duration.TotalHours([FDuration]), type number),
#"Added Custom1" = Table.AddColumn(#"Inserted Total Hours", "Invoicable Amount", each if [Invoicable]= true then [Total Hours]*[TrPrice] else 0),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Registration Amount", each [Total Hours]*[TrPrice]),
#"Changed Type4" = Table.TransformColumnTypes(#"Added Custom2",{{"Invoicable Amount", type number},{"Registration Amount", type number}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type4", {"CustomerId"}, Customers, {"CustomerId"}, "Customers", JoinKind.LeftOuter),
#"Expanded Customers" = Table.ExpandTableColumn(#"Merged Queries", "Customers", {"Name"}, {"Name"}),
#"Merged Queries1" = Table.NestedJoin(#"Expanded Customers", {"ProjectId"}, ProjectsByEntity, {"ProjectId"}, "ProjectsByEntity", JoinKind.LeftOuter),
#"Expanded ProjectsByEntity" = Table.ExpandTableColumn(#"Merged Queries1", "ProjectsByEntity", {"ProjectTitle", "IsActive"}, {"ProjectTitle", "IsActive"}),
#"Merged Queries2" = Table.NestedJoin(#"Expanded ProjectsByEntity", {"PersonId"}, Employees, {"EmployeeId"}, "Employees", JoinKind.LeftOuter),
#"Expanded Employees" = Table.ExpandTableColumn(#"Merged Queries2", "Employees", {"FirstName"}, {"FirstName"}),
#"Merged Queries3" = Table.NestedJoin(#"Expanded Employees", {"PrestationId"}, RegistrationItem, {"PrestationId"}, "RegistrationItem", JoinKind.LeftOuter),
#"Expanded RegistrationItem" = Table.ExpandTableColumn(#"Merged Queries3", "RegistrationItem", {"Prestation"}, {"Prestation"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded RegistrationItem",{{"FirstName", "EmployeeFName"}, {"Name", "Customer"}}),
#"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"Customer", "ProjectTitle", "EmployeeFName", "Prestation", "TimeRegistrationId", "CustomerId", "ProjectId", "PersonId", "PrestationId", "DateRegistration", "TimeFrom", "TimeTo", "Duration", "InvoiceId", "Overtime", "InternalRemarks", "Invoicable", "TrVat", "TrPrice", "Urgent", "Recalc", "Remarks", "InternalPrice", "ExForfait", "Benchmark", "Period", "CustomerInvoiceId", "FDuration", "Total Hours", "Invoicable Amount", "Registration Amount", "IsActive"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each ([PrestationId] <> 1208)),
#"Inserted Age" = Table.AddColumn(#"Filtered Rows", "Age", each Date.From(DateTime.LocalNow()) - [DateRegistration], type duration),
#"Inserted Total Days" = Table.AddColumn(#"Inserted Age", "Total Days", each Duration.TotalDays([Age]), type number),
#"Added Custom3" = Table.AddColumn(#"Inserted Total Days", "Ageing", each if[Total Days]<=30 and Date.IsInCurrentMonth([DateRegistration]) then "Current Month" else if [Total Days]<=30 and Date.IsInPreviousMonth([DateRegistration]) then "1 Month" else if [Total Days]<=60 then "2 Months" else if [Total Days]<=90 then "3 Months" else if [Total Days]<=180 then "4-6 Months" else if [Total Days]<=365 then "7-12 Months" else "More than 1 year"),
#"Changed Type5" = Table.TransformColumnTypes(#"Added Custom3",{{"Ageing", type text}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type5",{"Age", "Total Days"}),
#"Appended Query" = Table.Combine({#"Removed Columns", TimeRegHistory}),
#"Merged Queries4" = Table.NestedJoin(#"Appended Query", {"ProjectId"}, ProjectInvoicingMult, {"ProjectId"}, "ProjectInvoicingMult", JoinKind.LeftOuter),
#"Expanded ProjectInvoicingMult" = Table.ExpandTableColumn(#"Merged Queries4", "ProjectInvoicingMult", {"InvoiceGroup"}, {"InvoiceGroup"}),
#"Replaced Value1" = Table.ReplaceValue(#"Expanded ProjectInvoicingMult",null,"None",Replacer.ReplaceValue,{"InvoiceGroup"}),
#"Sorted Rows" = Table.Sort(#"Replaced Value1",{{"TimeRegistrationId", Order.Ascending}}),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows"),
#"Grouped Rows" = Table.Group(#"Removed Duplicates", {"TimeRegistrationId"}, {{"TableWithIndex", each Table.AddIndexColumn(_,"Running Count",1,1)}}),
Custom1 = Table.Combine(#"Grouped Rows"[TableWithIndex]),
#"Filtered Rows2" = Table.SelectRows(Custom1, each ([Running Count] = 1)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows2",{"Running Count"})
in
#"Removed Columns1"

 

This is taking far too long and I need help to reduce the time lag for calculation. Ideally, I would want Query 1 to be frozen when Query 2 is calculating if something like this is possible 🙂 

 

Thanks in advance!

2 ACCEPTED SOLUTIONS

If it were me, I would do testing to pinpoint exactly what steps are adding time to the refresh. Like, how long is the refresh when you just refresh/load Query1? What is the timing of doing Query2 but just the initial load and expand? As you add in each new step, you can gauge the overhead that it's adding and perhaps figure out exactly where you are getting the big performance hits.

 

When I'm tuning queries in Excel, once I identify certain Power Query transformations that are too expensive, if there is not a more efficient way to write it in M, I'll move the transformation from Power Query to a DAX calculated column in the Power Pivot model. As in, rather than load directly from Power Query -> Sheet, you load Power Query -> Power Pivot model -> sheet. See here about loading tables from Power Pivot model, the article refers to these as "reverse linked tables": https://www.sqlbi.com/articles/linkback-tables-in-powerpivot-for-excel-2013/

 

Another thing to consider: avoid merging your project and invoice tables but instead load them into Power Pivot and add a relationship. Make sure that you have a primary key in your project table (probably, project ID - ensure it's unique in project table). Then you can easily* pull over related columns as needed either in the model or in the DAX that you use to define your tables as described in previous paragraph.

 

* or rather, as easy as you are familiar with DAX. If you are new to DAX it's probably a whole other can of worms to get into, but it's necessary in the long term if you are going to use Power BI in desktop or Excel.

View solution in original post

Thanks MarkLaf. I went back and worked on the relationships again and I found your second last paragraph very useful. The calculation used to take 10 mins and by working on the relationships and removing all merging in between invoice and project tables, it now takes around 1m30s 🙂

 

Thank you!

View solution in original post

4 REPLIES 4
MarkLaf
Memorable Member
Memorable Member

  • Use Table.Buffer when referring to the first query from the second ( e.g. Source = Table.Buffer( Query1 ) )
  • Turning off parallel load may help. See: https://blog.crossjoin.co.uk/2019/03/26/power-bi-caching-parallelism-and-power-query-refresh-perform...
  • For your step 2 where you query invoicing off of project IDs, from my understanding you are making an API call for each project. You should look into ways to query in bulk to decrease the number of API calls. E.g.
    • get the start/end of all projects and do one invoice api call for all projects in the min start and max end
    • get all invoice lines with one call, then merge on projects. Even though you are returning more rows, the 1 v thousands of calls to api would probably speed things up
    • even something a bit ham-handed where you do a call with a filter like, id eq project1 or id eq project2 or id eq project3.... for however many you can fit into filter argument before hitting character limits that usually exist with GET
    • Or, similar to above, put the full filter into the body of Web.Contents and convert to POST where you won't have character limits
  • Use dataflows

Hi thank you for your suggestions. Unfortunately this post is mainly for Power query in excel. So dataflows and parallel load won't work here. I tried table.buffer but there was no improvement. Lastly, I have to do the query like this to fetch the API because this is how it is structured at the supplier's end. You have to bring in all projects id with Function/projects and then do an API call for each project ID like this Function/projects/projectid/invoicingdata where project id starts with 1. Any other suggestions are most welcomed.

If it were me, I would do testing to pinpoint exactly what steps are adding time to the refresh. Like, how long is the refresh when you just refresh/load Query1? What is the timing of doing Query2 but just the initial load and expand? As you add in each new step, you can gauge the overhead that it's adding and perhaps figure out exactly where you are getting the big performance hits.

 

When I'm tuning queries in Excel, once I identify certain Power Query transformations that are too expensive, if there is not a more efficient way to write it in M, I'll move the transformation from Power Query to a DAX calculated column in the Power Pivot model. As in, rather than load directly from Power Query -> Sheet, you load Power Query -> Power Pivot model -> sheet. See here about loading tables from Power Pivot model, the article refers to these as "reverse linked tables": https://www.sqlbi.com/articles/linkback-tables-in-powerpivot-for-excel-2013/

 

Another thing to consider: avoid merging your project and invoice tables but instead load them into Power Pivot and add a relationship. Make sure that you have a primary key in your project table (probably, project ID - ensure it's unique in project table). Then you can easily* pull over related columns as needed either in the model or in the DAX that you use to define your tables as described in previous paragraph.

 

* or rather, as easy as you are familiar with DAX. If you are new to DAX it's probably a whole other can of worms to get into, but it's necessary in the long term if you are going to use Power BI in desktop or Excel.

Thanks MarkLaf. I went back and worked on the relationships again and I found your second last paragraph very useful. The calculation used to take 10 mins and by working on the relationships and removing all merging in between invoice and project tables, it now takes around 1m30s 🙂

 

Thank you!

Helpful resources

Announcements
June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors