The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Good morning, everyone!
I’d like to share a situation and ask for your input.
I have a Power BI report that is ~70MB in disk size and ~200MB in memory usage (via DAX Studio). It imports about 3 million rows via API from my ERP (98% of the data) and SharePoint (2%), and takes approximately 8 minutes to refresh daily. The model follows best practices: star schema, no unnecessary calculated columns, no automatic date hierarchies, etc.
To optimize refresh time, I implemented incremental refresh on 8 tables (the heaviest ones out of 60 total), storing 2 years of history and refreshing only the last month. Since the source is API-based, I followed the strategy from the SQL Dusty blog (https://sqldusty.com/2020/05/20/power-bi-incremental-refresh-with-web-api-data-source/) , using an internal SQL database with a calendar table to ensure query folding with RangeStart and RangeEnd parameters.
Result:
First load (full): 1h20min (expected).
Subsequent incremental refreshes: between 10 and 20 minutes — i.e., slower than the original full load (7–8 min).
I was confused. To validate, I created a new .pbix file with just the same 8 tables and the same incremental logic. Result:
First load: 5 minutes
Subsequent refreshes: 1 minute
In other words, incremental refresh works. But in the full report, with the same 8 tables configured identically, performance is worse. The new model, with incremental enabled, was even leaner (another signal that incremental was correctly configured):
Disk size: 40MB
Memory: 99MB
Data volume: 1.1 million rows (vs 3 million in the previous one)
Still, the refresh time increased.
I can’t figure out why: is it because of the API? Is it the number of simultaneous calls overloading something? Why does the version with incremental refresh, less data and less memory, take longer than the full-load version?
Could it be due to API overhead (many small calls instead of one big one)?
Here’s an example of the function and M code (with names masked for security (All functions of the 8 tables were built following the same logic below)😞
(DataInicio as text) =>
let
Fonte = Json.Document(
Web.Contents(
"URLDAMINHAAPI",
[
RelativePath = "RELATIVEPATHDAMINHAAPI",
Query = [ parameters = "DATAINICIO_D=" & DataInicio & ";DATAFINAL_D=" & DataInicio ]
]
)
)
in
Fonte
Query using SQL Calendar + function
let
Fonte = Sql.Database("TEST", "TEST"),
dbo_Calendario = Fonte{[Schema="dbo",Item="Calendario"]}[Data],
#"Renamed Columns1" = Table.RenameColumns(dbo_Calendario,{{"Data", "Data_Calendario"}}),
#"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns1",{{"Data_Calendario", type datetime}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Data_Calendario] >= RangeStart and [Data_Calendario] < RangeEnd),
#"Invoked Custom Function" = Table.AddColumn(#"Filtered Rows", "fx_fPartida", each fx_fPartida([DateText])),
#"Expanded fx_fPartida" = Table.ExpandListColumn(#"Invoked Custom Function", "fx_fPartida"),
#"Expanded fx_fPartida1" = Table.ExpandRecordColumn(#"Expanded fx_fPartida", "fx_fPartida", {
"TEST", "COLI-TEST", "COLI-TEST", "DATA", "TEST", "COLI-TEST", "COLI-TEST"
}, {
"TEST", "COLI-TEST", "COLI-TEST", "DATA", "TEST", "COLI-TEST", "COLI-TEST"
}),
#"Changed Type2" = Table.TransformColumnTypes(#"Expanded fx_fPartida1", {
{"TEST", type text}, {"COLI-TEST", type text}, {"COLI-TEST", type text},
{"DATA", type datetimezone}, {"TEST", type text},
{"COLI-TEST", type text}, {"COLI-TEST", type text}
}),
#"Changed Type3" = Table.TransformColumnTypes(#"Changed Type2",{{"DATA", type date}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type3", each ([TEST] <> null))
in
#"Filtered Rows1"
Has anyone gone through a similar situation?
Is it expected that incremental refresh via API, even with query folding, results in more latency due to the multiple calls?
Is it really worth using incremental refresh with APIs, or does it only make sense with relational databases?
I appreciate any insight in advance.
Hi @PedroModa,
Just following up to see if the solutions provided by community members were helpful in addressing the issue.
If one of the responses helped resolve your query, please consider marking it as the Accepted Solution. Feel free to reach out if you need any further clarification or assistance.
Best regards,
Prasanna Kumar
Hi @PedroModa,
Power BI won’t let your report refresh online if your Web.Contents uses a URL that’s built using text joined together while the report is running. That’s because Power BI doesn’t know in advance what websites or addresses it might try to access.
To fix this, you should separate the main part of the URL (the base) from the parts that change (like dates or filters). You can do that by using the Query and RelativePath options in Web.Contents, like below.
(DataInicio as text, DataFim as text) =>
let
Fonte = Json.Document(
Web.Contents(
"https://api.example.com", // Static base URL
[
RelativePath = "endpoint", // Static relative path
Query = [
DATAINICIO_D = DataInicio,
DATAFINAL_D = DataFim],
Headers = [
#"Authorization" = "Bearer YOUR_TOKEN" // Or whatever is required
]]))
in
Fonte
@DataNinja777 , how are you?
I discovered that the previous error was due to the way I constructed my function. Since I separated the relative path from web.contents, my API tried to authenticate but failed. Therefore, it didn't return data when I published to the service. It tried to authenticate only on the web.contents URL, but failed. So, I had to combine the URL with the relative path, and it worked. I have to leave these parameters in order to connect to my API. However, by resolving the problem from the previous error by combining my URL into a single set, my function looks like this:
(DataInicio as text, DataFim as text) =>
let
FullUrl = "MYAPI/RELATIVEPATH/?parameters=DATAINICIO_D=" & DataInicio & ";DATAFINAL_D=" & DataFim,
Fonte = Json.Document(
Web.Contents(FullUrl)
)
in
Fonte
However, now when I publish, this error appears:
This dataset includes a dynamic data source. Since dynamic data sources aren't refreshed in the Power BI service, this dataset won't be refreshed. Learn more: https://aka.ms/dynamic-data-sources. Data source for Query1
Do you know how to help me with this?
On the desktop it updates normally, but when I publish this error appears:
@DataNinja777 Could you help me, please?
I created the query as you mentioned, but it's giving me an error that I can't resolve. It says that the DATA_CRIACAO column was not found, but it is in the table, I'm confused.
This is my funcion:
(DataInicio as text, DataFim as text) =>
let
Fonte = Json.Document(
Web.Contents(
"URLDAMINHAAPI",
[
RelativePath = "RELATIVEPATHDAMINHAAPI",
Query = [
parameters = "DATAINICIO_D=" & DataInicio & ";DATAFINAL_D=" & DataFim
]
]
)
)
in
Fonte
This is my query M:
let
StartDateText = DateTime.ToText(RangeStart, "yyyy-MM-dd"),
EndDateText = DateTime.ToText(RangeEnd, "yyyy-MM-dd"),
Fonte = fx_fPartida(StartDateText, EndDateText),
#"Convertido em Tabela" = Table.FromRecords(Fonte),
#"Tipo Alterado1" = Table.TransformColumnTypes(#"Convertido em Tabela", {{"DATA_CRIACAO", type datetimezone}}),
#"Tipo Alterado" = Table.TransformColumnTypes(#"Tipo Alterado1", {
{"CODCOLIGADA", type text},
{"COLI-CODLOTE", type text},
{"COLI-IDPARTIDA", type text},
{"DATA", type datetimezone},
{"IDPARTIDA", type text},
{"COLI-CREDITO", type text},
{"COLI-DEBITO", type text},
{"DATA_CRIACAO", type datetime}
}),
#"DATA como Date" = Table.TransformColumnTypes(#"Tipo Alterado", {{"DATA", type date}}),
#"Linhas Válidas" = Table.SelectRows(#"DATA como Date", each ([IDPARTIDA] <> null))
in
#"Linhas Válidas"
Hi @PedroModa ,
Of course, I can help with that. The "DATA_CRIACAO column was not found" error is almost certainly happening because the previous step in your query is producing an empty table. This occurs when your API call doesn't return any data, and the most likely culprit is a small formatting error in how you pass the parameters to the API within your custom function.
The Web.Contents function in Power Query is particular about how it receives query parameters. It expects the Query option to be a record where each field is a separate parameter, like [ParameterName1 = Value1, ParameterName2 = Value2]. Your function, however, combines everything into a single text string assigned to a parameter named parameters, which the API likely doesn't recognize.
Your current function is likely structured like this, which is causing the issue:
Query = [
parameters = "DATAINICIO_D=" & DataInicio & ";DATAFINAL_D=" & DataFim
]
To fix this, you need to modify the function to pass the parameters as a proper record. This change ensures that the API receives two distinct parameters, DATAINICIO_D and DATAFINAL_D, which is the standard and expected format.
Here is the corrected version of your function:
(DataInicio as text, DataFim as text) =>
let
Fonte = Json.Document(
Web.Contents(
"URLDAMINHAAPI",
[
RelativePath = "RELATIVEPATHDAMINHAAPI",
Query = [
DATAINICIO_D = DataInicio,
DATAFINAL_D = DataFim
]
]
)
)
in
Fonte
After updating your fx_fPartida function with this corrected code, you can debug the issue in your main query. In the Power Query Editor, select the Fonte step in the "Applied Steps" pane. Before the fix, you would likely see an empty list []. After the fix, you should see a List of Records. You can click on one of the records to inspect the fields and confirm that the DATA_CRIACAO column is now present. Once Fonte returns the correct data, all subsequent steps will execute without error.
Your main query will remain structurally the same, but it will now work as intended with the corrected function call.
// Main Query
let
// Power BI automatically provides RangeStart and RangeEnd
StartDateText = DateTime.ToText(RangeStart, "yyyy-MM-dd"),
EndDateText = DateTime.ToText(RangeEnd, "yyyy-MM-dd"),
// Call the corrected function
Fonte = fx_fPartida(StartDateText, EndDateText),
// This step will now create a table with the correct columns
#"Convertido em Tabela" = Table.FromRecords(Fonte),
// This step will now find the 'DATA_CRIACAO' column
#"Tipo Alterado" = Table.TransformColumnTypes(#"Convertido em Tabela", {
{"CODCOLIGADA", type text},
{"COLI-CODLOTE", type text},
{"COLI-IDPARTIDA", type text},
{"DATA", type datetimezone},
{"IDPARTIDA", type text},
{"COLI-CREDITO", type text},
{"COLI-DEBITO", type text},
{"DATA_CRIACAO", type datetime}
}),
#"DATA como Date" = Table.TransformColumnTypes(#"Tipo Alterado", {{"DATA", type date}}),
#"Linhas Válidas" = Table.SelectRows(#"DATA como Date", each ([IDPARTIDA] <> null))
in
#"Linhas Válidas"
When I call the function, I need to select the StartDateText and EndDateText, don't I? How would I do it in this case?
My friend, thank you very much for your reply. I understood what you said very well, but now I have other questions. My API supports a range of dates, so I believe I can set StartDateText and EndDateText. However, you say this would eliminate the need for the SQL calendar and the loop function. So are you saying that, with this, I wouldn't need the calendar table in my SQL to ensure query folding? That's the first question.
The second question is: If I don't need the SQL Calendar, where would I "apply" the RangeStart and RangeEnd parameters? Shouldn't I apply them to a date column?
Could you (if possible) demonstrate what my M query would look like where I invoke the function (the one you provided)?
Hi @PedroModa ,
Yes, your suspicion is correct. The slower refresh time is almost certainly caused by the overhead of making many small API calls instead of one large one. Your original refresh made roughly 8 large calls, while your new incremental method makes around 240 small calls (8 tables * 30 days). Each of these calls has a fixed overhead for connection, authentication, and network latency. Multiplying this overhead 240 times is creating a new bottleneck that is slower than downloading the larger dataset in a few calls.
This behavior stems directly from your Power Query logic. The combination of your SQL calendar table and this specific step forces a sequential loop:
#"Invoked Custom Function" = Table.AddColumn(
#"Filtered Rows",
"fx_fPartida",
each fx_fPartida([DateText])
)
This code iterates through every day in your refresh period and runs your custom function, which makes a separate Web.Contents call for each individual day. While you achieved query folding on the SQL side to get the list of dates, that folding stops there. The subsequent API calls are executed one by one, which is highly inefficient.
Your isolated test file was faster because it didn't have to compete for resources with the other 52 tables in your full report. To fix the issue in the main report, you must reduce the number of API calls. The best way to do this is to make a single API call that fetches all the data for the entire incremental period, from RangeStart to RangeEnd.
The ideal solution is to have the API endpoint modified to accept a start and end date. This would eliminate the need for the SQL calendar and the looping function. Your entire query for each table could be simplified to this:
let
// Convert parameters to text in the required format
StartDateText = DateTime.ToText(RangeStart, "yyyy-MM-dd"),
EndDateText = DateTime.ToText(RangeEnd, "yyyy-MM-dd"),
Source = Json.Document(
Web.Contents(
"URLDAMINHAAPI",
[
RelativePath = "RELATIVEPATHDAMINHAAPI",
Query = [
DATAINICIO_D = StartDateText,
DATAFINAL_D = EndDateText
]
]
)
),
// ... rest of your transformations
in
Source
To answer your final questions: Yes, this slowdown is expected with your current implementation that makes a call per day. And yes, incremental refresh is absolutely worth it with APIs, but only if the API is designed to support it by accepting date range parameters in a single query. Your experience perfectly illustrates the performance penalty when the API cannot handle this efficiently.
Best regards,
User | Count |
---|---|
80 | |
73 | |
40 | |
30 | |
28 |
User | Count |
---|---|
107 | |
96 | |
55 | |
47 | |
47 |