Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi all,
I have a big table (1.5m rows) that I want to query through a dataflow, and I want to only refresh new rows. I have a column called "end" in the table that will give the row number of the last row called from the source db, and I want to effectively modify the dataflow web API query URL by adding skip=(max value of "end" column) so I am only querying new rows each time.
eg
The reason i have to do it this way is the powerquery times out after 10mins and I can only pull circa 100k rows at a time before that happens, but I need to be able to change the API call to pull the correct 100k rows.
Any ideas on how best to do this?
Hi @Devrij ,
Thanks for @lbendlin reply.
Based on your description, you can that you need to determine the maximum value of the "end" column from the loaded data before building the API call. This can be done using Power Query's functionality, which you have correctly identified. Make sure you perform this step every time before making an API call to get the latest "end" value. The extracted "end" column maximum is then used to dynamically construct the API URL. your Power Query expression appears to be correct. However. make sure that the function correctly references the "end" column in the loaded data.
The end result for Cane looks like this
let
MaxEndValue = List.Max(YourPreviousQueryName[end]),
APIUrl = "https://xxxxxxx?api_key=xxxxxxx&dbs=xxxxx&collection=xxxxxxx&limit=100000&skip=" & Text.From(MaxEndValue),
Source = Json.Document(Web.Contents(APIUrl))
in
Source
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Not directly possible. Power BI Incremental Refresh is exclusively based on datetime partition boundaries. You can create your own partitions with different boundary rules but that would require you to re-invent all the partition management processes.
Does your source table have a suitable date or datetime column that you can use instead?
If your data is immutable then you can also consider running your own extract process based on your described logic, and place the results in a sharepoint folder so Power BI can pick the extracts up from there.
1.5M rows is not big, and does not normally warrant incremental refresh. Your mentioned timeout is a valid reason, however.
Thanks for the reply, I do have a datetime column, but what I'm struggling with is inserting that into the web API url to modify the actual query sent to the source. All the incremental query seems to do is filter the results of the source query, and since I can't query all of the rows in one go without it timing out, I'm not sure if that will work for us. Whereas it's a lot easier to just say "skip this many rows and pull the next 100k". As you say timeout is the only reason I need it to be incremental, although we are trying to be frugal with compute usage.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
58 | |
30 | |
24 | |
23 | |
20 |
User | Count |
---|---|
55 | |
34 | |
23 | |
22 | |
20 |