Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
A search of the frontend application returned 21891 records using 219 pages. Using PowerQuery's POST API, we are trying to call this data
PowerQuery -
3 - Quries
Tool Utilization Query - below is the code I'm using to return the results and it's returning only first 100 records. How to loop this code to get complete 21819 records from 219 pages.
let
apiUrl = "https://abc.net/dfd/v1/unrestricted-context-search?userId=rtr",
bearerToken = getbearerToken(),
options = [
Content = Text.ToBinary(requestBody),
Headers = [
#"Content-Type"="application/json"
]
],
apiResponse = Web.Contents(apiUrl, options),
#"Imported JSON" = Json.Document(apiResponse,1252),
docs = #"Imported JSON"[docs]
in
#"Expanded Column1
Let's assume that in the particular search criteria my API returns 60,000 records and my API call will always allow 1000 records. Due to the sequential pagination logic, it took longer to load the entire data set. In the example attached, I can read the complete data. Let's say we have 60000 records and each API call will support 1000 records, so the total number of pages is 60000/1000=60 pages. If each 1000 records took 4 seconds to load, then it will take 240 seconds to load all the data. How can the dataload be made fast through parallel mechanisms.
Hi @hanuraolm ,
I see the problem.
There is also a similar post regaring this: https://community.powerbi.com/t5/Power-Query/Power-Query-Parallel-or-Concurrent-REST-Invocation/td-p...
Unfortunately there it states that "Parellel operations are not permitted within a single query" - not sure if it's still actual.
I would still try below potential solution (adding a column with starts and a second column to get the data), BUT if that doesn't work, as suggested also by the OP of the above question, I'd create an intermediery (local python, custom software hosted in a service etc.) that would make the requests in paralell and make them accessible to PowerQuery somehow.
Below potential solution 😊
Given that you know the start of all pages immediately after you run the first request, one option could be to FIRST create a column with the start of all pages, like this:
and THEN add a column that invokes getPageData using the above start column instead of List.Generate - maybe this will help:
The code to do this could look like:
... your code
maxPages = Number.RoundUp(totalRecords/1000),
pagesStartList = List.Generate(
() => 0,
(lastPage) => lastPage < maxPages,
(lastPage) => lastPage + 1,
(lastPage) => lastPage * 1000
),
pagesStartTable = Table.FromColumns({pagesStartList}, {"start"}),
// Define a function to retrieve data for a single page
getPageData = (start as number) =>
let
... your code
in
data,
#"Added Custom" = Table.AddColumn(pagesStartTable, "pageData", each getPageData([start]))
in
#"Added Custom"
I haven't tested it, but if it works, please do tell us if it did AND also mark this as ANSWER if it helped.
Hi
Looping is done using List.Generate.
I recently answered some questions related to paging:
If you cannot manage to fix it using the above links, please reply and I'll have a look at your code.
Also it will be important to know where is the next page token/number in the API response and how to request the next page.
Please mark this as answer if it helped.
i could see Authorization: Bearer , kvAccessToken is same for all the pages.
when i click on the next page --
{"query":"((end_time ge '2023-01-18T05:28:31Z' and start_time le '2023-02-17T05:28:31Z') and (file_type_name eq 'datalog'))","fields":[],"count":true,"start":100,"limit":100,"sort":{"fieldname":"end_time","order":"desc"}}
Next page ----
{"query":"((end_time ge '2023-01-18T05:30:02Z' and start_time le '2023-02-17T05:30:02Z') and (file_type_name eq 'datalog'))","fields":[],"count":true,"start":200,"limit":100,"sort":{"fieldname":"end_time","order":"desc"}}
seems "start" is the changed every time, then "limit":100 every time.
these days are same for all page, because i choose last 30 days in application...
end_time ge '2023-01-18T05:28:31Z' and start_time le '2023-02-17T05:28:31Z'
Your help in sharing the code structure would be greatly appreciated.
Hi,
Should be something like below - it assumes that we don't know the number of available pages (do we know from the response how many pages are left?) and therefore will keep incrementing "start = start + limit" until we get an error.
Hopefully your server will raise an error if we request a start that doesn't exist, otherwise it won't stop requesting.
If below doesn't work, please get back here and we'll finish it.
Please mark this as answer if it helped.
let
apiUrl = "http://...",
bearerToken = getbearerToken(),
getkvAccessToken = getkvAccessToken(),
headers = [
#"Content-Type" = "application/json",
#"Authorization" = "Bearer " & bearerToken,
#"clientAppName" = "Voyager",
#"clientFeatureName" = "Search",
#"kvAccessToken" = getkvAccessToken,
#"Ocp-Apim-Subscription-Key" = "ec424b08a1a14b7b97e2e9f6138767be"
],
limit = 100,
runNextRequest = (lastRequest) =>
let
getResponse = (startRow) =>
let
requestBodyRecord = [
query = "((end_time ge '2022-12-31T12:01:33Z' and start_time le '2023-01-30T12:01:33Z') and search.ismatch('/.*datalog.*/', 'file_type_name', 'full', 'any'))",
fields = {},
count = true,
start = startRow,
limit = limit,
sort = [
fieldname = "end_time",
order = "desc"
]
],
options = [
Content = Json.FromValue(requestBodyRecord),
Headers = headers
],
apiResponse = Json.Document(Web.Contents(apiUrl, options))
in
apiResponse,
start = if lastRequest = null then 0 else lastRequest[start] + limit,
response = getResponse(start),
hasError = (try response)[HasError] = true,
currentRequest =
// in the absence of anything in the response to tell us when to stop,
// we will continue requesting as long as we don't get an error
if hasError = false then [
response = response,
start = start
] else null
in
currentRequest,
allPages = List.Generate(
// initial
() => runNextRequest(null),
// condition
(requestResponse) => requestResponse <> null,
//next
(requestResponse) => runNextRequest(requestResponse),
//selector
// TODO: add [docs] at the end of the line below to extract just docs
(requestResponse) => requestResponse[response]
)
in
allPages
Thanks for your Inputs, I tried the above code given, but it is loading very slowly record by record.
Tried with the below chat gpt code and getting repeated data(1st 100 records) until the completion of the loop, can somebody please help me to capture all data.
let
apiUrl = "https://sample url/username",
//startDateTime = DateTime.ToText(DateTime.LocalNow(), "yyyy-MM-ddThh:mm:ssZ"),
//endDateTime = DateTime.ToText( DateTime.LocalNow() - #duration(60, 0, 0, 0), "yyyy-MM-ddThh:mm:ssZ" ),
startDateTime = "2023-02-21T07:19:22Z",
endDateTime ="2023-01-22T07:19:22Z",
requestBody = "{""query"":""((((end_time ge '"
& endDateTime
& "' and start_time le '"
& startDateTime
& "')
options = [
Headers = [
#"Content-Type" = "application/json",
#"Authorization" = "Bearer " & bearerToken,
##"kvAccessToken" = getkvAccessToken(),
]
],
apiResponse = Web.Contents(apiUrl, options & [Content = Text.ToBinary(requestBody)]),
#"Imported JSON" = Json.Document(apiResponse, 1252),
docs = #"Imported JSON"[docs],
totalDocs = #"Imported JSON"[numFound],
allDocs = List.Generate(
() => [start = 0, accumulatedDocs = {}],
each [start] < totalDocs,
each [start = [start] + 100, accumulatedDocs = List.Combine({[accumulatedDocs], [docs]})],
each
let
optionsWithContent = Record.AddField(
options,
"Content",
Text.ToBinary(Text.Replace(requestBody, "#start#", Text.From([start])))
),
response = Web.Contents(apiUrl, optionsWithContent),
json = Json.Document(response, 1252)
in
[docs = json[docs]]
),
combinedDocs = List.Combine(List.Transform(allDocs, each _[docs])),
#"Converted to Table" = Table.FromList(combinedDocs, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"id", "start_time"}, {"id", "start_time"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Column1", each ([id] = "0055c4e0-027d-4883-b5cc-6ec9d4731fcf"))
in
#"Filtered Rows"
Hi,
When you're trying to build on a not-so-solid foundation, if you try to build and your building collapses, IMO you need to get back to making the foundation solid.
In our case, there were some missing pieces of the puzzle (ex. how to get the total number of pages), we gave it a shot and it didn't work -> so in order for me to continue helping, I'll need you to please provide some more clarity so that we have a solid foundation.
More than this, you also tried and did not succeed with the Chat GPT solution -> 2 x collapses.
One of the most solid foundations for connecting to an API is Postman/curl -> using a browser is NOT necessarily a solid foundation (seeing you write "when i click on the next page" leads me to believe you're using a browser to test, right?).
It should take you like ~30 mins (or maybe less) to get Postman setup and connect to the API.
After you have Postman setup, my advice is to make a smaller query (ex. 3 "pages") and try to get all of them.
An important missing piece of information for me is how will PowerQuery know when there are no more pages (?).
Ex in the case you have 3 pages (aka: start=0, start=100 and start=200), if you request the 4th page (aka start=400), will the API raise an error, OR there is something in the response json that tells us the number of pages?
Also I'd like to see how the response json from Postman looks like - if you post it here, DON'T forget to remove confidential information.
So if you're willing to invest in providing a solid foundation, I'm willing to invest in replying.
P.S.: looking at the ChatGPT solution I see you have "numFound" in the response -> which you didn't specify in the initial question. 😉
Check out the July 2025 Power BI update to learn about new features.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
10 | |
8 | |
7 | |
6 | |
6 |