Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
jdusek92
Helper IV
Helper IV

Help with recursive web power query (oDATA, SuccessFactors)

Hello,

I am connecting to oDATA SuccessFactors and I have encountered the following challenge:

 

I can get only 1000 rows in one JSON file (default limitation). With these 1000 records, I also get a URL with a skip token to get the second 1000, and so on up to the last thousand of records.

 

Step 1:jdusek92_0-1622575476097.png

Step 2:jdusek92_3-1622575643719.png

 

Step 3:jdusek92_4-1622575693591.png

 

 

The last Skiptoken URL return only a List of records without "__Next" with Skiptoken URL:

jdusek92_5-1622575873553.png

 

I would like to ask for help with a recursive function - this function will take the original URL and iterate through the original URL and all subsequent URLs with SkipTokens.

The result should be a list of lists of records that I can easily work with in the next steps.

 

My previous attempt used List.Generate to firstly get the list of all SkipToken URLS. Then I opened all these URLS - this means that I had to download all the JSON files twice = twice the download time. Also, I am not sure if the SkipToken URLs are still reliable when loaded for the second time.

 

 

Thank you

Jakub

 

2 ACCEPTED SOLUTIONS
mahoneypat
Microsoft Employee
Microsoft Employee

Here is an example on how to do this with List.Generate using a test REST API.  I adapted the approach described well in this article.  You can paste the code into a blank query to step through it. 

 

List.Generate() and Looping in PowerQuery - Exceed

 

let
    fn = (pagenumber) => Json.Document(Web.Contents("https://api.instantwebtools.net/v1/passenger?page=" & pagenumber & "&size=1000")),
    mylist = List.Generate(()=> [Page = 1, Result = fn(Number.ToText(1))], each List.Count([Result][data])>0, each [Page = [Page] + 1, Result = fn(Number.ToText([Page]+1))]),
    #"Converted to Table" = Table.FromList(mylist, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Page", "Result"}, {"Page", "Result"}),
    #"Expanded Result" = Table.ExpandRecordColumn(#"Expanded Column1", "Result", {"data"}, {"data"}),
    #"Expanded data" = Table.ExpandListColumn(#"Expanded Result", "data"),
    #"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"_id", "name", "trips", "airline", "__v"}, {"_id", "name", "trips", "airline", "__v"})
in
    #"Expanded data1"

 

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

Hello Pat, thank you very much for your help!

 

You inspired me and I was able to contruct my own solution. It seems very easy at the first sight and also works great for my scenario:

 

jdusek92_0-1622759931868.png

This returns a list of lists of records that I can easily expand and work with.

 

It works great, loads data - but maybe looks too simple when compared with other examples of recursive functions I have been going through.

Could you please have a look at my code if you have any possible concerns about it, because I can't believe this short code actually does tone of work.

 

Thanks, Jakub!

View solution in original post

14 REPLIES 14
robindk
Frequent Visitor

Pickup up on this tread again if someone still struggles.

Below is my current function that also works in Power Query Online - this is using Token Bearer (OAuth) for authentication, so you need to feed that Token Bearer into the Query. I use a power automate flow to store the bearer token - and just read the token from that into the Power Query. I generally recommend to use paging = cursor as part of the RelPath.

Example if an input to RelPath parameter would be: EmpJob?paging=cursor


(RelPath as text) =>
    let
    Token = Authentication{0}[Value],  
    // Explanation: Fetches the first token (Bearer Token) from the "Authentication" table or list. This token will be used for authorization when making API requests.
    
    baseurl = "https://api2.successfactors.eu/odata/v2/",
    // Explanation: Base URL for the SuccessFactors API. This is the constant part of the URL used in each API request.

    RelPath = RelPath,
    // Explanation: Takes the Relative Path as an argument passed into the function. This is the endpoint appended to the base URL - example "EmpJob"

    initReq  = Json.Document(Web.Contents(baseurl, [Headers=[#"Accept"="application/json", Authorization=Token],RelativePath=RelPath]))[d],
    // Explanation: Makes an initial API request to the specified relative path using the base URL. The request uses the token for authorization, and expects the response in JSON format. The `[d]` part selects the relevant data section from the JSON.

    nextUrl = Text.AfterDelimiter(initReq[__next],"v2/"),
    // Explanation: Retrieves the URL for the next page of data, if available. The `__next` field holds the URL for the next page. The `Text.AfterDelimiter` removes the base part of the URL, leaving just the path starting after "/v2/".

    initValue = initReq[results],
    // Explanation: Extracts the initial set of data (records) from the `results` field of the JSON response.

    gather = (data as list, url) =>
    let
        baseurl = "https://api2.successfactors.eu/odata/v2/",
        // Explanation: Reuses the base URL for the next API request.

        newReq = Json.Document(Web.Contents(baseurl, [Headers=[#"Accept"="application/json", Authorization=Token],RelativePath=url]))[d],
        // Explanation: Makes another API request for the next page of data using the relative path stored in `url`.

        newNextUrl = Text.AfterDelimiter(newReq[__next],"v2/"),
        // Explanation: Extracts the next URL for pagination if there are more pages of data to retrieve.

        newData = newReq[results],
        // Explanation: Retrieves the data (records) from the `results` field of the JSON response for the next page.

        data = List.Combine({data, newData}),
        // Explanation: Combines the previously gathered data with the newly retrieved data to form a single list.

        Converttotable = Record.ToTable(newReq),
        // Explanation: Converts the JSON response to a table format for easier processing later.

        Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
        // Explanation: Pivots the table so that each unique name in the JSON becomes a column, with corresponding values.

        Column_Names = Table.ColumnNames(Pivot_Columns),
        // Explanation: Retrieves the names of the columns in the pivoted table.

        Contains_Column = List.Contains(Column_Names, "__next"),
        // Explanation: Checks if the pivoted table contains a column named `__next`, indicating that there are more pages of data to retrieve.

        check = if Contains_Column = true then @gather(data, newNextUrl) else data
        // Explanation: If there is another page (indicated by the presence of `__next`), recursively calls `gather` to fetch the next page, otherwise returns the accumulated data.
    in
        check,
    
    Converttotable = Record.ToTable(initReq),
    // Explanation: Converts the initial JSON response into a table.

    Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
    // Explanation: Pivots the table, creating columns based on the distinct field names in the JSON response.

    Column_Names = Table.ColumnNames(Pivot_Columns),
    // Explanation: Gets the names of the columns from the pivoted table.

    Contains_Column = List.Contains(Column_Names, "__next"),
    // Explanation: Checks if the initial response contains a `__next` field, which indicates whether there are more pages of data to fetch.

    outputList = if Contains_Column = true then @gather(initValue, nextUrl) else initValue,
    // Explanation: If more pages of data are available (`__next` exists), it recursively gathers data by calling the `gather` function, otherwise it just uses the initial data.

    expand = Table.FromRecords(outputList)
    // Explanation: Converts the accumulated list of records into a table that can be used in Power Query.
    
in
    expand





mahoneypat
Microsoft Employee
Microsoft Employee

Here is an example on how to do this with List.Generate using a test REST API.  I adapted the approach described well in this article.  You can paste the code into a blank query to step through it. 

 

List.Generate() and Looping in PowerQuery - Exceed

 

let
    fn = (pagenumber) => Json.Document(Web.Contents("https://api.instantwebtools.net/v1/passenger?page=" & pagenumber & "&size=1000")),
    mylist = List.Generate(()=> [Page = 1, Result = fn(Number.ToText(1))], each List.Count([Result][data])>0, each [Page = [Page] + 1, Result = fn(Number.ToText([Page]+1))]),
    #"Converted to Table" = Table.FromList(mylist, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Page", "Result"}, {"Page", "Result"}),
    #"Expanded Result" = Table.ExpandRecordColumn(#"Expanded Column1", "Result", {"data"}, {"data"}),
    #"Expanded data" = Table.ExpandListColumn(#"Expanded Result", "data"),
    #"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"_id", "name", "trips", "airline", "__v"}, {"_id", "name", "trips", "airline", "__v"})
in
    #"Expanded data1"

 

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Hello Pat, thank you very much for your help!

 

You inspired me and I was able to contruct my own solution. It seems very easy at the first sight and also works great for my scenario:

 

jdusek92_0-1622759931868.png

This returns a list of lists of records that I can easily expand and work with.

 

It works great, loads data - but maybe looks too simple when compared with other examples of recursive functions I have been going through.

Could you please have a look at my code if you have any possible concerns about it, because I can't believe this short code actually does tone of work.

 

Thanks, Jakub!

Picking up on this - as for our organization - and for security in general - we did not want to use basic authentication against SuccessFactors - I came up with an alternative to the marked solution here.

Firstly I created a seperate dataflow for the token authentication.

For that part I use a service account and generate the assertion with the SAML generator supplied from SAP.

 

For the powerquery for this part it looks like below:

 

 

let
  Auth = Json.Document(Web.Contents(Path, [Headers=[#"Content-Type"="application/x-www-form-urlencoded"],Content=Text.ToBinary(#"insertion"), RelativePath="/oauth/token"])),
  #"Converted to table" = Record.ToTable(Auth),
  #"Transform columns" = Table.TransformColumnTypes(#"Converted to table", {{"Value", type text}}),
  #"Replace errors" = Table.ReplaceErrorValues(#"Transform columns", {{"Value", null}}),
  #"Transposed table" = Table.Transpose(#"Replace errors"),
  #"Promoted headers" = Table.PromoteHeaders(#"Transposed table", [PromoteAllScalars = true]),
  #"Changed column type" = Table.TransformColumnTypes(#"Promoted headers", {{"access_token", type text}, {"token_type", type text}, {"expires_in", Int64.Type}}),
  #"Added custom" = Table.AddColumn(#"Changed column type", "Token", each [token_type] & " " & [access_token]),
  Navigation = #"Added custom"[Token],
  #"Navigation 1" = Navigation{0},
  #"Convert to table" = Table.FromValue(#"Navigation 1")
in
  #"Convert to table"
 
The insertion part I referenced as a seperate variable.

2nd - for any new Odata sources I want to fetch into PowerBI I setup a seperate dataflow, where I reference the Authentication Token ( this is running 1 a day to get a new token)

For getting the odata - I have this function. For this to work, you should always use inlinecount=allpages in the query.

(RelPath as text) =>
    let
    Token = Authentication{0}[Value],
    RelPath = RelPath,
initReq  = Json.Document(Web.Contents(baseurl, [Headers=[#"Accept"="application/json", Authorization=Token],RelativePath=RelPath]))[d],
    nextUrl = Text.AfterDelimiter(initReq[__next],"v2/"),
initValue= initReq[results],
    gather=(data as list, url)=>
    let
    newReq=Json.Document(Web.Contents(baseurl,[Headers=[#"Accept"="application/json", Authorization=Token],RelativePath=url]))[d],
    newNextUrl = Text.AfterDelimiter(newReq[__next],"v2/"),
    newData= newReq[results],
    data=List.Combine({data,newData}),
    Converttotable = Record.ToTable(newReq),
    Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
    Column_Names=Table.ColumnNames(Pivot_Columns),
    Contains_Column=List.Contains(Column_Names,"__next"),
    check = if Contains_Column = true then @gather(data, newNextUrl) else data
in
    check,
    Converttotable = Record.ToTable(initReq),
    Pivot_Columns = Table.Pivot(Converttotable, List.Distinct(Converttotable[Name]), "Name", "Value", List.Count),
    Column_Names=Table.ColumnNames(Pivot_Columns),
    Contains_Column=List.Contains(Column_Names,"__next"),
    outputList= if Contains_Column= true then @gather(initValue,nextUrl) else initValue,
    expand=Table.FromRecords(outputList)
in
    expand




hi, can you post the advanced editor for this code. I try to replicate the code but it remains errors

Hello,

since then I improved it, this is the function that I use:

let
Source = (SFurl as text) => let
Source = List.Generate(()=>Record.AddField(Json.Document(Web.Contents(SFurl))[d],"count",1) , each Record.HasFields(_,"results") =true and (Record.HasFields(_,"count") and _[count]<=99999), each try Record.AddField(Json.Document(Web.Contents(_[__next]))[d],"count",_[count]+1) otherwise [df=[__next="SSS"]] ),
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"results"}, {"results"}),
#"Expanded results" = Table.ExpandListColumn(#"Expanded Column1", "results")
in
#"Expanded results"
in
Source

It will return a table/list of records, that you can further expand as you need.

(it has been a while I used it for the last time, so I cannot explain it, but it works for me - pull oData from SuccessFactors) 

the parameter should look like this:

"https://XXXXX/odata/v2/Position?$format=JSON..."

 

Warm regards,

Jakub

great thanks, i managed to get it to work in Power BI Desktop, but when I publish to power bi serivice it says that the dataset cannot be updated because there is a dynamic datasource as the source is contained in the query. how did you solve it?

here is the error:

image.png

 

Hi,

 

How did you make it work? I keep getting, identifier errors, commas missing, right paren error....

I just copied and pasted changing "SFurl as text" for the odata api url maybe im doing something wrong.

 

Best regards,

Hello,

I did not use it for scheduled online refresh - I was getting the same message.

So I used it manually on desktop

 

Jakub

If it works and is performant, that's the real test.  Does it make two web calls with each iteration?

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


How can I test that it does not perform all the calls twice?

 

Warm regards, Jakub

You could probably tell with the query diagnostics features (or with an external tool like Fiddler); however, if it is performant and you don't have a limit on web calls, don't worry about it.

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


mahoneypat
Microsoft Employee
Microsoft Employee

Please see the approach in this video for a simpler way to handle this.

Power BI - Tales From The Front - REST APIs - YouTube

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


Hello,

 

unfortunately, this oData API does not return the total count - therefore I cannot generate all the URLs beforehand. I would really need a recursive function to loop through the Skiptokens.

 

Warm Regards, Jakub

Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors