The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hey guys
Done extracting data from RestAPI a few times where the data is paginated - Most of the times it has a lot more fields to help with the itteration. Read all the responses on here and they are all great, i understand them but they all have a bit more info to go on.
This is my source
Essentially the API returns 200 records and the links essentially tell you that you need to read Page=1, page=2 until no results are returned.
So in the past its always given me the number of pages returned so i can itterate through. Can any help clear the fog for me to itterate round this data until i have all the records - please will be forever greatfull.
What have you tried and where are you stuck?
Hey Ibendlin
Thank you so much for reaching out! Really appreciate it
So this is the code - Must admit i used it for another piece of workand have just cobbled it together.
It doesnt work at all for some reason. Anything you guide on would be appreciated.
let
Source = Json.Document(Web.Contents("https://backend.signinapp.com/client-api/v1/sites/32358/history?date_from=2022-06-01&date_to=2022-08...",
])),
iterations = 10,
FnGetOnePage =
(url) as record =>
let
Source = Json.Document(Web.Contents("https://backend.signinapp.com/client-api/v1/sites/32358/history?date_from=2022-06-01&date_to=2022-08...",
Query=[pagenumber=Number.ToText( url)]
])),
data = try Source[first] otherwise null, //get the first
next = try Source[next] otherwise null, // get the next
res = [Data=data, Next=next]
in
res,
GeneratedList =
List.Generate(
()=>[i=0, res = FnGetOnePage(1)],
each [i]<iterations and [res][Data]<>null,
each [i=[i]+1, res = FnGetOnePage(i)],
each [res][Data]),
#"Converted to Table" = Table.FromList(GeneratedList, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Expanded data" = Table.ExpandListColumn(#"Converted to Table", "data"),
#"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"id", "group_id", "returning_visitor_id", "name", "photo_url", "badge_url", "status", "in_datetime", "out_datetime", "expected_datetime", "additional_fields", "personal_fields", "metadata"}, {"data.id", "data.group_id", "data.returning_visitor_id", "data.name", "data.photo_url", "data.badge_url", "data.status", "data.in_datetime", "data.out_datetime", "data.expected_datetime", "data.additional_fields", "data.personal_fields", "data.metadata"}),
#"Expanded data.additional_fields" = Table.ExpandRecordColumn(#"Expanded data1", "data.additional_fields", {"Company", "Visiting", "Car Reg"}, {"data.additional_fields.Company", "data.additional_fields.Visiting", "data.additional_fields.Car Reg"}),
#"Expanded data.metadata" = Table.ExpandRecordColumn(#"Expanded data.additional_fields", "data.metadata", {"rejected_sign_in"}, {"data.metadata.rejected_sign_in"}),
#"Expanded links" = Table.ExpandRecordColumn(#"Expanded data.metadata", "links", {"first", "last", "prev", "next"}, {"links.first", "links.last", "links.prev", "links.next"}),
#"Expanded meta" = Table.ExpandRecordColumn(#"Expanded links", "meta", {"current_page", "from", "path", "per_page", "to"}, {"meta.current_page", "meta.from", "meta.path", "meta.per_page", "meta.to"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded meta",{{"data.id", Int64.Type}, {"data.group_id", Int64.Type}, {"data.returning_visitor_id", Int64.Type}, {"data.name", type text}, {"data.photo_url", type text}, {"data.badge_url", type any}, {"data.status", type text}, {"data.in_datetime", type datetime}, {"data.out_datetime", type any}, {"data.expected_datetime", type any}, {"data.additional_fields.Company", type any}, {"data.additional_fields.Visiting", type any}, {"data.additional_fields.Car Reg", type any}, {"data.personal_fields", type any}, {"data.metadata.rejected_sign_in", type logical}, {"links.first", type text}, {"links.last", type any}, {"links.prev", type any}, {"links.next", type text}, {"meta.current_page", Int64.Type}, {"meta.from", Int64.Type}, {"meta.path", type text}, {"meta.per_page", Int64.Type}, {"meta.to", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"links.first", "links.last", "links.prev", "links.next", "meta.current_page", "meta.from", "meta.path", "meta.per_page", "meta.to", "data.expected_datetime", "data.additional_fields.Company", "data.additional_fields.Visiting", "data.additional_fields.Car Reg", "data.personal_fields", "data.metadata.rejected_sign_in", "data.id", "data.group_id", "data.returning_visitor_id", "data.photo_url", "data.badge_url"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Removed Columns",{{"data.out_datetime", type datetime}})
in
#"Changed Type1"
When you call it the first time does it tell you the total number of records? If it does then you can do the pagination more easily.
- first call, get number of records
- generate a list with the required number of pages
- generate the URL for each page and fetch the page content into a table column
- expand the table column which will automatically union all results.
Hey - no it doesn't - it basically gives you the link to the first and link to the next - the api instructions say that when the next page is null the next link will be blank.
O. Previous projects like you said it's always told you how many records to iterate through - it's just very badly drawn out I think this api
Yes, that is unfortunate. it forces you to use list.accumulate or even a recursive function where you have to lug the received data around with each loop. Or you could harvest only the URLs and hope that your browser cache helps when you then call them again to collect the data.
Yes - Can you help me with the code i have written to do it the best way? I have proper hit a brick wall now 😞
unfortunately helping in these scenarios is very difficult without access to the API.
Practice using List.Accumulate. Read about recursive functions. Try stuff out.