Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
LiorRahav
Regular Visitor

Power Query web scrapping

Hi,  when connecting to a website, I want to pull more then the first page, what if the page number is hidden in the metadata, for example: https://dailymed.nlm.nih.gov/dailymed/services/v2/ndcs  .xml or .json, this will only give me the first 100 records... any thoughts?

2 REPLIES 2
LiorRahav
Regular Visitor

Thanks, very good article but I'm missing something..

 
I can't pull anything after the first page, maybe because the page is in the metadata or the [paging][next] part?
 
let
 iterations = 10,          // Number of iterations
 url = 
 
 FnGetOnePage =
  (url) as record =>
   let
    Source = Json.Document(Web.Contents(url)),
    data = try Source[data] otherwise null,
    next = try Source[paging][next] otherwise null,
    res = [Data=data, Next=next]
   in
    res,
 
 GeneratedList =
  List.Generate(
   ()=>[i=0, res = FnGetOnePage(url)],
   each [i]<iterations and [res][Data]<>null,
   each [i=[i]+1, res = FnGetOnePage([res][Next])],
   each [res][Data])
in
    GeneratedList
 
this is what I get:
 

 


 
as an fyi, this is the metadata:
let
    Source = Json.Document(Web.Contents("https://dailymed.nlm.nih.gov/dailymed/services/v2/ndcs")),
    #"Converted to Table" = Table.FromRecords({Source}),
    #"Expanded metadata" = Table.ExpandRecordColumn(#"Converted to Table", "metadata", {"db_published_date", "elements_per_page", "current_url", "next_page_url", "total_elements", "total_pages", "current_page", "previous_page", "previous_page_url", "next_page"}, {"metadata.db_published_date", "metadata.elements_per_page", "metadata.current_url", "metadata.next_page_url", "metadata.total_elements", "metadata.total_pages", "metadata.current_page", "metadata.previous_page", "metadata.previous_page_url", "metadata.next_page"})
in
    #"Expanded metadata"
 
 

 

 
i hope you can help/ have time to help 🙂
lbendlin
Super User
Super User

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors