Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Anonymous
Not applicable

Trying to pull HTML data dynamically - issue parsing tables

Hello,

 

I am trying to dynamically pull data from a web-source, and have it update monthly (this update will still need a manual trigger since the PowerBI Service does not allow dynamic sources).

 

  • I am dynamically pulling a list of Item Codes that users in the business can add to (the list might change over time) in sharepoint
    • I am adding a code "03463G" to each batch generally - so that the format of the table is ALWAYS the same (I filter out this dummy code later)
  • I am using the list to generate multiple URLs. These URLs are being queried by PowerBI
    • I am generating multiple URLs because the service does not allow more than Item Codes being pulled at a given time, so I am batching it

What I am struggling with is that, the moment that I am batching my queries, I am getting some issues with pulling the table from the website properly.

 

Take as an example the HTML tables generated here (I have simplified it to 2 irrelevant Item Codes per URL - the case stays the same)

 

 

 

http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'12878T','12904E'
&ITEMCNT=4&LIST=12878T,12904E&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407

http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'13713R','03463G'
&ITEMCNT=2&LIST=13713R,03463G&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407

 

 

This is the main code I use to query after generating the Item Code lists:

 

 

 // Define the function to generate query URLs
    GenerateQueryURL = (ItemCodesList) =>
    let
        // Always add '3463G' to the batch
        ItemCodesWithStandard = List.Combine({ItemCodesList, {"03463G"}}),
        ItemCount = List.Count(ItemCodesWithStandard),
        ItemCodesTextForList = Text.Combine(ItemCodesWithStandard, ","),
        ItemCodesTextForItemlst = Text.Combine(List.Transform(ItemCodesWithStandard, each "'" & _ & "'"), ","),

        StartDate = "202101", // Adjust as needed

        // Get the current date and calculate the end date (latest month minus two)
        CurrentDate = DateTime.LocalNow(),
        TargetDate = Date.AddMonths(CurrentDate, -2),
        EndDate = Text.From(Date.Year(TargetDate)) & Text.PadStart(Text.From(Date.Month(TargetDate)), 2, "0"),

        // Construct the URL
        BaseURL = "http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=%2Fstatistics%2Fpbs_item_standard_report",
        QueryString = "&itemlst=" & ItemCodesTextForItemlst & "&ITEMCNT=" & Text.From(ItemCount) & "&LIST=" & ItemCodesTextForList & "&VAR=BENEFIT&RPT_FMT=2&start_dt=" & StartDate & "&end_dt=" & EndDate,
        FullURL = BaseURL & QueryString
    in
        FullURL,

    // split Item Codes into Batches
    BatchSize = 28,
    Batches = if List.Count(ItemCodesList) > 0 then List.Split(ItemCodesList, BatchSize) else {},

    // generate URL for Each Batch
    QueryURLs = List.Transform(Batches, each GenerateQueryURL(_)),

    // fetch and inspect HTML content
    FetchHTMLTable = (url) =>
    let
        Source = Web.Contents(url),
        RawHTML = Text.FromBinary(Source),
        // Extract the table from HTML
        Document = try Html.Table(Source, {{"Table", "table.table"}}) otherwise null,
        TableData = if Document <> null and Table.RowCount(Document) > 0 then Document{0}[Table] else null
    in
        [RawHTML = RawHTML, TableData = TableData],

    // get HTML content and inspect
    RawAndTable = List.Transform(QueryURLs, each FetchHTMLTable(_)),

    // raw HTML and table data if needed
    RawHTMLs = List.Transform(RawAndTable, each _[RawHTML]),
    TablesFromURLs = List.Transform(RawAndTable, each _[TableData]),

    // combine tables into a single table if valid
    CombinedTable = Table.Combine(List.RemoveNulls(TablesFromURLs))
in
    CombinedTable

 

 

 

I am getting stuck in getting the tables out in the right format:

FirozReinders_1-1725933060834.png

 

I was succesful in retreiving the table correctly initially, but it has stopped working properly - I am assuming because of a change in the table structure.

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

1 REPLY 1
lbendlin
Super User
Super User

If you want to parse the HTML yourself you need to use Web.BrowserContents , not Web.Contents.

 

Web Scraping with Html.Table in Power Query – BI Polar (ssbipolar.com)

How to get the most out of Web.BrowserContents() - Foster BI

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.