Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Anonymous
Not applicable

Trying to pull HTML data dynamically - issue parsing tables

Hello,

 

I am trying to dynamically pull data from a web-source, and have it update monthly (this update will still need a manual trigger since the PowerBI Service does not allow dynamic sources).

 

  • I am dynamically pulling a list of Item Codes that users in the business can add to (the list might change over time) in sharepoint
    • I am adding a code "03463G" to each batch generally - so that the format of the table is ALWAYS the same (I filter out this dummy code later)
  • I am using the list to generate multiple URLs. These URLs are being queried by PowerBI
    • I am generating multiple URLs because the service does not allow more than Item Codes being pulled at a given time, so I am batching it

What I am struggling with is that, the moment that I am batching my queries, I am getting some issues with pulling the table from the website properly.

 

Take as an example the HTML tables generated here (I have simplified it to 2 irrelevant Item Codes per URL - the case stays the same)

 

 

 

http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'12878T','12904E'
&ITEMCNT=4&LIST=12878T,12904E&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407

http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'13713R','03463G'
&ITEMCNT=2&LIST=13713R,03463G&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407

 

 

This is the main code I use to query after generating the Item Code lists:

 

 

 // Define the function to generate query URLs
    GenerateQueryURL = (ItemCodesList) =>
    let
        // Always add '3463G' to the batch
        ItemCodesWithStandard = List.Combine({ItemCodesList, {"03463G"}}),
        ItemCount = List.Count(ItemCodesWithStandard),
        ItemCodesTextForList = Text.Combine(ItemCodesWithStandard, ","),
        ItemCodesTextForItemlst = Text.Combine(List.Transform(ItemCodesWithStandard, each "'" & _ & "'"), ","),

        StartDate = "202101", // Adjust as needed

        // Get the current date and calculate the end date (latest month minus two)
        CurrentDate = DateTime.LocalNow(),
        TargetDate = Date.AddMonths(CurrentDate, -2),
        EndDate = Text.From(Date.Year(TargetDate)) & Text.PadStart(Text.From(Date.Month(TargetDate)), 2, "0"),

        // Construct the URL
        BaseURL = "http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=%2Fstatistics%2Fpbs_item_standard_report",
        QueryString = "&itemlst=" & ItemCodesTextForItemlst & "&ITEMCNT=" & Text.From(ItemCount) & "&LIST=" & ItemCodesTextForList & "&VAR=BENEFIT&RPT_FMT=2&start_dt=" & StartDate & "&end_dt=" & EndDate,
        FullURL = BaseURL & QueryString
    in
        FullURL,

    // split Item Codes into Batches
    BatchSize = 28,
    Batches = if List.Count(ItemCodesList) > 0 then List.Split(ItemCodesList, BatchSize) else {},

    // generate URL for Each Batch
    QueryURLs = List.Transform(Batches, each GenerateQueryURL(_)),

    // fetch and inspect HTML content
    FetchHTMLTable = (url) =>
    let
        Source = Web.Contents(url),
        RawHTML = Text.FromBinary(Source),
        // Extract the table from HTML
        Document = try Html.Table(Source, {{"Table", "table.table"}}) otherwise null,
        TableData = if Document <> null and Table.RowCount(Document) > 0 then Document{0}[Table] else null
    in
        [RawHTML = RawHTML, TableData = TableData],

    // get HTML content and inspect
    RawAndTable = List.Transform(QueryURLs, each FetchHTMLTable(_)),

    // raw HTML and table data if needed
    RawHTMLs = List.Transform(RawAndTable, each _[RawHTML]),
    TablesFromURLs = List.Transform(RawAndTable, each _[TableData]),

    // combine tables into a single table if valid
    CombinedTable = Table.Combine(List.RemoveNulls(TablesFromURLs))
in
    CombinedTable

 

 

 

I am getting stuck in getting the tables out in the right format:

FirozReinders_1-1725933060834.png

 

I was succesful in retreiving the table correctly initially, but it has stopped working properly - I am assuming because of a change in the table structure.

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

1 REPLY 1
lbendlin
Super User
Super User

If you want to parse the HTML yourself you need to use Web.BrowserContents , not Web.Contents.

 

Web Scraping with Html.Table in Power Query – BI Polar (ssbipolar.com)

How to get the most out of Web.BrowserContents() - Foster BI

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.