Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello,
I am trying to dynamically pull data from a web-source, and have it update monthly (this update will still need a manual trigger since the PowerBI Service does not allow dynamic sources).
What I am struggling with is that, the moment that I am batching my queries, I am getting some issues with pulling the table from the website properly.
Take as an example the HTML tables generated here (I have simplified it to 2 irrelevant Item Codes per URL - the case stays the same)
http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'12878T','12904E'
&ITEMCNT=4&LIST=12878T,12904E&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407
http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=
%2Fstatistics%2Fpbs_item_standard_report&itemlst=
'13713R','03463G'
&ITEMCNT=2&LIST=13713R,03463G&VAR=BENEFIT&RPT_FMT=2
&start_dt=202101&end_dt=202407
This is the main code I use to query after generating the Item Code lists:
// Define the function to generate query URLs
GenerateQueryURL = (ItemCodesList) =>
let
// Always add '3463G' to the batch
ItemCodesWithStandard = List.Combine({ItemCodesList, {"03463G"}}),
ItemCount = List.Count(ItemCodesWithStandard),
ItemCodesTextForList = Text.Combine(ItemCodesWithStandard, ","),
ItemCodesTextForItemlst = Text.Combine(List.Transform(ItemCodesWithStandard, each "'" & _ & "'"), ","),
StartDate = "202101", // Adjust as needed
// Get the current date and calculate the end date (latest month minus two)
CurrentDate = DateTime.LocalNow(),
TargetDate = Date.AddMonths(CurrentDate, -2),
EndDate = Text.From(Date.Year(TargetDate)) & Text.PadStart(Text.From(Date.Month(TargetDate)), 2, "0"),
// Construct the URL
BaseURL = "http://medicarestatistics.humanservices.gov.au/statistics/do.jsp?_PROGRAM=%2Fstatistics%2Fpbs_item_standard_report",
QueryString = "&itemlst=" & ItemCodesTextForItemlst & "&ITEMCNT=" & Text.From(ItemCount) & "&LIST=" & ItemCodesTextForList & "&VAR=BENEFIT&RPT_FMT=2&start_dt=" & StartDate & "&end_dt=" & EndDate,
FullURL = BaseURL & QueryString
in
FullURL,
// split Item Codes into Batches
BatchSize = 28,
Batches = if List.Count(ItemCodesList) > 0 then List.Split(ItemCodesList, BatchSize) else {},
// generate URL for Each Batch
QueryURLs = List.Transform(Batches, each GenerateQueryURL(_)),
// fetch and inspect HTML content
FetchHTMLTable = (url) =>
let
Source = Web.Contents(url),
RawHTML = Text.FromBinary(Source),
// Extract the table from HTML
Document = try Html.Table(Source, {{"Table", "table.table"}}) otherwise null,
TableData = if Document <> null and Table.RowCount(Document) > 0 then Document{0}[Table] else null
in
[RawHTML = RawHTML, TableData = TableData],
// get HTML content and inspect
RawAndTable = List.Transform(QueryURLs, each FetchHTMLTable(_)),
// raw HTML and table data if needed
RawHTMLs = List.Transform(RawAndTable, each _[RawHTML]),
TablesFromURLs = List.Transform(RawAndTable, each _[TableData]),
// combine tables into a single table if valid
CombinedTable = Table.Combine(List.RemoveNulls(TablesFromURLs))
in
CombinedTable
I am getting stuck in getting the tables out in the right format:
I was succesful in retreiving the table correctly initially, but it has stopped working properly - I am assuming because of a change in the table structure.
Solved! Go to Solution.
If you want to parse the HTML yourself you need to use Web.BrowserContents , not Web.Contents.
Web Scraping with Html.Table in Power Query – BI Polar (ssbipolar.com)
How to get the most out of Web.BrowserContents() - Foster BI
If you want to parse the HTML yourself you need to use Web.BrowserContents , not Web.Contents.
Web Scraping with Html.Table in Power Query – BI Polar (ssbipolar.com)
How to get the most out of Web.BrowserContents() - Foster BI