Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi,
I am working on a dashboard that I would like to publish. I would also like to set a scheduled refresh for it. All of my data is currently sourced from the Web. And apparently it is not possible to set a scheduled refresh if I use the function Web.BrowserContents in the queries.
I have been trying to work a way around it with the function Web.Contents but all I get is an empty table.
Here is the original query, the one that gives the expected output :
let
Source = Web.BrowserContents("https://ir.tevapharm.com" & "/news-and-events" & "/press-releases" & "/default.aspx"),
#"Extracted Table From Html" =
Html.Table(
Source,
{
{"URL","a[href^=""/news-""]", each [Attributes][href]},
{"Headline", ".module-news .module_headline"},
{"Date", ".module-news .module_date-time"}
},
[RowSelector=".module-news .module_item-wrap"]
),
#"Replaced Value" = Table.ReplaceValue(#"Extracted Table From Html","/news-","https://ir.tevapharm.com/news-",Replacer.ReplaceText,{"URL"}),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"URL", type text}, {"Headline", type text}, {"Date", type date}})
in #"Changed Type"
Here is the query with the Web function replaced, it gives an empty table:
let
Source = Web.Contents("https://ir.tevapharm.com" & "/news-and-events" & "/press-releases" & "/default.aspx"),
#"Extracted Table From Html" =
Html.Table(
Source,
{
{"URL","a[href^=""/news-""]", each [Attributes][href]},
{"Headline", ".module-news .module_headline"},
{"Date", ".module-news .module_date-time"}
},
[RowSelector=".module-news .module_item-wrap"]
),
#"Replaced Value" = Table.ReplaceValue(#"Extracted Table From Html","/news-","https://ir.tevapharm.com/news-",Replacer.ReplaceText,{"URL"}),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"URL", type text}, {"Headline", type text}, {"Date", type date}})
in #"Changed Type"
Then I tried changing the format of the source before extracting the HTML contents with thies query, but I also get an empty table:
let
Source = Web.Contents("https://ir.tevapharm.com" & "/news-and-events" & "/press-releases" & "/default.aspx"),
#"Binary to text"=Text.FromBinary(Source),
#"Extracted Table From Html" =
Html.Table(
#"Binary to text",
{
{"URL","a[href^=""/news-""]", each [Attributes][href]},
{"Headline", ".module-news .module_headline"},
{"Date", ".module-news .module_date-time"}
},
[RowSelector=".module-news .module_item-wrap"]
),
#"Replaced Value" = Table.ReplaceValue(#"Extracted Table From Html","/news-","https://ir.tevapharm.com/news-",Replacer.ReplaceText,{"URL"}),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"URL", type text}, {"Headline", type text}, {"Date", type date}})
in #"Changed Type"
Has someone already had a similar problem ? Would love to know how you worked your way around it, thanks!
Solved! Go to Solution.
Hi @myriamouan,
This one's got me curious and unfortunately I seem to be running into the same dead ends as you.
I have done a lot of searching with no definitive answers but I suspect it is down to the use of JavaScript or other website specific formatting/design on that site.
One option which I assume you've already thought of and are probably trying to avoid...
Another possibility I've found is to use the RSS feed link but I don't think it contains all of the same info...
let
url = "https://ir.tevapharm.com/rss/pressrelease.aspx",
source = Binary.Buffer(Web.Contents(url)),
#"Imported XML" = Xml.Tables(source, null, 1252),
channel = #"Imported XML"{0}[channel],
item = channel{0}[item],
#"Changed Type2" = Table.TransformColumnTypes(
item,
{
{"title", type text},
{"description", type text},
{"link", type text},
{"pubDate", type datetimezone}
}
)
in
#"Changed Type2"
Here's a similar thread which also doesn't seem to be solved (it is marked as solved but I suspect that is down to the CST member marking it as solved and not the OP)...
I'll keep looking to see if I can find any other options but let me know if any of this helps.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Hi @myriamouan,
This one's got me curious and unfortunately I seem to be running into the same dead ends as you.
I have done a lot of searching with no definitive answers but I suspect it is down to the use of JavaScript or other website specific formatting/design on that site.
One option which I assume you've already thought of and are probably trying to avoid...
Another possibility I've found is to use the RSS feed link but I don't think it contains all of the same info...
let
url = "https://ir.tevapharm.com/rss/pressrelease.aspx",
source = Binary.Buffer(Web.Contents(url)),
#"Imported XML" = Xml.Tables(source, null, 1252),
channel = #"Imported XML"{0}[channel],
item = channel{0}[item],
#"Changed Type2" = Table.TransformColumnTypes(
item,
{
{"title", type text},
{"description", type text},
{"link", type text},
{"pubDate", type datetimezone}
}
)
in
#"Changed Type2"
Here's a similar thread which also doesn't seem to be solved (it is marked as solved but I suspect that is down to the CST member marking it as solved and not the OP)...
I'll keep looking to see if I can find any other options but let me know if any of this helps.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
I noticed that some websites simply don't offer RSS feed. How would you proceed in this situation?
That is a good question and one I'm still trying to figure out.
My best guess so far, you'd need to dig into the developer tools in your web browser or use an external tool like Fiddler or Wireshark to inspect the HTTP traffic. From there, it may be possible to find another URL that is being called when the site loads.
Similar issues mentioned in this post...
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Hi @KNP ,
Thank you so much for taking a look at my issue. The second option you suggest works perfectly and it does allow to schedule a refresh on Power BI Service ! The only difference I noticed is that is only scrapes the data that is displayed on the first page and does not browse into the following pages of the same link. This is not a problem for me since it is only essential for me to have the latest news releases, the former news are just bonuses.
As you guessed, I am indeed trying to avoid using a gateway as I am working from a business environment that does not provide me with an enterprise gateway.
Thanks a lot for your help! I will implement this solution to the rest of my queries and mark your answer as solution.
Best,
Myriam
There is not really a need to concatenate the URL like that - unless you are using other parameters which might cause your refresh issues.
Try this version:
let
Source = Web.BrowserContents("https://ir.tevapharm.com/news-and-events/press-releases/default.aspx"),
#"Extracted Table From Html" = Html.Table(Source, {{"Column1", ".module_date-text"}, {"Column2", ".module_headline-link"}}, [RowSelector=".module_item"])
in
#"Extracted Table From Html"
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
8 | |
6 | |
6 | |
6 | |
5 |
User | Count |
---|---|
9 | |
9 | |
8 | |
6 | |
6 |