- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
In my other community blog post, How to Scrape Multiple Webpages, I discussed how to connect to a table from a webpage and use the M code generated by the web connector's dialogue box to get the same table but from across multiple web pages. But how about if the web connector cannot detect a table or cannot generate a table with the information needed using Add Table Using Examples feature? A good example are image URLs. To do that, we'll need to inspect the underlying HTML code.
Head over to https://edge.pse.com.ph/companyPage/stockData.do?cmpy_id=15. This is a company information page of a company called ABS-CBN, a Philippine media company, in the Philippine Stock Exchange company listing website. Notice the image/logo of the company. Our goal is to be able to get the URL of that image.
Right-click on the image, click inspect/inspect element (the options will depend on the browser you're using I'm using Chrome) and then we will try to find a pattern.
The URL starts with /clogo. This could be a pattern but let's double check. Press Ctrl + F on your keyboard and type /clogo to check if there are more than one /clogo words. It appears there's only one so we will stick to this.
Let's go back to the query editor. Duplicate the StockData query from the attached sample pbix in my previous blog post and rename it to CompanyLogo. Delete all applied steps from and after Added Custom. Create a custom column called HTML using this formula. Web.BrowserContents allows as to access a webpage as HTML text but this is not our final formula as we want to be able to refresh the resulting dataset in Power BI Service.
= Web.BrowserContents("https://edge.pse.com.ph/companyInformation/form.do?cmpy_id=" & Text.From([Page Number])
This how the query should look like after adding that column:
Now let's add another column to extract the logo URL. Highlight HTML column, go to Add Column tab, click Extract and then Text Between Delimiters. Use /clogo as start and delimiter and " as end. The query should now look like below:
Notice the rows with a blank result. Let's go to the website and check if indeed the related pages do not have a logo. It appears they really don't so we'll leave it at that.
Now, complete the URL. Going back to the inspect element part of this blog, the complete path to the logo starts with the website's base URL. Replace all empty cells with null. Add a custom column called Logo URL with the following formula.
= if [Text Between Delimiters] <> null then "https://edge.pse.com.ph/clogo" & [Text Between Delimiters] else null
After removing custom columns HTML and Text Between Delimiters, the query should now look like this:
As mentioned earlier in the post, the use of Web.BrowserContents is only temporary. Chris Webb mentions in his blog that "...if you dynamically generate a url for use with Web.Contents() and do not use the RelativePath and/or Query options you will not be able refresh your dataset because the Power BI Service will not be able to analyse your code to discover what urls are present in it."
While it may be a different function, the same still applies. We made use of dynamic URLs and Web.BrowserContents doesn't have RelativePath parameter so using it will inhibit us from refresh out dataset in Powr BI service - until, at least, the service engine updates to accomodate such case.
However, do note that while Web.Contents has RelativePath parameter, unlike Web.BrowserContents, it doesn't return the underlying html code right from the get go but, instead, binary. To demonstrate, we will change the function used in our HTML custom column.
In order to return the URL as HTML code, we need to use Lines.FromBinary to return a list which every item is a line of html code and Text.Combine with line feed "#(lf)" as the delimiter to combine each item in the list into a single text. Change the custom column formula to the following.
let
html = Web.Contents ( "https://edge.pse.com.ph", [RelativePath =
"companyInformation/form.do", Query = [cmpy_id = Text.From ( [Page Number] ) ] ] )
in Text.Combine ( Lines.FromBinary ( html ), "#(lf)" )
The rest of the query will remain as is so we should still be seeing the same result. You may now load this into the model.
In this blog post, we have discussed
- how to get the image URL from a webpage by finding the pattern in the underlying HTML.
- how to return the binary file returned by Web.BrowserContents as html.
Attached is sample pbix for youre reference. A live report using a similar technique can also be found here - https://app.powerbi.com/view?r=eyJrIjoiZjljYmViZTQtMGI1NC00ZmI3LTlkNTUtYTIxYmIxMWVlMjk0IiwidCI6ImRmO...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
- Seamless Power BI Report Management with SharePoin...
- Field Parameters in Power BI
- #PBI10 Dataviz Contest | Frequently Asked Question...
- How to Organize Measures Effectively in Power BI
- Power BI Enhancements You Need to Know - Part 6: P...
- #PBI10 Dataviz Contest | Meet the Judges
- Dataviz Contests - How to Submit
- #PBI10 Dataviz Contest
- Unlock Deeper Insights: Data Modeling Best Practic...
- How to Display Power BI Reports in a Continuous Sl...
-
ajaybabuinturi
on: Seamless Power BI Report Management with SharePoin...
-
Ilgar_Zarbali on: Field Parameters in Power BI
- lokesh07 on: How to Organize Measures Effectively in Power BI
-
Magudeswaran_MR
on: Power BI Enhancements You Need to Know - Part 6: P...
-
technolog on: #PBI10 Dataviz Contest | Meet the Judges
- HerraZhang on: Dataviz Contests - How to Submit
-
edchuy
on: #PBI10 Dataviz Contest
- TharunChalla on: What Power BI Developers Often Miss and Why Does I...
-
Magudeswaran_MR
on: Power BI Enhancements You Need to Know – Part 5: O...
-
Magudeswaran_MR
on: The Power of Automation: One-Click DAX Formulas an...
-
How to
716 -
Tips & Tricks
694 -
Events
143 -
Support insights
121 -
Opinion
98 -
DAX
66 -
Power BI
65 -
Power Query
62 -
Power BI Dev Camp
45 -
Power BI Desktop
40 -
Roundup
39 -
Dataflow
23 -
Featured User Group Leader
21 -
Power BI Embedded
20 -
Time Intelligence
19 -
Tips&Tricks
18 -
PowerBI REST API
12 -
Data Protection
11 -
Power Query Tips & Tricks
8 -
finance
8 -
Power BI Service
8 -
Direct Query
7 -
Power BI REST API
6 -
Auto ML
6 -
financial reporting
6 -
Data Analysis
6 -
Power Automate
6 -
Data Visualization
6 -
Python
6 -
Income Statement
5 -
Dax studio
5 -
powerbi
5 -
service
5 -
Power BI PowerShell
5 -
Machine Learning
5 -
M language
4 -
Paginated Reports
4 -
External tool
4 -
Power BI Goals
4 -
PowerShell
4 -
Desktop
4 -
Bookmarks
4 -
Line chart
4 -
Group By
4 -
community
4 -
RLS
4 -
Visualisation
3 -
Life Sciences
3 -
Administration
3 -
M code
3 -
Visuals
3 -
SQL Server 2017 Express Edition
3 -
R script
3 -
Aggregation
3 -
calendar
3 -
Gateways
3 -
R
3 -
M Query
3 -
Webinar
3 -
CALCULATE
3 -
R visual
3 -
Reports
3 -
PowerApps
3 -
Data Science
3 -
Azure
3 -
Data model
3 -
Conditional Formatting
3 -
Forecasting
2 -
REST API
2 -
Editor
2 -
Split
2 -
measure
2 -
Microsoft-flow
2 -
Paginated Report Builder
2 -
Working with Non Standatd Periods
2 -
powerbi.tips
2 -
Custom function
2 -
Reverse
2 -
PUG
2 -
Custom Measures
2 -
Filtering
2 -
Row and column conversion
2 -
Python script
2 -
Nulls
2 -
DVW Analytics
2 -
parameter
2 -
Industrial App Store
2 -
Week
2 -
Date duration
2 -
Formatting
2 -
Weekday Calendar
2 -
Support insights.
2 -
construct list
2 -
slicers
2 -
SAP
2 -
Power Platform
2 -
Workday
2 -
external tools
2 -
index
2 -
RANKX
2 -
PBI Desktop
2 -
Date Dimension
2 -
Integer
2 -
Visualization
2 -
Power BI Challenge
2 -
Query Parameter
2 -
Date
2 -
SharePoint
2 -
Power BI Installation and Updates
2 -
How Things Work
2 -
Tabular Editor
2 -
rank
2 -
ladataweb
2 -
Troubleshooting
2 -
Date DIFF
2 -
Transform data
2 -
Healthcare
2 -
Tips and Tricks
2 -
Incremental Refresh
2 -
Number Ranges
2 -
Query Plans
2 -
Power BI & Power Apps
2 -
Random numbers
2 -
Day of the Week
2 -
Custom Visual
2 -
VLOOKUP
2 -
pivot
2 -
calculated column
2 -
M
2 -
hierarchies
2 -
Power BI Anniversary
2 -
Language M
2 -
inexact
2 -
Date Comparison
2 -
Power BI Premium Per user
2 -
API
1 -
Kingsley
1 -
Merge
1 -
variable
1 -
Issues
1 -
function
1 -
stacked column chart
1 -
ho
1 -
ABB
1 -
KNN algorithm
1 -
List.Zip
1 -
optimization
1 -
Artificial Intelligence
1 -
Map Visual
1 -
Text.ContainsAll
1 -
Tuesday
1 -
help
1 -
group
1 -
Scorecard
1 -
Json
1 -
Tops
1 -
financial reporting hierarchies RLS
1 -
Featured Data Stories
1 -
MQTT
1 -
Custom Periods
1 -
Partial group
1 -
Reduce Size
1 -
FBL3N
1 -
Wednesday
1 -
Power Pivot
1 -
Quick Tips
1 -
data
1 -
PBIRS
1 -
Usage Metrics in Power BI
1 -
Multivalued column
1 -
Pipeline
1 -
Path
1 -
Yokogawa
1 -
Dynamic calculation
1 -
Data Wrangling
1 -
native folded query
1 -
transform table
1 -
UX
1 -
Cell content
1 -
General Ledger
1 -
Thursday
1 -
Table
1 -
Natural Query Language
1 -
Infographic
1 -
automation
1 -
Prediction
1 -
newworkspacepowerbi
1 -
Performance KPIs
1 -
HR Analytics
1 -
keepfilters
1 -
Connect Data
1 -
Financial Year
1 -
Schneider
1 -
dynamically delete records
1 -
Copy Measures
1 -
Friday
1 -
Q&A
1 -
Event
1 -
Custom Visuals
1 -
Free vs Pro
1 -
Format
1 -
Active Employee
1 -
Custom Date Range on Date Slicer
1 -
refresh error
1 -
PAS
1 -
certain duration
1 -
DA-100
1 -
bulk renaming of columns
1 -
Single Date Picker
1 -
Monday
1 -
PCS
1 -
Saturday
1 -
update
1 -
Slicer
1 -
Visual
1 -
forecast
1 -
Regression
1 -
CICD
1 -
Current Employees
1 -
date hierarchy
1 -
relationship
1 -
SIEMENS
1 -
Multiple Currency
1 -
Power BI Premium
1 -
On-premises data gateway
1 -
Binary
1 -
Power BI Connector for SAP
1 -
Sunday
1 -
Training
1 -
Announcement
1 -
Features
1 -
domain
1 -
pbiviz
1 -
sport statistics
1 -
Intelligent Plant
1 -
Circular dependency
1 -
GE
1 -
Exchange rate
1 -
Dendrogram
1 -
range of values
1 -
activity log
1 -
Decimal
1 -
Charticulator Challenge
1 -
Field parameters
1 -
deployment
1 -
ssrs traffic light indicators
1 -
SQL
1 -
trick
1 -
Scripts
1 -
Color Map
1 -
Industrial
1 -
Weekday
1 -
Working Date
1 -
Space Issue
1 -
Emerson
1 -
Date Table
1 -
Cluster Analysis
1 -
Stacked Area Chart
1 -
union tables
1 -
Number
1 -
Start of Week
1 -
Tips& Tricks
1 -
Workspace
1 -
Theme Colours
1 -
Text
1 -
Flow
1 -
Publish to Web
1 -
Extract
1 -
Topper Color On Map
1 -
Historians
1 -
context transition
1 -
Custom textbox
1 -
OPC
1 -
Zabbix
1 -
Label: DAX
1 -
Business Analysis
1 -
Supporting Insight
1 -
rank value
1 -
Synapse
1 -
End of Week
1 -
Tips&Trick
1 -
Showcase
1 -
custom connector
1 -
Waterfall Chart
1 -
Power BI On-Premise Data Gateway
1 -
patch
1 -
Top Category Color
1 -
A&E data
1 -
Previous Order
1 -
Substring
1 -
Wonderware
1 -
Power M
1 -
Format DAX
1 -
Custom functions
1 -
accumulative
1 -
DAX&Power Query
1 -
Premium Per User
1 -
GENERATESERIES
1 -
Report Server
1 -
Audit Logs
1 -
analytics pane
1 -
step by step
1 -
Top Brand Color on Map
1 -
Tutorial
1 -
Previous Date
1 -
XMLA End point
1 -
color reference
1 -
Date Time
1 -
Marker
1 -
Lineage
1 -
CSV file
1 -
conditional accumulative
1 -
Matrix Subtotal
1 -
Check
1 -
null value
1 -
Excel
1 -
Cumulative Totals
1 -
Report Theme
1 -
Bookmarking
1 -
oracle
1 -
mahak
1 -
pandas
1 -
Networkdays
1 -
Button
1 -
Dataset list
1 -
Keyboard Shortcuts
1 -
Fill Function
1 -
LOOKUPVALUE()
1 -
Tips &Tricks
1 -
Plotly package
1 -
Sameperiodlastyear
1 -
Office Theme
1 -
matrix
1 -
bar chart
1 -
Measures
1 -
powerbi argentina
1 -
Canvas Apps
1 -
total
1 -
Filter context
1 -
Difference between two dates
1 -
get data
1 -
OSI
1 -
Query format convert
1 -
ETL
1 -
Json files
1 -
Merge Rows
1 -
CONCATENATEX()
1 -
take over Datasets;
1 -
Networkdays.Intl
1 -
refresh M language Python script Support Insights
1 -
Governance
1 -
Fun
1 -
Power BI gateway
1 -
gateway
1 -
Elementary
1 -
Custom filters
1 -
Vertipaq Analyzer
1 -
powerbi cordoba
1 -
Model Driven Apps
1 -
REMOVEFILTERS
1 -
XMLA endpoint
1 -
translations
1 -
OSI pi
1 -
Parquet
1 -
Change rows to columns
1 -
remove spaces
1 -
Get row and column totals
1 -
Retail
1 -
Power BI Report Server
1 -
School
1 -
Cost-Benefit Analysis
1 -
DIisconnected Tables
1 -
Sandbox
1 -
Honeywell
1 -
Combine queries
1 -
X axis at different granularity
1 -
ADLS
1 -
Primary Key
1 -
Microsoft 365 usage analytics data
1 -
Randomly filter
1 -
Week of the Day
1 -
Azure AAD
1 -
query
1 -
Dynamic Visuals
1 -
KPI
1 -
Intro
1 -
Icons
1 -
ISV
1 -
Ties
1 -
unpivot
1 -
Practice Model
1 -
Continuous streak
1 -
ProcessVue
1 -
Create function
1 -
Table.Schema
1 -
Acknowledging
1 -
Postman
1 -
Text.ContainsAny
1 -
Power BI Show
1 -
Get latest sign-in data for each user
1
- 06-29-2025 - 07-05-2025
- 06-22-2025 - 06-28-2025
- 06-15-2025 - 06-21-2025
- 06-08-2025 - 06-14-2025
- 06-01-2025 - 06-07-2025
- 05-25-2025 - 05-31-2025
- 05-18-2025 - 05-24-2025
- 05-11-2025 - 05-17-2025
- 05-04-2025 - 05-10-2025
- 04-27-2025 - 05-03-2025
- 04-20-2025 - 04-26-2025
- 04-13-2025 - 04-19-2025
- 04-06-2025 - 04-12-2025
- View Complete Archives