- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
In my other community blog post, How to Scrape Multiple Webpages, I discussed how to connect to a table from a webpage and use the M code generated by the web connector's dialogue box to get the same table but from across multiple web pages. But how about if the web connector cannot detect a table or cannot generate a table with the information needed using Add Table Using Examples feature? A good example are image URLs. To do that, we'll need to inspect the underlying HTML code.
Head over to https://edge.pse.com.ph/companyPage/stockData.do?cmpy_id=15. This is a company information page of a company called ABS-CBN, a Philippine media company, in the Philippine Stock Exchange company listing website. Notice the image/logo of the company. Our goal is to be able to get the URL of that image.
Right-click on the image, click inspect/inspect element (the options will depend on the browser you're using I'm using Chrome) and then we will try to find a pattern.
The URL starts with /clogo. This could be a pattern but let's double check. Press Ctrl + F on your keyboard and type /clogo to check if there are more than one /clogo words. It appears there's only one so we will stick to this.
Let's go back to the query editor. Duplicate the StockData query from the attached sample pbix in my previous blog post and rename it to CompanyLogo. Delete all applied steps from and after Added Custom. Create a custom column called HTML using this formula. Web.BrowserContents allows as to access a webpage as HTML text but this is not our final formula as we want to be able to refresh the resulting dataset in Power BI Service.
= Web.BrowserContents("https://edge.pse.com.ph/companyInformation/form.do?cmpy_id=" & Text.From([Page Number])
This how the query should look like after adding that column:
Now let's add another column to extract the logo URL. Highlight HTML column, go to Add Column tab, click Extract and then Text Between Delimiters. Use /clogo as start and delimiter and " as end. The query should now look like below:
Notice the rows with a blank result. Let's go to the website and check if indeed the related pages do not have a logo. It appears they really don't so we'll leave it at that.
Now, complete the URL. Going back to the inspect element part of this blog, the complete path to the logo starts with the website's base URL. Replace all empty cells with null. Add a custom column called Logo URL with the following formula.
= if [Text Between Delimiters] <> null then "https://edge.pse.com.ph/clogo" & [Text Between Delimiters] else null
After removing custom columns HTML and Text Between Delimiters, the query should now look like this:
As mentioned earlier in the post, the use of Web.BrowserContents is only temporary. Chris Webb mentions in his blog that "...if you dynamically generate a url for use with Web.Contents() and do not use the RelativePath and/or Query options you will not be able refresh your dataset because the Power BI Service will not be able to analyse your code to discover what urls are present in it."
While it may be a different function, the same still applies. We made use of dynamic URLs and Web.BrowserContents doesn't have RelativePath parameter so using it will inhibit us from refresh out dataset in Powr BI service - until, at least, the service engine updates to accomodate such case.
However, do note that while Web.Contents has RelativePath parameter, unlike Web.BrowserContents, it doesn't return the underlying html code right from the get go but, instead, binary. To demonstrate, we will change the function used in our HTML custom column.
In order to return the URL as HTML code, we need to use Lines.FromBinary to return a list which every item is a line of html code and Text.Combine with line feed "#(lf)" as the delimiter to combine each item in the list into a single text. Change the custom column formula to the following.
let
html = Web.Contents ( "https://edge.pse.com.ph", [RelativePath =
"companyInformation/form.do", Query = [cmpy_id = Text.From ( [Page Number] ) ] ] )
in Text.Combine ( Lines.FromBinary ( html ), "#(lf)" )
The rest of the query will remain as is so we should still be seeing the same result. You may now load this into the model.
In this blog post, we have discussed
- how to get the image URL from a webpage by finding the pattern in the underlying HTML.
- how to return the binary file returned by Web.BrowserContents as html.
Attached is sample pbix for youre reference. A live report using a similar technique can also be found here - https://app.powerbi.com/view?r=eyJrIjoiZjljYmViZTQtMGI1NC00ZmI3LTlkNTUtYTIxYmIxMWVlMjk0IiwidCI6ImRmO...
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
- While Loop in M Language - In Short
- 🌟 Judges’ Favorites: Honoring More Standout Entri...
- 🎉 And the Winners Are… Celebrating the Best of th...
- 🎉Power BI Turns 10! Grab your digital swag!
- Level Up Your Reports: Mastering Power BI Performa...
- Integrate Python Scripts in Power BI
- ISINSCOPE vs HASONEVALUE in Power BI: The Real Dif...
- Great-Looking Dashboard, Bad Decisions: How Poor D...
- Button Slicer / How to Make Images as Slicers in P...
- Power BI Enhancements You Need to Know – Part 7: V...
-
slindsay on: 🎉Power BI Turns 10! Grab your digital swag!
- sgunasekhar1 on: Level Up Your Reports: Mastering Power BI Performa...
-
bhanu_gautam on: ISINSCOPE vs HASONEVALUE in Power BI: The Real Dif...
-
v-agajavelly on: Button Slicer / How to Make Images as Slicers in P...
- lokesh0909 on: Unlocking the Power of Calculation Group - Beyond ...
-
Abhilash_P
on: Seamless Power BI Report Management with SharePoin...
-
Magudeswaran_MR
on: Field Parameters in Power BI
- Vikranth426 on: How to Organize Measures Effectively in Power BI
-
Magudeswaran_MR
on: Power BI Enhancements You Need to Know - Part 6: P...
-
technolog on: #PBI10 Dataviz Contest | Meet the Judges
-
How to
672 -
Tips & Tricks
649 -
Events
146 -
Support insights
121 -
Opinion
83 -
DAX
66 -
Power BI
65 -
Power Query
62 -
Power BI Dev Camp
45 -
Power BI Desktop
40 -
Roundup
37 -
Dataflow
23 -
Featured User Group Leader
21 -
Power BI Embedded
20 -
Time Intelligence
19 -
Tips&Tricks
18 -
PowerBI REST API
12 -
Data Protection
11 -
Power Query Tips & Tricks
8 -
finance
8 -
Power BI Service
8 -
Direct Query
7 -
Power BI REST API
6 -
Auto ML
6 -
Data Analysis
6 -
financial reporting
6 -
Power Automate
6 -
Data Visualization
6 -
Python
6 -
Dax studio
5 -
Income Statement
5 -
powerbi
5 -
service
5 -
Power BI PowerShell
5 -
Machine Learning
5 -
RLS
4 -
M language
4 -
External tool
4 -
Paginated Reports
4 -
Power BI Goals
4 -
Desktop
4 -
PowerShell
4 -
Bookmarks
4 -
Group By
4 -
Line chart
4 -
community
4 -
Data model
3 -
Conditional Formatting
3 -
Life Sciences
3 -
Visualisation
3 -
Administration
3 -
M code
3 -
SQL Server 2017 Express Edition
3 -
Visuals
3 -
R script
3 -
Aggregation
3 -
calendar
3 -
Gateways
3 -
Webinar
3 -
R
3 -
M Query
3 -
R visual
3 -
CALCULATE
3 -
Reports
3 -
PowerApps
3 -
Tips and Tricks
3 -
Data Science
3 -
Azure
3 -
M
2 -
hierarchies
2 -
Power BI Anniversary
2 -
Language M
2 -
Custom Visual
2 -
VLOOKUP
2 -
pivot
2 -
calculated column
2 -
Power BI Premium Per user
2 -
inexact
2 -
Date Comparison
2 -
Split
2 -
Forecasting
2 -
REST API
2 -
Editor
2 -
Working with Non Standatd Periods
2 -
powerbi.tips
2 -
Custom function
2 -
Reverse
2 -
measure
2 -
Microsoft-flow
2 -
Paginated Report Builder
2 -
PUG
2 -
Custom Measures
2 -
Filtering
2 -
Row and column conversion
2 -
Python script
2 -
Nulls
2 -
DVW Analytics
2 -
Industrial App Store
2 -
Week
2 -
Date duration
2 -
parameter
2 -
Weekday Calendar
2 -
Support insights.
2 -
construct list
2 -
Formatting
2 -
Workday
2 -
external tools
2 -
slicers
2 -
SAP
2 -
Power Platform
2 -
index
2 -
RANKX
2 -
Integer
2 -
PBI Desktop
2 -
Date Dimension
2 -
Power BI Challenge
2 -
Query Parameter
2 -
Date
2 -
Visualization
2 -
Tabular Editor
2 -
SharePoint
2 -
Power BI Installation and Updates
2 -
How Things Work
2 -
Troubleshooting
2 -
Date DIFF
2 -
Transform data
2 -
Healthcare
2 -
rank
2 -
ladataweb
2 -
Incremental Refresh
2 -
Query Plans
2 -
Power BI & Power Apps
2 -
Random numbers
2 -
Day of the Week
2 -
Number Ranges
2 -
DIisconnected Tables
1 -
Sandbox
1 -
Honeywell
1 -
Combine queries
1 -
X axis at different granularity
1 -
ADLS
1 -
Primary Key
1 -
Microsoft 365 usage analytics data
1 -
Randomly filter
1 -
Week of the Day
1 -
Azure AAD
1 -
Retail
1 -
Power BI Report Server
1 -
School
1 -
Cost-Benefit Analysis
1 -
ISV
1 -
Ties
1 -
unpivot
1 -
Practice Model
1 -
Continuous streak
1 -
ProcessVue
1 -
Create function
1 -
Table.Schema
1 -
Acknowledging
1 -
Postman
1 -
Text.ContainsAny
1 -
Power BI Show
1 -
Get latest sign-in data for each user
1 -
query
1 -
Dynamic Visuals
1 -
KPI
1 -
Intro
1 -
Icons
1 -
Issues
1 -
function
1 -
stacked column chart
1 -
ho
1 -
ABB
1 -
KNN algorithm
1 -
List.Zip
1 -
optimization
1 -
Artificial Intelligence
1 -
Map Visual
1 -
Text.ContainsAll
1 -
Tuesday
1 -
API
1 -
Kingsley
1 -
Merge
1 -
variable
1 -
financial reporting hierarchies RLS
1 -
Featured Data Stories
1 -
MQTT
1 -
Custom Periods
1 -
Partial group
1 -
Reduce Size
1 -
FBL3N
1 -
Wednesday
1 -
Power Pivot
1 -
help
1 -
group
1 -
Scorecard
1 -
Json
1 -
Tops
1 -
Multivalued column
1 -
Pipeline
1 -
Path
1 -
Yokogawa
1 -
Dynamic calculation
1 -
Data Wrangling
1 -
native folded query
1 -
transform table
1 -
UX
1 -
Cell content
1 -
General Ledger
1 -
Thursday
1 -
Quick Tips
1 -
data
1 -
PBIRS
1 -
Usage Metrics in Power BI
1 -
HR Analytics
1 -
keepfilters
1 -
Connect Data
1 -
Financial Year
1 -
Schneider
1 -
dynamically delete records
1 -
Copy Measures
1 -
Friday
1 -
Q&A
1 -
Table
1 -
Natural Query Language
1 -
Infographic
1 -
automation
1 -
Prediction
1 -
newworkspacepowerbi
1 -
Performance KPIs
1 -
Active Employee
1 -
Custom Date Range on Date Slicer
1 -
refresh error
1 -
PAS
1 -
certain duration
1 -
DA-100
1 -
bulk renaming of columns
1 -
Single Date Picker
1 -
Monday
1 -
PCS
1 -
Saturday
1 -
update
1 -
Event
1 -
Custom Visuals
1 -
Free vs Pro
1 -
Format
1 -
Current Employees
1 -
date hierarchy
1 -
relationship
1 -
SIEMENS
1 -
Multiple Currency
1 -
Power BI Premium
1 -
On-premises data gateway
1 -
Binary
1 -
Power BI Connector for SAP
1 -
Sunday
1 -
Training
1 -
Slicer
1 -
Visual
1 -
forecast
1 -
Regression
1 -
CICD
1 -
sport statistics
1 -
Intelligent Plant
1 -
Circular dependency
1 -
GE
1 -
Exchange rate
1 -
Dendrogram
1 -
range of values
1 -
activity log
1 -
Decimal
1 -
Charticulator Challenge
1 -
Field parameters
1 -
Announcement
1 -
Features
1 -
domain
1 -
pbiviz
1 -
Color Map
1 -
Industrial
1 -
Weekday
1 -
Working Date
1 -
Space Issue
1 -
Emerson
1 -
Date Table
1 -
Cluster Analysis
1 -
Stacked Area Chart
1 -
union tables
1 -
Number
1 -
Start of Week
1 -
Tips& Tricks
1 -
Workspace
1 -
deployment
1 -
ssrs traffic light indicators
1 -
SQL
1 -
trick
1 -
Scripts
1 -
Topper Color On Map
1 -
Historians
1 -
context transition
1 -
Custom textbox
1 -
OPC
1 -
Zabbix
1 -
Label: DAX
1 -
Business Analysis
1 -
Supporting Insight
1 -
rank value
1 -
Synapse
1 -
End of Week
1 -
Tips&Trick
1 -
Theme Colours
1 -
Text
1 -
Flow
1 -
Publish to Web
1 -
Extract
1 -
Top Category Color
1 -
A&E data
1 -
Previous Order
1 -
Substring
1 -
Wonderware
1 -
Power M
1 -
Format DAX
1 -
Custom functions
1 -
accumulative
1 -
DAX&Power Query
1 -
Premium Per User
1 -
GENERATESERIES
1 -
Showcase
1 -
custom connector
1 -
Waterfall Chart
1 -
Power BI On-Premise Data Gateway
1 -
patch
1 -
Top Brand Color on Map
1 -
Tutorial
1 -
Previous Date
1 -
XMLA End point
1 -
color reference
1 -
Date Time
1 -
Marker
1 -
Lineage
1 -
CSV file
1 -
conditional accumulative
1 -
Matrix Subtotal
1 -
Check
1 -
null value
1 -
Excel
1 -
Report Server
1 -
Audit Logs
1 -
analytics pane
1 -
step by step
1 -
pandas
1 -
Networkdays
1 -
Button
1 -
Dataset list
1 -
Keyboard Shortcuts
1 -
Fill Function
1 -
LOOKUPVALUE()
1 -
Tips &Tricks
1 -
Plotly package
1 -
Cumulative Totals
1 -
Report Theme
1 -
Bookmarking
1 -
oracle
1 -
mahak
1 -
total
1 -
Filter context
1 -
Difference between two dates
1 -
get data
1 -
OSI
1 -
Query format convert
1 -
ETL
1 -
Json files
1 -
Merge Rows
1 -
CONCATENATEX()
1 -
take over Datasets;
1 -
Networkdays.Intl
1 -
refresh M language Python script Support Insights
1 -
Sameperiodlastyear
1 -
Office Theme
1 -
matrix
1 -
bar chart
1 -
Measures
1 -
powerbi argentina
1 -
Canvas Apps
1 -
REMOVEFILTERS
1 -
XMLA endpoint
1 -
translations
1 -
OSI pi
1 -
Parquet
1 -
Change rows to columns
1 -
remove spaces
1 -
Get row and column totals
1 -
Tutorial Requests
1 -
Governance
1 -
Fun
1 -
Power BI gateway
1 -
gateway
1 -
Elementary
1 -
Custom filters
1 -
Vertipaq Analyzer
1 -
powerbi cordoba
1 -
Model Driven Apps
1
- 07-20-2025 - 07-24-2025
- 07-13-2025 - 07-19-2025
- 07-06-2025 - 07-12-2025
- 06-29-2025 - 07-05-2025
- 06-22-2025 - 06-28-2025
- 06-15-2025 - 06-21-2025
- 06-08-2025 - 06-14-2025
- 06-01-2025 - 06-07-2025
- 05-25-2025 - 05-31-2025
- 05-18-2025 - 05-24-2025
- 05-11-2025 - 05-17-2025
- 05-04-2025 - 05-10-2025
- 04-27-2025 - 05-03-2025
- 04-20-2025 - 04-26-2025
- View Complete Archives