Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
I've made an R script to scrape a certain website. It works fine when I run it in Rstudio. Now I want to integrate it in a power Bi desktop so my co workers can work with it (without having to use Rstudio). However, I keep getting an error, it seems the read_html function doesnt work.
My original script, which works in Rstudio. I use a .xlsx file which contains a list of URLS (input.xlsx)
rm(list = ls())
library(rvest)
library(readxl)
library(xlsx)
library(rstudioapi)
# Set working directory and import URLs
setwd(dirname(getActiveDocumentContext()$path))
dfURL <- read_xlsx("Input.xlsx")
# Initiate dataframe
dfETA <- data.frame(matrix(ncol = 4, nrow = nrow(dfURL)))
colnames(dfETA) <- c("URL", "Vessel", "Destination", "ETA")
dfETA$URL <- dfURL$URL
for (i in 1:nrow(dfURL)){
# Get URL & load webpage
url <- as.character(dfURL[i,])
page <- read_html(url)
# Extract CSS adresses
CSSextract1 <- html_nodes(page,'.n3ata')
CSSextract2 <- html_nodes(page,'.st')
# Convert to text
toText1 <- html_text(CSSextract1)
toText2 <- html_text(CSSextract2)
# Extract information from text
destination <- trimws(toText1[1])
ETA <- toText1[2]
# Fill df with information
dfETA$Vessel[i] <- toText2
dfETA$Destination[i] <- destination
dfETA$ETA[i] <- ETA
}
# Write to xlsx
write.xlsx(dfETA,"Output.xlsx", append = FALSE, row.names = FALSE)I already searched allot, one of the things I found is that you should point out the file and library locations explicitly. In power Bi, I made sure that the library locations are the same as the locations used by Rstudio (.libloc()) Also, prior to this error, Power Bi returned an error that it couldn't find the xml2 package, I installed and loaded it allong with the other packages, the resulting code is where I stand now and what produces the error. I've tried to make the code in such a way, that whoever is willing and able to help me can just copy and paste the code, so that the library locations are generic.
libloc_rvest <- find.package('rvest')
libloc_rvest <- substr(libloc_rvest,1,nchar(libloc_rvest) - nchar("rvest") - 1)
libloc_readxl <- find.package("readxl")
libloc_readxl <- substr(libloc_readxl,1,nchar(libloc_readxl) - nchar("readxl") - 1)
libloc_rstudioapi <- find.package("rstudioapi")
libloc_rstudioapi <- substr(libloc_rstudioapi,1,nchar(libloc_rstudioapi) - nchar("rstudioapi") - 1)
libloc_xml2 <- find.package("xml2")
libloc_xml2 <- substr(libloc_xml2,1,nchar(libloc_xml2) - nchar("xml2") - 1)
library(rvest, lib.loc=libloc_rvest)
library(readxl, lib.loc=libloc_readxl)
library(rstudioapi, lib.loc=libloc_rstudioapi)
library(xml2, lib.loc=libloc_xml2)
# Set working directory and import URLs
dfURL <- read_xlsx("N:/ETAscraper/ETAscraper/Input.xlsx")
# Initiate dataframe
dfETA <- data.frame(matrix(ncol = 4, nrow = nrow(dfURL)))
colnames(dfETA) <- c("URL", "Vessel", "Destination", "ETA")
dfETA$URL <- dfURL$URL
for (i in 1:nrow(dfURL)){
# Get URL & load webpage
url <- as.character(dfURL[i,])
page <- read_html(url)
# Extract CSS adresses
CSSextract1 <- html_nodes(page,'.n3ata')
CSSextract2 <- html_nodes(page,'.st')
# Convert to text
toText1 <- html_text(CSSextract1)
toText2 <- html_text(CSSextract2)
# Extract information from text
destination <- trimws(toText1[1])
ETA <- toText1[2]
# Fill df with information
dfETA$Vessel[i] <- toText2
dfETA$Destination[i] <- destination
dfETA$ETA[i] <- ETA
}
# Write to xlsx
# write.xlsx(dfETA,"Output.xlsx", append = FALSE, row.names = FALSE)
If anyone could help me out, or point me in the right direction, that would be greatly appreciated!
Hi @Anonymous ,
It should be a Rvest error,check the reference below:
https://github.com/yusuzech/r-web-scraping-cheat-sheet/blob/master/README.md
Best Regards,
Kelly
Did I answer your question? Mark my post as a solution!
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 38 | |
| 36 | |
| 33 | |
| 33 | |
| 29 |
| User | Count |
|---|---|
| 132 | |
| 90 | |
| 81 | |
| 66 | |
| 65 |