Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Anonymous
Not applicable

Data importing issue / approach for hyperlinks

Hi All, sorry I am new to all of this and to Power BI so apologies if I am asking very basic questions.

 

Objective:

  • I am trying to create a database which has a list of companies registered in a country with fiels like name of the company, address, ... and accounts information (e.g.: shareholders funds, creditors, debitors, profit, .... - the usual financial info)

 

Where I have got to:

  • The information I need is made available by an official website in two different type of files:
  1.  One massive csv file that include 4.5mil companies (this includes all information except for the financials - each company has an ID)
  2. Zipped folders (so 12 zip files) that include between 178K to 850K html files. Each file represent the financial accounts of each company within the 4.5mil csv file above

 - I have uploaded file (1) into Power BI and discovered that it does have a handy column which shows the hyperlinks to retrieve for each row (each company) related to their financial accounts potentially making useless downloading file (2) as well....

 

Issues:

  1. I don't know if / how to use those hyperlinks to import financials. Given the number (c.4.5mil) I suspect there may be a much easier / more efficient way to do so
  2. I have played with the importing of one of those HTML files to see how Power BI reads it. It looks like the HTML includes some table and with some effort I can isolate those tables that have relevant financials. The point though is going one by one will take probably a year... 😁

 

Thanks

Patrick

Annotation 2020-05-16 172135.jpg

1 ACCEPTED SOLUTION
AlB
Community Champion
Community Champion

Create an custome column and in the custom column formula use

 

= Csv.Document(Web.Contents([URI]),[Delimiter=",", Columns=18, Encoding=65001, QuoteStyle=QuoteStyle.None])

 

Reading from a website can take long so I would first try it with a few rows only. Again, if this doesn't work do share some sampel data so that I can be more specific in my answers

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs

Cheers 

SU18_powerbi_badge

View solution in original post

3 REPLIES 3
AlB
Community Champion
Community Champion

Hi @Anonymous 

Based on the table you are showing, with the URLs in the forst column, you could create a custom column in which you use a function to extract the info from each URL/Company. I've seen that the site offers the info in different file formats (json, html, csv...)

You could use something like:

 

Source = Csv.Document(Web.Contents( [FirstColumnNameHere_with_the_URLS]),[Delimiter=",", Columns=18, Encoding=65001, QuoteStyle=QuoteStyle.None])

 

and then add the transformations thta you need to get to the table format you want. After that you would have a table for each URL and just need to combine them all in one final table with all the finacial information. If you share a sample of the data (or point me to where I can download it) I'd be able to be more specific.

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs

Cheers 

SU18_powerbi_badge

Anonymous
Not applicable

Hi as requested I am posting through initial thread.

 

Firstly thanks for your help.   I have tried to use your 'Source' code for the custom column and typed it as per image but doesn't seem to work. It may be as very new to Power BI and its functions so would welcome if you can shed further lights on the why. Thanks in advance for your further assistance.

 

Annotation 2020-05-17 155625.jpg

AlB
Community Champion
Community Champion

Create an custome column and in the custom column formula use

 

= Csv.Document(Web.Contents([URI]),[Delimiter=",", Columns=18, Encoding=65001, QuoteStyle=QuoteStyle.None])

 

Reading from a website can take long so I would first try it with a few rows only. Again, if this doesn't work do share some sampel data so that I can be more specific in my answers

Please mark the question solved when done and consider giving kudos if posts are helpful.

Contact me privately for support with any larger-scale BI needs

Cheers 

SU18_powerbi_badge

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.