- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Get the csv zipped file from a URL
Hi,
I came across a website providing the zipped csv file wanna get the csv into the Power Bi. Any indication? I need to get only the file in the top first row every day.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Inspecting the HTML shows that these ZIP files are coded by day
So you could either assume that "yesterday's file" is available when you poll it, or run a screen scraper.
Assumptions are boring so lets go with the second approach. Power Query is reasonably adept at reading tables in HTML pages and actually works well with your URL.
let
Source = Web.Page(Web.Contents("https://www.emi.ea.govt.nz/Wholesale/Datasets/RealTimePrices/2020")),
Data0 = Source{0}[Data],
ZipFile = Web.Contents("https://www.emi.ea.govt.nz/Wholesale/Datasets/RealTimePrices/2020/" & Data0[Name]{0}),
ZipData = Unzip(ZipFile),
#"Imported CSV" = Csv.Document(ZipData,[Delimiter=",", Columns=9, Encoding=1252, QuoteStyle=QuoteStyle.None])
in
#"Imported CSV"
The Unzip function can be whatever your preference is, if you want to take the one from my blog you need to amend it slightly.
// expects binary of the ZIP file, only extracts the first data file after getting its size from the central directory
// https://en.wikipedia.org/wiki/Zip_(file_format)#Structure
(ZIPFile) =>
let
//read the entire ZIP file into memory - we'll use it often so this is worth it
Source = Binary.Buffer(ZIPFile),
// get the full size of the ZIP file
Size = Binary.Length(Source),
//Find the start of the central directory at the sixth to last byte
Directory = BinaryFormat.Record([
MiscHeader=BinaryFormat.Binary(Size-6),
Start=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian)
]) ,
Start = Directory(Source)[Start],
//find the first entry in the directory and get the compressed file size
FirstDirectoryEntry = BinaryFormat.Record([
MiscHeader=BinaryFormat.Binary(Start+20),
FileSize=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
UnCompressedFileSize=BinaryFormat.Binary(4),
FileNameLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
ExtrasLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian)
]) ,
//figure out where the raw data starts
Offset = 30+FirstDirectoryEntry(Source)[FileNameLen]+FirstDirectoryEntry(Source)[ExtrasLen],
Compressed = FirstDirectoryEntry(Source)[FileSize]+1,
//get the raw data of the compressed file
Raw = BinaryFormat.Record([
Header=BinaryFormat.Binary(Offset),
Data=BinaryFormat.Binary(Compressed)
])
// unzip it
in
Binary.Decompress(Raw(Source)[Data], Compression.Deflate)
And here's the result: (note the column headers are not part of the data, you need to supply that yourself).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Inspecting the HTML shows that these ZIP files are coded by day
So you could either assume that "yesterday's file" is available when you poll it, or run a screen scraper.
Assumptions are boring so lets go with the second approach. Power Query is reasonably adept at reading tables in HTML pages and actually works well with your URL.
let
Source = Web.Page(Web.Contents("https://www.emi.ea.govt.nz/Wholesale/Datasets/RealTimePrices/2020")),
Data0 = Source{0}[Data],
ZipFile = Web.Contents("https://www.emi.ea.govt.nz/Wholesale/Datasets/RealTimePrices/2020/" & Data0[Name]{0}),
ZipData = Unzip(ZipFile),
#"Imported CSV" = Csv.Document(ZipData,[Delimiter=",", Columns=9, Encoding=1252, QuoteStyle=QuoteStyle.None])
in
#"Imported CSV"
The Unzip function can be whatever your preference is, if you want to take the one from my blog you need to amend it slightly.
// expects binary of the ZIP file, only extracts the first data file after getting its size from the central directory
// https://en.wikipedia.org/wiki/Zip_(file_format)#Structure
(ZIPFile) =>
let
//read the entire ZIP file into memory - we'll use it often so this is worth it
Source = Binary.Buffer(ZIPFile),
// get the full size of the ZIP file
Size = Binary.Length(Source),
//Find the start of the central directory at the sixth to last byte
Directory = BinaryFormat.Record([
MiscHeader=BinaryFormat.Binary(Size-6),
Start=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian)
]) ,
Start = Directory(Source)[Start],
//find the first entry in the directory and get the compressed file size
FirstDirectoryEntry = BinaryFormat.Record([
MiscHeader=BinaryFormat.Binary(Start+20),
FileSize=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
UnCompressedFileSize=BinaryFormat.Binary(4),
FileNameLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
ExtrasLen=BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian)
]) ,
//figure out where the raw data starts
Offset = 30+FirstDirectoryEntry(Source)[FileNameLen]+FirstDirectoryEntry(Source)[ExtrasLen],
Compressed = FirstDirectoryEntry(Source)[FileSize]+1,
//get the raw data of the compressed file
Raw = BinaryFormat.Record([
Header=BinaryFormat.Binary(Offset),
Data=BinaryFormat.Binary(Compressed)
])
// unzip it
in
Binary.Decompress(Raw(Source)[Data], Compression.Deflate)
And here's the result: (note the column headers are not part of the data, you need to supply that yourself).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks for your great help. I was away so sorry for belated reply. I applied the code you sent (the first short part) in the blank query and I got few errors as attached. I could not get through. and what I see is the list of zipped file names not their contents:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

you will need the Unzip function as well - that's the code in the second listing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

@Alirezam Check out this blog article by @lbendlin , it seems to be exactly what you are looking for:
https://community.powerbi.com/t5/Community-Blog/Working-With-Zip-Files-in-Power-Query/ba-p/1190186
Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: Power BI Cookbook Third Edition (Color)
DAX is easy, CALCULATE makes DAX hard...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Helpful resources
Join us at the Microsoft Fabric Community Conference
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!
Power BI Monthly Update - February 2025
Check out the February 2025 Power BI update to learn about new features.

Subject | Author | Posted | |
---|---|---|---|
11-15-2024 04:53 PM | |||
09-20-2024 02:15 AM | |||
12-30-2024 06:17 PM | |||
09-17-2023 05:11 AM | |||
09-08-2024 12:39 PM |
User | Count |
---|---|
124 | |
114 | |
73 | |
65 | |
46 |