Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi all
I am looking to extract data from multiple ZIP files all stored in the same folder. REsearching online I I was able to find Mark White's function that should enable the extraction of csv or txt files from any zip: Reading Zip files in PowerQuery / M (sql10.blogspot.com)
Upon testing with my files, for some reason I only get one file returned from the function and also I am unable to extract its content.
The zip files has a folder structure, meaning the actual txt files are nexted within folders. But as per my knowledeg that should not cause an issue to the function.
Anyone have an idea?
For reference I've linked one of the ZIPs: Link to the ZIP
And below you can find the function:
(ZIPFile) =>
let
Header = BinaryFormat.Record([
MiscHeader = BinaryFormat.Binary(14),
BinarySize = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
FileSize = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
FileNameLen= BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
ExtrasLen = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian)
]),
HeaderChoice = BinaryFormat.Choice(
BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
each if _ <> 67324752 // not the IsValid number? then return a dummy formatter
then BinaryFormat.Record([IsValid = false, Filename=null, Content=null])
else BinaryFormat.Choice(
BinaryFormat.Binary(26), // Header payload - 14+4+4+2+2
each BinaryFormat.Record([
IsValid = true,
Filename = BinaryFormat.Text(Header(_)[FileNameLen]),
Extras = BinaryFormat.Text(Header(_)[ExtrasLen]),
Content = BinaryFormat.Transform(
BinaryFormat.Binary(Header(_)[BinarySize]),
(x) => try Binary.Buffer(Binary.Decompress(x, Compression.Deflate)) otherwise null
)
]),
type binary // enable streaming
)
),
ZipFormat = BinaryFormat.List(HeaderChoice, each _[IsValid] = true),
Entries = List.Transform(
List.RemoveLastN( ZipFormat(ZIPFile), 1),
(e) => [FileName = e[Filename], Content = e[Content] ]
)
in
Table.FromRecords(Entries)
Solved! Go to Solution.
Hi @Data-Crunch
The below post by @artemus contains a function that appears to work with ZIP files containing multiple files:
It returns a list of records, each of which contains fields FileName and Content.
I have pasted the function here for convenience:
// Unzip
(ZIPFile) =>
let
ushort = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
uint = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
EDOCfn = BinaryFormat.Record([
ZipContent = BinaryFormat.Binary(Binary.Length(ZIPFile) - 22),
Magic = uint,
DiskNum = ushort,
CDirectoryDiskId = ushort,
CDirectoryRecordCountOnDisk = ushort,
CDirectoryRecordCount = ushort,
SizeOfCentralDirectory = uint,
CentralDirectoryOffset = uint,
CommendLength = ushort
]),
EDOC = EDOCfn(ZIPFile),
BeforeCentralDirectory = BinaryFormat.Binary(EDOC[CentralDirectoryOffset]),
CentralDirectory = BinaryFormat.Length(BinaryFormat.Record(
[
ZipContent = BeforeCentralDirectory,
Items = BinaryFormat.List(BinaryFormat.Record(
[
Magic = uint,
CurrentVersion = ushort,
MinVersion = ushort,
Flags = ushort,
CompressionMethod = ushort,
FileModificationTime = ushort,
FileModificationDate = ushort,
CRC32 = uint,
BinarySize = uint,
FileSize = uint,
FileInfo = BinaryFormat.Choice(
BinaryFormat.Record(
[
Len = ushort,
FieldsLen = ushort,
FileCommentLength = ushort,
Disk = ushort,
InternalFileAttr = ushort,
ExternalAttr = uint,
PosOfFileHeader = uint
]),
(fileInfo) => BinaryFormat.Record(
[
FileName = BinaryFormat.Text(fileInfo[Len], TextEncoding.Ascii),
Fields = BinaryFormat.Binary(fileInfo[FieldsLen]),
FileComment = BinaryFormat.Text(fileInfo[FileCommentLength], TextEncoding.Ascii),
Disk = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[Disk]),
InternalFileAttr = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[Disk]),
ExternalAttr = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[InternalFileAttr]),
PosOfFileHeader = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[PosOfFileHeader])
])
)
]),
EDOC[CDirectoryRecordCount]
)
]),
EDOC[CentralDirectoryOffset] + EDOC[SizeOfCentralDirectory]),
Contents = List.Transform(
CentralDirectory(ZIPFile)[Items],
(cdEntry) =>
let
ZipEntry = BinaryFormat.Record(
[
PreviousData = BinaryFormat.Binary(cdEntry[FileInfo][PosOfFileHeader]),
Magic = uint,
ZipVersion = ushort,
ZipFlags = ushort,
CompressionMethod = ushort,
FileModificationTime = ushort,
FileModificationDate = ushort,
CRC32 = uint,
BinarySize = uint,
FileSize = uint,
FileName = BinaryFormat.Choice(
BinaryFormat.Record(
[
Len = ushort,
FieldsLen = ushort
]),
(fileInfo) => BinaryFormat.Record(
[
FileName = BinaryFormat.Text(fileInfo[Len], TextEncoding.Ascii),
Fields = BinaryFormat.Binary(fileInfo[FieldsLen])
])
),
FileContent = BinaryFormat.Transform(
BinaryFormat.Binary(cdEntry[BinarySize]),
each Binary.Decompress(_, Compression.Deflate)
)
])(ZIPFile)
in
[FileName=ZipEntry[FileName][FileName], Content=ZipEntry[FileContent]]
)
in
Contents
Hi @Data-Crunch
The below post by @artemus contains a function that appears to work with ZIP files containing multiple files:
It returns a list of records, each of which contains fields FileName and Content.
I have pasted the function here for convenience:
// Unzip
(ZIPFile) =>
let
ushort = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger16, ByteOrder.LittleEndian),
uint = BinaryFormat.ByteOrder(BinaryFormat.UnsignedInteger32, ByteOrder.LittleEndian),
EDOCfn = BinaryFormat.Record([
ZipContent = BinaryFormat.Binary(Binary.Length(ZIPFile) - 22),
Magic = uint,
DiskNum = ushort,
CDirectoryDiskId = ushort,
CDirectoryRecordCountOnDisk = ushort,
CDirectoryRecordCount = ushort,
SizeOfCentralDirectory = uint,
CentralDirectoryOffset = uint,
CommendLength = ushort
]),
EDOC = EDOCfn(ZIPFile),
BeforeCentralDirectory = BinaryFormat.Binary(EDOC[CentralDirectoryOffset]),
CentralDirectory = BinaryFormat.Length(BinaryFormat.Record(
[
ZipContent = BeforeCentralDirectory,
Items = BinaryFormat.List(BinaryFormat.Record(
[
Magic = uint,
CurrentVersion = ushort,
MinVersion = ushort,
Flags = ushort,
CompressionMethod = ushort,
FileModificationTime = ushort,
FileModificationDate = ushort,
CRC32 = uint,
BinarySize = uint,
FileSize = uint,
FileInfo = BinaryFormat.Choice(
BinaryFormat.Record(
[
Len = ushort,
FieldsLen = ushort,
FileCommentLength = ushort,
Disk = ushort,
InternalFileAttr = ushort,
ExternalAttr = uint,
PosOfFileHeader = uint
]),
(fileInfo) => BinaryFormat.Record(
[
FileName = BinaryFormat.Text(fileInfo[Len], TextEncoding.Ascii),
Fields = BinaryFormat.Binary(fileInfo[FieldsLen]),
FileComment = BinaryFormat.Text(fileInfo[FileCommentLength], TextEncoding.Ascii),
Disk = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[Disk]),
InternalFileAttr = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[Disk]),
ExternalAttr = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[InternalFileAttr]),
PosOfFileHeader = BinaryFormat.Transform(BinaryFormat.Null, each fileInfo[PosOfFileHeader])
])
)
]),
EDOC[CDirectoryRecordCount]
)
]),
EDOC[CentralDirectoryOffset] + EDOC[SizeOfCentralDirectory]),
Contents = List.Transform(
CentralDirectory(ZIPFile)[Items],
(cdEntry) =>
let
ZipEntry = BinaryFormat.Record(
[
PreviousData = BinaryFormat.Binary(cdEntry[FileInfo][PosOfFileHeader]),
Magic = uint,
ZipVersion = ushort,
ZipFlags = ushort,
CompressionMethod = ushort,
FileModificationTime = ushort,
FileModificationDate = ushort,
CRC32 = uint,
BinarySize = uint,
FileSize = uint,
FileName = BinaryFormat.Choice(
BinaryFormat.Record(
[
Len = ushort,
FieldsLen = ushort
]),
(fileInfo) => BinaryFormat.Record(
[
FileName = BinaryFormat.Text(fileInfo[Len], TextEncoding.Ascii),
Fields = BinaryFormat.Binary(fileInfo[FieldsLen])
])
),
FileContent = BinaryFormat.Transform(
BinaryFormat.Binary(cdEntry[BinarySize]),
each Binary.Decompress(_, Compression.Deflate)
)
])(ZIPFile)
in
[FileName=ZipEntry[FileName][FileName], Content=ZipEntry[FileContent]]
)
in
Contents
This one worked like a charm! Thanks!
Check out the July 2025 Power BI update to learn about new features.
User | Count |
---|---|
73 | |
72 | |
39 | |
31 | |
26 |
User | Count |
---|---|
97 | |
87 | |
43 | |
40 | |
35 |