Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
doyoudo911
New Member

Decompressing Zip resulting in null

I'm trying to decompress a zip. It can contain one or more txt files. In this example I have created two with different names but the exact same content (for testing). My problem is that the first will always show up as null whereas the other shows as binary. When I run it containing one txt file, that one shows up as null. How do I get it to show/import the data and not show null? I'm using the code from Mark Whites Blog 

 

pic.png

4 REPLIES 4
Vijay_A_Verma
Most Valuable Professional
Most Valuable Professional

For null issue, use this code

// Function build for decompressing a Zip file even if the filelength is missing in the localheader
// More info on zip files here: https://en.wikipedia.org/wiki/ZIP_(file_format)
// Known limitation: If there is a comment appended to the central header then this function will fail. You can use a hex editor to find these comments at the end of the file wicht will be just readable text  
//Credit - https://github.com/Michael19842/PowerBiFunctions/blob/main/ZipFile/Unzip.m
let
    Source = File.Contents("C:\Users\Downloads\fo16AUG2022bhav.csv.zip"),
    UnzipContents=(ZipFile as binary) =>
        let
            //Load the file into a buffer
            ZipFileBuffer = Binary.Buffer(ZipFile),
            ZipFileSize = Binary.Length(ZipFileBuffer),
            //Constant values used in the query
            CentralHeaderSignature = 0x02014b50,
            CentralHeaderSize = 42,
            LocalHeaderSize = 30,
            // Predefined byteformats that are used many times over 
            Unsigned16BitLittleIEndian =
                BinaryFormat.ByteOrder(
                    BinaryFormat.UnsignedInteger16,
                    ByteOrder.LittleEndian
                ),
            Unsigned32BitLittleIEndian =
                BinaryFormat.ByteOrder(
                    BinaryFormat.UnsignedInteger32,
                    ByteOrder.LittleEndian
                ),
            // Definition of central directory header
            CentralDirectoryHeader =
                BinaryFormat.Record(
                    [
                        Version = Unsigned16BitLittleIEndian,
                        VersionNeeded = Unsigned16BitLittleIEndian,
                        GeneralPurposeFlag = Unsigned16BitLittleIEndian,
                        CompressionMethod = Unsigned16BitLittleIEndian,
                        LastModifiedTime = Unsigned16BitLittleIEndian,
                        LastModifiedDate = Unsigned16BitLittleIEndian,
                        CRC32 = Unsigned32BitLittleIEndian,
                        CompressedSize = Unsigned32BitLittleIEndian,
                        UncompressedSize = Unsigned32BitLittleIEndian,
                        FileNameLength = Unsigned16BitLittleIEndian,
                        ExtrasLength = Unsigned16BitLittleIEndian,
                        FileCommentLenght = Unsigned16BitLittleIEndian,
                        DiskNumberStarts = Unsigned16BitLittleIEndian,
                        InternalFileAttributes = Unsigned16BitLittleIEndian,
                        EnternalFileAttributes = Unsigned32BitLittleIEndian,
                        LocalHeaderOffset = Unsigned32BitLittleIEndian
                    ]
                ),
            // Definition of the end of central directory record
            EndOfCentralDirectoryRecord =
                BinaryFormat.Record(
                    [
                        RestOfFile = BinaryFormat.Binary(ZipFileSize - 22),
                        EOCDsignature = Unsigned32BitLittleIEndian,
                        NumberOfThisDisk = Unsigned16BitLittleIEndian,
                        DiskWhereCentralDirectoryStarts = Unsigned16BitLittleIEndian,
                        NumberOfRecordsOnThisDisk = Unsigned16BitLittleIEndian,
                        TotalNumberOfRecords = Unsigned16BitLittleIEndian,
                        CentralDirectorySize = Unsigned32BitLittleIEndian,
                        OffsetToStart = Unsigned32BitLittleIEndian
                    ]
                ),
            //Formatter used for building a table of all files in te central directory
            CentralHeaderFormatter =
                BinaryFormat.Choice(
                    Unsigned32BitLittleIEndian,
                    // Should contain the signature
                    each
                        if _ <> CentralHeaderSignature // Test if the signature is not there
                        then
                            BinaryFormat.Record(
                                [
                                    LocalHeaderOffset = null,
                                    CompressedSize = null,
                                    FileNameLength = null,
                                    HeaderSize = null,
                                    IsValid = false,
                                    Filename = null
                                ]
                            )
                        // if so create a dummy entry 
                        else
                            BinaryFormat.Choice(
                                //Catch the staticly sized part of the central header
                                BinaryFormat.Binary(CentralHeaderSize),
                                //Create a record containing the files size, offset(of the local header), name, etc.. 
                                each
                                    BinaryFormat.Record(
                                        [
                                            LocalHeaderOffset = CentralDirectoryHeader(_)[LocalHeaderOffset],
                                            CompressedSize = CentralDirectoryHeader(_)[CompressedSize],
                                            FileNameLength = CentralDirectoryHeader(_)[FileNameLength],
                                            HeaderSize =
                                                LocalHeaderSize
                                                + CentralDirectoryHeader(_)[FileNameLength]
                                                + CentralDirectoryHeader(_)[ExtrasLength],
                                            IsValid = true,
                                            Filename = BinaryFormat.Text(CentralDirectoryHeader(_)[FileNameLength])
                                        ]
                                    ),
                                type binary
                            )
                ),
            //Get a record of the end of central directory, this contains the offset of the central header so we can itterate from that position
            EOCDR = EndOfCentralDirectoryRecord(ZipFileBuffer),
            //Get the central directory as a binary extract
            CentralDirectory =
                Binary.Range(
                    ZipFileBuffer,
                    EOCDR[OffsetToStart]
                ),
            //A list formatter for the central directory  
            CentralDirectoryFormatter =
                BinaryFormat.List(
                    CentralHeaderFormatter,
                    each _[IsValid] = true
                ),
            //Get a Table from Records containing the file info extracted from the central directory
            FilesTable =
                Table.FromRecords(
                    List.RemoveLastN(
                        CentralDirectoryFormatter(CentralDirectory),
                        1
                    )
                ),
            //Add the binary to the table and decompress it
            ReturnValue =
                Table.AddColumn(
                    FilesTable,
                    "Content",
                    each
                        Binary.Decompress(
                            Binary.Range(
                                ZipFileBuffer,
                                [LocalHeaderOffset] + [HeaderSize],
                                [CompressedSize]
                            ),
                            Compression.Deflate
                        )
                )
        in
            ReturnValue,
            Files = UnzipContents(Source),
            Content = Files{0}[Content]
in
    Content

If I'm using a portal like sharepoint, how would I go about updating the first line of the source? When changing the link it's either too long or error says must be a valid absolute path.

Vijay_A_Verma
Most Valuable Professional
Most Valuable Professional

Can you first try this with a local zip file and see whether the code is working or not for you?

I get: "DataFormat.Error: Failed to construct a huffman tree using the length array. The stream might be corrupted."

 

For what it's worth the data is double piped delimited. 

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors