Solved: Re: decompress, skip lines, delimit

manyad

All, I have a bunch of .gz files in a folder that I have to import into Power BI. After looking up online, i have built the following function. My files have the very first record which is file info header (with about 4 strings concatenated by a pipe), while the rest of the data has about 20 odd fields split by a pipe.

Using the below function, I get only 4 fields because of the file header record. If i use the commented line, i end up with all data in a single column with a different challenge i.e. promoting headers becomes a problem.

In essence, i need to:

1. decompress the zips

2. skip the first line of the file

3. read the file after delimiting

4. promote headers as column names

Is anyone able to advise how the above is possible?

(gzFile) =>
let
Decompressed = Binary.Decompress(File.Contents(gzFile), Compression.GZip),
//Imported = Table.FromColumns({Lines.FromBinary(Decompressed,null,null,1252)}),
Imported = Csv.Document(Decompressed, [Delimiter="|", Encoding=1252]),
SkipRowFirst = Table.Skip(Imported,1),
SkipRowLast = Table.RemoveLastN(SkipRowFirst,1),
PromotedHeaders = Table.PromoteHeaders(SkipRowLast , [PromoteAllScalars=true])
in
PromotedHeaders

followed by this query to read the files:

let
Source = Folder.Files("C:\temp"),
gzFiles = Table.SelectRows(Source, each ([Extension] = ".gz")),
unpackGZ = Table.AddColumn(gzFiles, "Custom", each unzip([Folder Path]&[Name]))
in
unpackGZ

MarkLaf

Edit: actually, looking again at your issue + code (which is basically the same as mine), I think the main thing to change is to add a Columns field to your options record. When it's unspecified (as we are both doing in our M), I think it just looks at top row to determine column count (some number less than the ~20 you mentioned). So, I think the simple fix for you is:

Imported = Csv.Document(Decompressed, [Columns=20, Delimiter="|", Encoding=1252]), // <-- added Columns=#

Or, upstream, get the appropriate number of delimiters appended to first row such that it matches column count for primary data

Original post for references:

The following worked for me.

For reference, I made two simple csvs of format:

Zipped each with 7zip into gz. Note: original csvs in same folder but we are filtering them out per your procedure.

The following M successfully unpacks these.

let
    Source = Folder.Files( #"<FolderPath>" ),
    FilterGz = Table.SelectRows(Source, each [Extension] = ".gz"),
    AddGzParse = Table.AddColumn(
        FilterGz, "gz_tables", each 
        // steps for unpacking each gz - you can put these in a separate function if desired
        [ 
            gz = [Content],
            csv_binary = Binary.Decompress(gz, Compression.GZip), 
            csv_table = Csv.Document(csv_binary,null,"|"), 
            csv_table_trim = Table.RemoveLastN( Table.RemoveFirstN(csv_table,1), 1), 
            csv_table_promoted = Table.PromoteHeaders(csv_table_trim)
        ] [csv_table_promoted]
    ),
    // use our output from first row to automate column type and expansion
    // note, it's csv so all cols come in as text
    FirstTableType = Value.Type( List.First( AddGzParse[gz_tables] ) ),
    FixGzParseColType = Table.TransformColumns( 
        AddGzParse, 
        {{"gz_tables", each {_}{0}, FirstTableType }} 
    ),
    SelectMetaAndTables = Table.SelectColumns(
        FixGzParseColType,{"Name", "Date created", "gz_tables"}
    ),
    ExpandGz_tables = Table.ExpandTableColumn(
        SelectMetaAndTables, "gz_tables", 
        Type.TableSchema(FirstTableType)[Name] 
    )
in
    ExpandGz_tables

Output

View solution in original post

v-pnaroju-msft

Thankyou, @lbendlin and @MarkLaf for your responses.

Hi manyad,

We are pleased to note that your issue has been resolved. Thank you for sharing your insights and approach in resolving the issue, which will be beneficial to other members of the community. Should you have any further queries, please feel free to contact the Microsoft Fabric community.

Thank you.

MarkLaf

Edit: actually, looking again at your issue + code (which is basically the same as mine), I think the main thing to change is to add a Columns field to your options record. When it's unspecified (as we are both doing in our M), I think it just looks at top row to determine column count (some number less than the ~20 you mentioned). So, I think the simple fix for you is:

Imported = Csv.Document(Decompressed, [Columns=20, Delimiter="|", Encoding=1252]), // <-- added Columns=#

Or, upstream, get the appropriate number of delimiters appended to first row such that it matches column count for primary data

Original post for references:

The following worked for me.

For reference, I made two simple csvs of format:

Zipped each with 7zip into gz. Note: original csvs in same folder but we are filtering them out per your procedure.

The following M successfully unpacks these.

let
    Source = Folder.Files( #"<FolderPath>" ),
    FilterGz = Table.SelectRows(Source, each [Extension] = ".gz"),
    AddGzParse = Table.AddColumn(
        FilterGz, "gz_tables", each 
        // steps for unpacking each gz - you can put these in a separate function if desired
        [ 
            gz = [Content],
            csv_binary = Binary.Decompress(gz, Compression.GZip), 
            csv_table = Csv.Document(csv_binary,null,"|"), 
            csv_table_trim = Table.RemoveLastN( Table.RemoveFirstN(csv_table,1), 1), 
            csv_table_promoted = Table.PromoteHeaders(csv_table_trim)
        ] [csv_table_promoted]
    ),
    // use our output from first row to automate column type and expansion
    // note, it's csv so all cols come in as text
    FirstTableType = Value.Type( List.First( AddGzParse[gz_tables] ) ),
    FixGzParseColType = Table.TransformColumns( 
        AddGzParse, 
        {{"gz_tables", each {_}{0}, FirstTableType }} 
    ),
    SelectMetaAndTables = Table.SelectColumns(
        FixGzParseColType,{"Name", "Date created", "gz_tables"}
    ),
    ExpandGz_tables = Table.ExpandTableColumn(
        SelectMetaAndTables, "gz_tables", 
        Type.TableSchema(FirstTableType)[Name] 
    )
in
    ExpandGz_tables

Output

manyadg

This is exactly what I needed. Thanks. I also added logic to read the number of fields from the second/good line and made the columns dynamic. Thanks for your help. Much appreciated.

lbendlin

Can you post one or two sample gz files to a publicly available file store?

manyadg

Thanks. Can't post the files. But based on the other response my problem was resolved by adding the columns attribute.

decompress, skip lines, delimit

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta

decompress, skip lines, delimit

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - February 2026

FabCon Atlanta 2026