Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Score big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount

Reply
Katie484
New Member

Combined CSV files top remove top 3 rows of each with junk data & keep 1 header row

I have several csv files combining from a folder. Each csv file has 3 rows of junk data / heading info which needs to be removed. The 4th row of each file is the actual header row but I need to remove duplicates. Please can you assist in the best way of achieving this?

1 ACCEPTED SOLUTION
KNP
Super User
Super User

Hi @Katie484,

 

I use the below custom function for processing CSV files.

I've created it so the column number that you specify that has no 'junk' data is used to dynamically find the headers. e.g. it won't matter if one file has 3 rows and another has 5.

 

Create a blank query and paste this code into the advanced editor...

 

// fProcessFiles
// Use with Add Column - Invoke Custom Function
// Specify the column number that has only blank rows before the headers/data to deal with inconsistent files.
let
    fProcessFiles =
        (myFile as binary, myColumn as text) =>
            let
                CSV = Csv.Document(myFile, [Encoding = 1252]),
                ConditionalBlankIndex =
                    Table.AddColumn(
                        CSV,
                        "Custom",
                        each
                            if Record.Field(_, "Column" & myColumn) = "" then
                                0
                            else
                                1
                    ),
                HeaderPosition =
                    List.PositionOf(
                        ConditionalBlankIndex[Custom],
                        1
                    ),
                RemoveRows = Table.Skip(CSV, HeaderPosition),
                PromotedHeaders =
                    Table.PromoteHeaders(
                        RemoveRows,
                        [PromoteAllScalars = true]
                    )
            in
                PromotedHeaders
in
    fProcessFiles

 

You can then use this function to apply to a list of files.

https://youtu.be/rXFzuvqIbsg

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
xOIEmaj

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
image
fabric-SUbadge
Proud to be a Super User!

View solution in original post

2 REPLIES 2
ahadkarimi
Solution Specialist
Solution Specialist

Hey @Katie484, just change your folder path instead of "C:\Users\Ahadk\OneDrive\Desktop\New folder (7)", let me know if there is any problem.

let
    Source = Folder.Files("C:\Users\Ahadk\OneDrive\Desktop\New folder (7)"),
    #"Imported CSVs" = Table.AddColumn(Source, "Custom", each Csv.Document([Content], [Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None])),
    #"Removed Other Columns" = Table.SelectColumns(#"Imported CSVs", {"Custom"}),
    #"Expanded Tables" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", Table.ColumnNames(#"Removed Other Columns"[Custom]{0})),
    #"Removed Top Rows" = Table.Skip(#"Expanded Tables", 3),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
    #"Removed Duplicates" = Table.Distinct(#"Promoted Headers")
in
    #"Removed Duplicates"

 

KNP
Super User
Super User

Hi @Katie484,

 

I use the below custom function for processing CSV files.

I've created it so the column number that you specify that has no 'junk' data is used to dynamically find the headers. e.g. it won't matter if one file has 3 rows and another has 5.

 

Create a blank query and paste this code into the advanced editor...

 

// fProcessFiles
// Use with Add Column - Invoke Custom Function
// Specify the column number that has only blank rows before the headers/data to deal with inconsistent files.
let
    fProcessFiles =
        (myFile as binary, myColumn as text) =>
            let
                CSV = Csv.Document(myFile, [Encoding = 1252]),
                ConditionalBlankIndex =
                    Table.AddColumn(
                        CSV,
                        "Custom",
                        each
                            if Record.Field(_, "Column" & myColumn) = "" then
                                0
                            else
                                1
                    ),
                HeaderPosition =
                    List.PositionOf(
                        ConditionalBlankIndex[Custom],
                        1
                    ),
                RemoveRows = Table.Skip(CSV, HeaderPosition),
                PromotedHeaders =
                    Table.PromoteHeaders(
                        RemoveRows,
                        [PromoteAllScalars = true]
                    )
            in
                PromotedHeaders
in
    fProcessFiles

 

You can then use this function to apply to a list of files.

https://youtu.be/rXFzuvqIbsg

 

Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
xOIEmaj

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
image
fabric-SUbadge
Proud to be a Super User!

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors