Solved: Help to import huge csv data

Atiroocky · ‎11-06-2025

Hello,

I have to deal with .csv files with millions of rows.

Data are extracted from a building management software.

Data are already unpivoted, that is why there is a huge amount of row

For each row, first data is timeseries, second is the value (comma separated).

The issue concernes the "attribute". It is not added in each row.

It is added on a new line at the beginning of each set, as the minimal example below :

Attribute_1 
Date_1, value_1
Date_2, value_2
Date_3, value_3
#linebreak
Attribute_2
Date_1, value_1
Date_2, value_2
Date_3, value_3

My goal is to transform this previous formatting to this :

Date_1, value_1, Attribute_1
Date_2, value_2, Attribute_1
Date_3, value_3, Attribute_1
Date_1, value_1, Attribute_2
Date_2, value_2, Attribute_2
Date_3, value_3, Attribute_2

Thanks for your help.

Cookistador · ‎11-06-2025

Hi @Atiroocky

This is how you can achieve that, so I used the sample you shared with us:

And this is what I got after a few transformations:

How did I achieve that?

1) Create a conditional column and use the following logic:

If the value is recognized as null, replace the blank value by null, if you prefer a custom column, you can use the following code

if column2 = "" then column1 else null

2)Fill Down the Attribute

Right-click the header of your new "Attribute" column.
Select Fill > Down.
all the null,values will be replaced by the attribute name from the row above, propagating all the way down until the next attribute is found.

3) exclude the linebreak lines, select one column which sould be filled all the time and excluded the null or empty values

And tadam, you have the right table

If the table is to heavy for this transformation, you can use a dataflow instead of Power BI desktop

If you need more help, do not hesistate to ask

Ps: this is the generated code

let
    Source = Csv.Document(File.Contents("C:\Users\dgodfroid\Desktop\test.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom.1", each if [Column2] = "" then [Column1] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Column2] <> ""))
in
    #"Filtered Rows"

View solution in original post

ronrsnfld · ‎11-06-2025

Here's another method which relys on Grouping. It seems to operate more rapidly on your sample data, but you should test both methods against your large data set.

I started by pasting your sample data into a *.csv file.

Please read the code comments to better understand the algorithm.

let
    Source = Csv.Document(File.Contents(FullPathToCSV),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    
//Remove rows with #linebreak
    #"Filtered Rows" = Table.SelectRows(Source, each ([Column1] <> "#linebreak")),

//Group by Attribute, then add the Attribute Column and remove the row
//  Note: Attribute is defined by the blank in column2 as it comes from a CSV file
    #"Group by Attribute" = Table.Group(#"Filtered Rows","Column2",{
        {"Rows",(t)=>[a=t{0}[Column1],
                      b=Table.AddColumn(Table.Skip(t),"Attribute", each a)][b],

                //change the types according to the actual types.
                //for example, Column1 might be date and Column2 might be number or similar
                      type table[Column1=any, Column2=any, Attribute=text]}
    }, GroupKind.Local,(x,y)=>Number.From(x="" and y="")),

//Remove Grouping Column
    #"Removed Columns" = Table.RemoveColumns(#"Group by Attribute",{"Column2"}),

//Expand Grouped Rows and rename
    #"Expanded Rows" = Table.ExpandTableColumn(#"Removed Columns", "Rows",
        {"Column1", "Column2", "Attribute"}, 
        {"Date","Value","Attribute"})
in
    #"Expanded Rows"

Source:

Results:

Please note that if you really want a comma-separated result, as you show in your question, you will need to add a Merge Column step to the above which you can easily do from the UI:

 Table.CombineColumns(#"Expanded Rows",{"Date", "Value", "Attribute"},Combiner.CombineTextByDelimiter(", ", QuoteStyle.None),"Merged")