Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Atiroocky
Frequent Visitor

Help to import huge csv data

Hello,

 

I have to deal with .csv files with millions of rows.

Data are extracted from a building management software.

Data are already unpivoted, that is why there is a huge amount of row

For each row, first data is timeseries, second is the value (comma separated).

The issue concernes the "attribute". It is not added in each row.

It is added on a new line at the beginning of each set, as the minimal example below :

 

Attribute_1 
Date_1, value_1
Date_2, value_2
Date_3, value_3
#linebreak
Attribute_2
Date_1, value_1
Date_2, value_2
Date_3, value_3

 

My goal is to transform this previous formatting to this :

 

Date_1, value_1, Attribute_1
Date_2, value_2, Attribute_1
Date_3, value_3, Attribute_1
Date_1, value_1, Attribute_2
Date_2, value_2, Attribute_2
Date_3, value_3, Attribute_2

 

Thanks for your help.

2 ACCEPTED SOLUTIONS
Cookistador
Super User
Super User

Hi @Atiroocky 

 

This is how you can achieve that, so I used the sample you shared with us:

Cookistador_0-1762420617426.png

 

And this is what I got after a few transformations:

Cookistador_1-1762420648211.png

 

How did I achieve that?

1)  Create a conditional column and use the following logic:

Cookistador_2-1762420748218.png

 

If the value is recognized as null, replace the blank value by null, if you prefer a custom column, you can use the following code

if column2 = "" then column1 else null

2)Fill Down the Attribute

 

  • Right-click the header of your new "Attribute" column.

  • Select Fill > Down.

  • all the null,values will be replaced by the attribute name from the row above, propagating all the way down until the next attribute is found.

3) exclude the linebreak lines, select one column which sould be filled all the time and excluded the null or empty values 

 

And tadam, you have the right table

If the table is to heavy for this transformation, you can use a dataflow instead of Power BI desktop 

 

If you need more help, do not hesistate to ask

 

Ps: this is the generated code

 

let
    Source = Csv.Document(File.Contents("C:\Users\dgodfroid\Desktop\test.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom.1", each if [Column2] = "" then [Column1] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Column2] <> ""))
in
    #"Filtered Rows"

 

 

View solution in original post

ronrsnfld
Super User
Super User

Here's another method which relys on Grouping. It seems to operate more rapidly on your sample data, but you should test both methods against your large data set.

 

I started by pasting your sample data into a *.csv file.

Please read the code comments to better understand the algorithm.

let
    Source = Csv.Document(File.Contents(FullPathToCSV),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    
//Remove rows with #linebreak
    #"Filtered Rows" = Table.SelectRows(Source, each ([Column1] <> "#linebreak")),

//Group by Attribute, then add the Attribute Column and remove the row
//  Note: Attribute is defined by the blank in column2 as it comes from a CSV file
    #"Group by Attribute" = Table.Group(#"Filtered Rows","Column2",{
        {"Rows",(t)=>[a=t{0}[Column1],
                      b=Table.AddColumn(Table.Skip(t),"Attribute", each a)][b],

                //change the types according to the actual types.
                //for example, Column1 might be date and Column2 might be number or similar
                      type table[Column1=any, Column2=any, Attribute=text]}
    }, GroupKind.Local,(x,y)=>Number.From(x="" and y="")),

//Remove Grouping Column
    #"Removed Columns" = Table.RemoveColumns(#"Group by Attribute",{"Column2"}),

//Expand Grouped Rows and rename
    #"Expanded Rows" = Table.ExpandTableColumn(#"Removed Columns", "Rows",
        {"Column1", "Column2", "Attribute"}, 
        {"Date","Value","Attribute"})
in
    #"Expanded Rows"

Source:

ronrsnfld_0-1762431198408.png

 

Results:

ronrsnfld_1-1762431236626.png

 

Please note that if you really want a comma-separated result, as you show in your question, you will need to add a Merge Column step to the above which you can easily do from the UI:

 

 Table.CombineColumns(#"Expanded Rows",{"Date", "Value", "Attribute"},Combiner.CombineTextByDelimiter(", ", QuoteStyle.None),"Merged")

ronrsnfld_2-1762431543176.png

 

 

 

 

 

View solution in original post

3 REPLIES 3
ronrsnfld
Super User
Super User

Here's another method which relys on Grouping. It seems to operate more rapidly on your sample data, but you should test both methods against your large data set.

 

I started by pasting your sample data into a *.csv file.

Please read the code comments to better understand the algorithm.

let
    Source = Csv.Document(File.Contents(FullPathToCSV),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    
//Remove rows with #linebreak
    #"Filtered Rows" = Table.SelectRows(Source, each ([Column1] <> "#linebreak")),

//Group by Attribute, then add the Attribute Column and remove the row
//  Note: Attribute is defined by the blank in column2 as it comes from a CSV file
    #"Group by Attribute" = Table.Group(#"Filtered Rows","Column2",{
        {"Rows",(t)=>[a=t{0}[Column1],
                      b=Table.AddColumn(Table.Skip(t),"Attribute", each a)][b],

                //change the types according to the actual types.
                //for example, Column1 might be date and Column2 might be number or similar
                      type table[Column1=any, Column2=any, Attribute=text]}
    }, GroupKind.Local,(x,y)=>Number.From(x="" and y="")),

//Remove Grouping Column
    #"Removed Columns" = Table.RemoveColumns(#"Group by Attribute",{"Column2"}),

//Expand Grouped Rows and rename
    #"Expanded Rows" = Table.ExpandTableColumn(#"Removed Columns", "Rows",
        {"Column1", "Column2", "Attribute"}, 
        {"Date","Value","Attribute"})
in
    #"Expanded Rows"

Source:

ronrsnfld_0-1762431198408.png

 

Results:

ronrsnfld_1-1762431236626.png

 

Please note that if you really want a comma-separated result, as you show in your question, you will need to add a Merge Column step to the above which you can easily do from the UI:

 

 Table.CombineColumns(#"Expanded Rows",{"Date", "Value", "Attribute"},Combiner.CombineTextByDelimiter(", ", QuoteStyle.None),"Merged")

ronrsnfld_2-1762431543176.png

 

 

 

 

 

Cookistador
Super User
Super User

Hi @Atiroocky 

 

This is how you can achieve that, so I used the sample you shared with us:

Cookistador_0-1762420617426.png

 

And this is what I got after a few transformations:

Cookistador_1-1762420648211.png

 

How did I achieve that?

1)  Create a conditional column and use the following logic:

Cookistador_2-1762420748218.png

 

If the value is recognized as null, replace the blank value by null, if you prefer a custom column, you can use the following code

if column2 = "" then column1 else null

2)Fill Down the Attribute

 

  • Right-click the header of your new "Attribute" column.

  • Select Fill > Down.

  • all the null,values will be replaced by the attribute name from the row above, propagating all the way down until the next attribute is found.

3) exclude the linebreak lines, select one column which sould be filled all the time and excluded the null or empty values 

 

And tadam, you have the right table

If the table is to heavy for this transformation, you can use a dataflow instead of Power BI desktop 

 

If you need more help, do not hesistate to ask

 

Ps: this is the generated code

 

let
    Source = Csv.Document(File.Contents("C:\Users\dgodfroid\Desktop\test.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom.1", each if [Column2] = "" then [Column1] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Custom.1"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([Column2] <> ""))
in
    #"Filtered Rows"

 

 

Thank you @Cookistador , you just nailed it !

Processing the large amount of data was very quick.

Just note that in the first step, with the conditional column, if we want "equals = blank" the value field of the condition MUST be leaved. If you write something down on it and erase it (= blank ?), you cannot validate the condition. In this case, have to delete the condition and make another one without writing anything in the value field.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.