Hi there. I am working with a .csv file produced by a proprietary application that includes 4 different tables all jammed together into a single file. The tables themselves are different to each other and are separated by a single blank row.
Header 1 | Header 2 | Header 3 | Header 4 |
Value 1 | Value 2 | Value 3 | Value 4 |
Header 1 | Header 2 | Header 3 | Header 4 |
Value 1 | Value 2 | Value 3 | Value 4 |
Header 1 | Header 2 | Header 3 | Header 4 |
Value 1 | Value 2 | Value 3 | Value 4 |
Header 1 | Header 2 | Header 3 | Header 4 |
Value 1 | Value 2 | Value 3 | Value 4 |
They contain different information which I would like to interrogate separately. Currently, I take the .csv and manually separate into four .csvs which I then pull into Power BI separately and interrogate as 4 separate queries.
Is there a way for me to import the .csv into Power BI and then split the four tables into four queries? The clear delimiter here is the blank row that separates each table from each other.
Solved! Go to Solution.
1. Import your csv and have following code for primary file (C:\test.csv should be replaced with your csv file path)
let
Source = Csv.Document(File.Contents("C:\test.csv"),[Delimiter=",", Columns=4, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Added Custom" = Table.AddColumn(Source, "IsBlank", each List.IsEmpty(List.RemoveItems(Record.ToList(_),{""}))),
#"List with Index" = List.Generate(()=>[x=0,i=0], each [i]<Table.RowCount(#"Added Custom"), each [i=[i]+1, x=(if #"Added Custom"[IsBlank]{i}=true then [x]+1 else [x])], each [x]),
#"Create Table with Index" = Table.FromColumns(Table.ToColumns(#"Added Custom") & {#"List with Index"},Table.ColumnNames(#"Added Custom")&{"Index"}),
#"Filtered Rows" = Table.SelectRows(#"Create Table with Index", each ([IsBlank] = false)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"IsBlank"})
in
#"Removed Columns"
2. Right click on this query in left pane and uncheck Enable load as this file need not be loaded into your report.
3. Right click on this query and Reference. Create overall 4 references. Rename them as csv1, csv2, csv3 and csv4 or whatever name you like.
4. In csv1, filter it on Index 0, in csv2, filter on 1, in csv3, filter on 2 and in csv4, filter on 3. Delete Index columns from there.
5. Promote headers in all these 4 queries and do a Detect data type for entire table.
The solution file as well as test.csv used in this can be downloaded from - https://1drv.ms/u/s!Akd5y6ruJhvhuWwDHySKxd-kYc0-?e=scwGxK
Hi there Vijay,
Thanks for the quick reply and the solution. I should have included two bits of information:
1) The data I am using is not static. It is used in a recurring piece of analysis and as such, the original .csv will be downloaded regularly (usually weekly) with the most up-to-date information. I want to get the dashboard to a situation where I just need to download the .csv, move it into the correct folder and overwrite the existing file, then click refresh on Power BI to get the most up-to-date stats. Will this solution work with that?
2) Each table has a different number of columns. Two have 91 columns, one has 21, and one has 126. This may change in the future.
Do either of these factors affect the code you have shared? Here is a dummy version of the .csv if that helps.
Hi @Iarampatta ,
@Iarampatta ‘s workaround is good.
Download his file and open it. Go to the Power Query Editor, and then click the gear next to Source in the main table and replace the source file with yours.
Then remove the 'Change Type' step, and you can see the data after removing it.
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
1. Import your csv and have following code for primary file (C:\test.csv should be replaced with your csv file path)
let
Source = Csv.Document(File.Contents("C:\test.csv"),[Delimiter=",", Columns=4, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Added Custom" = Table.AddColumn(Source, "IsBlank", each List.IsEmpty(List.RemoveItems(Record.ToList(_),{""}))),
#"List with Index" = List.Generate(()=>[x=0,i=0], each [i]<Table.RowCount(#"Added Custom"), each [i=[i]+1, x=(if #"Added Custom"[IsBlank]{i}=true then [x]+1 else [x])], each [x]),
#"Create Table with Index" = Table.FromColumns(Table.ToColumns(#"Added Custom") & {#"List with Index"},Table.ColumnNames(#"Added Custom")&{"Index"}),
#"Filtered Rows" = Table.SelectRows(#"Create Table with Index", each ([IsBlank] = false)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"IsBlank"})
in
#"Removed Columns"
2. Right click on this query in left pane and uncheck Enable load as this file need not be loaded into your report.
3. Right click on this query and Reference. Create overall 4 references. Rename them as csv1, csv2, csv3 and csv4 or whatever name you like.
4. In csv1, filter it on Index 0, in csv2, filter on 1, in csv3, filter on 2 and in csv4, filter on 3. Delete Index columns from there.
5. Promote headers in all these 4 queries and do a Detect data type for entire table.
The solution file as well as test.csv used in this can be downloaded from - https://1drv.ms/u/s!Akd5y6ruJhvhuWwDHySKxd-kYc0-?e=scwGxK