Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Import multiple files with various date ranges from one folder

Hi,

I have a folder with 50 Excel files. All Excel files consist of 1 sheet named ‘data’ that contains the sales numbers per product. Every sheet contains data over a different time range. The amount of columns can differ per file. Every month new files are being added to this folder, they’re always added in the same format.

 

3 example tables from these files:

 

 

Example file A

Product

Jan 1 2019

Jan 2 2019

Product A

100

100

Product B

100

100

Product C

100

100

 

 

Example file B

Product

Jan 3 2019

Jan 5 2019

Jan 6 2019

Product A

200

200

200

Product B

200

200

200

Product C

200

200

200

 

 

Example file C

Product

Jan 7 2019

Product A

300

Product B

300

Product C

300

 

 

I want to pull all the files in the folder at once so I can process the data. However, when I do that using the default PowerQuery option to import a folder, the data stacks up. This is the result I then get:

 

 

Product

Jan 1 2019

Jan 2 2019

 

Product A

100

100

 

Product B

100

100

 

Product C

100

100

 

Product

Jan 3 2019

Jan 5 2019

Jan 6 2019

Product A

200

200

200

Product B

200

200

200

Product C

200

200

200

Product

Jan 7 2019

 

 

Product A

300

 

 

Product B

300

 

 

Product C

300

 

 

 

Ofcourse what I want is that the dates are all in the same row, so that I can unpivot the dates and process the data.

How can I import this table in a way that I eventually can get a product column, a sales column and a date column?


Regards

Bas

 

1 ACCEPTED SOLUTION

Hi  @Anonymous ,

 

Looking at the examples in your data you can use the option from folder, but believe that you need to do a few things in the sample file, this always have 3 columns of dates so believe you can treat your file like below:

  • Remove Total Column
  • Add a custom column for each date with the following code:
if [Column1] = "" then [Column4] else null
if [Column1] = "" then [Column5] else null
if [Column1] = "" then [Column6] else null
  • Now do a fill down on all the previous tree columns to get the dates
  • Add a new column for each of the previou ones with the concatenation between the values and the date:
[Column4] & "|" & [Column4_Date]
[Column5] & "|" & [Column5_Date]
[Column6] & "|" & [Column6_Date]
  • Remove the column 4, 5, 6 and columnDate4, 5, 6 from your model
  • Filter the procuct column without blanks
  • Rename your columns
  • Select first 3 columns and unpivot others
  • Remove Column atttribute
  • Split the column values by the | separator
  • Rename the columns

 

Now this will merge the files as you need.

Check the code for the example file below:

let
    Source = Csv.Document(Parameter1,[Delimiter=";", Columns=7, QuoteStyle=QuoteStyle.None]),
    #"Filtered Rows1" = Table.SelectRows(Source, each ([Column2] <> "Menu items Menu Item Name")),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Column7"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns1", "Column4_Date", each if [Column1] = "" then [Column4] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Column4_Date"}),
    #"Added Custom1" = Table.AddColumn(#"Filled Down", "Column5_Date", each if [Column1] = "" then [Column5] else null),
    #"Filled Down1" = Table.FillDown(#"Added Custom1",{"Column5_Date"}),
    #"Added Custom2" = Table.AddColumn(#"Filled Down1", "Column6_Date", each if [Column1] = "" then [Column6] else null),
    #"Filled Down2" = Table.FillDown(#"Added Custom2",{"Column6_Date"}),
    #"Added Custom3" = Table.AddColumn(#"Filled Down2", "Week1-Values", each [Column4] & "|" & [Column4_Date]),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Week2-Values", each [Column5] & "|" & [Column5_Date]),
    #"Added Custom5" = Table.AddColumn(#"Added Custom4", "Week3-Valiues", each [Column6] & "|" & [Column6_Date]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom5",{"Column4", "Column5", "Column6", "Column4_Date", "Column5_Date", "Column6_Date"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Column2] <> "")),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Column1", "Item"}, {"Column2", "Item NAme"}, {"Column3", "Unit Price Paid for Item"}}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Renamed Columns", {"Item", "Item NAme", "Unit Price Paid for Item"}, "Attribute", "Value"),
    #"Removed Columns2" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns2", "Value", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Value.1", "Value.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Item", Int64.Type}, {"Item NAme", type text}, {"Unit Price Paid for Item", Int64.Type}, {"Value.1", Int64.Type}, {"Value.2", type date}}),
    #"Renamed Columns1" = Table.RenameColumns(#"Changed Type",{{"Value.2", "Date_DeliveryedWeek"}, {"Value.1", "Menus Item Count"}})
in
    #"Renamed Columns1"

 

Also check a PBIX file attach with the example: you need to change the Source from the query Sample file and FromFolderData to your folder.


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



View solution in original post

7 REPLIES 7
amitchandak
Super User
Super User

Anonymous
Not applicable

Hi @amitchandak @MFelix ,

Thanks for the suggestions. These options would work if I would have 1 query per file. That's not the case; all files are loaded up in the same query. The end result is showed up above.

As you can see, the dates can show up in any row (the amount of products can also differ per file so there's no use in picking up the nth row number and using that). If I start unpivoting the data, then the link between the date and the sales number is gone. 

 

Would you have any other suggestions?

Many thanks
Bas

 

Hi @Anonymous ,

 

If as you refer the files have all the same format, even though the number of columns differs because of the dates the use of from folder should work correclty.

 

Can you please answer me these questions:

  • Are your files all in the same folder?
  • Is the 1st row of your file always the headers? (product, date 1, date 2 ,date 3)
  • Does the headers repeat within your files?
  • Why do you refer that unpivot breaks the link between date and sales number?
  • Are the examples you gave in the first post correct according to your model?

If you can answer this I can then prepare a sample with the examples you provide and show you the needed steps.

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Anonymous
Not applicable

Hi @MFelix ,

Thank you very much for your help on this.

All files in the folder are setup in the exact same way. They are all CSV files. These are three example CSV files that, apart from the product name, are identical to the ones I’m using, so never mind the examples I gave on the start of the thread:

 

 

 

 

Dates Delivered Week

Dates Delivered Week

13-11-17

06-11-17

30-10-17

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

8.75

 

125

143

268

2

Product B

54.99

 

21

22

43

3

Product C

54.99

 

3

 

3

4

Product D

54.99

 

3

1

4

5

Product E

23

 

3

1

4

 

 

 

  

Dates Delivered Week

25-06-18

18-06-18

11-06-18

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

8.75

691

937

655

2,283

2

Product B

54.99

92

19

1

112

3

Product C

54.99

59

10

 

69

4

Product D

54.99

50

3

 

53

5

Product E

48.99

45

27

 

72

6

Product F

54.99

40

8

 

48

 

 

 

  

Dates Delivered Week

23-09-19

16-09-19

09-09-19

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

19.72

126

40

 

166

2

Product B

19.72

54

16

 

70

3

Product C

26

40

46

42

128

4

Product D

12.95

34

33

42

109

5

Product E

3.5

32

15

31

78

 

 

This is the PowerQuery code I’ve used to pull this data:

 

 

let

    Source = Folder.Files("C:\Users\XXX\Desktop\Data"),

    #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),

    #"Added Custom" = Table.AddColumn(#"Filtered Hidden Files1", "FileLoad", each Csv.Document([Content])),

    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"FileLoad"}),

    #"Expanded FileLoad" = Table.ExpandTableColumn(#"Removed Other Columns", "FileLoad", List.Union(List.Transform(#"Removed Other Columns"[FileLoad], each Table.ColumnNames(_)))),

 

 

So I load the data from the folder, then add a custom column called FileLoad that shows the content of each CSV document. And then I expand that FileLoad column to show me all columns of these files, no matter how many columns there are.

 

As a result right now, I get the following: all data is stacked. What I want to achieve is that the dates are showing up as columns.

 

 

 

 

Dates Delivered Week

Dates Delivered Week

13-11-17

06-11-17

30-10-17

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

8.75

 

125

143

268

2

Product B

54.99

 

21

22

43

3

Product C

54.99

 

3

 

3

4

Product D

54.99

 

3

1

4

5

Product E

23

 

3

1

4

 

 

Dates Delivered Week

25-06-18

18-06-18

11-06-18

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

8.75

691

937

655

2,283

2

Product B

54.99

92

19

1

112

3

Product C

54.99

59

10

 

69

4

Product D

54.99

50

3

 

53

5

Product E

48.99

45

27

 

72

6

Product F

54.99

40

8

 

48

 

 

Dates Delivered Week

23-09-19

16-09-19

09-09-19

 

 

Menu items Menu Item Name

Menu items Unit Price Paid for Item

Menu items Count

Menu items Count

Menu items Count

 

1

Product A

19.72

126

40

 

166

2

Product B

19.72

54

16

 

70

3

Product C

26

40

46

42

128

4

Product D

12.95

34

33

42

109

5

Product E

3.5

32

15

31

78

 

 

To answer your questions specifically:

  • Are your files all in the same folder?
    • Yes
  • Is the 1st row of your file always the headers? (product, date 1, date 2 ,date 3)
    • The headers are always the same
      I tried to simply remove the first column (row number) and the last column (totals), as these headers are always blank. The product header is also blank, so I tried naming that ‘product’ for all files. But it had no effect on the result.
  • Does the headers repeat within your files?
    • The product headers repeat. The date headers always differ.
  • Why do you refer that unpivot breaks the link between date and sales number?
    • Because if I use unpivot in my end result, it’s not making it possible for me to see my data in the way I want to.. Unless I’m looking at it wrong and I can just unpivot this end result?
  • Are the examples you gave in the first post correct according to your model?
    • Updated the tables in this post

 

I'm sure there's some kind of small thing that I'm not thinking of that can fix this!


Best Regards
Bas

 

Hi  @Anonymous ,

 

Looking at the examples in your data you can use the option from folder, but believe that you need to do a few things in the sample file, this always have 3 columns of dates so believe you can treat your file like below:

  • Remove Total Column
  • Add a custom column for each date with the following code:
if [Column1] = "" then [Column4] else null
if [Column1] = "" then [Column5] else null
if [Column1] = "" then [Column6] else null
  • Now do a fill down on all the previous tree columns to get the dates
  • Add a new column for each of the previou ones with the concatenation between the values and the date:
[Column4] & "|" & [Column4_Date]
[Column5] & "|" & [Column5_Date]
[Column6] & "|" & [Column6_Date]
  • Remove the column 4, 5, 6 and columnDate4, 5, 6 from your model
  • Filter the procuct column without blanks
  • Rename your columns
  • Select first 3 columns and unpivot others
  • Remove Column atttribute
  • Split the column values by the | separator
  • Rename the columns

 

Now this will merge the files as you need.

Check the code for the example file below:

let
    Source = Csv.Document(Parameter1,[Delimiter=";", Columns=7, QuoteStyle=QuoteStyle.None]),
    #"Filtered Rows1" = Table.SelectRows(Source, each ([Column2] <> "Menu items Menu Item Name")),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Column7"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns1", "Column4_Date", each if [Column1] = "" then [Column4] else null),
    #"Filled Down" = Table.FillDown(#"Added Custom",{"Column4_Date"}),
    #"Added Custom1" = Table.AddColumn(#"Filled Down", "Column5_Date", each if [Column1] = "" then [Column5] else null),
    #"Filled Down1" = Table.FillDown(#"Added Custom1",{"Column5_Date"}),
    #"Added Custom2" = Table.AddColumn(#"Filled Down1", "Column6_Date", each if [Column1] = "" then [Column6] else null),
    #"Filled Down2" = Table.FillDown(#"Added Custom2",{"Column6_Date"}),
    #"Added Custom3" = Table.AddColumn(#"Filled Down2", "Week1-Values", each [Column4] & "|" & [Column4_Date]),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Week2-Values", each [Column5] & "|" & [Column5_Date]),
    #"Added Custom5" = Table.AddColumn(#"Added Custom4", "Week3-Valiues", each [Column6] & "|" & [Column6_Date]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom5",{"Column4", "Column5", "Column6", "Column4_Date", "Column5_Date", "Column6_Date"}),
    #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Column2] <> "")),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows",{{"Column1", "Item"}, {"Column2", "Item NAme"}, {"Column3", "Unit Price Paid for Item"}}),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Renamed Columns", {"Item", "Item NAme", "Unit Price Paid for Item"}, "Attribute", "Value"),
    #"Removed Columns2" = Table.RemoveColumns(#"Unpivoted Other Columns",{"Attribute"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Removed Columns2", "Value", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), {"Value.1", "Value.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Item", Int64.Type}, {"Item NAme", type text}, {"Unit Price Paid for Item", Int64.Type}, {"Value.1", Int64.Type}, {"Value.2", type date}}),
    #"Renamed Columns1" = Table.RenameColumns(#"Changed Type",{{"Value.2", "Date_DeliveryedWeek"}, {"Value.1", "Menus Item Count"}})
in
    #"Renamed Columns1"

 

Also check a PBIX file attach with the example: you need to change the Source from the query Sample file and FromFolderData to your folder.


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Anonymous
Not applicable

Thank you for the detailed explanation @MFelix . I took me a bit of time to implement and adjust it to my situation but I can very happily say that works! Thanks so much!

MFelix
Super User
Super User

HI @Anonymous ,

 

I'm assuming you are using the import file from folder that creates a function when you select a file.

 

On the example file you need to select the product column and the go to unpivot and select the Unpivot other columns. If you make this option this will make on all files that the date columns are all unpivot no matter how many columns there are.

 

Be aware that if have other columns than the ones presenting they cal also be unpivot if you don't want that select also those columns.

 

This also prevents errors when uploading new data.

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.