Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
JohnBI
Frequent Visitor

Combine multiple "FOLDERS" (not files)

Hi,

 

Here is a "simplified" version of my problem.

 

I know how to use "Combine Binaries" in Query Editor to combine multiple csv files.

 

My goal:

Combine two folders (.csv), with.....

 

Same columns (ie DATE, CATEGORY, VALUE)

 

Different date period

Folder A:     1 Jan 2000 - 31 Dec 2005

Folder B:     1 Jan 2006 - 31 Dec 2010

 

I found no solution by quick googling the websites. So any solution would be appreciated !! Thanks in advance !

1 ACCEPTED SOLUTION
v-shex-msft
Community Support
Community Support

Hi @JohnBI,

 

You can try to use below steps to achieve your requirement:

 

1. Get data from the parent folder.

2. Add filter to find out ther related folder.

 

For query:

let
    Source = Folder.Files("C:\Users\xxxxxx\Desktop"),
    FolderFilter= Table.SelectRows(Source, each [Folder Path] = "C:\Users\xxxxxx\Desktop\file\" or [Folder Path] = "C:\Users\xxxxxx\Desktop\Read\"),
    NameFilter = Table.SelectRows(FolderFilter, each [Name] ="abc.xlsx" or [Name] ="cde.xlsx" )
in
    NameFilter

 

After above steps, you can use combine binaries function to merge these records.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

5 REPLIES 5
Bwendan
New Member

The solution is relatively simple, just requires an extra step or two. 

Power query does most of the hard work for you by using the 'Get data from folders' option, but you need to add to some extra steps to extract from multiple folders. 

Here's the code. I've given a full explanation below though.

 

let
    Paths = {{"C:\Users\ME\Desktop\Folder"}, {"C:\Users\ME\Documents\Folder 2"}},
    #"Converted to Table" = Table.FromList(Paths, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Extracted Values" = Table.TransformColumns(#"Converted to Table", {"Column1", each Text.Combine(List.Transform(_, Text.From)), type text}),
    #"Added Custom" = Table.AddColumn(#"Extracted Values", "Custom", each Folder.Files([Column1])),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}, {"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
    #"Filtered Hidden Files1" = Table.SelectRows(#"Expanded Custom", each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
    #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
    #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}})
in
    #"Changed Type"

 




First, just extract some data from one folder using Power Query's built in option. 
in this case i've got a folder on my desktop imaginatively called 'Folder' with 3 files in it, CSV1, CSV2, CSV3

Bwendan_0-1606213886024.png


Bwendan_1-1606214101518.png

at this point Power query has included in the code a line that specifies the folder path we'd given it 

and then applied a number of steps to the files to bring them into a single table. 

Power query generated code from Import FolderPower query generated code from Import Folder

 

We need to replace this folder path with multiple folders.

Personally, I've used a list. You can do this other ways. But you need to get a table with a single column of all your folder paths, this is before the code written by Power query.

Create table with single column from a list of folder pathsCreate table with single column from a list of folder paths

 

and then all we need to do is replace the 'Source' line with an expansion of the table with the method Folder.Files
First step, add a column with the function Folder.Files([Column1])

 

Add column Folder.FilesAdd column Folder.Files

 

 

Then expand all the columns from this.

 

Expand allExpand all

Remove the original 'Source' using the query settings & the whole thing should work. 

 

curtisp
Helper II
Helper II

This solution assumes that the multiple folders are all subfolders in the same parent folder.

 

But what about a different scenario where there are multiple folders that are not all subfolders in same parent folder?

 

Can the Source have OR statements? something like ..

 

Source = Folder.Files("C:\Users\xxxxxx\myDocuments\folder2" OR "C:\Users\xxxxxx\Desktop\folder2"),

 

I haven't been able to make anything work or see any other possible solutions by searching.

 

v-shex-msft
Community Support
Community Support

Hi @JohnBI,

 

You can try to use below steps to achieve your requirement:

 

1. Get data from the parent folder.

2. Add filter to find out ther related folder.

 

For query:

let
    Source = Folder.Files("C:\Users\xxxxxx\Desktop"),
    FolderFilter= Table.SelectRows(Source, each [Folder Path] = "C:\Users\xxxxxx\Desktop\file\" or [Folder Path] = "C:\Users\xxxxxx\Desktop\Read\"),
    NameFilter = Table.SelectRows(FolderFilter, each [Name] ="abc.xlsx" or [Name] ="cde.xlsx" )
in
    NameFilter

 

After above steps, you can use combine binaries function to merge these records.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.
Anonymous
Not applicable

Hello, 

 

Thank you for the code. I tried to run it in a folder that includes a large amounts of subfolders and got the following error:

 

DataFormat.Error: The file name 'RANDOM FILE PATH' is longer than the system-defined maximum length

 

The folders include a very large number of documents and many of them exceed the maximum length so it's not possible to shorten all the documents. Is there another way to fix this? 

 

Thank you

Phil_Seamark
Employee
Employee

Have you tried the get data from Folders option?  There is a good explanation here http://powerbi.tips/2016/06/loading-data-from-folder/

 

folder.png


To learn more about DAX visit : aka.ms/practicalDAX

Proud to be a Datanaut!

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.