Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
Check it out now!Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more
Hi,
Here is a "simplified" version of my problem.
I know how to use "Combine Binaries" in Query Editor to combine multiple csv files.
My goal:
Combine two folders (.csv), with.....
Same columns (ie DATE, CATEGORY, VALUE)
Different date period
Folder A: 1 Jan 2000 - 31 Dec 2005
Folder B: 1 Jan 2006 - 31 Dec 2010
I found no solution by quick googling the websites. So any solution would be appreciated !! Thanks in advance !
Solved! Go to Solution.
Hi @JohnBI,
You can try to use below steps to achieve your requirement:
1. Get data from the parent folder.
2. Add filter to find out ther related folder.
For query:
let Source = Folder.Files("C:\Users\xxxxxx\Desktop"), FolderFilter= Table.SelectRows(Source, each [Folder Path] = "C:\Users\xxxxxx\Desktop\file\" or [Folder Path] = "C:\Users\xxxxxx\Desktop\Read\"), NameFilter = Table.SelectRows(FolderFilter, each [Name] ="abc.xlsx" or [Name] ="cde.xlsx" ) in NameFilter
After above steps, you can use combine binaries function to merge these records.
Regards,
Xiaoxin Sheng
The solution is relatively simple, just requires an extra step or two.
Power query does most of the hard work for you by using the 'Get data from folders' option, but you need to add to some extra steps to extract from multiple folders.
Here's the code. I've given a full explanation below though.
let
Paths = {{"C:\Users\ME\Desktop\Folder"}, {"C:\Users\ME\Documents\Folder 2"}},
#"Converted to Table" = Table.FromList(Paths, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Extracted Values" = Table.TransformColumns(#"Converted to Table", {"Column1", each Text.Combine(List.Transform(_, Text.From)), type text}),
#"Added Custom" = Table.AddColumn(#"Extracted Values", "Custom", each Folder.Files([Column1])),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}, {"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
#"Filtered Hidden Files1" = Table.SelectRows(#"Expanded Custom", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}})
in
#"Changed Type"
First, just extract some data from one folder using Power Query's built in option.
in this case i've got a folder on my desktop imaginatively called 'Folder' with 3 files in it, CSV1, CSV2, CSV3
at this point Power query has included in the code a line that specifies the folder path we'd given it
and then applied a number of steps to the files to bring them into a single table.
Power query generated code from Import Folder
We need to replace this folder path with multiple folders.
Personally, I've used a list. You can do this other ways. But you need to get a table with a single column of all your folder paths, this is before the code written by Power query.
Create table with single column from a list of folder paths
and then all we need to do is replace the 'Source' line with an expansion of the table with the method Folder.Files
First step, add a column with the function Folder.Files([Column1])
Add column Folder.Files
Then expand all the columns from this.
Expand all
Remove the original 'Source' using the query settings & the whole thing should work.
This solution assumes that the multiple folders are all subfolders in the same parent folder.
But what about a different scenario where there are multiple folders that are not all subfolders in same parent folder?
Can the Source have OR statements? something like ..
Source = Folder.Files("C:\Users\xxxxxx\myDocuments\folder2" OR "C:\Users\xxxxxx\Desktop\folder2"),
I haven't been able to make anything work or see any other possible solutions by searching.
Hi @JohnBI,
You can try to use below steps to achieve your requirement:
1. Get data from the parent folder.
2. Add filter to find out ther related folder.
For query:
let Source = Folder.Files("C:\Users\xxxxxx\Desktop"), FolderFilter= Table.SelectRows(Source, each [Folder Path] = "C:\Users\xxxxxx\Desktop\file\" or [Folder Path] = "C:\Users\xxxxxx\Desktop\Read\"), NameFilter = Table.SelectRows(FolderFilter, each [Name] ="abc.xlsx" or [Name] ="cde.xlsx" ) in NameFilter
After above steps, you can use combine binaries function to merge these records.
Regards,
Xiaoxin Sheng
Hello,
Thank you for the code. I tried to run it in a folder that includes a large amounts of subfolders and got the following error:
DataFormat.Error: The file name 'RANDOM FILE PATH' is longer than the system-defined maximum length
The folders include a very large number of documents and many of them exceed the maximum length so it's not possible to shorten all the documents. Is there another way to fix this?
Thank you
Have you tried the get data from Folders option? There is a good explanation here http://powerbi.tips/2016/06/loading-data-from-folder/