The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi,
Am hoping for some help. I will outline three steps below where I take a folder as a source, combine the binaries and then import. It all works fine, however, at the final step after I have combined and imported each binary I would like there to be a column which displays which file (e.g. Filename1.csv) that each row originated from. When you do the combine and import in Steps 2 & Step 3, the file metadata from Step1 (e.g. Filename, DateCreated) is no longer visible.
Has anyone found a strategy to add column(s) for file metadata, particularly the originating filename as a new column in the imported dataset?
The steps are outlined below.
Kind regards
Brian
Step 1:
Source = Folder.Files("C:\Users\XXX\YYYY"),
Content Name
Binary Filename1.csv
Binary Filename2.csv
Binary Filename3.csv
Binary Filename4.csv
Binary Filename5.csv
Binary Filename6.csv
Step 2:
CombinedBinaries = Binary.Combine(#"Removed Other Columns"[Content])
-- Displays an CSV file icon
-- 180159607 bytes
Step 3:
Imported = Csv.Document(#"Combined Binaries",[Delimiter=";", Columns=16, Encoding=1252, QuoteStyle=QuoteStyle.None])
--All the rows for the appropriate columns from each file are imported and appended, but I cannot
--see how to append a column for the originating filename.
Solved! Go to Solution.
Actually, it's pretty easy with this line of code:
= Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=";", Encoding=1252])))
Replace your Step 2&3 with it.
It will retain all the metadata from the 1st step and add the content in an additional custom column.
If your csv's have all the same headers, you simply expand this column and the headers of the 1st file will be shown - and expand all other files on these cols as well.
If there are differnt cols - just come back & I'll post the code for the auto-expand of different headers.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Your help is greatly appreciated. I am trying to retain the file name with the contents of each csv file. My data is below. Help?
Source = Folder.Files("C:\Users\sjgue\OneDrive\Desktop\accounting\GilfordJ\Reports\Trial Bal\csv tb"),
#"Removed Other Columns" = Table.SelectColumns(Source,{"Content"}),
#"Filtered Hidden Files1" = Table.SelectRows(#"Removed Other Columns", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File", each #"Transform File"([Content])),
#"Removed Other Columns1" = Table.SelectColumns(#"Invoke Custom Function1", {"Transform File"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))),
#"Added Custom" = Table.AddColumn(#"Expanded Table Column1", "Custom", each Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=";", Encoding=1252])))),
Custom = #"Added Custom"{0}[Custom]
in
Custom
Hi,
Can you advise me on using the Table name in the new column of the table, when we use the data from websource with multiple tables
What would be the code if you were connecting to a folder with fixed width text files rather than CSV files?
Sorry, no idea as I haven't done this already. But you can easily find out by using the UI: Start the import from folder process and you will see a table popping up with all files listed with different metadata-field. Choose "Content" and click into one of its fields. A further dialogue might start but at the end you will be able to see the full code in the advanced editor.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Actually, it's pretty easy with this line of code:
= Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=";", Encoding=1252])))
Replace your Step 2&3 with it.
It will retain all the metadata from the 1st step and add the content in an additional custom column.
If your csv's have all the same headers, you simply expand this column and the headers of the 1st file will be shown - and expand all other files on these cols as well.
If there are differnt cols - just come back & I'll post the code for the auto-expand of different headers.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hey Imke!
Thanks for all the support.
I have a slighly different problem and since I'm new to M I don't know how to proceed.
In my case, I need to keep the creation date of the document, as I may benefit from it for some calculations.
How can I transform the creation date into a new column in Power Querry?
Thanks
The column should be/stay there, if you don't use Combine Binaries but use the method I've described and add a column to retrieve the data from the "Content"-column.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF,
I am doing something a little bit different and was wondering if you could help guide me in my use case. I have a SharePoint folder where different CSVs will be uploaded with different headers. Each csv will have at least one of the common merge headers, so there will always be "Key1", "Key2", or "Key3" as one of the headers - the rest might have extra data. I am trying to modify my power query to handle these cases, so that if the file only has "Key2" - just that column will be read in and the rest ignored. Or if it just has "Key1" and "Key2" - those will be read and the rest ignored.. etc
I created a sample file with 3 columns for the 3 keys for my default file, but I need help modifying my query because right now the new files just read the first 3 columns of data and doesn't search for "Key2" in the data. Any help on this use case?
Thanks in advance for the help!
You have to select the 3 columns in your function before you expand. Therefore you have to modify the Table.SelectColumns-command to cater for missing columns:
Table.SelectColumns(AppendToSource,{"Key1", "Key2", "Key3"}, MissingField.Ignore)
That way, any of the 3 key-columns will be selected and no error will be thrown if one or more of them are missing.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hello, thanks for your help so far. I'm curious if this solution works in PowerPivot?
I've tried several variations of your solution and I can't seem to get it to work.
I want to retain on each row of the Binary Data CLE/CVO and Actual/Budget.
My folder paths:
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CLE\Actual\ |
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Actual\ |
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Budget\ |
C:\Users\User\OneDrive - Corporate\PowerPivot\TB\CVO\Budget\ |
My Binary Data:
This is the code I tried:
let
Source = Folder.Files("C:\Users\User\OneDrive - Corporate\PowerPivot\TB"),
#"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Folder Path],[Delimiter="\", Encoding=1252])))
in
#"Added Custom"
I ended up solving this myself. I finally decided to use my head and look up table.expandtablecolumn which made it immediately obvious what I was doing wrong. I'm new to this so I had no reference point for what was actually happening when I used table.expandtablecolumn.
let Source = Folder.Files("C:\Users\User\OneDrive - Corporate\PowerPivot\TB"), #"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content], [Delimiter=",", Encoding=1252]))), #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom", "Folder Path"}), #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Period", "No", "Current Debit/(Credit)", "YTD Debit/(Credit)"}, {"Custom.Period", "Custom.No", "Custom.Current Debit/(Credit)", "Custom.YTD Debit/(Credit)"}) in #"Expanded Custom"
From here I'll split the Folder Path on "\" and then remove all the Folder Path columns besides the two containing CLE/CVO and Actual/Budget.
When I try this solution for every entry in the csvs I get a different row in the data for every filepath in the folder regardless of whether the entry appeared in it or not. This makes it impossible to find entries by specifying the csv which is my ultimate goal.
Sorry, but I have no clue what you want and what the current problem is you're facing.
Any chance to refer more specificly to the examples given and:
1) Point out where your requirement differs from the examples given and
2) Provide some examples of your case?
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi the code for the auto expand for different headers would be awesome!
Perfect thanks Imke. At all my data came through in one column but after noticing the [Delimiter=";" and changing it to [Delimiter="," then I was sorted.
For anyone else reading, I'll try to describe the difference between my first attempt and Imke's solution. Imke avoids steo where I combined the binaries (and lost the filename), instead she adds a new column that contains the data from each file.
My Attempt - loose filename...
let
//Lists all the files in the folder
Source = Folder.Files("C:\Users\Brian\Desktop\FolderName")
/*Code generated for me by clicking "Combine binaries" double down arrow on Content column
Combines all the files into one binary, (NB: here I lose the filename which I wanted to keep 😞 ...) */
MyStep1 = Binary.Combine(Source[Content]),
/*Code generate for me by right clicking binary object and choosing Csv.
Converts single binary object from Step1 above, expands all the data - but I am missing the filename that I wanted to keep.
*/
MyStep2 = Csv.Document(MyStep1,[Delimiter=",", Columns=12, Encoding=1252, QuoteStyle=QuoteStyle.None])
in
MyStep2
Imke's Solution - keep filename...
let
//Lists all the files in the folder
Source = Folder.Files("C:\Users\Brian\Desktop\FolderName")
/*Instead of combining binaries into one single binary and losing all the file metadata (like filename), use the
Table.AddColumn - best of both worlds, keep the file metadata plus add a new column containing the data from
each file.*/
ImkeStep1 = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Csv.Document([Content],[Delimiter=",", Encoding=1252])))
in
ImkeStep1
I am using the folder type for get data, how can I show the file name or file create date for each row of the merged data?
It is not clear to me what you are actually doing. Could you please share the M-code that has been created so far (from the advanced editor).
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Here is where I am at:
let
Source = Folder.Files("C:\Users\pbakaric\Google Drive\DFM Dashboard\Reports"),
#"Combined Binaries" = Binary.Combine(Source[Content]),
#"Imported CSV" = Csv.Document(#"Combined Binaries",[Delimiter=";", Columns=21, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(#"Imported CSV",{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type text}, {"Column9", type text}, {"Column10", type text}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}, {"Column14", type text}, {"Column15", type text}, {"Column16", type text}, {"Column17", type text}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Column21", type text}}),
#"Removed Top Rows" = Table.Skip(#"Changed Type",1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows"),
#"Removed Blank Rows" = Table.SelectRows(#"Promoted Headers", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null})))
in
#"Removed Blank Rows"
I tried incorporating your script but I couldn't seem to get it to work. I appreciate the help!!!
Your code should look like this:
let Source = Folder.Files("C:\Users\pbakaric\Google Drive\DFM Dashboard\Reports"), #"Added Custom" = Table.AddColumn(Source, "Custom", each Table.PromoteHeaders(Table.Skip(Csv.Document([Content], [Delimiter=";", Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]),1))), #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Custom"}), #"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Datum", "Menge", "Wert", "Kommentar"}, {"Datum", "Menge", "Wert", "Kommentar"}) in #"Expanded Custom"
Check out the video as well, as there you'll find another method, showing how to use an explicit function to do that. This might be more convenient if there are more steps to perform an the nesting into the #"Added Custom"-steps becomes too confusing.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi ImkeF,
I'm following the steps outlined in the video to retain the file names and I'm receiving an error at the very last step, when I add a custom column:
= Table.AddColumn(Source, "Custom", each Folder([Content]))
The error I am receiving is as follows: "Expression error: The name 'Folder" wasn't recognized. Make sure it's spelled correctly."
Here's the function I created:
= (Content) =>
let
// Source = Folder.Files("C:XXXContentDrilldown"),
// #"C:XXXContentDrilldown csv" = Source{[#"Folder Path"="C:\XXXContentDrilldown\",Name="XXXContent Drilldown 20160501-20160531.csv"]}[Content],
#"Imported CSV" = Csv.Document(Content,[Delimiter=",", Columns=7, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Changed Type" = Table.TransformColumnTypes(#"Imported CSV",{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}}),
#"Removed Top Rows" = Table.Skip(#"Changed Type",6),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows")
in #"Promoted Headers"
I would appreciate any insight.
Thanks!
Margaryta
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
112 | |
80 | |
74 | |
52 | |
50 |
User | Count |
---|---|
132 | |
124 | |
78 | |
64 | |
61 |