Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
How to import a text file content as one single column of text?
My use case is getting folder of csv files, but they all have a
1 top row = header row of text that is a title
2nd row = column header titles
rest of rows = comma separated values
I want to do following:
1. Import file data as a single column of text
2. filter out the title rows
3. promote column headers (now first row) as column names
4. Split by delimiter (now that all data is csv)
5. filter out all other files' redundant column name rows
But I am struggling to find syntax to do something like
Csv.Document([Content],[Delimiter="", Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]))
is there something like this?
Thanks!
Solved! Go to Solution.
Here's the function I use to process multiple CSV files. I also dislike the helper functions that are created.
// fProcessFiles
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252]),
ConditionalBlankIndex = Table.AddColumn(CSV, "Custom", each if [Column2] = "" then 0 else 1),
HeaderPosition = List.PositionOf(ConditionalBlankIndex[Custom], 1),
RemoveRows = Table.Skip(CSV, HeaderPosition),
PromotedHeaders = Table.PromoteHeaders(RemoveRows, [PromoteAllScalars = true])
in
PromotedHeaders
in
fProcessFiles
This assumes there's a title (like you have described) but nothing in the first row of column 2.
Use the 'Add Column' >> 'Invoke Custom Function' on a table with a list of files.
Let me know if you have any questions. I can't say this is definitely faster but it is certainly cleaner to work with.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
I recommend reading this documentation for loading CSV files from a folder. It explicitly covers the header and column names situation you describe:
https://docs.microsoft.com/en-us/power-query/combine-files-csv
The basic idea is that you apply the same transformation function (e.g. remove the top row, promote headers, split by a delimiter, etc.) to each CSV file you load from the folder.
Thanks AlexisOlson,
Good suggestion. I am in fact actually using that solution right now!
But I am now experimenting to try something less complex eg without the helper query, function and parameter. My working assumption is that less complex will be faster.
So this method gets me same result with less complexity.
But still needs to be seen if it is faster!
Yeah. This is a good starting point but when I do it myself, I also bypass the extra helper queries and sample files to reduce clutter. The basic logic is the same but I do a bit more hand-coding.
Just tried "faking it" by specifying a delimiter that doesn't exist in the data eg in my case a pipe character "|" and it returned a single column as I wanted.
Csv.Document([Content],[Delimiter="|", Columns=1, Encoding=1252, QuoteStyle=QuoteStyle.None]))
Is there perhaps a "proper way" to get same result, that is a purpose created Power Query function or method?
Hi @009co,
I'm not sure why you need to bring in the file as a single column.
Typically, the pattern would look like this...
let
Source = Csv.Document(
File.Contents("H:\My Drive\Power BI\Community Solutions\Files\CSV with Title.csv"),
[Delimiter = ",", Columns = 2, Encoding = 1252, QuoteStyle = QuoteStyle.None]
),
#"Removed Top Rows" = Table.Skip(Source, 1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars = true]),
#"Changed Type1" = Table.TransformColumnTypes(
#"Promoted Headers",
{{"Header 1", type text}, {" Header 2", type text}}
)
in
#"Changed Type1"
Sometimes Power Query will automatically add a 'Changed Type' and 'Promoted Headers' step that needs to be/can be removed. Basically, the Table.Skip removes as many rows as necessary, 1 in this instance. Then you're left with the column headers at the top, so the Table.PromoteHeaders moves them to the column headers and you're good to go.
Files attached for reference.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Thanks KNP,
I am trying to use a different method, given the specific data issues in my files as described.
Here's the function I use to process multiple CSV files. I also dislike the helper functions that are created.
// fProcessFiles
let
fProcessFiles = (myFile as binary) =>
let
CSV = Csv.Document(myFile, [Encoding=1252]),
ConditionalBlankIndex = Table.AddColumn(CSV, "Custom", each if [Column2] = "" then 0 else 1),
HeaderPosition = List.PositionOf(ConditionalBlankIndex[Custom], 1),
RemoveRows = Table.Skip(CSV, HeaderPosition),
PromotedHeaders = Table.PromoteHeaders(RemoveRows, [PromoteAllScalars = true])
in
PromotedHeaders
in
fProcessFiles
This assumes there's a title (like you have described) but nothing in the first row of column 2.
Use the 'Add Column' >> 'Invoke Custom Function' on a table with a list of files.
Let me know if you have any questions. I can't say this is definitely faster but it is certainly cleaner to work with.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Yes the helper functions work but clutter things up too much.
Wondeful, a thing of beauty!
No need to filter out each successive files titles or header rows!
The custom function used on the list of files [Content] column values resulted in Tables that are then simply expanded so show all the rows of the data in all of the files.
I retained my file and folder names to parse out additional information to add a couple extra colums and was ready to go.
Can't say if it was faster but maybe .. yes? But sure is less complexity and number of query objects!
Thanks!
Glad you like it.
I am in the process of modifying it to allow it to work for multiple file formats but it's not done yet.
If you're happy this is solved, please click the 'Accept as Solution' button.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Check out the July 2025 Power BI update to learn about new features.