Solved: Re: I need help importing this .txt file with its ...

finglonger76 · ‎04-03-2024

I need to bring in data from multiple .txt files. Each one has sections for products with three columns of attribute descriptions and then values. I need help getting the data in a usefull form. I was thinking the product header is the key or row(s) and the columns are the attributes. I cant figure this out with the weird mix of delimiters. 1,2, and 3 are all attributes then values.

Any help would be awesome.

ToddChitt · ‎04-03-2024

So if I understand this correctly, the .txt file is not one long vertical list of key/value pairs, but instead three parallel lits. Is that right?

Maybe you start with one Power Query that strips out as many header rows as needed. Even take out the column header row above the actual data.

From there, REFERENCE that query, and remove the columns for lists 2 and 3. Set appropriate Column Header names.

Repeat for the middle list (removing lists 1 and 3), and again for list 3 (removing 1 and 2).

Finally, do a UNION MERGE to bring the 3 lists together into one long one.

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

View solution in original post

dufoq3 · ‎04-03-2024

I started to play with it but don't want to spend much time.

You can start with something like this:

let
    Source = Csv.Document(File.Contents("Address\sample_file.txt"),[Delimiter=":", Columns=1, Encoding=1250, QuoteStyle=QuoteStyle.None]),
    FilteredRows = Table.SelectRows(Source, each ([Column1] <> "")),
    Ad_Split = Table.AddColumn(FilteredRows, "Split", each 
        [ a = Splitter.SplitTextByCharacterTransition({" "}, {"0".."9", "-"})([Column1]),
          b = if Text.Trim(a{0}) = "" then List.Skip(a) else a,
          c = if List.Count(List.Select(b, (x)=> Text.Contains(x, "."))) = 0 then null else b
        ][c], type list)
in
    Ad_Split

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

ToddChitt · ‎04-03-2024

So if I understand this correctly, the .txt file is not one long vertical list of key/value pairs, but instead three parallel lits. Is that right?

Maybe you start with one Power Query that strips out as many header rows as needed. Even take out the column header row above the actual data.

From there, REFERENCE that query, and remove the columns for lists 2 and 3. Set appropriate Column Header names.

Repeat for the middle list (removing lists 1 and 3), and again for list 3 (removing 1 and 2).

Finally, do a UNION MERGE to bring the 3 lists together into one long one.

Did I answer your question? If so, mark my post as a solution. Also consider helping someone else in the forums!

Proud to be a Super User!

finglonger76 · ‎04-03-2024

Thank you. That is the way I started to go and after your suggestion I pressed on and it seems to work.

ronrsnfld · ‎04-03-2024

I'd start by converting the picture to the actual text. Power Query does not have a built-in OCR function.

If your data is really text and you are just showing a picture, I'd suggest you post your data as text which can be easily copy/pasted, and also what you expect the output to be. Power Query can split a string on multiple delimiters, and Unpivot would seem to be useful also.

finglonger76 · ‎04-03-2024

That is a picture for reference.

here is a .txt file

https://1drv.ms/t/s!ArRjzwL-cGRGnlQc_Fx5_asBVQY9?e=fXFXuT

dufoq3 · ‎04-03-2024

I started to play with it but don't want to spend much time.

You can start with something like this:

let
    Source = Csv.Document(File.Contents("Address\sample_file.txt"),[Delimiter=":", Columns=1, Encoding=1250, QuoteStyle=QuoteStyle.None]),
    FilteredRows = Table.SelectRows(Source, each ([Column1] <> "")),
    Ad_Split = Table.AddColumn(FilteredRows, "Split", each 
        [ a = Splitter.SplitTextByCharacterTransition({" "}, {"0".."9", "-"})([Column1]),
          b = if Text.Trim(a{0}) = "" then List.Skip(a) else a,
          c = if List.Count(List.Select(b, (x)=> Text.Contains(x, "."))) = 0 then null else b
        ][c], type list)
in
    Ad_Split

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.