Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
cgkas
Helper V
Helper V

How to import not homogeneous text file?

Good day, 

 

May someone could help me with this complex problem [at least for me 🙂 ]. I'm using Power Query in Excel, but the rules would be almost the same in Power Bi.

 

I ´m trying to import this not homogeneous text file that has upper part with one format and the lower part with other format.

Upper data appears in blocks that begins with parameter Az until Gz and values are to the right after the "=" sign in same line. Within each block [Az to Gz] are sub blocks separated with an empty line. So when tabulate the values of those sub blocks should go in different lines.

 

For the lower part of the data with second format (a mix of first format and new format), the blocks of data begins with parameter "Bz" until RRCO. Then comes some consecutive lines arranged in 4 columns. 

 

I have the idea to import it separating by spaces or by "=" sign, but since the file has the second format in lower part, separating by "=" wouldn't separate the second format in lower part

 

My expected output is like this:

 

image.png

 

My input file is like this:

Thanks in advance for any help.

SOME TEXT SOME TEXT SOME TEXT
SOME TEXT
RETCODE = 0

                        Az = 1
                        Bz = 5
                        Cz = 3
						 
                        Dz = 0
                        Ez = 0
						 
                        Dz = 4
                        Ez = 1
                        Gz = 1


blah blah


---    ENDBLOCK

SOME TEXT
RETCODE = 0
                        Az = 3
                        Bz = 2
                        Cz = 8
                        Dz = 6
                        Ez = 9
                        Fz = 3



blah blah

blah blah

---    ENDBLOCK

SOME TEXT
RETCODE = 0
                        Az = 7
                        Bz = 2
                        Cz = 2
                        Dz = 9
                        Ez = 0
						 
                        Dz = 1
                        Ez = 4
						 
                        Dz = 7
                        Ez = 0
                        Fz = 1
						 
                        Dz = 1
                        Gz = 8


blah blah

blah blah

---    ENDBLOCK
%%XYZ HGGYT:%%

BLA BLAH = 0 

                            Bz = 2
                            Cz = MAIN-332
                           DRM = CTRRES
                          RRCO = DLAM
 
blah blahblah blahblah blahblah blah
-----------------------------------------------
 
RDX         FCD         MBZ          HGW                        
PRM         <NULL>      XJN          bbv.uj002.pkl706.svfdata   
PRM         ALLO        STT          bbv.uj003.pkl706.svfdata   
PRM         <NULL>      BNN          bbv.uj410.pkl310.svfdata   


blah blah

---    ENDBLOCK
%%XYZ HGGYT:%%

BLA BLAH = 0 

                            Bz = 5
                            Cz = GOPAG
                           DRM = BRI5T
                          RRCO = TALD
 
blah blahblah blahblah blahblah blah
-----------------------------------------------
 
RDX         FCD         MBZ          HGW  
Rdx         Fcd         Mbz          bbv.123k.svfdata
REJ         PRMRJC      YW           bbv.322h7.svfdata
PRM         <NULL>      G4           bbv.119.svfdata

 

 

2 REPLIES 2
cgkas
Helper V
Helper V

Thanks for your answer. I was able to import the file and remove garbage and empty lines. Now I have a table containing only rows I need, but I'm stuck in how to manipulate it in order to get the desired output.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("zZVdT8IwFIb/SkOyO0f2wYeYeLGtY4D7IKXGAXLBQEOECxPFmP16186tMLZ2RIyehHJWnr49PX3J5vPGJPBsgO0Qg5Kssbg6IOgTsrEVQBvcAoU+0wFUhBEnnMpFTIK0uYhFEF28G4zzqqoQO65XOJVqiaX4Z3MYwoZot9wAMhTmZVmmsj403cC6Yz/w+1+1t8GaVoXQ1mvi1l+LW9URt6rHRfqFO+b2q2TqN7rXvUj3+AgUt+Ysz/INaTNbi6X4x7fFf7Z+wf4/KNxhPryEMyQpnM7AwHGm+EaSGGu6Bkg+A3I0IC6cRA0bkKBW8IyhL+u6kIbIS2gLI2RPBCxCVpCw0DW8lDxuAyfJmnNOHGyBYJgX0bdgnnvmjFU3cB6qCqci4+SgWTzuFUVf+feuS7OndDYc+WxNFH009y+KojVft7uu0mm+fTyvl+/LMjnDdYMsn2B8KqLXECmvyfRPamqpCpHTk6+C3J97k/9uJUG96QRjw6lnTBMN27ieL7Hhwv/qyxRffzJ8tWZ4FDOc3LGq6dvsbr9fJaMcSDyDRlaaTw8tT1bqmrbpHi0VW8xpFURUtcckFl8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Filtered Rows" = Table.SelectRows(Source, 
                         each 
                            Text.StartsWith([Column1], " ") or
                            //List.Count(Text.Split([Column1]," ")) > 4 
                            Text.Contains([Column1], "svfdata")
                        ),
    RemoveMultiSpaces = Table.TransformColumns(#"Filtered Rows", {{"Column1", each Text.Combine(List.Select(Text.SplitAny(_, " "), each _ <> "")," "), type text}}),
    #"Filtered Rows1" = Table.SelectRows(RemoveMultiSpaces, each ([Column1] <> "")),
    #"Replaced Value" = Table.ReplaceValue(#"Filtered Rows1"," = ","@",Replacer.ReplaceText,{"Column1"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value"," ","@",Replacer.ReplaceText,{"Column1"}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value1", "Column1", Splitter.SplitTextByDelimiter("@", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4"})
in
    #"Split Column by Delimiter"

 

This is my current output. 

v-rzhou-msft
Community Support
Community Support

Hi @cgkas ,

 

As far as I know, Power BI does not support to load text file with unhomogeneous structure as you requirement.

Here I suggest you to try "Extract Table Using Examples" function in Text connector.

RicoZhou_0-1670915287066.png

Function:

RicoZhou_1-1670915762123.png

You may refer to this video to learn more details: Import Text/CSV using Examples in Power BI

 

Best Regards,
Rico Zhou

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.