Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.
Hello,
I want to import a legacy text file which is having data split into two rows for each field. The headers are repeated for each page.
Screenshot of Sample Data
Thank you
Solved! Go to Solution.
That's quite a challenge.
My suggestion would be to first create a table with the data of each first row and another table with the data of each second row.
Then combine both tables.
I created the following example file with part of your data (at estimated positions):
Assuming you need no data from the headers and assuming input is at fixed positions, I created the following code.
I renamed each step and otherwise I adjusted generated code (or created code using the advance editor) for the steps:
Source (adjusted in order to get only 1 column)
SplitByPosition1 and -2
RemovedTopRows1 and - 2 (remove based on condition)
CombinedAsRecords
let Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"), // First the first rows SplitByPosition1 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})), Trimmed1 = Table.TransformColumns(SplitByPosition1,{},Text.Trim), RemovedTopRows1 = Table.Skip(Trimmed1,each _[Column1.1]<>"A/C NO"), PromotedHeaders1 = Table.PromoteHeaders(RemovedTopRows1), Typed1 = Table.TransformColumnTypes(PromotedHeaders1,{{"A/C NO", type number}, {"Princ Bal", type number}, {"Int Accr", type number}}, "en-US"), RemovedErrors1 = Table.RemoveRowsWithErrors(Typed1), ResultPart1 = Table.SelectRows(RemovedErrors1, each ([#"A/C NO"] <> null)), // Now parse the second rows SplitByPosition2 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{18,15},{50,10}})), Trimmed2 = Table.TransformColumns(SplitByPosition2,{},Text.Trim), RemovedTopRows2 = Table.Skip(Trimmed2,each _[Column1.1]<>"Interest Rat"), PromotedHeaders2 = Table.PromoteHeaders(RemovedTopRows2), Typed2 = Table.TransformColumnTypes(PromotedHeaders2,{{"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"), RemovedErrors2 = Table.RemoveRowsWithErrors(Typed2), ResultPart2 = Table.SelectRows(RemovedErrors2, each ([Interest Rat] <> null)), // Now combine the results CombinedAsRecords = Table.FromColumns({Table.ToRecords(ResultPart1),Table.ToRecords(ResultPart2)}), Expanded1 = Table.ExpandRecordColumn(CombinedAsRecords, "Column1", {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}, {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}), Result = Table.ExpandRecordColumn(Expanded1, "Column2", {"Interest Rat", "Fees Recv"}, {"Interest Rat", "Fees Recv"}) in Result
This should help you to create the code for your file. If you need further help then please provide a representative sample file.
This is the result from my example file:
You may want to add some finishing steps, like reordering columns.
An alternative solution which is probably much faster is the following in which the source is duplicated in 2 table columns, with the 2nd column shifted 1 row.
let Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"), // Add column shifted 1 row AddedShiftedColumn = Table.FromColumns({Source[Column1],List.Skip(Source[Column1],1)}), SplitByPosition1 = Table.SplitColumn(AddedShiftedColumn,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})), SplitByPosition2 = Table.SplitColumn(SplitByPosition1,"Column2",Splitter.SplitTextByRanges({{18,15},{50,10}})), Trimmed = Table.TransformColumns(SplitByPosition2,{},Text.Trim), RemovedTopRows = Table.Skip(Trimmed,each _[Column1.1] <> "A/C NO"), PromotedHeaders = Table.PromoteHeaders(RemovedTopRows), Typed = Table.TransformColumnTypes(PromotedHeaders,{{"A/C NO", Int64.Type}, {"Princ Bal", type number}, {"Int Accr", type number}, {"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"), RemovedErrors = Table.RemoveRowsWithErrors(Typed), Filtered = Table.SelectRows(RemovedErrors, each ([#"A/C NO"] <> null)) in Filtered
That's quite a challenge.
My suggestion would be to first create a table with the data of each first row and another table with the data of each second row.
Then combine both tables.
I created the following example file with part of your data (at estimated positions):
Assuming you need no data from the headers and assuming input is at fixed positions, I created the following code.
I renamed each step and otherwise I adjusted generated code (or created code using the advance editor) for the steps:
Source (adjusted in order to get only 1 column)
SplitByPosition1 and -2
RemovedTopRows1 and - 2 (remove based on condition)
CombinedAsRecords
let Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"), // First the first rows SplitByPosition1 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})), Trimmed1 = Table.TransformColumns(SplitByPosition1,{},Text.Trim), RemovedTopRows1 = Table.Skip(Trimmed1,each _[Column1.1]<>"A/C NO"), PromotedHeaders1 = Table.PromoteHeaders(RemovedTopRows1), Typed1 = Table.TransformColumnTypes(PromotedHeaders1,{{"A/C NO", type number}, {"Princ Bal", type number}, {"Int Accr", type number}}, "en-US"), RemovedErrors1 = Table.RemoveRowsWithErrors(Typed1), ResultPart1 = Table.SelectRows(RemovedErrors1, each ([#"A/C NO"] <> null)), // Now parse the second rows SplitByPosition2 = Table.SplitColumn(Source,"Column1",Splitter.SplitTextByRanges({{18,15},{50,10}})), Trimmed2 = Table.TransformColumns(SplitByPosition2,{},Text.Trim), RemovedTopRows2 = Table.Skip(Trimmed2,each _[Column1.1]<>"Interest Rat"), PromotedHeaders2 = Table.PromoteHeaders(RemovedTopRows2), Typed2 = Table.TransformColumnTypes(PromotedHeaders2,{{"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"), RemovedErrors2 = Table.RemoveRowsWithErrors(Typed2), ResultPart2 = Table.SelectRows(RemovedErrors2, each ([Interest Rat] <> null)), // Now combine the results CombinedAsRecords = Table.FromColumns({Table.ToRecords(ResultPart1),Table.ToRecords(ResultPart2)}), Expanded1 = Table.ExpandRecordColumn(CombinedAsRecords, "Column1", {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}, {"A/C NO", "Borrower", "Princ Bal", "Int Accr"}), Result = Table.ExpandRecordColumn(Expanded1, "Column2", {"Interest Rat", "Fees Recv"}, {"Interest Rat", "Fees Recv"}) in Result
This should help you to create the code for your file. If you need further help then please provide a representative sample file.
This is the result from my example file:
You may want to add some finishing steps, like reordering columns.
Note this topic is cross posted over here.
An alternative solution which is probably much faster is the following in which the source is duplicated in 2 table columns, with the 2nd column shifted 1 row.
let Source = Csv.Document(File.Contents("C:\Users\Marcel\Documents\Forum bijdragen\Power BI Community\Power Query to import text file having two rows for each field\Legacy.txt"),1,"#(lf)"), // Add column shifted 1 row AddedShiftedColumn = Table.FromColumns({Source[Column1],List.Skip(Source[Column1],1)}), SplitByPosition1 = Table.SplitColumn(AddedShiftedColumn,"Column1",Splitter.SplitTextByRanges({{0,13},{17,15},{35,15},{63,10}})), SplitByPosition2 = Table.SplitColumn(SplitByPosition1,"Column2",Splitter.SplitTextByRanges({{18,15},{50,10}})), Trimmed = Table.TransformColumns(SplitByPosition2,{},Text.Trim), RemovedTopRows = Table.Skip(Trimmed,each _[Column1.1] <> "A/C NO"), PromotedHeaders = Table.PromoteHeaders(RemovedTopRows), Typed = Table.TransformColumnTypes(PromotedHeaders,{{"A/C NO", Int64.Type}, {"Princ Bal", type number}, {"Int Accr", type number}, {"Interest Rat", type number}, {"Fees Recv", type number}}, "en-US"), RemovedErrors = Table.RemoveRowsWithErrors(Typed), Filtered = Table.SelectRows(RemovedErrors, each ([#"A/C NO"] <> null)) in Filtered
Thank you so much Sir.
Check out the November 2023 Power BI update to learn about new features.
Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.