Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi, im a new power query user. Appreciate any response for my query below:
I have a file which contain the header fields and data. How can i filter the rows to only select the header fields and data.
Sample as below:
START-OF-FILE
START-OF-FIELDS
header1
header2
header3
END-OF-FIELDS
TIMESTARTED=xxx
START-OF-DATA
data1,data2,data3
data4,data5,data6
data7,data8,data9
data10,data11,data12
END-OF-DATA
TIMEFINISHED=xxx
END-OF-FILE
Solved! Go to Solution.
Hi @Isma ,
In Power Query create a new blank query, go to Advanced Editor andpaste the following over the default code to follow the steps I took:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCg5xDArR9XfTdfP0cVWK1UERcfVxCQaLZaQmpqQWGSKxjZDYxmC2q58Lmi4wEeLp6wo20tXFtqKiAtUKF8cQR7BISmJJoqEOiDQCk8ZwURMw3xRMmsFFzcF8CzBpiTDBACxgCDHJ0AjZXXCrQA5y8/TzDPZAchHc8aBAiAUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
filterUnwanted = Table.SelectRows(chgTypes, each [Column1] <> "" and [Column1] <> "START-OF-FILE" and [Column1] <> "START-OF-FIELDS" and [Column1] <> "END-OF-FIELDS" and [Column1] <> "TIMESTARTED=xxx" and [Column1] <> "START-OF-DATA" and [Column1] <> "END-OF-DATA" and [Column1] <> "TIMEFINISHED=xxx" and [Column1] <> "END-OF-FILE"),
addValueType = Table.AddColumn(filterUnwanted, "valueType", each if Text.Contains([Column1], ",") then "data" else "header"),
headerTable = Table.SelectRows(addValueType, each ([valueType] = "header")),
listHeaders = Table.Group(headerTable, {"valueType"}, {{"Column1", each Text.Combine([Column1], ",")}}),
dataTable = Table.SelectRows(addValueType, each ([valueType] = "data")),
appendHeadersAndData = Table.Combine({listHeaders, dataTable}),
remValueTypeCol = Table.RemoveColumns(appendHeadersAndData,{"valueType"}),
splitColsByComma = Table.SplitColumn(remValueTypeCol, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3"}),
promTableHeads = Table.PromoteHeaders(splitColsByComma, [PromoteAllScalars=true])
in
promTableHeads
Summary:
1) Filter out any row that we know we don't want, e.g. "END-OF-FIELDS" etc.
2) Classify the remaining values - if cell contains "," then assume data, else header.
3) Split this table into separate header and data tables, and combine headers into comma-separated list.
4) Append comma-separated headers back onto data table.
5) Split into columns by comma delimiter.
6) Promote first row to headers.
This gives me the following output:
Pete
Proud to be a Datanaut!
Hi @Isma ,
In Power Query create a new blank query, go to Advanced Editor andpaste the following over the default code to follow the steps I took:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCg5xDArR9XfTdfP0cVWK1UERcfVxCQaLZaQmpqQWGSKxjZDYxmC2q58Lmi4wEeLp6wo20tXFtqKiAtUKF8cQR7BISmJJoqEOiDQCk8ZwURMw3xRMmsFFzcF8CzBpiTDBACxgCDHJ0AjZXXCrQA5y8/TzDPZAchHc8aBAiAUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
filterUnwanted = Table.SelectRows(chgTypes, each [Column1] <> "" and [Column1] <> "START-OF-FILE" and [Column1] <> "START-OF-FIELDS" and [Column1] <> "END-OF-FIELDS" and [Column1] <> "TIMESTARTED=xxx" and [Column1] <> "START-OF-DATA" and [Column1] <> "END-OF-DATA" and [Column1] <> "TIMEFINISHED=xxx" and [Column1] <> "END-OF-FILE"),
addValueType = Table.AddColumn(filterUnwanted, "valueType", each if Text.Contains([Column1], ",") then "data" else "header"),
headerTable = Table.SelectRows(addValueType, each ([valueType] = "header")),
listHeaders = Table.Group(headerTable, {"valueType"}, {{"Column1", each Text.Combine([Column1], ",")}}),
dataTable = Table.SelectRows(addValueType, each ([valueType] = "data")),
appendHeadersAndData = Table.Combine({listHeaders, dataTable}),
remValueTypeCol = Table.RemoveColumns(appendHeadersAndData,{"valueType"}),
splitColsByComma = Table.SplitColumn(remValueTypeCol, "Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3"}),
promTableHeads = Table.PromoteHeaders(splitColsByComma, [PromoteAllScalars=true])
in
promTableHeads
Summary:
1) Filter out any row that we know we don't want, e.g. "END-OF-FIELDS" etc.
2) Classify the remaining values - if cell contains "," then assume data, else header.
3) Split this table into separate header and data tables, and combine headers into comma-separated list.
4) Append comma-separated headers back onto data table.
5) Split into columns by comma delimiter.
6) Promote first row to headers.
This gives me the following output:
Pete
Proud to be a Datanaut!
Hi,
With some modification to your codes I managed to solve my issues. Thanks a lot with the solution and tips. Thank again 😉