Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have pulled into the Power Query Editor a bank statement from a PDF and it has Check Num, $ Amt, and Date, repeated three times for a total of 9 columns. I've been trying to find a key word to search on how to "unpivot" the similar columns, but this is not a pure unpivot. How can I transform this to one table with three columns?
For now my workaround is copy the table 3x, remove different columns, merge these three queries into a new query.
Here is a pic:
Here's some sample data:
Number | Amount | Date | Number_1 | Amount_2 | Date_3 | Number_4 | Amount_5 | Date_6 |
89489 | $10. | 10/28/2023 | 92147 | $1,200. | 5-Oct-23 | 92180 | $2,500. | 10/17/2023 |
91728 | $25. | 10/05/2023 | 92148 | $3,740.92 | 20-Oct-23 | 92183 | $689. | 10/14/2023 |
92022 | $6. | 10/11/2023 | 92151 | $5,819.45 | 3-Oct-23 | 92184 | $1,667.02 | 10/20/2023 |
92072 | $117.5 | 10/03/2023 | 92152 | $236. | 4-Oct-23 | 92185 | $58.75 | 10/14/2023 |
92073 | $13,505.99 | 10/17/2023 | 92155 | $500.5 | 4-Oct-23 | 92186 | $7,778.24 | 10/27/2023 |
Solved! Go to Solution.
My #"Renamed Columns" should reference your last step:
let
Source = Pdf.Tables(
File.Contents(
"J:\Accounting\Bank Statements\2022\10-Oct 2022\WF TRH Main Operating 1425-Oct 2022.pdf"
),
[Implementation = "1.3"]
),
Table005 = Source{[Id = "Table005"]}[Data],
#"Promoted Headers1" = Table.PromoteHeaders(Table005, [PromoteAllScalars = true]),
#"Changed Type1" = Table.TransformColumnTypes(
#"Promoted Headers1",
{
{"Number", type text},
{"Amount", type number},
{"Date", type date},
{"Number_1", type text},
{"Amount_2", type number},
{"Date_3", type date},
{"Number_4", type text},
{"Amount_5", type number},
{"Date_6", type date}
}
),
#"Changed Type" = Table.TransformColumnTypes(
#"Changed Type1",
{
{"Number", type text},
{"Amount", Currency.Type},
{"Date", type date},
{"Number_1", type text},
{"Amount_2", Currency.Type},
{"Date_3", type date},
{"Number_4", type text},
{"Amount_5", Currency.Type},
{"Date_6", type date}
}
),
#"Extracted Text Before Delimiter" = Table.TransformColumns(
#"Changed Type",
{
{"Number", each Text.BeforeDelimiter(_, " *"), type text},
{"Number_1", each Text.BeforeDelimiter(_, " *"), type text},
{"Number_4", each Text.BeforeDelimiter(_, " *"), type text}
}
),
#"Renamed Columns" = Table.RenameColumns(
#"Extracted Text Before Delimiter",
{
{"Date", "Date_1"},
{"Date_3", "Date_2"},
{"Date_6", "Date_3"},
{"Number_1", "Number_2"},
{"Number_4", "Number_3"},
{"Amount_5", "Amount_3"},
{"Amount", "Amount_1"},
{"Number", "Number_1"}
}
),
#"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1, Int64.Type),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(
#"Unpivoted Columns",
"Attribute",
Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv),
{"Column", "ColumnGroup"}
),
#"Pivoted Column" = Table.Pivot(
#"Split Column by Delimiter",
List.Distinct(#"Split Column by Delimiter"[Column]),
"Column",
"Value"
),
#"Changed Type" = Table.TransformColumnTypes(
#"Pivoted Column",
{
{"Index", Int64.Type},
{"ColumnGroup", Int64.Type},
{"Number", Int64.Type},
{"Amount", Currency.Type},
{"Date", type date}
}
)
in
#"Changed Type"
Nice, @AlexisOlson
I would just List.Union each of the three similar columns, then List.Zip them for a Table.FromColumns step.
--Nate
That sound similar to my 3rd solution but I don't see where List.Zip comes into play.
Also, note that you probably want List.Combine rather than List.Union since the latter removes duplicates (assuming you don't use the equationCriteria argument).
@AlexisOlson @I agree--List.Combine is the way to go. But as for List.Zip, after you List.Combine each of the 3 columns that are similar, you have 3 lists of column values. If you List.Zip the lists, you get a list of lists combined by position, so a list of the first evaluate each column than a list of the second value in each column, and so forth. Then you have the list structure for Table.FromColumns:
Table.FromColumns({
{1, "Bob", "123-4567"},
{2, "Jim", "987-6543"},
{3, "Paul", "543-7890"}
})
So
= Table.FromColumns(List.Zip({List.Combine({Table[Number], Table[Number_1], Table[Number_4]}), List.Combine({Table[Date], Table[Date_2], Table[Date_5]}), List.Combine({Table[Amount], Table[Amount_3], Table[Amount_6]})}))
--Nate
Here's a method that unpivots all of the columns, groups the corresponding columns together and then pivots to three columns:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBLDsQgCEDvYrq0FlAETjEHaLqaK8z9M0proplZgXny+JxnUCtqIYYNIbWAcJAeBJTbwwiLOIsEjnl/vT/7gAodUmQYtSh37RXPYCik/oEfCjybneUoBZJRywlWeY9bVRvqMqlbRo4HxMnM2BFHRUuFW55XcblXqlUS0LMzLHJxOaIkfibPs98pZW9eVjd7a03Cf6cWXwpzOxkns/Vot/w2tIvyr712JlFEE5Vn8nHx6ws=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Number = _t, Amount = _t, Date = _t, Number_1 = _t, Amount_2 = _t, Date_3 = _t, Number_4 = _t, Amount_5 = _t, Date_6 = _t]),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Date", "Date_1"}, {"Date_3", "Date_2"}, {"Date_6", "Date_3"}, {"Number_1", "Number_2"}, {"Number_4", "Number_3"}, {"Amount_5", "Amount_3"}, {"Amount", "Amount_1"}, {"Number", "Number_1"}}),
#"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1, Int64.Type),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Column", "ColumnGroup"}),
#"Pivoted Column" = Table.Pivot(#"Split Column by Delimiter", List.Distinct(#"Split Column by Delimiter"[Column]), "Column", "Value"),
#"Changed Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"Index", Int64.Type}, {"ColumnGroup", Int64.Type}, {"Number", Int64.Type}, {"Amount", Currency.Type}, {"Date", type date}})
in
#"Changed Type"
Thank you for your help. As I'm still new to Power Query, can I beg one more assist in Advanced Editor?
Below are my transformation steps, and then your recommendated script. I know I need to keep my source so do I blend in your script at "let _t = ((type nullable text)..."
let
Source = Pdf.Tables(File.Contents("J:\Accounting\Bank Statements\2022\10-Oct 2022\WF TRH Main Operating 1425-Oct 2022.pdf"), [Implementation="1.3"]),
Table005 = Source{[Id="Table005"]}[Data],
#"Promoted Headers1" = Table.PromoteHeaders(Table005, [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers1",{{"Number", type text}, {"Amount", type number}, {"Date", type date}, {"Number_1", type text}, {"Amount_2", type number}, {"Date_3", type date}, {"Number_4", type text}, {"Amount_5", type number}, {"Date_6", type date}}),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type1",{{"Number", type text}, {"Amount", Currency.Type}, {"Date", type date}, {"Number_1", type text}, {"Amount_2", Currency.Type}, {"Date_3", type date}, {"Number_4", type text}, {"Amount_5", Currency.Type}, {"Date_6", type date}}),
#"Extracted Text Before Delimiter" = Table.TransformColumns(#"Changed Type", {{"Number", each Text.BeforeDelimiter(_, " *"), type text}, {"Number_1", each Text.BeforeDelimiter(_, " *"), type text}, {"Number_4", each Text.BeforeDelimiter(_, " *"), type text}})
in
#"Extracted Text Before Delimiter"
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBLDsQgCEDvYrq0FlAETjEHaLqaK8z9M0proplZgXny+JxnUCtqIYYNIbWAcJAeBJTbwwiLOIsEjnl/vT/7gAodUmQYtSh37RXPYCik/oEfCjybneUoBZJRywlWeY9bVRvqMqlbRo4HxMnM2BFHRUuFW55XcblXqlUS0LMzLHJxOaIkfibPs98pZW9eVjd7a03Cf6cWXwpzOxkns/Vot/w2tIvyr712JlFEE5Vn8nHx6ws=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Number = _t, Amount = _t, Date = _t, Number_1 = _t, Amount_2 = _t, Date_3 = _t, Number_4 = _t, Amount_5 = _t, Date_6 = _t]),
#"Renamed Columns" = Table.RenameColumns(Source,{{"Date", "Date_1"}, {"Date_3", "Date_2"}, {"Date_6", "Date_3"}, {"Number_1", "Number_2"}, {"Number_4", "Number_3"}, {"Amount_5", "Amount_3"}, {"Amount", "Amount_1"}, {"Number", "Number_1"}}),
#"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1, Int64.Type),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv), {"Column", "ColumnGroup"}),
#"Pivoted Column" = Table.Pivot(#"Split Column by Delimiter", List.Distinct(#"Split Column by Delimiter"[Column]), "Column", "Value"),
#"Changed Type" = Table.TransformColumnTypes(#"Pivoted Column",{{"Index", Int64.Type}, {"ColumnGroup", Int64.Type}, {"Number", Int64.Type}, {"Amount", Currency.Type}, {"Date", type date}})
in
#"Changed Type"
My #"Renamed Columns" should reference your last step:
let
Source = Pdf.Tables(
File.Contents(
"J:\Accounting\Bank Statements\2022\10-Oct 2022\WF TRH Main Operating 1425-Oct 2022.pdf"
),
[Implementation = "1.3"]
),
Table005 = Source{[Id = "Table005"]}[Data],
#"Promoted Headers1" = Table.PromoteHeaders(Table005, [PromoteAllScalars = true]),
#"Changed Type1" = Table.TransformColumnTypes(
#"Promoted Headers1",
{
{"Number", type text},
{"Amount", type number},
{"Date", type date},
{"Number_1", type text},
{"Amount_2", type number},
{"Date_3", type date},
{"Number_4", type text},
{"Amount_5", type number},
{"Date_6", type date}
}
),
#"Changed Type" = Table.TransformColumnTypes(
#"Changed Type1",
{
{"Number", type text},
{"Amount", Currency.Type},
{"Date", type date},
{"Number_1", type text},
{"Amount_2", Currency.Type},
{"Date_3", type date},
{"Number_4", type text},
{"Amount_5", Currency.Type},
{"Date_6", type date}
}
),
#"Extracted Text Before Delimiter" = Table.TransformColumns(
#"Changed Type",
{
{"Number", each Text.BeforeDelimiter(_, " *"), type text},
{"Number_1", each Text.BeforeDelimiter(_, " *"), type text},
{"Number_4", each Text.BeforeDelimiter(_, " *"), type text}
}
),
#"Renamed Columns" = Table.RenameColumns(
#"Extracted Text Before Delimiter",
{
{"Date", "Date_1"},
{"Date_3", "Date_2"},
{"Date_6", "Date_3"},
{"Number_1", "Number_2"},
{"Number_4", "Number_3"},
{"Amount_5", "Amount_3"},
{"Amount", "Amount_1"},
{"Number", "Number_1"}
}
),
#"Added Index" = Table.AddIndexColumn(#"Renamed Columns", "Index", 1, 1, Int64.Type),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(
#"Unpivoted Columns",
"Attribute",
Splitter.SplitTextByDelimiter("_", QuoteStyle.Csv),
{"Column", "ColumnGroup"}
),
#"Pivoted Column" = Table.Pivot(
#"Split Column by Delimiter",
List.Distinct(#"Split Column by Delimiter"[Column]),
"Column",
"Value"
),
#"Changed Type" = Table.TransformColumnTypes(
#"Pivoted Column",
{
{"Index", Int64.Type},
{"ColumnGroup", Int64.Type},
{"Number", Int64.Type},
{"Amount", Currency.Type},
{"Date", type date}
}
)
in
#"Changed Type"
I thought I posted a thank you, but I don't see it. Your solution works great. Thank you so much. I'll now study teh Applied Steps to learn how to do this.
Here are another two possible methods:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBLDsQgCEDvYrq0FlAETjEHaLqaK8z9M0proplZgXny+JxnUCtqIYYNIbWAcJAeBJTbwwiLOIsEjnl/vT/7gAodUmQYtSh37RXPYCik/oEfCjybneUoBZJRywlWeY9bVRvqMqlbRo4HxMnM2BFHRUuFW55XcblXqlUS0LMzLHJxOaIkfibPs98pZW9eVjd7a03Cf6cWXwpzOxkns/Vot/w2tIvyr712JlFEE5Vn8nHx6ws=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Number = _t, Amount = _t, Date = _t, Number_1 = _t, Amount_2 = _t, Date_3 = _t, Number_4 = _t, Amount_5 = _t, Date_6 = _t]),
#"Removed Other Columns" =
Table.FromColumns(
List.Transform(
List.Split(
Table.ToColumns(
Table.SelectColumns(
Source,
List.Sort(Table.ColumnNames(Source))
)
),
3
),
List.Combine
),
{"Amount", "Date", "Number"}
),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"Amount", Currency.Type}, {"Date", type date}, {"Number", Int64.Type}})
in
#"Changed Type"
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZBLDsQgCEDvYrq0FlAETjEHaLqaK8z9M0proplZgXny+JxnUCtqIYYNIbWAcJAeBJTbwwiLOIsEjnl/vT/7gAodUmQYtSh37RXPYCik/oEfCjybneUoBZJRywlWeY9bVRvqMqlbRo4HxMnM2BFHRUuFW55XcblXqlUS0LMzLHJxOaIkfibPs98pZW9eVjd7a03Cf6cWXwpzOxkns/Vot/w2tIvyr712JlFEE5Vn8nHx6ws=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Number = _t, Amount = _t, Date = _t, Number_1 = _t, Amount_2 = _t, Date_3 = _t, Number_4 = _t, Amount_5 = _t, Date_6 = _t]),
Col_List = {{"Number", "Number_1", "Number_4"}, {"Amount", "Amount_2", "Amount_5"}, {"Date", "Date_3", "Date_6"}},
NumberCols = List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Number")), // {Number,Number_1,Number_4}
AmountCols = List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Amount")), // {Amount,Amount_2,Amount_5}
DateCols = List.Select(Table.ColumnNames(Source), each Text.Contains(_, "Date")), // {Date,Date_3,Date_6}
CombineNumbers = List.Combine(List.Transform(NumberCols, each Table.Column(Source, _))),
CombineAmounts = List.Combine(List.Transform(AmountCols, each Table.Column(Source, _))),
CombineDates = List.Combine(List.Transform(DateCols, each Table.Column(Source, _))),
ToTable = Table.FromColumns({CombineNumbers, CombineAmounts, CombineDates}, {"Number", "Amount", "Date"}),
#"Changed Type" = Table.TransformColumnTypes(ToTable,{{"Number", Int64.Type}, {"Amount", Currency.Type}, {"Date", type date}})
in
#"Changed Type"