Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
So I have a bunch of really messy pdf files that i'm drawing data from and they have the data all in one column when pulled into power bi like this:
I want to reorgnaize the data into this format, summing the numbers that belong in the same category:
Any ideas on how to approach?
Solved! Go to Solution.
So I went through and used power query to get a table that looks like this:
Column1 | Column1 - Copy |
Build Hours | 300 |
Build Hours | 200 |
Demo Hours | 400 |
Other Hours | 200 |
Other Hours | 200 |
Other Hours | 100 |
Here's the power query (and I'm sure it could be cleaned up, but it gets a usuable result:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcirNzElR8MgvLSpWitWJVjI2MADT6OJGUHGX1Nx8JGETqLB/SUZqERblMNoQRMcCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}), #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Column1", "Column1 - Copy"), #"Remove Numbers" = Table.TransformColumns(#"Duplicated Column",{{"Column1", each Text.Combine(List.RemoveItems(Text.ToList(_),{"0".."9"}))}}), #"Replaced Value" = Table.ReplaceValue(#"Remove Numbers","",null,Replacer.ReplaceValue,{"Column1"}), #"Filled Down" = Table.FillDown(#"Replaced Value",{"Column1"}), #"Changed Type1" = Table.TransformColumnTypes(#"Filled Down",{{"Column1 - Copy", Int64.Type}}), #"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type1", {"Column1 - Copy"}) in #"Removed Errors"
So I went through and used power query to get a table that looks like this:
Column1 | Column1 - Copy |
Build Hours | 300 |
Build Hours | 200 |
Demo Hours | 400 |
Other Hours | 200 |
Other Hours | 200 |
Other Hours | 100 |
Here's the power query (and I'm sure it could be cleaned up, but it gets a usuable result:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcirNzElR8MgvLSpWitWJVjI2MADT6OJGUHGX1Nx8JGETqLB/SUZqERblMNoQRMcCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}), #"Duplicated Column" = Table.DuplicateColumn(#"Changed Type", "Column1", "Column1 - Copy"), #"Remove Numbers" = Table.TransformColumns(#"Duplicated Column",{{"Column1", each Text.Combine(List.RemoveItems(Text.ToList(_),{"0".."9"}))}}), #"Replaced Value" = Table.ReplaceValue(#"Remove Numbers","",null,Replacer.ReplaceValue,{"Column1"}), #"Filled Down" = Table.FillDown(#"Replaced Value",{"Column1"}), #"Changed Type1" = Table.TransformColumnTypes(#"Filled Down",{{"Column1 - Copy", Int64.Type}}), #"Removed Errors" = Table.RemoveRowsWithErrors(#"Changed Type1", {"Column1 - Copy"}) in #"Removed Errors"
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
101 | |
72 | |
47 | |
39 | |
33 |
User | Count |
---|---|
157 | |
101 | |
60 | |
43 | |
40 |