Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for a $400 discount.
Register nowGet inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.
We're aiming to transform a table with unstructured data into a more organized format using Power Query. The goal is to group information by an "Identifier" column, extracting specific data points like company, address, and ZIP code into their respective columns. Now I have a table with the information mix into many rows in just one column, in other column is the identificatory appearing in every row containing information about that identificator. I need to identify with information in the rows that belong to the Company column, those to Address and those to Postcode. I want to ignore those that do not belong anywhere. The information about the company can appear in two rows. How can I perform this conversion in Power Query? I show one table as it looks now and the other as I want it to look.
I have other information in the table in other columns that cannot be deleted (e.g. by deleting all rows that contain @)
Current table
Identifikator | Information |
6 |
|
6 | Unternehmen 1 |
6 | Abteilung 3 |
6 | Hügelstr. 69 |
6 | Kommentar |
6 | 60433 Frankfurt |
8 |
|
8 | Unternehmen 2 |
8 |
|
8 | Zeil 5 |
8 | 60313 Frankfurt am Main |
8 | |
9 |
|
9 | Unternehmen 3 |
9 | Elbestrasse 48 |
9 | 60329 Frankfurt |
9 |
|
10 | Unternehmen 4 |
10 | Abteilung 6 |
10 | Gutleutstr.7-11 |
10 | 60329 Frankfurt |
10 |
Final table
Identificator | Company | Adress | Zip Code | |
6 | Unternehmen 1 - Abteilung 3 | Hügelstr. 69 | 60433 Frankfurt |
|
8 | Städtische Kitas Frankfurt am Main | Zeil 5 | 60313 Frankfurt am Main | |
9 | ABG Frankfurt Holding | Elbestrasse 48 | 60329 Frankfurt | |
10 | Stadt Frankfurt am Main -Amt für Bau und Immobilien | Gutleutstr.7-11 | 60329 Frankfurt |
Notes on the transformations
- Identifiers 6 and 10 have company information in two rows, they need to be combined in the column for the company
- Emails should be ignored
- Address goes in one column
- Zip code also goes in one column
I will very grateful if you have an idea on how to proceed for this transformation. Thank you!
Solved! Go to Solution.
Hi @Luzadriana255 ,
@Shravan133 's reply is great and vaild.
Here's my solution with .pbix file attached.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZC7DoIwFIZfpemshFKssOngJTGOLhKGqgck9JKUMvhubr6YiPFg0K39/vT/Tk+WUUEnlOaT9+FgPDgDVw2GMKTLk4dKtaYkHNn2cS9BNd4FRKRId1Z3T710SEQYc07WTpq6aJ3vefJRJiNl9BsfOzOZ4VWEnH3VEanJXlYG87oOykWpZaWCs9U9Tj916cjGka7UCbqvyKYBEieIO1mUjmbHNhaO6uIBDwsTA9y0XkHrXyubTxkbgn+aPqhMYRc3ebU2uADN8yc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Identifikator = _t, Information = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Identifikator", Int64.Type}, {"Information", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Information] <> "")),
//Rows where the Information column is empty are filtered out.
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each not Text.Contains([Information], "@")),
//Rows where the Information column contains “@” are filtered out.
#"Added Custom" = Table.AddColumn(#"Filtered Rows1", "Custom", each if Text.Contains([Information], "Zeil ") or Text.Contains([Information], "Hügelstr") or Text.Contains([Information], "Elbestrasse") or Text.Contains([Information], "Gutleutstr") then "Address"
else if Text.Contains([Information], "603") or Text.Contains([Information], "604") then "ZIP Code"
else if Text.Contains([Information], "Unternehmen") or Text.Contains([Information], "Abteilung") then "Company"
else null),
//A new column Custom is added based on the content of the Information column:
//If Information contains certain street names, it is labeled as “Address”.
//If Information contains certain ZIP codes, it is labeled as “ZIP Code”.
//If Information contains certain company-related terms, it is labeled as “Company”.
//Otherwise, it is labeled as null.
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Table.Group(#"Added Custom", {"Custom","Identifikator"}, {{"Information", each Text.Combine([Information], "-"), type text}})),
//Another custom column is added, grouping the table by Custom and Identifikator, and combining the Information values into a single string separated by “-”.
#"Expanded Custom.1" = Table.ExpandTableColumn(#"Added Custom1", "Custom.1", {"Information"}, {"Information.1"}),
//The grouped table is expanded back into the main table, adding a new column Information.1.
#"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Custom.1", each Text.Contains([Information.1],[Information])),
//A new column Custom.1 is added, checking if Information.1 contains the original Information value.
#"Filtered Rows2" = Table.SelectRows(#"Added Custom2", each ([Custom.1] = true)),
//Rows where Custom.1 is true are kept.
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows2",{"Information"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Filtered Rows3" = Table.SelectRows(#"Removed Duplicates", each ([Custom] <> null)),
//Rows where Custom is not null are kept.
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows3",{"Custom.1"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns1", List.Distinct(#"Removed Columns1"[Custom]), "Custom", "Information.1")
//The table is pivoted, using the distinct values in the Custom column as new columns, and the Information.1 values as the data.
in
#"Pivoted Column"
Current table:
Final result:
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @Luzadriana255 ,
@Shravan133 's reply is great and vaild.
Here's my solution with .pbix file attached.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZC7DoIwFIZfpemshFKssOngJTGOLhKGqgck9JKUMvhubr6YiPFg0K39/vT/Tk+WUUEnlOaT9+FgPDgDVw2GMKTLk4dKtaYkHNn2cS9BNd4FRKRId1Z3T710SEQYc07WTpq6aJ3vefJRJiNl9BsfOzOZ4VWEnH3VEanJXlYG87oOykWpZaWCs9U9Tj916cjGka7UCbqvyKYBEieIO1mUjmbHNhaO6uIBDwsTA9y0XkHrXyubTxkbgn+aPqhMYRc3ebU2uADN8yc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Identifikator = _t, Information = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Identifikator", Int64.Type}, {"Information", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Information] <> "")),
//Rows where the Information column is empty are filtered out.
#"Filtered Rows1" = Table.SelectRows(#"Filtered Rows", each not Text.Contains([Information], "@")),
//Rows where the Information column contains “@” are filtered out.
#"Added Custom" = Table.AddColumn(#"Filtered Rows1", "Custom", each if Text.Contains([Information], "Zeil ") or Text.Contains([Information], "Hügelstr") or Text.Contains([Information], "Elbestrasse") or Text.Contains([Information], "Gutleutstr") then "Address"
else if Text.Contains([Information], "603") or Text.Contains([Information], "604") then "ZIP Code"
else if Text.Contains([Information], "Unternehmen") or Text.Contains([Information], "Abteilung") then "Company"
else null),
//A new column Custom is added based on the content of the Information column:
//If Information contains certain street names, it is labeled as “Address”.
//If Information contains certain ZIP codes, it is labeled as “ZIP Code”.
//If Information contains certain company-related terms, it is labeled as “Company”.
//Otherwise, it is labeled as null.
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Table.Group(#"Added Custom", {"Custom","Identifikator"}, {{"Information", each Text.Combine([Information], "-"), type text}})),
//Another custom column is added, grouping the table by Custom and Identifikator, and combining the Information values into a single string separated by “-”.
#"Expanded Custom.1" = Table.ExpandTableColumn(#"Added Custom1", "Custom.1", {"Information"}, {"Information.1"}),
//The grouped table is expanded back into the main table, adding a new column Information.1.
#"Added Custom2" = Table.AddColumn(#"Expanded Custom.1", "Custom.1", each Text.Contains([Information.1],[Information])),
//A new column Custom.1 is added, checking if Information.1 contains the original Information value.
#"Filtered Rows2" = Table.SelectRows(#"Added Custom2", each ([Custom.1] = true)),
//Rows where Custom.1 is true are kept.
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows2",{"Information"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns"),
#"Filtered Rows3" = Table.SelectRows(#"Removed Duplicates", each ([Custom] <> null)),
//Rows where Custom is not null are kept.
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows3",{"Custom.1"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns1", List.Distinct(#"Removed Columns1"[Custom]), "Custom", "Information.1")
//The table is pivoted, using the distinct values in the Custom column as new columns, and the Information.1 values as the data.
in
#"Pivoted Column"
Current table:
Final result:
Best Regards,
Stephen Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
To transform your unstructured data into a more organized format using Power Query, follow these steps:
Example Custom Column Formula:
if Text.Contains([Information], "Str.") or Text.Contains([Information], "Hauptstraße") then "Address"
else if Text.Contains([Information], "603") or Text.Contains([Information], "604") then "ZIP Code"
else if Text.Contains([Information], "Unternehmen") or Text.Contains([Information], "Firma") then "Company"
else null
Example of concatenating company information:
Text.Combine([Information], " - ")
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!
Check out the February 2025 Power BI update to learn about new features.
User | Count |
---|---|
28 | |
28 | |
23 | |
14 | |
10 |
User | Count |
---|---|
23 | |
21 | |
17 | |
10 | |
9 |