March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I have 87000+ responses from a survey and one of the questions was zip codes. I'm trying to filter through the zip codes and get rid of answers like "234", "V3g1s4", "no thanks", etc. Using the filter featers works for getting rid of blanks, but for this many responses, how can I get rid of invalid answers efficiently? I'm looking for a method to filter out any non-numeric responses and any responses less than or greater than 5 digits.
On a side note, if anyone knows of the best mapping app to use within PowerBi, please share! I'm working with several different features. ArcGis cannot display the amount of zip codes I have.
Solved! Go to Solution.
Perhaps:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjE2MDNRitWJVjIyhtBlxumGxRBmXr5CSUZiXnYxmAdWamqmFBsLAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Int32.From([Column1])), #"Removed Errors" = Table.RemoveRowsWithErrors(#"Added Custom", {"Custom"}), #"Duplicated Column" = Table.DuplicateColumn(#"Removed Errors", "Custom", "Custom - Copy"), #"Calculated Text Length" = Table.TransformColumns(Table.TransformColumnTypes(#"Duplicated Column", {{"Custom - Copy", type text}}, "en-US"),{{"Custom - Copy", Text.Length, Int64.Type}}), #"Filtered Rows" = Table.SelectRows(#"Calculated Text Length", each ([#"Custom - Copy"] = 5)), #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom", "Custom - Copy"}) in #"Removed Columns"
If you are wanting to validate the zipcodes you might need to take it a step futher. Basic validation, about the only thing you can check is if the zipcode is numeric or not (USA Only). There are regex for zipcodes - but we didn't have much luck with them since we only have USA data - although it might help if you are getting USA and Canada zipcodes coming in.
We require a 5 digit zipcode (USA Only) to be entered on our microsites. The main goto entry for spam leads is "12345".. which is a valid zipcode (Schenectady, NY) - so we have to use other criteria to weed out the spam leads from this region. Point is - if you have 5 digits - it might look ok - but it could still be an invalid zipcode.
We resorted to buying zipcode data from: https://www.zip-codes.com/ which gets updated monthly. This has the added benefit of providing us additional data: County, City, State, Longitude, Latitude, AreaCodes, etc. We have a data cleanup step before loading the data into Power BI. With this data, we can use a SQL join to determine if we have an invalid zipcode or not (with exceptions).
Perhaps:
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjE2MDNRitWJVjIyhtBlxumGxRBmXr5CSUZiXnYxmAdWamqmFBsLAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Int32.From([Column1])), #"Removed Errors" = Table.RemoveRowsWithErrors(#"Added Custom", {"Custom"}), #"Duplicated Column" = Table.DuplicateColumn(#"Removed Errors", "Custom", "Custom - Copy"), #"Calculated Text Length" = Table.TransformColumns(Table.TransformColumnTypes(#"Duplicated Column", {{"Custom - Copy", type text}}, "en-US"),{{"Custom - Copy", Text.Length, Int64.Type}}), #"Filtered Rows" = Table.SelectRows(#"Calculated Text Length", each ([#"Custom - Copy"] = 5)), #"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom", "Custom - Copy"}) in #"Removed Columns"
Okay, I'm a little illiterate when it comes to syntax. I tried pasting the code and it gave me this:
Yeah, can't do that, I started with an Enter Data query. Create a blank query and paste in my code. Basically, what I did was to create a custom column with this formula:
=Int32.From([Column1])
This returns a number if it is a number or Error if not. I then right-clicked this column and chose "Remove Errors". Then you could select this column and choose Transform | Extract | Length. Then filter to just 5. Finally, just remove this column.
You should be left with your original column with just valid zip codes.
Amazing! Thank you so much!
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
124 | |
89 | |
84 | |
70 | |
51 |
User | Count |
---|---|
206 | |
143 | |
97 | |
79 | |
68 |