The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hello,
can someone help me out with a piece of M code that allows me to make a check on the data quality of an input file.
I have 2 tables:
tbl_data : this table if the result of an inputform and contains 2 columns (which are both free text)
- a postal code
- name of a city / commune
tbl_Bpost : contains 2 columns and is a download of the Belgian postal organisation. (so this s the only version of the truth)
- a postal code
- name of a city / commune
Remark : 1 postal code can occur several times in the table, with each time a different city namen (cf example 2 below)
I want to add in tbl_data a custom column 'CHECK" with the value "OK" or "NOK" where
OK is the result of a check that is ok = the name of the city in tbl_data matches with one of the names in tbl_Bpost of that specific postalcode
Example 1 :
for Postal code 9300 in Tbl_data, I have 4 lines --> row 1 and row 4 are NOK row 2 and 3 are OK
because in Tbl_Bpost :
Example 2
for Postal code 8400 in Tbl_data, I have 3 lines --> row 2 is NOK row 1 and 3 are OK
because in Tbl_Bpost :
I assume I have to use the function List.contain" but I am unable to make the list in table "Tbl_Post" varialbe (based on the postal code in Tbl_data
Thank you for your help
regards
Bjorn
Solved! Go to Solution.
Hi @Bjorn_C
I agree with @HotChilli , you need to select both postcode and city columns as matching columns, just like below. Hold on Ctrl key on keyboard and select columns in the same sequence.
The result is like this
Then add a custom column to get the Check result.
if Table.IsEmpty([Tbl_Bpost]) then "NOK" else "OK"
All M code
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WsjAxMFDSUfLPLy5JzUtJVYrVQYiVpqTmJZUWpSMLRiXmpZTl5xdBlVoag0X9MlNLy1OLskEoD1nCMTGnuASvAJiKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"postal code" = _t, city = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"postal code", type text}, {"city", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"postal code", "city"}, Tbl_Bpost, {"Postcode", "Plaatsnaam"}, "Tbl_Bpost", JoinKind.LeftOuter),
#"Added Custom" = Table.AddColumn(#"Merged Queries", "Check", each if Table.IsEmpty([Tbl_Bpost]) then "NOK" else "OK")
in
#"Added Custom"
Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.
Hi @Bjorn_C
The merge step is not correct. It needs to join on 2 pairs of columns - postcode and city. You can do this in the Merge dialog using ctrl key (a small index number will appear next to the column to show the order of matching).
If you get this step correct, the column that gets returned will contain a table with the matching records. If there are no matching records the table will be empty, so if you use Table.IsEmpty, passing it the name of the column, you will get a true or false
As said above, make sure your types match, and then:
--Nate
Initially, I am going to assume the data is in a range (however, it could be a table). Convert both ranges to Power Queries by High-Lighting one of the Range/Table, then click on the Data ribbon, then the group: Get and Transform Data , and click on the command button: From Table/Range. Make sure the postal codes are "text"; this is done by clicking on the characters immediately to the left of the field and selecting "Text".
Example:
From:
To:
"Name" the first query: Tbl_data.
Close and Load the table.
Do similar step with the data: Tbl_Bpost, except of course, name the query: Tbl_Bpost.
Then again click on the Data ribbon, and in the group: Get and Transform Data, click on the command button: Get Data, Combine queries, Merge. For the first (or top) query select table: Tbl_data and the second (lower) query select table: Tbl_Bpost.
Click on the fields: postal code and Postcode, respectfully, and then click OK.
When the Power Query editor opens, expand the table: Tbl_Bpost by clicking the icon to the right of the table name and select the Plaatsnaam field, uncheck the “Use original column name as prefix”, and click “OK”. The Postcode is not needed as it is linked to the postal code.
Now select the ribbon: Add Column, and click the “Conditional Column” button. Name the new column: Check, complete as shown below and click Ok.
Finally, click the ribbon: Home, click Close & Load, click the top choice: Close & Load
A new worksheet will be added with the desired results.
Good Luck
You can do a Merge (comparing on 2 pairs of columns). It would be a Left Outer. This would return a column containing a table of matched rows.
Then you can add a column which uses Table.IsEmpty to test this returned column and return true or false
---
Alternatively, add a column using Table.ContainsAny -> the documentation on this is not great but if you have a go, I will help if you get stuck
I loaded both tables into PowerQuery made a merge of both.
For the example of postal code "8400" in my tbl_data, I have again the 3 rows, and due to to the merge I have a table matching each row (I did not expand the tables).
It is the final step (comparing for each line if the value in the column city, matches at least one of the value in the
the code so far is (including the filter on postam code "8400") :
let
Source = Table.NestedJoin(Tbl_data, {"postal code"}, Tbl_Bpost, {"Postcode"}, "Tbl_Bpost", JoinKind.LeftOuter),
#"Filtered Rows" = Table.SelectRows(Source, each [postal code] = "8400")
in
#"Filtered Rows"
Can someone help me with the next line of code to add the custom column?
Regards
Bjorn
Your next step would be:
= Table.AddColumn(#"Filtered Rows", "Valid", each if List.Contains([Tbl_Bpost][Postcode], [city]) then "OK" else "NOK", type text)
--Nate
Hi Nate,
thanks for the input. However I do not get the expected result.
In the first and third row I expected "OK"
Below the code I am using
let
Source = Table.NestedJoin(Tbl_data, {"postal code"}, Tbl_Bpost, {"Postcode"}, "Tbl_Bpost", JoinKind.LeftOuter),
#"Filtered Rows" = Table.SelectRows(Source, each [postal code] = "8400"),
Custom1 = Table.AddColumn(#"Filtered Rows", "Valid", each if List.Contains([Tbl_Bpost][Postcode], [city]) then "OK" else "NOK", type text)
in
Custom1
Any idea where it goes wrong
thx
Bjorn
Hi @Bjorn_C
I agree with @HotChilli , you need to select both postcode and city columns as matching columns, just like below. Hold on Ctrl key on keyboard and select columns in the same sequence.
The result is like this
Then add a custom column to get the Check result.
if Table.IsEmpty([Tbl_Bpost]) then "NOK" else "OK"
All M code
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WsjAxMFDSUfLPLy5JzUtJVYrVQYiVpqTmJZUWpSMLRiXmpZTl5xdBlVoag0X9MlNLy1OLskEoD1nCMTGnuASvAJiKjQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"postal code" = _t, city = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"postal code", type text}, {"city", type text}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"postal code", "city"}, Tbl_Bpost, {"Postcode", "Plaatsnaam"}, "Tbl_Bpost", JoinKind.LeftOuter),
#"Added Custom" = Table.AddColumn(#"Merged Queries", "Check", each if Table.IsEmpty([Tbl_Bpost]) then "NOK" else "OK")
in
#"Added Custom"
Best Regards,
Community Support Team _ Jing
If this post helps, please Accept it as Solution to help other members find it.
Thanks everyone!!!
it is working now 🙂
Regards
Bjorn
The appearance of your post "seems" to indicate that table: Tbl_data, field: Postal Code is "text" (as it is left-aligned); whereas the table: Tbl_Bpost, field: Postcode is "numeric" (as it is right-aligned).
To my knowledge, to set a relationship between tables, the linked fields MUST be the same type. I follow the rule that field should be numeric ONLY if one record's field can be added to another record's same field and the result is meaningful. In that regard, codes associated with Postal Codes should be "text", not "numeric".
I hope that helps. 🙂