The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I'm currently working on a Power BI project with a dataset called "CRM Data Set." In this dataset, I have columns for "email," "mobile," and "course." I'm facing a specific challenge where I need to create a custom index column based on the following criteria:
Main Objectives:
Identify Repeated Inquiries: I want to identify repeated inquiries, which can be determined by comparing the "mobile" and "email" columns. If either the mobile number or email address matches with any other entry in the dataset, I need to mark it as a repeated entry.
Course Inquiry Identification: I also need to track which course each inquiry is about. This is crucial for understanding which programs each individual inquired about.
Create a Unique Identifier: Since my dataset lacks a primary or foreign key column, I need to generate a unique identifier for each record. This identifier should allow me to reference records in other tables. It's important to note that different types of students may appear in the dataset, and their phone numbers or email addresses might duplicate in other inquiries. Additionally, I can't solely rely on the date, email, or mobile for differentiation, as data import can result in identical date and time values.
Attached screenshot is of my sample columns for your clarity I want to make unique indentifier which should represent (Unique record Identifier, Track individual record e.g. I am not clear but maybe combination of created on (date), email, mobile, course)
Specific Requirements for the Index Column:
If the "email" or "mobile" and course value matches any other record in the entire dataset, the index should be set to 0 for that row.
If the "email" or "mobile" and course values are unique among all other rows, the index should be incremented by 1 for each unique combination of "email" and "mobile."
Thanks in advance
Hello - here you are.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rdDPC4IwFAfw/8Wzpc0ij1HRDyGQspN4GDbcaPOJC8L/vpFzNLND1Pew994OH3gvTZ25Nwk85KPAcZ0IaKnLohCY8XEOQn1MTNQQ00ayXKpuA5zDfXSunMz9IxR64VfOjskb1I3qjkRWUEqiGeR3zgnXmHbVkpCJGvbJAII65MCuRBeLCEzUsGTAoWgGlnp3cooJ/4mKmGhfS5maWFfuL9YhMRbtayEzEzWsKBHqys8jr2uoKnLpKwmUjS6fnS0BQVrmZafsAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Created = _t, Lead = _t, Email = _t, Mobile = _t, Course = _t, Stage = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Created", type date}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Course", Order.Ascending}, {"Created", Order.Ascending}, {"Email", Order.Ascending}, {"Mobile", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"Course", "Email", "Mobile"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"Data", each Table.AddIndexColumn ( _, "Id", 0, 1, Int64.Type ), type table}}, 0, (x,y) => Number.From ( x[Course] <> y[Course] and x[Email] <> y[Email] or x[Mobile] <> y[Mobile] ) ),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Created", "Lead", "Email", "Mobile", "Course", "Stage", "Id"}, {"Created", "Lead", "Email", "Mobile", "Course", "Stage", "Id"})
in
#"Expanded Data"
Here is a preview from the grouped rows step...
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.