Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
Hi guys,
I have some data and it came from a manual entry CRM so it has alot of human errors. (which you will see in the Customer Stage, the status will be marked as XXX Error)
The first step that I have to do is to filter out the Customer profile which is at the latest stage.
Customer Stage ranking: Lead OR Lead Error -> Prospect OR Prospect Error-> Profile OR Profile Error (final stage)
Below is a piece of sample data I have:
| Customer | Profile Status | Asset |
| A | Lead | 10 |
| A | Prospect | 20 |
| B | Lead | 30 |
| B | Prospect | 30 |
| B | Profile | 50 |
| C | Lead | 40 |
| C | Profile | 60 |
| C | Profile | 60 |
| D | Lead | 15 |
| D | Prospect | 20 |
| D | Prospect Error | 20 |
| E | Profile | 5 |
| E | Profile Error | 5 |
The result I am looking for as per below:
| Customer | Profile Status | Asset |
| A | Prospect | 20 |
| B | Profile | 50 |
| C | Profile | 60 |
| C | Profile | 60 |
| D | Prospect | 20 |
| D | Prospect Error | 20 |
| E | Profile | 5 |
| E | Profile Error | 5 |
You will probably realize that there are still duplicates after the exercise, however, my goal here is not trying to de-dup but picking up the profile at their latest stages.
Lastly, if this can be done in power editor that would be great as I will need to join this table with other tables afterwards.
Any tips are appreciated.
Thank you
Solved! Go to Solution.
Hi @ljx0648 ,
Try this in the query editor. Paste the code in a blank query.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUfJJTUxRANKGBkqxOhChgKL84oLU5BIg0wgi6oSk0BghhKQQVTQtMycVyDKFCDoj6TZBCCHUmeERdEF2oylcCNON2EVdUR2EXSwWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, #"Profile Status" = _t, Asset = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Profile Status", type text}, {"Asset", Int64.Type}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Profile Status", Text.Trim, type text}}),
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Profile Status Order", each if [Profile Status] = "Lead" then 1 else if [Profile Status] = "Prospect" then 2 else 3),
#"Sorted Rows" = Table.Sort(#"Added Custom",{{"Profile Status Order", Order.Descending}}),
#"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Max", each let customer = [Customer]
in
List.Max(
Table.SelectRows(#"Sorted Rows", each [Customer ] = customer)[Profile Status Order])),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Filter", each [Profile Status Order]=[Max]),
#"Filtered Rows" = Table.SelectRows(#"Added Custom2", each [Filter] = true),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Customer", "Profile Status", "Asset"}),
#"Sorted Rows1" = Table.Sort(#"Removed Other Columns",{{"Customer", Order.Ascending}})
in
#"Sorted Rows1"
You can actually click the wrench of the first Added Custom column step and add more conditions.
Hi @ljx0648 ,
Try this in the query editor. Paste the code in a blank query.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUfJJTUxRANKGBkqxOhChgKL84oLU5BIg0wgi6oSk0BghhKQQVTQtMycVyDKFCDoj6TZBCCHUmeERdEF2oylcCNON2EVdUR2EXSwWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Customer = _t, #"Profile Status" = _t, Asset = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Customer", type text}, {"Profile Status", type text}, {"Asset", Int64.Type}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Profile Status", Text.Trim, type text}}),
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Profile Status Order", each if [Profile Status] = "Lead" then 1 else if [Profile Status] = "Prospect" then 2 else 3),
#"Sorted Rows" = Table.Sort(#"Added Custom",{{"Profile Status Order", Order.Descending}}),
#"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Max", each let customer = [Customer]
in
List.Max(
Table.SelectRows(#"Sorted Rows", each [Customer ] = customer)[Profile Status Order])),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Filter", each [Profile Status Order]=[Max]),
#"Filtered Rows" = Table.SelectRows(#"Added Custom2", each [Filter] = true),
#"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Customer", "Profile Status", "Asset"}),
#"Sorted Rows1" = Table.Sort(#"Removed Other Columns",{{"Customer", Order.Ascending}})
in
#"Sorted Rows1"
Moreover, I stuck at running this part of the code
#"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Max", each let customer = [Customer] in List.Max( Table.SelectRows(#"Sorted Rows", each [Customer ] = customer)[Profile Status Order])),
May I know if I should put , behind the customer)
That code is passing the [Customer] column as a variable named customer to be used inside Table.SelectRows. What it does is it filters the previous applied step on the customer column according to the customer column value in the current step. The current customer value had to be passed as a variable otherwise the code would have been written as
Table.SelectRows(#"Sorted Rows", each [Customer ] = [Customer])
which will return true thus not filtering the previous step at all.
[Profile Status Order]
This drills down into the Profile Status Order column of the previous step, creating a list. List.Max will return the max value from that list.
Thank you so much for your patience and the code works as you suggested.
However, this line of code
#"Added Custom1" = Table.AddColumn(#"Sorted Rows", "Max", each let customer = [Customer] in List.Max( Table.SelectRows(#"Sorted Rows", each [Customer ] = customer)[Profile Status Order])),
took me 8 hours (overnight) to run the codes & get the result as there are 150K records in the real data.
I also tried approached this by grouping the customer 1st then select the Max order, but this approach only return ONE record (which I willl need duplicate if it is in the same stage).
May I know if you have any idea that can optimize this?
Thank you so much for your quick response!
Another quick question here, I just found out there are manual entry errors in the raw data.
Thus the Customer Stage ranking: (Lead Or Lead Error) -> (Prospect Or Propect Error) -> (Profile Or Profile Error) (final stage)
In other words, there are two status in each Stage.
May I know how should I modify your code to capture that as well?
Much appreicated!
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 50 | |
| 44 | |
| 41 | |
| 18 | |
| 18 |
| User | Count |
|---|---|
| 69 | |
| 68 | |
| 32 | |
| 32 | |
| 32 |