Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi all,
I am an apprentice to M language, but I am aware of ETL using Power query. I need help to add a custom column using group by in order to flag each transaction if the customer is new or not.
I have a reference date - "1/1/2021".
The logic I am looking for is to add an IF ELSE statement within the table.group function to achieve a "Y" flag for new customer if the helper column( unique identifier ) has been occured after the reference date, else "N".
Please see my sample below, I have been deriving the flag in excel using formulas.
Open to all suggestion and thanks in advance. Appreciate your help to cross this bridge and be able to build this logic in PQ using excel sample as reference.
@Alex_T
I wrote this logic for Sku Count based on your solution for New Customer Flag above. Please see below query:
= Table.Group(Source,"Account-Zip-Sku",{{"Sku Count", each if List.Dates[Invoice Dates] <= List.Dates[Start of Month] then "0" else "1"}})
But it throws me this error :
Expression.Error: We cannot apply field access to the type Function.
Details:
Value=[Function]
Key=Invoice Dates
Notes: I referenced the orginal source and created a reference query to test and derive Sku Count using the above logic.
If I remove the List. Dates function then it throws this error:
Expression.Error: We cannot apply operator <= to types List and List.
Details:
Operator=<=
Left=[List]
Right=[List]
Could you please help me write the correct query?
Thank you!
Hiya @Anonymous ,
Table.Group requires aggregations. [Invoice Dates] is referrring to the column as a column in the first error and as a list in the second error, in both cases it is trying to make sense of the expression, which unfortunately doesn't follow the right syntax
List.Dates(start as date, count as number, step as duration) as list
List.Dates creates a sequential list of dates, at best using [Invoice Dates], if the syntax was right, would offer a start date but this would fail straight away as it's a list and not a scalar value, your syntax is List.Dates[Invoice Dates], not List.Dates(<parameters>)
TBH, I have never used Table.Groups before but it appeared to produce the right result for you for New Customer, in this case I think you want a count and Table.Groups may again be the right approach (in which case I think you may want to use List.Min again with the dates, but I need to be sure I understand what you are trying to achieve). In the example PQ_Test, all the Invoice Dates are >= Start of Month so it looks like that is poor test data as it contradicts your <= comparison (or at least will work poorly with your test).
So this is a bit confusing (I also have no previous experience of what SKUs are! I work with hospital data!), can you clarify, is this right...
I'm not sure it is and if not could you re-write it and clarify what you need and we can have another go 🙂
Cheers
PS this is perhaps closer to what you appeared to be trying to do so may be what you want...
Table.Group(Source,"Unique ID",{ "SKU_count", each
if List.Min([Invoice Date]) <= List.Min([Start of month]) then 1 else 0})
Also, these are good references for learning PowerQuery...
Power Query M function reference - PowerQuery M | Microsoft Docs
Power Query M Primer (part 1): Introduction, Simple Expressions & let | Ben Gribaudo
Hi @Alex_T
I appreciate you taking time to help.
I am attaching a new sample data close to my orginal file (due to confidentiality can't share the OG file).I build pivot table based reports using csv and excel files for CPG industry, SKU is item in terms of supply chain.
Here's what I am trying to achieve. I get data sources from different databases and I am trying to perform all the ETL using power query and automate the refresh as each week fresh transactions are posted.
I have been in a fix to derive these two columns:
The samples I am able to share here doesn't contain previous two years data but my original file does. that's why we are considering, reference date.
We are using the logic Invoice date<= Start of the month - becuase the client needs to set filters at month level.
I hope there is an agile way I can derive these in the same table as two new columns?
Can you please look into this and let me know?
Note: I tried the logic
Table.Group(Source,"Unique ID",{ "SKU_count", each if List.Min([Invoice Date]) <= List.Min([Start of month]) then 1 else 0})
But I got this error
Cheers,
V
Hi @Anonymous ,
Hopefully this will help...
New_customer_table
let
_reference_date = #date(2021,01,01),
Source = PQ_Test,
Custom1 = Table.Group(Source,"Unique ID",{
{"New Customer", each if List.Min([Sku Count]) = 0 and List.Min([Invoice Date]) >= _reference_date then "Yes" else "No"}
})
in
Custom1
Using Table.Groups you will obviously be creating a separate reference table of customer status. You can create a new column with a join.
New_PQ_Test_table
let
Source = Table.NestedJoin(PQ_Test, {"Unique ID"}, New_customer_table,
{"Unique ID"}, "Join_table", JoinKind.LeftOuter),
Expanded_Join_table = Table.ExpandTableColumn(Source, "Join_table",
{"New Customer"}, {"Join_table.New Customer"})
in
Expanded_Join_table
You need a new table for this as joining onto the existing PQ_Test will create a circular reference, to avoid seeing both original and new tables in the report, deselect "Enable load"
Let me know if this works for you.
Best wishes, Alex
Hi @Alex_T
Thank you, I appreciate your help.
I guess I am one step closer to achieving what I need to.
I forgot to mention a prerequisite step in my orginal post. Kindly help me on this.
In order to derive the New Customer flag, I need to first derive the sku count in PQ as a new custom coulmn.
The Sku count is dependent on the Unique ID that I have already derived.
Request/Questions
1) Can you please guide me - how to derive the Sku count in PQ? logic for SKu count is in the sample file.
2)The table in the sample file is already loaded into PQ in my computer for the weekly tasks. Must I still include the line "Source = PQ_Test," in the query editor?
Thank you!