Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount.
Register nowThe Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.
Good morning! I'd like to ask for some help with something...
I have a table of transactions. I am summarizing them by IdenityID, Date, Time and Marketing Source. Problem is that there are some duplicates for (Identity, Date, Time) with different Marketing Sources (one with a value, one blank).
Here's some code and a pic of the output of the table statement below for one person on one day...
Marketing Count Test =
VAR
UniqueLogins =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
RETURN
UniqueLogins
In the above data, there are duplicates at 11:14, 2:42 and 2:49. How would I go about removing the only the duplicated records with a blank marketing source for those times? The records at other times with a blank value should be retained.
Any help would be appreciated!
Solved! Go to Solution.
Ok...this works
Marketing Count Test =
VAR
LoginTransactions =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
VAR
UniqueLogins =
SUMMARIZE(
LoginTransactions,
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
[ModifiedTime],
"MarketingSource",
MAX(IdentityLog[MarketingSourceLogId])
)
RETURN
UniqueLogins
But this seems overly complicated. I could achieve the same result with SUMMARIZECOLUMNS if I could truncate seconds from the time. Could anyone please offer some guidance on how to achieve that?
This is what I was looking for!
Active Users by Marketing Source Earliest Attribution =
VAR
LoginTransactions =
ADDCOLUMNS(
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
"Created Time",
MIN(IdentityLog[CreatedTime])
),
"MarketingSource",
LOOKUPVALUE(
IdentityLog[MarketingSourceLogId],
IdentityLog[IdentityId],
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
[Created Time]
)
)
RETURN
COUNTROWS(LoginTransactions)
A second, simpler version of code...
Marketing Count Test =
VAR
LoginTransactions =
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
"MarketingSource",
FIRSTNONBLANK(
IdentityLog[MarketingSourceLogId],
VALUES(IdentityLog[MarketingSourceLogId])
)
)
RETURN
LoginTransactions
Unfortunately this produces only those records where there is a value in Marketing Source. I did have the thought that maybe I could use this as a second table, produce a third filtering only blank from the previous code and then union them together but that would put me exactly where I started 🙄
Continuing to play with this. This produces almost exactly what I want
Marketing Count Test Redux =
VAR
LoginTransactions =
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
"MarketingSource",
IGNORE(
FIRSTNONBLANK(
IdentityLog[MarketingSourceLogId],
VALUES(IdentityLog[MarketingSourceLogId])
)
)
)
RETURN
LoginTransactions
The only problem with it is that it's grouping by HH:MM:SS as shown.
The original code using GROUPBY allowed me to add a column for truncating seconds from the time, but I couldn't remove the duplicates. This works, except I can't truncate seconds from the time.
Any suggestions most appreciated!
Ok...this works
Marketing Count Test =
VAR
LoginTransactions =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
VAR
UniqueLogins =
SUMMARIZE(
LoginTransactions,
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
[ModifiedTime],
"MarketingSource",
MAX(IdentityLog[MarketingSourceLogId])
)
RETURN
UniqueLogins
But this seems overly complicated. I could achieve the same result with SUMMARIZECOLUMNS if I could truncate seconds from the time. Could anyone please offer some guidance on how to achieve that?
Could you post some of that sample data as text?
This is what I was looking for!
Active Users by Marketing Source Earliest Attribution =
VAR
LoginTransactions =
ADDCOLUMNS(
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
"Created Time",
MIN(IdentityLog[CreatedTime])
),
"MarketingSource",
LOOKUPVALUE(
IdentityLog[MarketingSourceLogId],
IdentityLog[IdentityId],
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
[Created Time]
)
)
RETURN
COUNTROWS(LoginTransactions)
User | Count |
---|---|
120 | |
67 | |
66 | |
57 | |
50 |
User | Count |
---|---|
176 | |
83 | |
69 | |
65 | |
54 |