Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Good morning! I'd like to ask for some help with something...
I have a table of transactions. I am summarizing them by IdenityID, Date, Time and Marketing Source. Problem is that there are some duplicates for (Identity, Date, Time) with different Marketing Sources (one with a value, one blank).
Here's some code and a pic of the output of the table statement below for one person on one day...
Marketing Count Test =
VAR
UniqueLogins =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
RETURN
UniqueLogins
In the above data, there are duplicates at 11:14, 2:42 and 2:49. How would I go about removing the only the duplicated records with a blank marketing source for those times? The records at other times with a blank value should be retained.
Any help would be appreciated!
Solved! Go to Solution.
Ok...this works
Marketing Count Test =
VAR
LoginTransactions =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
VAR
UniqueLogins =
SUMMARIZE(
LoginTransactions,
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
[ModifiedTime],
"MarketingSource",
MAX(IdentityLog[MarketingSourceLogId])
)
RETURN
UniqueLogins
But this seems overly complicated. I could achieve the same result with SUMMARIZECOLUMNS if I could truncate seconds from the time. Could anyone please offer some guidance on how to achieve that?
This is what I was looking for!
Active Users by Marketing Source Earliest Attribution =
VAR
LoginTransactions =
ADDCOLUMNS(
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
"Created Time",
MIN(IdentityLog[CreatedTime])
),
"MarketingSource",
LOOKUPVALUE(
IdentityLog[MarketingSourceLogId],
IdentityLog[IdentityId],
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
[Created Time]
)
)
RETURN
COUNTROWS(LoginTransactions)
A second, simpler version of code...
Marketing Count Test =
VAR
LoginTransactions =
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
"MarketingSource",
FIRSTNONBLANK(
IdentityLog[MarketingSourceLogId],
VALUES(IdentityLog[MarketingSourceLogId])
)
)
RETURN
LoginTransactions
Unfortunately this produces only those records where there is a value in Marketing Source. I did have the thought that maybe I could use this as a second table, produce a third filtering only blank from the previous code and then union them together but that would put me exactly where I started 🙄
Continuing to play with this. This produces almost exactly what I want
Marketing Count Test Redux =
VAR
LoginTransactions =
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
"MarketingSource",
IGNORE(
FIRSTNONBLANK(
IdentityLog[MarketingSourceLogId],
VALUES(IdentityLog[MarketingSourceLogId])
)
)
)
RETURN
LoginTransactions
The only problem with it is that it's grouping by HH:MM:SS as shown.
The original code using GROUPBY allowed me to add a column for truncating seconds from the time, but I couldn't remove the duplicates. This works, except I can't truncate seconds from the time.
Any suggestions most appreciated!
Ok...this works
Marketing Count Test =
VAR
LoginTransactions =
GROUPBY(
ADDCOLUMNS(
IdentityLog,
"ModifiedTime",
TIME(
HOUR(IdentityLog[CreatedTime]),
MINUTE(IdentityLog[CreatedTime]),
0
)
),
IdentityLog[IdentityId],
IdentityLog[MarketingSourceLogId],
IdentityLog[CreatedDate],
[ModifiedTime]
)
VAR
UniqueLogins =
SUMMARIZE(
LoginTransactions,
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
[ModifiedTime],
"MarketingSource",
MAX(IdentityLog[MarketingSourceLogId])
)
RETURN
UniqueLogins
But this seems overly complicated. I could achieve the same result with SUMMARIZECOLUMNS if I could truncate seconds from the time. Could anyone please offer some guidance on how to achieve that?
Could you post some of that sample data as text?
This is what I was looking for!
Active Users by Marketing Source Earliest Attribution =
VAR
LoginTransactions =
ADDCOLUMNS(
SUMMARIZECOLUMNS(
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
"Created Time",
MIN(IdentityLog[CreatedTime])
),
"MarketingSource",
LOOKUPVALUE(
IdentityLog[MarketingSourceLogId],
IdentityLog[IdentityId],
IdentityLog[IdentityId],
IdentityLog[CreatedDate],
IdentityLog[CreatedDate],
IdentityLog[CreatedTime],
[Created Time]
)
)
RETURN
COUNTROWS(LoginTransactions)
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
92 | |
86 | |
84 | |
66 | |
49 |
User | Count |
---|---|
140 | |
113 | |
106 | |
64 | |
60 |