Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
djkoenig
Helper II
Helper II

Creating a Sequence ID to Reconcile Time Spent

Hello Experts,

Link to Sample Dataset + Below Excel Workbook excerpt: https://docs.google.com/spreadsheets/d/1K4oGRoCh5o_nEpEPot7T8gd5ZahjYSdE/edit?usp=drive_link&ouid=10...

 

SUMMARY

I have encountered an issue that I’m sure someone has solved before. In essence, I need a way to calculate hours per person per day spent on site, encompassing breaks, days off, and improper sign-outs.

My idea is that I want to create a running difference and use that running difference as a flag along (with a bit of additional logic) to create a sequence ID. This sequence ID then will be the join between two tables (Entry and Exit) and I will basically smush the proper exit with the proper entry on a single line as a result.

BACKGROUND

We are monitoring check-in and check-out times at a security gate. We have some data that looks like:

Yellow lines indicate potential problem areas. Red numbers are calculated fields. Please note, this is viewing just two individuals to hopefully make things a bit easier to visualize.

djkoenig_0-1708655348473.png

So, it’s not a terrible exercise in Excel. Column G is set to equal 1. Colum H says if Badge ID = Prev. Badge ID then Add the running total to the Record number, if not just evaluate record. You see this working in line 29 when the Name/Badge ID switches.

Column I reads the check-in gate and then the running difference (column F) to create the sequence ID. The logic is set that if the Check-In Gate equals “Main Gate Exit” AND Running Difference is greater than 1, then add one to the running total on badge ID, otherwise if Check-In Gate still equals “Main Gate Exit” but Running Difference is Less than 1, give me just running total on Badge ID. If none of that is true (basically Check-In Gate equals Main Gate Entry), then add the record number (1) to the running total on Badge ID.

 

REQUIREMENTS

The only other caveat to this is that we cannot generate anything on a spreadsheet and must perform the running totals and creation of logic solely within the PowerBI platform.

 

CONCLUSION

I’d be absolutely thrilled if anyone could provide an elegant solution (or a basic one for that matter). There may be an easier way than I described, and I am certainly all ears if that is the case. Otherwise, I think I have the baseline logic, just not enough experience to know how to quite execute inside PowerBI. 

Let me know if you need any more information, but I am hoping that excel workbook will suffice. First tab is sample data, second is desired format, third is that screenshot from above. 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

This data is immutable so doesn't require a DAX solution.  Here is the Power Query version.

 

If you want you can highlight any durations that are not 7 hours - those are your anomalies

 

lbendlin_0-1708729979698.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZVNC4JAEED/yuJZ0P108yYSUSQE3RIPEkIeWqM8VL++NW8ROjJzG1x4vN0RXlkGu+7RsH13a95BGHAhldHjwLbuPEwRj4RiOo1jdij8h6JuHdvUfcPWrr+/giqEQMQA4WKgZD+UZ9svgWBNJIWJpDBRFCaKwkRTmGgKE/M1wYkYMhEP4WiI/Wey8DY2lajbJAMEud+E4lktxZ9mKUx4TKEyUtAunMSFZEMcsqLaNex4bfuLP1/ZxGjlh+yUs7wDN2OOAUkGiIH0gAQDxEB6QHIBYiA9ILEAMZAegFSAEBQaM6EAMaY7AURMZmKOMVOJ6gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Badge Id" = _t, Company = _t, #"Date Time" = _t, #"Check-In Gate" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Badge Id", Int64.Type}, {"Company", type text}, {"Date Time", type datetime}, {"Check-In Gate", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Badge Id"}, {{"Rows", each Table.AddIndexColumn(Table.Sort(_,{{"Date Time", Order.Ascending}}), "Index", 0, 1, Int64.Type), type table [Name=nullable text, Badge Id=nullable number, Company=nullable text, Date Time=nullable datetime, #"Check-In Gate"=nullable text, Index= Int64.Type]}}),
    #"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"Name", "Company", "Date Time", "Check-In Gate","Index"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Rows", "Check-In Time", each if Text.EndsWith([#"Check-In Gate"],"Entry") then [Date Time] else null, type datetime),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckOut Gate", (k)=> Table.SelectRows(#"Added Custom", each [Badge Id]=k[Badge Id] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[#"Check-In Gate"], type text),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Checkout Time", (k)=> Table.SelectRows(#"Added Custom", each [Badge Id]=k[Badge Id] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[Date Time], type datetime),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom2", each Text.EndsWith([#"Check-In Gate"], "Entry")),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Badge Id", "Name", "Company", "Check-In Gate", "Check-In Time", "CheckOut Gate", "Checkout Time"}),
    #"Added Custom3" = Table.AddColumn(#"Removed Other Columns", "On-Site", each [Checkout Time]-[#"Check-In Time"],type duration)
in
    #"Added Custom3"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

 

View solution in original post

15 REPLIES 15
lbendlin
Super User
Super User

This data is immutable so doesn't require a DAX solution.  Here is the Power Query version.

 

If you want you can highlight any durations that are not 7 hours - those are your anomalies

 

lbendlin_0-1708729979698.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rZVNC4JAEED/yuJZ0P108yYSUSQE3RIPEkIeWqM8VL++NW8ROjJzG1x4vN0RXlkGu+7RsH13a95BGHAhldHjwLbuPEwRj4RiOo1jdij8h6JuHdvUfcPWrr+/giqEQMQA4WKgZD+UZ9svgWBNJIWJpDBRFCaKwkRTmGgKE/M1wYkYMhEP4WiI/Wey8DY2lajbJAMEud+E4lktxZ9mKUx4TKEyUtAunMSFZEMcsqLaNex4bfuLP1/ZxGjlh+yUs7wDN2OOAUkGiIH0gAQDxEB6QHIBYiA9ILEAMZAegFSAEBQaM6EAMaY7AURMZmKOMVOJ6gM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Badge Id" = _t, Company = _t, #"Date Time" = _t, #"Check-In Gate" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Badge Id", Int64.Type}, {"Company", type text}, {"Date Time", type datetime}, {"Check-In Gate", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Badge Id"}, {{"Rows", each Table.AddIndexColumn(Table.Sort(_,{{"Date Time", Order.Ascending}}), "Index", 0, 1, Int64.Type), type table [Name=nullable text, Badge Id=nullable number, Company=nullable text, Date Time=nullable datetime, #"Check-In Gate"=nullable text, Index= Int64.Type]}}),
    #"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"Name", "Company", "Date Time", "Check-In Gate","Index"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Rows", "Check-In Time", each if Text.EndsWith([#"Check-In Gate"],"Entry") then [Date Time] else null, type datetime),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckOut Gate", (k)=> Table.SelectRows(#"Added Custom", each [Badge Id]=k[Badge Id] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[#"Check-In Gate"], type text),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Checkout Time", (k)=> Table.SelectRows(#"Added Custom", each [Badge Id]=k[Badge Id] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[Date Time], type datetime),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom2", each Text.EndsWith([#"Check-In Gate"], "Entry")),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"Badge Id", "Name", "Company", "Check-In Gate", "Check-In Time", "CheckOut Gate", "Checkout Time"}),
    #"Added Custom3" = Table.AddColumn(#"Removed Other Columns", "On-Site", each [Checkout Time]-[#"Check-In Time"],type duration)
in
    #"Added Custom3"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

 

Wow, thank you @lbendlin !! I am a PowerQuery novice, so I really didn't think to do it this way. Your output looks perfect!

 

However, I am running into an issue when I translate to my own dataset: "Token Identifier Expected". This is something trivial, but I don't know how to solve. 

 

djkoenig_0-1708735041523.png

But maybe instead of worrying about that (although I would like to know what is happening there), could we attempt to replicate it on the source table? I'm curious at how you m-code handles a much larger data volume. I've removed personal data and the link is below:

https://drive.google.com/file/d/1nwt9FNH22rivTdBydxJ1vbx952RzreQd/view?usp=drive_link

I really do appreciate any further guidance!

 

 

Should all the other readers be ignored?

 

lbendlin_0-1708737308108.png

 

Yes, can remove everything but Main Gate Exit and Main Gate Entry. 

Do I need to worry about the type column?

 

lbendlin_0-1708738452120.png

 

Not really. Let's set it to be just access granted entries. This should be ~98% of the data.

I don't think we need to sequence someone trying to badge in 3-4 times in a row in a 2 min interval. In theory, they should eventually check-out/in and show up as access granted. 

Try this - might be some column spelling issues.

 

let
    Source = Odbc.DataSource("dsn=PostgreSQL30", [HierarchicalNavigation=true]),
    s2logactivity_View = Source{[Name="s2config",Kind="Database"]}[Data]{[Name="report",Kind="Schema"]}[Data]{[Name="s2logactivity",Kind="View"]}[Data],
    Filtered = Table.SelectRows(s2logactivity_View, each [type]="Access Granted" and List.Contains({"Main Gate Entry","Main Gate Exit"},[portalname])),
    #"Grouped Rows" = Table.Group(Filtered, {"personkey"}, {{"Rows", each Table.AddIndexColumn(Table.Sort(_,{{"dttm", Order.Ascending}}), "Index", 0, 1, Int64.Type), type table [personkey=nullable number, dttm=nullable datetime, portalname=nullable text, Index= Int64.Type]}}),
    #"Expanded Rows" = Table.ExpandTableColumn(#"Grouped Rows", "Rows", {"dttm", "portalname","Index"},{"Date Time", "Check-In Gate","Index"}),
    #"Added Custom" = Table.AddColumn(#"Expanded Rows", "Check-In Time", each if Text.EndsWith([#"Check-In Gate"],"Entry") then [Date Time] else null, type datetime),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckOut Gate", (k)=> Table.SelectRows(#"Added Custom", each [personkey]=k[personkey] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[#"Check-In Gate"], type text),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Checkout Time", (k)=> Table.SelectRows(#"Added Custom", each [personkey]=k[personkey] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[Date Time], type datetime),
    #"Filtered Rows" = Table.SelectRows(#"Added Custom2", each Text.EndsWith([#"Check-In Gate"], "Entry")),
    #"Removed Other Columns" = Table.SelectColumns(#"Filtered Rows",{"personkey", "Check-In Gate", "Check-In Time", "CheckOut Gate", "Checkout Time"}),
    #"Added Custom3" = Table.AddColumn(#"Removed Other Columns", "On-Site", each [Checkout Time]-[#"Check-In Time"],type duration)
in
    #"Added Custom3"

Thank you for this @lbendlin . The preview looks great and I appreciate the warning on the spelling. That wasn't an issue, but Power Query appears to be case sensitive. I needed to switch "Access Granted" to "Access granted" on the initial filter line, but then the data populated.

The problem is that too much data populated and I'm a bit confused at what the query is pulling. It's past a billion lines and still counting. The total table count is: djkoenig_2-1708799289331.png

Loaded record count is: 

djkoenig_1-1708799203574.png

I can't finish the load currently, nor am I sure that it is going to finish. It's been a few hours. Do you know what might be causing this to iterate so many times?

Replace the  #"Added Custom"  with the below to add a Table.Buffer statement so you don't have to pull that table for each row of the next step.

 

 #"Added Custom" = Table.Buffer(Table.AddColumn(#"Expanded Rows", "Check-In Time", each if Text.EndsWith([#"Check-In Gate"],"Entry") then [Date Time] else null, type datetime)),

Ohhhhh, that makes a lot of sense! I need to be connected to my work's internet to pull the data, and I'm not going back this weekend (again). 

 

I'll let you know on Monday how this turns out, but I think you've nailed it!!

Huge thanks @lbendlin !! That is a beautiful looking table. The table.buffer really cut down on the load time as well. 

 

I'm trying to not to abuse you here, so last question from this Power Query newbie. Is there any way you could talk me through the "Added Custom Rows1" and "AddedCustomRows2" portion? I know it mirrors the Excel logic, but I'd value a technical explanation straight from the Super User. I'm clearly a little hazy on the syntax, and as a result I'm getting lost on the introduction of 'k'. I guess maybe I'm thinking in terms of DAX, but I don't see 'k' defined anywhere and then it just starts popping up. 

djkoenig_0-1708986028836.png

djkoenig_1-1708986077065.pngAlso, marking the original answer as a solution and singing your praises!!

    #"Added Custom1" = Table.AddColumn(#"Added Custom", "CheckOut Gate", (k)=> Table.SelectRows(#"Added Custom", each [personkey]=k[personkey] and [Index]>k[Index] and Text.EndsWith([#"Check-In Gate"],"Exit")){0}[#"Check-In Gate"], type text),

 

Usually when you add a new column or do some other computation row by row you will see that Power Query inserts an "each"  keyword to indicate that.  Most of the time that is sufficient, and you can reference to the "current"  row via the "_"  shortcut. 

 

But if you need to compare two rows against each other (sorry for the pun), Power Query gets confused as to which one is which. so instead of "each"  you now have to use a function that explicitly carries the row context (plus whatever other parameters you may need). 

 

The "k"  is that parameter, indicating "current row of the outer context". Why "k"?  Frankly I have no idea. It's what I used when I started learning Power Query, and as a creature of habit I keep using it ever since.  You can use whatever parameter name you want.

 

You will notice another "each", this time inside the Table.SelectRows.  That now points to the inner context and allows me to compare columns from both tables against each other.  So

 

each [personKey]=k[personKey] 

 

refers to the inner table on the left and the outer table on the right.

If all your comparisons were for "equal"  you could have used nested joins.  But since your conditions are different this Table.AddColumns with custom column generator function (that's the official term)  is the best way to do this.

I feel as if I've learned more about Power Query from this one exercise, then all the rest of the time I've spent with it haha. Much appreciation!

 

You did this in like 8 lines, so I can't imagine this is a terribly difficult solution. Is there a good place to start learning some more of the nuances of PQ? Any great websites, books, etc that you've come across? 

 

Also, if the need arises, I get the sneaking suspicision that I will also use k. So, now there will be at least two of us 🙂

There is only one thing you need to master Power Query. Read through Ben Gribaudo's Primer.

 

https://bengribaudo.com/blog/2017/11/17/4107/power-query-m-primer-part1-introduction-simple-expressi...

I was hoping you would say something like that and that I would enjoy the writing style. Check and check! I don't even need to buy anything... legends the both of ya. 

Thanks for everything! 

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.