Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
BuzzyBeeDee
Frequent Visitor

Find records within 6 month intervals of first row in group

Hi there,

I have a dataset of patients discharged from hospital. This is used to generate a list of patient experience surveys to be sent out. HOWEVER, we only want to send out a survey for the same patient if it's been more than 6 months since they were last surveyed. I have been trying for days to figure out how to achieve this in power query M (since I'm constructing this in a dataflow), but I'm all out of ideas. I can generate a 'ReSurveyDate' column easily in Excel, just can't work out how to get the equivalent in Power Query M. Sample data would be something like:

 

IDRecordDateReSurveyDate
101/01/202401/07/2024
103/02/202401/07/2024
104/05/202401/07/2024
107/09/202407/03/2025
110/10/202407/03/2025
121/04/202521/10/2025
118/06/202521/10/2025
120/07/202521/10/2025
130/10/202630/04/2027
207/02/202407/08/2024
212/04/202407/08/2024
216/09/202416/03/2025
305/01/202405/07/2024
318/04/202405/07/2024
301/05/202405/07/2024
401/03/202401/09/2024
418/05/202401/09/2024
511/05/202411/11/2024


In Excel, I can simply set the ReSurveyDate for the first record for each ID group as the RecordDate+6 months, then each subsequent one uses the formula =IF(RecordDate<Previous ReSurveyDate, ReSurveyDate,RecordDate+6 months)

Once I have the ReSurveyDate generated I can then easily just select the first record within each group of ID+ReSurveyDate. 

Any help on how to generate the ReSurveyDate as above would be greatly appreciated!!!

EDIT: Sorry, I should have mentioned that the data is currently sitting in several .csv files (approx 3 months data in each file) that get imported and combined in PowerBI into a single file. So there's no 'ReSurveyDate' field in the .csv data, it needs to be generated within PowerBI. I've just included that as an example of the expected output for that field 🙂

1 ACCEPTED SOLUTION
aduguid
Super User
Super User

Try this in Power Query 

let
    // Load your data source
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

    // Convert RecordDate and ReSurveyDate columns to Date type
    ChangedType = Table.TransformColumnTypes(Source,{{"RecordDate", type date}, {"ReSurveyDate", type date}}),

    // Sort the data by ID and RecordDate
    SortedRows = Table.Sort(ChangedType,{{"ID", Order.Ascending}, {"RecordDate", Order.Ascending}}),

    // Add a new column for ReSurveyDate calculation
    AddReSurveyDate = Table.AddColumn(SortedRows, "ReSurveyDate", each
        let
            currentId = [ID],
            currentRecordDate = [RecordDate],
            previousRows = Table.SelectRows(SortedRows, each [ID] = currentId and [RecordDate] < currentRecordDate),
            maxPreviousReSurveyDate = if Table.IsEmpty(previousRows) then null else List.Max(previousRows[ReSurveyDate])
        in
            if maxPreviousReSurveyDate = null or currentRecordDate > maxPreviousReSurveyDate then
                Date.AddMonths(currentRecordDate, 6)
            else
                maxPreviousReSurveyDate
    ),

    // Remove the old ReSurveyDate column if you want to replace it with the new one
    RemoveOldColumn = Table.RemoveColumns(AddReSurveyDate,{"ReSurveyDate"}),

    // Rename the new ReSurveyDate column
    RenameColumn = Table.RenameColumns(RemoveOldColumn,{{"ReSurveyDate", "ReSurveyDate"}}),

    // Group by ID and ReSurveyDate and keep the first row in each group
    GroupedRows = Table.Group(RenameColumn, {"ID", "ReSurveyDate"}, {{"FirstRecord", each Table.FirstN(_, 1), type table [ID=Int64.Type, RecordDate=Date.Type, ReSurveyDate=Date.Type]}}),

    // Expand the grouped table to get the final result
    ExpandedRows = Table.ExpandTableColumn(GroupedRows, "FirstRecord", {"ID", "RecordDate", "ReSurveyDate"})
in
    ExpandedRows

View solution in original post

3 REPLIES 3
aduguid
Super User
Super User

Try this in Power Query 

let
    // Load your data source
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],

    // Convert RecordDate and ReSurveyDate columns to Date type
    ChangedType = Table.TransformColumnTypes(Source,{{"RecordDate", type date}, {"ReSurveyDate", type date}}),

    // Sort the data by ID and RecordDate
    SortedRows = Table.Sort(ChangedType,{{"ID", Order.Ascending}, {"RecordDate", Order.Ascending}}),

    // Add a new column for ReSurveyDate calculation
    AddReSurveyDate = Table.AddColumn(SortedRows, "ReSurveyDate", each
        let
            currentId = [ID],
            currentRecordDate = [RecordDate],
            previousRows = Table.SelectRows(SortedRows, each [ID] = currentId and [RecordDate] < currentRecordDate),
            maxPreviousReSurveyDate = if Table.IsEmpty(previousRows) then null else List.Max(previousRows[ReSurveyDate])
        in
            if maxPreviousReSurveyDate = null or currentRecordDate > maxPreviousReSurveyDate then
                Date.AddMonths(currentRecordDate, 6)
            else
                maxPreviousReSurveyDate
    ),

    // Remove the old ReSurveyDate column if you want to replace it with the new one
    RemoveOldColumn = Table.RemoveColumns(AddReSurveyDate,{"ReSurveyDate"}),

    // Rename the new ReSurveyDate column
    RenameColumn = Table.RenameColumns(RemoveOldColumn,{{"ReSurveyDate", "ReSurveyDate"}}),

    // Group by ID and ReSurveyDate and keep the first row in each group
    GroupedRows = Table.Group(RenameColumn, {"ID", "ReSurveyDate"}, {{"FirstRecord", each Table.FirstN(_, 1), type table [ID=Int64.Type, RecordDate=Date.Type, ReSurveyDate=Date.Type]}}),

    // Expand the grouped table to get the final result
    ExpandedRows = Table.ExpandTableColumn(GroupedRows, "FirstRecord", {"ID", "RecordDate", "ReSurveyDate"})
in
    ExpandedRows

Argh, I just realised this is not working because I think it was assumed I already had a 'ReSurveyDate' column? As an example, here's one set of records, where you can see row index 7 is incorrect. The ReSurveyDate should only be updated if it's been 6 months+ since the previous ReSurveyDate. But since the code is used to generate the ReSurveyDate in the first place, I don't know how I would get it to refer to the previous result (rather than the 'first resurvey' field, which just applies to all of the patient's records).

 

MRNDisch DateFirst resurveyIndexWhat I should getWhat I do get
110/01/202410/07/2024110/07/202410/07/2024
124/02/202410/07/2024210/07/202410/07/2024
121/03/202410/07/2024310/07/202410/07/2024
128/03/202410/07/2024410/07/202410/07/202
109/06/202410/07/2024510/07/202410/07/2024
118/07/202410/07/2024618/01/202518/01/2025
126/08/202410/07/2024718/01/202526/02/2025



The code I put in was:

#"IDENTIFY RESURVEY DATES" = Table.AddColumn(
    Buff,
    "ReSurveyDate",
    each
      let
        currentMRN = [MRN],
        currentDD = [#"Disch Date"],
        previousRows = Table.SelectRows(
          Buff,
          each [MRN] = currentMRN and [#"Disch Date"] < currentDD
        ),
        maxPreviousReSurveyDate =
          if Table.IsEmpty(previousRows) then
            null
          else
            List.Max(previousRows[First resurvey])
      in
        if maxPreviousReSurveyDate = null or currentDD > maxPreviousReSurveyDate then
          Date.AddMonths(currentDD, 6)
        else
          maxPreviousReSurveyDate
  )

Any help would be most gratefully appreciated!!

I cannot thankyou enough for this! You have absolutely saved my sanity! 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.