Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Dear Forum Users,
I am work on a dataset of child maltreatment records to calculate/visualize maltreatment recurrence counts/percentage, etc. A sample data file is like this:
ChildID | CaseID | ReportDate | Maltreat Type | Confirmed | gender |
11 | C01 | 1/2/2021 | physical abuse | yes | M |
11 | C02 | 4/2/2021 | physical abuse | no | M |
11 | C03 | 5/12/2022 | neglect | yes | M |
12 | C01 | 1/2/2021 | neglect | no | F |
12 | C02 | 4/2/2021 | physical abuse | yes | F |
12 | C03 | 5/12/2022 | physical abuse | yes | F |
12 | C04 | 3/12/2023 | sexual abuse | no | F |
13 | C05 | 11/2/2021 | emotional abuse | yes | M |
13 | C06 | 1/20/2022 | emotional abuse | yes | M |
The data table has ChildID, each ChildID can have several maltreatment records, each with a separate ReportDate. Each report of maltreatment is either Confirmed "yes" or "no" Confirmed. There are other related columns like child Gender and Maltreatment Type (and many other columns in the raw data file).
The final goal of the project is (1) Not only to calculate frequencty of all maltreatment recurrence, (2) but also only the Confirmed maltreat recurrence, and (3) the confirmed recurrence within 1 year (365 days). Based on previous posts on creating lagged values, the case of (1) should be easy to handle in Power Query. I can sort the query table by ChildID and ReportDate (in ascending order), create two index columns (0 to N, and 1 to N+1), and then merge the table to itself with the two index columns as keys. Then the merged table will have the next_record's report date and its confirmed status (yes, no). I can also calculate the day differences to the next report, and create a column for 365 days or not. All these can be done in Power Query.
However, this method won't work for tasks (2) / (3). Not all next records are confirmed. If the immediately next record is not confirmed, we need to look further down to find if the further next record is confirmed, and if so, we need to fetch its date (and other info) to the present record. I have not found any solution to this issue in Power Query. Besides, my original data table is very large (nearly a million records with nearly 100 columns), and sorting such a large file takes a long time. Besides, the table needs buffer for the sorting order to stay.
Fortunately I found a previous post in this forum which uses "Earlier" function for a similar problem. I have used the "Earlier" Function to create calculated columns for both "nex report date" and "next confirmed date". I can also calculate the column 'days to next confirmed date" and a column to show if the days are with 365 days. The formulaes for these columns are shown below:
nxt_confirm_dt = MINX( FILTER(maltreat_data, [ChildID]=EARLIER(maltreat_data[ChildID]) && [ReportDate] > EARLIER(maltreat_data[ReportDate]) && maltreat_data[Confirmed] = "yes" ), [ReportDate])
days_nxtconfirmdt = DATEDIFF(maltreat_data[ReportDate],maltreat_data[nxt_confirm_dt],DAY)
days_nxtconfirmLT365 = IF (maltreat_data[days_nxtconfirmdt] <365 && NOT(ISBLANK(maltreat_data[days_nxtconfirmdt])), 1, 0)
Then I created a Dax measure to sum the number of maltreatment recurrence.
AMeasure_Confirm_lt1yr = SUMX(maltreat_data,maltreat_data[days_nxtconfirmLT365])
I created several different visuals using the Dax Measure to display the counts of confirmed maltreatment recurrence for Gender, Report Year, and current records Conformed status (yes / no).
I also used a slicer for report Year, and the results turned out to be all correct. I have thought using the Year filter/slicer would change the current context and mess up the calculated columns with Earlier function (since it uses the whole ChildID group regardless of report year/dates). However, the columns are not affected by the slicer/filer.
Here is an screenshot of the Power BI visuals:
I am sharing this to have your suggestions on (1) whether this can be done in Power Query, and (2) whether there are any potential pitfalls such as using filters messing up the calculation. And if possible, I wonder whether you can help create a single Dax Measure that will create the next confirmed dates, calculate the days, and get the final summary results of confirmed maltreatment recurrence within 1 Year. So I don't have to create the separate calculated columns.
Thanks.
Solved! Go to Solution.
You use variables instead of EARLIER.
Is this what you are looking for?
Thanks, @lbendlin . As I am counting the maltreatment recurrence within 365 days, I need to calculate the next confirmed report date instead of the "Previous Report Date" as in your measure. So I updated your measure accordingly, as follows:
AMaltreatRecurrence Count =
IF(
COUNTROWS(maltreat_data) > 1,
VAR a = ADDCOLUMNS(
maltreat_data,
"Next Report Date",
VAR c = [ChildID]
VAR r = [ReportDate]
CALCULATE(
MIN(maltreat_data[ReportDate]),
FILTER(
ALL(maltreat_data),
maltreat_data[ChildID] = c &&
maltreat_data[Confirmed] = "yes" &&
maltreat_data[ReportDate] > r)
)
)
)
VAR b = FILTER(
a,
NOT ISBLANK([Next Report Date]) &&
DATEDIFF(maltreat_data[ReportDate], [Next Report Date], DAY) < 365
)
RETURN
COUNTROWS(b)
With this new measure I got the same results as the Earlier methods.
Interestingly when I checked the codes with ChatGPT, it adviced I use Earlier instead of the creating immediate Variables. Its reasoning is:
Your DAX formula has the right structure and approach, but it still contains a minor issue with how variables (c and r) are used. These variables are being defined within the ADDCOLUMNS calculation but are not effectively scoped for use within CALCULATE. Specifically:
AMaltreatRecurrence Count =
IF(
COUNTROWS(maltreat_data) > 1,
VAR a = ADDCOLUMNS(
maltreat_data,
"Next Report Date",
CALCULATE(
MIN(maltreat_data[ReportDate]),
FILTER(
ALL(maltreat_data),
maltreat_data[ChildID] = EARLIER(maltreat_data[ChildID]) &&
maltreat_data[Confirmed] = "yes" &&
maltreat_data[ReportDate] > EARLIER(maltreat_data[ReportDate])
)
)
)
VAR b = FILTER(
a,
NOT ISBLANK([Next Report Date]) &&
DATEDIFF(maltreat_data[ReportDate], [Next Report Date], DAY) < 365
)
RETURN
COUNTROWS(b)
)
I don't know whether its reasoning hold water. But anyway the results with both get the same results. There is a total count of 8 maltreatment recurrences.
Thanks for your help.
Your solution is so great lbendlin
Hi, @Anonymous
I wish you all the best. Previously Super user have provided a solution to help you solve the problem. Since we haven't heard back from you yet, I'd like to confirm if you've successfully resolved this issue or if you need further help?
If yes, you are welcome to share your workaround and mark it as a solution so that other users can benefit as well. If you find a reply particularly helpful to you, you can also mark it as a solution.
If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.
Best Regards
Jianpeng Li
Problem solved. Thanks.
Hi,
In the PBI file that you have shared, show the expected result very clearly.
EARLIER means "outer context", it is not necessarily something to do with dates. Nowadays we recommend using variables instead of EARLIER.
You mention performance concerns. However your processing seems to focus around each individual ChildID so you shouldn't have that issue as much.
Measures must be used when the result of the calculation can be impacted by users interacting with the report. For immutable data points you can use calculated columns, or you can consider pushing this up into Power Query (although that may be quite a bit slower).
Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information. Do not include anything that is unrelated to the issue or question.
Please show the expected outcome based on the sample data you provided.
Need help uploading data? https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...
Hi @lbendlin , The forum does not allow me to attach files. But here is link to the sample data I used to create the dashboard.
https://docs.google.com/spreadsheets/d/1YlQK-H1Z9uiF7ARF9LsQAadX6VGriqph/edit?usp=sharing&ouid=10420...
The major columns are just 3: Child ID, Maltreat Report Date, and Confirmation Status. In my original data file, the data is not sorted in any clear order (probably by case ID, child Id, report dates). A child ID can have multiple reports (with different report dates) and each report has a maltreatmet confirmation status (Yes or No).
The task is to find the number of maltreatment recurrence within 1 year. That is whether a Child ID has a subsequent Confirmed (value=yes) maltreatment report with 365 days from the current report date. So I need to get the next Confimred report date for each report record if it exists. And calculate the lapse of days between the current report date and the next report date and see whether it is within 365 days. Then I can create a summary Dax measures for the number of maltreat recurrences, and also visualize the data by gender, year and other features.
Here is the link to the Pbix:
https://drive.google.com/file/d/11_a6ERN6Un9lPVwlvRsYRm4rD9jPiU6-/view?usp=sharing
I have read that Earlier is not widely used now. But I have no idea of other methods that can be solve the issue. I think using Earlier can apply to my probem. I am open to any other suggestions you can provide. Thanks.
Thanks, @lbendlin . As I am counting the maltreatment recurrence within 365 days, I need to calculate the next confirmed report date instead of the "Previous Report Date" as in your measure. So I updated your measure accordingly, as follows:
AMaltreatRecurrence Count =
IF(
COUNTROWS(maltreat_data) > 1,
VAR a = ADDCOLUMNS(
maltreat_data,
"Next Report Date",
VAR c = [ChildID]
VAR r = [ReportDate]
CALCULATE(
MIN(maltreat_data[ReportDate]),
FILTER(
ALL(maltreat_data),
maltreat_data[ChildID] = c &&
maltreat_data[Confirmed] = "yes" &&
maltreat_data[ReportDate] > r)
)
)
)
VAR b = FILTER(
a,
NOT ISBLANK([Next Report Date]) &&
DATEDIFF(maltreat_data[ReportDate], [Next Report Date], DAY) < 365
)
RETURN
COUNTROWS(b)
With this new measure I got the same results as the Earlier methods.
Interestingly when I checked the codes with ChatGPT, it adviced I use Earlier instead of the creating immediate Variables. Its reasoning is:
Your DAX formula has the right structure and approach, but it still contains a minor issue with how variables (c and r) are used. These variables are being defined within the ADDCOLUMNS calculation but are not effectively scoped for use within CALCULATE. Specifically:
AMaltreatRecurrence Count =
IF(
COUNTROWS(maltreat_data) > 1,
VAR a = ADDCOLUMNS(
maltreat_data,
"Next Report Date",
CALCULATE(
MIN(maltreat_data[ReportDate]),
FILTER(
ALL(maltreat_data),
maltreat_data[ChildID] = EARLIER(maltreat_data[ChildID]) &&
maltreat_data[Confirmed] = "yes" &&
maltreat_data[ReportDate] > EARLIER(maltreat_data[ReportDate])
)
)
)
VAR b = FILTER(
a,
NOT ISBLANK([Next Report Date]) &&
DATEDIFF(maltreat_data[ReportDate], [Next Report Date], DAY) < 365
)
RETURN
COUNTROWS(b)
)
I don't know whether its reasoning hold water. But anyway the results with both get the same results. There is a total count of 8 maltreatment recurrences.
Thanks for your help.
As I am counting the maltreatment recurrence within 365 days, I need to calculate the next confirmed report date instead of the "Previous Report Date" as in your measure.
I would like to respectfully disagree on the methodology. I think comparing to a prior event makes more sense.
Interestingly when I checked the codes with ChatGPT, it adviced I use Earlier instead of the creating immediate Variables
That's funny.
@lbendlinThat is right. The "previos report date" makes perfect sense for counting of recurrence. But the final measure I need to create is the "maltreatment recurrence RATE", which is defined as "among the number of maltreatment reports in a certain year, what percentage has a subsequent confirmed report within 1 year of the report date". So I also need to count the current reports in a year, which is then divided by the number with subsequent reports.
Thanks for your quick response. I think ChatGPT is poorly trained in the Variable method, just like me.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.