March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello, everyone!
Having used a different BI tool for many years, our team is making the move to Power BI. I'm steadily getting used with it, but I've hit a bit of a stumbling block. I'm sure it's something very simple, but after a search of the forum I can't quite find a question and answer that match my issue.
I have a dataset (pulled from a SQL Server using DirectQuery) of record-level patient attendance data, a date table (created using PowerQuery M) and a couple of lookup tables (.csvs stored in SharePoint). My only relationship is a many to one from a date field in my dataset to my date table.
Now, I might be going about this all wrong (and please tell me, if so) but I've created a number of measures to match-up to our KPIs using CALCULATE. For example:
Indicator 1.1 = CALCULATE(
DISTINCTCOUNT( 'table'[ID] ),
LEFT('table'[ID],2) = "SH",
'table'[WaitingDays] <=2
)
/
CALCULATE(
DISTINCTCOUNT( 'table'[ID] ),
LEFT('table'[ID],2) = "SH"
)
The above indicator should give us the percentage of distinct patients whose IDs begin with "SH" who waited 2 days or less for their appointment. Seems to work fine.
The issue I'm having is for some indicators we need to count non-distinct IDs - like when a patient appears two or three times in a period. However, it's complicated by each ID having multiple records per date which capture other fields (it's how the client's recording system is setup, so I don't think we can escape it). This throws up an issue. For example:
Indicator 10.2 = CALCULATE(
COUNT( 'table'[ID] ),
'table'[Outcome] IN {"New","Repeat"}
)
This measure counts way too many patients, as they appear more than once each date.
What I need is a way to count each ID only once per date they appear in the period.
I might be approaching this totally wrong (again, although I'm not new to BI, I'm very new to PBI) so happy to change the approach if easier/more logical. Here is a mock of the data and problem:
ID | Date | Outcome |
001 | 01/01/2024 | New |
001 | 01/01/2024 | New |
002 | 01/01/2024 | New |
002 | 01/01/2024 | New |
002 | 01/01/2024 | New |
003 | 02/01/2024 | New |
001 | 04/01/2024 | Repeat |
001 | 04/01/2024 | Repeat |
003 | 08/01/2024 | Repeat |
004 | 08/01/2024 | New |
004 | 08/01/2024 | New |
001 | 08/01/2024 | Follow-up |
001 | 08/01/2024 | Follow-up |
What I'd expect to see is a count of 6. As is, I'm getting every row other than the "Follow-up" outcomes.
Thanks in advance!
Solved! Go to Solution.
Hello! I entered the sample data as below:
I created the following DAX:
Proud to be a Super User! | |
The quickest way to do this is to simply replace COUNT with DISTINCTCOUNT(), i.e
Indicator 10.2 = CALCULATE(
DISTINCTCOUNT( 'table'[ID] ),
'table'[Outcome] IN {"New","Repeat"}
)
But the issue with this way is that the totals (if you don't have the dates in the visual) will be incorrect, as it's counting the distinct new / repeat customers over all days
To get around this, i would recommend creating a new table just so that the totals are correct.
NEW Indicator 10.2 =
var __newTable = ADDCOLUMNS(
VALUES('Table'[Date]),
"distinct customer count",
CALCULATE(
DISTINCTCOUNT( 'table'[ID] ),
'table'[Outcome] IN {"New","Repeat"}
)
)
return SUMX(__newTable, [distinct customer count])
Hi, Vicky.
First, thanks for your time.
Yes, as you say DISTINCTCOUNT is no good in this instance as the date field will not be used in the visual.
My only concern with this is whether adding new tables impacts performance. I've been impressed with the speed of reports thus far - but with *lots* of KPIs to calculate, if I have to make a new table for each one not only might it become messy quickly but will it also impact report speed?
Hello! I entered the sample data as below:
I created the following DAX:
Proud to be a Super User! | |
Hi, Audrey.
Thanks for your time to respond!
I think this may work. My only "issue" is that it's throwing an error to say the date column used in SUMMARIZE doesn't appear in the table. Am I correct in thinking that I can simply switch out the date column from my date table to the original date column from my dataset and it won't affect time intelligence?
For this, it's not really doing any time intelligence (MTD, YTD, YoY, MoM, etc.), it's really just doing the following:
- FILTER is used to filter the table to include only rows where the [Outcome] is "New" or "Repeat".
- SUMMARIZE then creates a summary table with unique combinations of [ID] and [Date].
- COUNTROWS counts the number of unique combinations meeting the criteria of "New" or "Repeat" outcomes
In this case you would want the summarization to have the date field from your fact table because that is what summarize is doing, it's making another temporary table in the background with each unique combination of ID and Date if the Outcome is New or Repeat, then it is adding them together.
If you were doing any time intelligence to compare to another time period, in that meausre, you would reference the Date from your date table.
Proud to be a Super User! | |
That's perfect; with such a good explanation, I should be able to adapt it as neccesary for each KPI measure myself.
Thanks so much for taking the time to help - it's been a big change getting to grips with PBI, but I'm very inspired by the possibilities and community that enables it!
Best wishes.
You are very welcome! Power BI is a great tool. It takes, some time but there are so many great references. Good luck on your journey!
Proud to be a Super User! | |
Very sorry to bother you, Audrey.
All was going well until I got to a specific indicator:
Indicator 13.1 =
COUNTROWS(
SUMMARIZE(
FILTER('table',
'table'[Outcome] <> "NEW" || 'table'[Outcome] <> "REVIEW" || 'table'[Age] < 19
),
'table'[ID], 'table'[Date]
)
)
The [Age] column is another measure, like so:
Age = ROUNDDOWN(DATEDIFF('table'[BirthDate], TODAY(), DAY) / 365.25, 0)
For some reason, the indicator is counting everybody, rather than just patients who do not have those outcomes and are under 19 years of age. If I remove the age part of the filter, there is no change in the end count. Am I missing something really simple?
Thanks in advance.
Hi! Can you add a column for age in Power Query? That should fix the issue.
Proud to be a Super User! | |
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
126 | |
85 | |
69 | |
53 | |
44 |
User | Count |
---|---|
202 | |
106 | |
100 | |
64 | |
56 |