Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
I would like to calculate the distriution of the number of occurences of a field. More specifically, I want to see the distribution of the number of applications customers have in a particular time period.
For example, I can easily for the first column (customer number) determine how many applications they made:
However, I want to be able to create a graph / summary table that would tell me:
Number of applications | Number of customers (having this number of applications) |
1 | 11 |
2 | 1 |
How would I go about doing so with DAX (and without creating a table that would not filter according to filters on a page)?
Solved! Go to Solution.
To make this work, I believe you'll need to create a disconnected dimension table containing all possible "count" values.
Then create a Frequency measure to count the number of CUSTOMER_NUMBERs with a given count.
See for example this pbix using your sample data
I created a table called 'Count' and a measure Frequency
Count = VAR MaxCount = MAXX ( VALUES ( YourTable[CUSTOMER_NUMBER] ), CALCULATE ( COUNTROWS ( YourTable ) ) ) RETURN SELECTCOLUMNS ( GENERATESERIES ( 0, MaxCount ), "Count", [Value] )
Frequency = SUMX ( 'Count', COUNTROWS ( FILTER ( VALUES ( YourTable[CUSTOMER_NUMBER] ), CALCULATE ( COUNTROWS ( YourTable ) ) = 'Count'[Count] ) ) )
The Frequency measure uses SUMX to iterate over the 'Count' table so it can aggregate over multiple Count values.
You can then create visuals showing Frequency filtered by 'Count'[Count]
Regards,
Owen
1. Create a measure Distribution = COUNT(yourtablename[Customer])
2. Now create a table visual with
YourTable[Count] & Distribution as values.
For YourTable[Count] value set to do not summarize.
CheenuSing
Hi,
Thanks for the response.
I don't exactly understand how to do the second step. Do I have to use the SUMMARIZE statement? But in that case I cannot use a calculated field as the second parameter. Could you perhaps provide an example?
Thanks
Sorry, maybe I understand your solution incorrectly, but let me explain with an example:
If I have the following data:
ACCOUNT_NUMBER | CUSTOMER_NUMBER |
1 | A |
2 | B |
3 | C |
4 | D |
5 | E |
6 | E |
7 | F |
The creating a measure - COUNT([CUSTOMER_NUMBER]) and creating a visual table based on that, would give me:
CUSTOMER_NUMBER | COUNT(Customer_number) |
A | 1 |
B | 1 |
C | 1 |
D | 1 |
E | 2 |
F | 1 |
However, what I want, requires another step, because what I now want is this:
COUNT(Customer_number) | COUNT(COUNT(CUSTOMER_NUMBER)) |
1 | 5 |
2 | 1 |
To make this work, I believe you'll need to create a disconnected dimension table containing all possible "count" values.
Then create a Frequency measure to count the number of CUSTOMER_NUMBERs with a given count.
See for example this pbix using your sample data
I created a table called 'Count' and a measure Frequency
Count = VAR MaxCount = MAXX ( VALUES ( YourTable[CUSTOMER_NUMBER] ), CALCULATE ( COUNTROWS ( YourTable ) ) ) RETURN SELECTCOLUMNS ( GENERATESERIES ( 0, MaxCount ), "Count", [Value] )
Frequency = SUMX ( 'Count', COUNTROWS ( FILTER ( VALUES ( YourTable[CUSTOMER_NUMBER] ), CALCULATE ( COUNTROWS ( YourTable ) ) = 'Count'[Count] ) ) )
The Frequency measure uses SUMX to iterate over the 'Count' table so it can aggregate over multiple Count values.
You can then create visuals showing Frequency filtered by 'Count'[Count]
Regards,
Owen
Thank you very much!
This works well, especially for showing the distribution over the whole range of possible counts - which can be beneficial.
(I also got it to work with a COUNT and SUMMARIZE statement, together with a bi-directional join between the 2 tables, in which case it only keeps the range of counts relevant for the selection).