Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi,
I have a dataset of several thousand companies that have memberships in work groups. There are 16 different work groups (A-P). Companies can be member of one or several groups. Most are members in one or two groups. I am trying to figure out the most common combinations of work groups. I think this would be called a co-occurrence matrix.
My original dataset looks like this:
Work groups | A | B | C | D |
Company 1 | 1 | 0 | 0 | 0 |
Company 2 | 0 | 1 | 1 | 0 |
Company 3 | 1 | 1 | 1 | 0 |
Company 4 | 1 | 0 | 0 | 1 |
… |
|
|
|
|
Company 1 is member of work group A only. Company 3 has membership to work groups A, B and C.
My expected output would be:
| A | B | C | D |
A | 3 | 0 | 1 | 1 |
B | 0 | 2 | 2 | 0 |
C | 1 | 2 | 2 | 0 |
D | 1 | 0 | 0 | 1 |
… |
|
|
|
|
The diagonal shows the total membership in a given work group, all other field show the “overlap” as in how many companies are in both those work groups. It doesn’t matter to me whether the output is in a full matrix (where all combinations appear twice – once above and once below the diagonal) or a triangle shape.
My approach (that doesn’t work 😞 ) was to create three tables:
table1: source
(this is the unpivoted original dataset with an extra copy of the Workgroup column)
Company | Workgroup | Workgroup copy | Value |
Company1 | A | A | 1 |
Company1 | B | B | 0 |
Company1 | C | C | 0 |
Company1 | D | D | 0 |
Company2 | A | A | 0 |
Company2 | B | B | 1 |
Company2 | C | C | 1 |
Company2 | D | D | 0 |
… |
|
|
|
table 2: Workgrouplist1
Workgroups |
A |
B |
C |
D |
… |
table 3: Workgrouplist2
Workgroups |
A |
B |
C |
D |
… |
There are no relationships between these three tables.
And here is the DAX code of my measure:
countifs =
VAR CurrentX = SELECTEDVALUE('Workgrouplist2'[Workgroups])
VAR CurrentY = SELECTEDVALUE('Workgrouplist1'[Workgroups])
VAR Cooccurrence =
CALCULATE (
COUNTROWS('source'),
FILTER(
'source',
'source'[Workgroup] = CurrentX
&&
'source'[Workgroup copy] = CurrentY
&& 'source'[Value] = 1
)
)
RETURN
Cooccurrence
The actual output is
| A | B | C | D |
A | 3 |
|
|
|
B |
| 2 |
|
|
C |
|
| 2 |
|
D |
|
|
| 1 |
What is my mistake?
Hi @v-yaningy-msft,
Thank you for asking. The numbers represent the number of companies that participate in the respective two workgroups. No company is a member of both A and B, one company is in both A and C, two are in both B and C etc. The two red triangles show in fact the same numbers. So a way to display only one of these triangles would be perfect (look better than two that replicate the same information).
Thank you!! Best, Suus
Hi, @Suus
Didn't get the logic of the implementation of the numbers in the red box, can you explain it in more detail
Best Regards,
Yang
Community Support Team
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
102 | |
75 | |
44 | |
39 | |
32 |
User | Count |
---|---|
163 | |
90 | |
66 | |
46 | |
43 |