Re: variables and tables or columns

CAPEconsulting · ‎09-03-2017

I have a matrix table with over 500K rows and over 100 columns. A few columns have TRUE FALSE based values

I am trying to use a slicer filter by condition, where if selected then the table gets filtered to rows that have a TRUE or FALSE value for a column based on the slicer value. So the slicer filters columns and gets me the population that for that column are applicable

Then based on the selction for example if MUSCULOSKELETAL is selected then I want to know for persons that are a TRUE for MUSCULOSKELETAL how many have chd, and how many have anxiety etc. The prevalence of chd and anxiety have already been created as measures like CHD Prev = COUNT(Epi[chd active])/COUNT(Epi[ID]) and Anxiety Prev = COUNT(Epi[anxiety active])/COUNT(Epi[ID]) respectively.

What I tried was

COMORBIDITY = IF(ISCROSSFILTERED(Diagnosis[Diagnosis]),

VAR

depression = CALCULATE(COUNTROWS(Epi),Epi[depression active] = "true")

VAR

diabetes = [DIABETES]

VAR

etc. many VAR like the above and then

RETURN

SWITCH(ALLSELECTED(Diagnosis[Diagnosis]), "acute renal failure", acute_renal_failure, "anxiety", anxiety, "asthma", asthma, "bipolar", bipolar, "chd", chd, "chronic renal failure", etc.

While this worked to give me the total number of the selcted slicer it does not filter the prevalence measures - rightly so as the row context and filter context are not being met.

So any suggestions of how I can look at the co-morbidity of selected groups - i.e. use the slicer to filter though the table for a particular condition and for patients of that condition find the prevalence (measures) of other conditions.

AlbertoFerrari · ‎09-05-2017

Your scenario requires a deeper analysis, which I cannot perform using the forum, but at first sight, I would say that you have a wrong model. Well, wrong is probably too bad, let us say "suboptimal". In other words, the code tend to be very complex to write.

Instead of having 100 columns, most of which are empty, I would create 100 rows, removing the empty ones. You will increase the number of rows from 500K to several millions, but the size is still very manageable and the model probably results smaller.

Then, using techniques like the one described here for basket analysis, you can achieve the results you need.

I worked on several healthcare models like yours, and I have always been impressed by the complexity of the queries that you guys generate. On the wrong (suboptimal) model they become a nightmare. Anyway, there's no quick solution, as I said, it would require more dedicated time to produce a good result. Not that I want to sell you my time, it is only that a discussion in the forum is likely not enough for your needs, you will probably need to roll up your sleeves and be prepared to change the model and learn some advanced DAX.

A scenari similar to this can also be found in The Definitive Guide to DAX, on page 392 where we speak about "Frequent itemset search", which is a relaxed version of basket analysis.

Have fun with DAX!

Alberto Ferrari
http://www.sqlbi.com

Alberto Ferrari - SQLBI

CAPEconsulting · ‎09-08-2017

Many thanks for that @AlbertoFerrari

I have heard several of your sessions and will try to get your book. I actually did purchase your initial book "Excel 2013 Building Data Models with PowerPivot". So I am very aware of the amazing work you and Mark do.

I will look into your trainings/mentorship/consultation.

In general what would be your advise as in ideal steps in setting up models for cases like these where the fact table (currently just the one called Epi) is 550k rows x 100 columns - should I split it into 3-4 smaller fact tables like conditions, investigations, intervention dates and patient details all lonked via the same primary key such as patient ID??? And have all smaller filter tables like geography, time, age all linked to all the patient ID column of all these smaller fact tables??? That way I will be able to pivot columns and hence not have as many columns as you have suggested

Currently I have the typical star schema as follows

Abhijeet

PS: Would you revommend a mentoring arrangement with SQLBI or just the gebric trainibgs that you guys have?

AlbertoFerrari · ‎09-11-2017

Hi,

It is very hard to give you any advice without analyzing much deeper your requirements, sorry for that but whatever I tell you would likely be a wrong answer.

As a general idea, for the kind of queries that I think you are trying, I would go much farther in normalizing the model.

Instead of having 100 columns, I think you can end up with a small set of columns (patient, date, doctor, hospital and stuff like this) to identify the event (a visit, perhaps) and then another table where your 100 columns become 100 rows and the column names become a new "Symptom" dimension.

So, something like:

Patient     Date     Symptom     Measure
Alberto     09/11    Headache    YES
Alberto     09/11    Pain             YES
Alberto     09/11    Dead            NO

This lets you compute questions like: give the percentage of patients that had a headache and were not dead, or show me the patients without a headache, with pain and died. or any combination of AND, OR and NOT conditions. The code is not easy to write, but can be very generic and driven by the user through slicers.

Of course, the size of the table increases by a lot, but this is not an issue at all, a few hundreds of millions of rows are not scary at all.

Alberto

Alberto Ferrari - SQLBI

AlbertoFerrari · ‎09-11-2017

Hi,

It is very hard to give you any advice without analyzing much deeper your requirements, sorry for that but whatever I tell you would likely be a wrong answer.

As a general idea, for the kind of queries that I think you are trying, I would go much farther in normalizing the model.

Instead of having 100 columns, I think you can end up with a small set of columns (patient, date, doctor, hospital and stuff like this) to identify the event (a visit, perhaps) and then another table where your 100 columns become 100 rows and the column names become a new "Symptom" dimension.

So, something like:

Patient

Alberto Ferrari - SQLBI

CAPEconsulting · ‎09-05-2017

Any assistance please

@Eric_Zhang, @kolson256, @konstantinos, @Phil_Seamark, @ImkeF, @v-ljerr-msft, @MattAllington, @Sean

Just to give everyone a bit more clarity what I am trying to do is

I have prevalence of conditions in a chart like this

and I am trying to use a slicer to automatically pick up a section of the population as if I was creating a chart like this

and then use a selction of the 2nd chart to find prevalence in the 1st chart like this

BUT INSTAED OF CHART 1 which is based on one column of the table I am trying to use a slicer to automatically pick up the right column. And also the slicer fields to have composite columns too as in if the slicer value is DIABETES then the population selcted is FILTER(Epi, NOT(ISBLANK(Epi[diabetes type1 active])) || NOT(ISBLANK(Epi[diabetes type2 active])) || NOT(ISBLANK(Epi[diabetes undefined active])))

v-chuncz-msft · ‎09-05-2017

@CAPEconsulting,

You may learn to use the SELECTEDVALUE function.

https://www.sqlbi.com/blog/marco/2017/08/01/new-function-and-the-importance-of-variables-in-dax-dax/

Community Support Team _ Sam Zha
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

CAPEconsulting · ‎09-05-2017

@v-chuncz-msft

Not sure how this helps in my model.

Coukd you please elaborate on how you'd expect me to use SELECTEDVALUE for my particular scenario

variables and tables or columns

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly

Party with Power BI’s own Guy in a Cube

variables and tables or columns

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly