Solved: Re: Fact table filtering dim table

AlB · ‎09-03-2023

Hi all,

We have a simple model with one dim table (DimT) and one fact table (FactT) (pbix attached) :

We now create a simple table visual (V1) with DimT[Product] and SUM(FactT[Sales]). The result is as expected, with DimT[Product] filtering FactT:

We create another simple table visual (V2) with DimT[Product] and FactT[Sales] (set to Don't summarize). The result is as expected:

However, if we create a a third visual (V3) by changing the order in V2, i.e. with FactT[Sales] first and then DimT[Product] (set to Don't summarize) the result is this

where Fact[Sales] seems to be filtering DimT[Product]. The default filtering direction however is DimT --> FactT

Question 1: Why is FactT filtering DimT? Shouldn't the result be a full Cartesian product of FactT[Sales] and DimT[Product]?

In a fourth visual (V4), we add to the table in V3 a simple measure:

Product_M = 
SELECTEDVALUE(DimT[Product])

with the result (note not all the results are shown here, see attached pbix)

where we see that the measure is now forcing the Cartesian product we mentioned earlier

Question 2 : Can you explain this behavior?

Thanks

OwenAuger · ‎09-07-2023

@AlB

The difference lies in the DAX queries Power BI generates for the different visuals.

1. When no measure is included in the visual (as in V2/V3), a "hidden measure" is automatically added to the query within SUMMARIZECOLUMNS, in this case with this expression (wrapped in CALCULATE):

COUNTROWS ( 'FactT' )

This ensures that only combinations of Product/Sales values where FacT is nonempty are returned.

The DAX query also includes a check that at least one of Product/Sales is nonblank.

2. When a measure is included the visual (as in V4), the nonblank values of this measure will determine which combinations of Product/Sales values are returned. In the case of [Product_M], this measure is nonblank for all combinations in the Cartesian product of Product/Sales values.

We could summarise this as: If a table visual includes fact/dim columns but no measures, Power BI will automatically filter the combinations of fact/dim columns to those where fact is nonempty.

I extracted the essential parts of the V2/V3 & V4 queries below to illustrate:

-- V2/V3 DAX Query
-- Includes rows where
--   1. CALCULATE ( COUNTROWS ( FactT ) ) is nonblank,
--      which is true when FactT is nonempty when Product/Sales values are applied as filters.
--   2. At least one of DimT[Product] or FactT[Sales] is nonblank
EVALUATE
FILTER (
    SUMMARIZECOLUMNS (
        'DimT'[Product],
        'FactT'[Sales],
        "CountRowsFactT", CALCULATE ( COUNTROWS ( 'FactT' ) )
    ),
    OR ( NOT ( ISBLANK ( 'DimT'[Product] ) ), NOT ( ISBLANK ( 'FactT'[Sales] ) ) )
)

-- V4 DAX Query
-- Includes rows where
--   1. [Product_M] is nonblank.
--      which is true for any combination of DimT[Product] & FactT[Sales]
--      resulting in the Cartesian product.
EVALUATE
SUMMARIZECOLUMNS (
    'DimT'[Product],
    'FactT'[Sales],
    "Product_M", 'FactT'[Product_M]
)

It would be interesting to understand the general rule determining what expressions are automatically added when a visual includes no measures. Presumably tables on the "end" of a chain of 1:many relationships would always be aggregated.

Regards

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

View solution in original post

AlB · ‎09-05-2023

Thanks @jdbuchanan71

What I do not understand is, why does the measure force that behavior (the change from V3 to V4)?

OwenAuger · ‎09-07-2023

@AlB

The difference lies in the DAX queries Power BI generates for the different visuals.

1. When no measure is included in the visual (as in V2/V3), a "hidden measure" is automatically added to the query within SUMMARIZECOLUMNS, in this case with this expression (wrapped in CALCULATE):

COUNTROWS ( 'FactT' )

This ensures that only combinations of Product/Sales values where FacT is nonempty are returned.

The DAX query also includes a check that at least one of Product/Sales is nonblank.

2. When a measure is included the visual (as in V4), the nonblank values of this measure will determine which combinations of Product/Sales values are returned. In the case of [Product_M], this measure is nonblank for all combinations in the Cartesian product of Product/Sales values.

We could summarise this as: If a table visual includes fact/dim columns but no measures, Power BI will automatically filter the combinations of fact/dim columns to those where fact is nonempty.

I extracted the essential parts of the V2/V3 & V4 queries below to illustrate:

-- V2/V3 DAX Query
-- Includes rows where
--   1. CALCULATE ( COUNTROWS ( FactT ) ) is nonblank,
--      which is true when FactT is nonempty when Product/Sales values are applied as filters.
--   2. At least one of DimT[Product] or FactT[Sales] is nonblank
EVALUATE
FILTER (
    SUMMARIZECOLUMNS (
        'DimT'[Product],
        'FactT'[Sales],
        "CountRowsFactT", CALCULATE ( COUNTROWS ( 'FactT' ) )
    ),
    OR ( NOT ( ISBLANK ( 'DimT'[Product] ) ), NOT ( ISBLANK ( 'FactT'[Sales] ) ) )
)

-- V4 DAX Query
-- Includes rows where
--   1. [Product_M] is nonblank.
--      which is true for any combination of DimT[Product] & FactT[Sales]
--      resulting in the Cartesian product.
EVALUATE
SUMMARIZECOLUMNS (
    'DimT'[Product],
    'FactT'[Sales],
    "Product_M", 'FactT'[Product_M]
)

It would be interesting to understand the general rule determining what expressions are automatically added when a visual includes no measures. Presumably tables on the "end" of a chain of 1:many relationships would always be aggregated.

Regards

Owen Auger
Did I answer your question? Mark my post as a solution!
Blog
LinkedIn

AlB · ‎09-10-2023

Thanks very much @OwenAuger for the fantastic explanation.

That makes sense.

I also see that the queries generated by PBI for V2 and V3 are exactly the same, regardless of the order in which DimT[Product] and FactT[Sales] are placed in the table visual. I was always under the impression this order would play a role but apparently I was mistaken.

Thanks

jdbuchanan71 · ‎09-04-2023

v3 should not show the cartesian product. The order the fields are added to the visual has no effect on the filtering because the relationship has not changed.

Adding the measure in v4 forces the model to consider every FactT row that is filtered by SalesT. Since the relationship is set to single direction, there is no filter so it is giving you the answer against every combination.

If you set the relationship to bi-directional the filtering in v4 using the measure can flow up hill.

AmiraBedh · ‎09-03-2023

In a one-to-many relationship between tables in Power BI, the default filtering behavior is from the "one" side (DimT) to the "many" side (FactT). This means that if you filter or slice based on DimT[Product], only the corresponding rows in FactT that have a match with DimT will be shown. The opposite is not true by default: selecting a value from FactT will not filter DimT unless you have enabled bidirectional filtering (which is generally not recommended as it can cause ambiguous results).

When you design visuals in Power BI, the way the data is presented is based on the order in which you add fields to the Values section of a visual. But this doesn't mean that the order of the fields changes the filtering behavior; it's just a display preference. The relationship and the filtering defined between tables are maintained.
- **V1**: Summarized view showing total sales for each product. The filtering behavior here is straightforward and as expected.

- **V2**: This visual lists sales for each product in separate rows. Since you have FactT[Sales] set to "Don't summarize", it lists each sales value on a separate line, but still respects the relationship, showing which product it relates to.

- **V3**: The behavior is still the same, but you've changed the visual order. It doesn't mean FactT is filtering DimT. It's simply presenting sales and then its corresponding product. Power BI still understands the relationship and shows the matching DimT[Product] for each FactT[Sales].

To clarify, you are NOT getting a Cartesian product in V3. If you were getting a Cartesian product, you'd see every combination of DimT[Product] and FactT[Sales], which you aren’t.

`SELECTEDVALUE` is a DAX function that returns the value of the column for the row when there's only one value. If there are multiple or no values, it returns a blank. When you use this measure in conjunction with a visual that has FactT[Sales], it checks for the product associated with that sale. If there's only one product for a sale (which there should be, based on your data model), it returns it. It doesn’t create a Cartesian product; it just returns the associated product for the sale.

FactT isn't filtering DimT in V3. Instead, it's displaying sales and the corresponding product based on the relationship you've established. If it were a full Cartesian product, you'd see every sales value for every product, which is not what you observed.

The `Product_M` measure you introduced using `SELECTEDVALUE(DimT[Product])` isn't forcing a Cartesian product. It's merely returning the associated product for the sale based on the existing relationship between FactT and DimT.

Proud to be a Power BI Super User !

Microsoft Community : https://docs.microsoft.com/en-us/users/AmiraBedhiafi
Linkedin : https://www.linkedin.com/in/amira-bedhiafi/
StackOverflow : https://stackoverflow.com/users/9517769/amira-bedhiafi
C-Sharp Corner : https://www.c-sharpcorner.com/members/amira-bedhiafi
Power BI Community :https://community.powerbi.com/t5/user/viewprofilepage/user-id/332696

AlB · ‎09-03-2023

Hi @AmiraBedh

Thanks very much for you quick response.

You have not clarified all of the questions, I'm afraid.

I did not say V3 shows the Cartesian product. Quite to the contrary, I said it should, but it does not.

V4 is clearly showing the Cartesian product of Sales and Product. Like I said, the screen cap does not show all the results (only products A to C) but you can see the whole thing in the attached pbix.

My main question is, why adding the measure causes the change of behavior between V3 and V4?

Thanks

@TomMartens @MFelix @jdbuchanan71 @OwenAuger