Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
fabric_user
Frequent Visitor

Direct Lake relationships - duplicates, null values, assume referential integrity

Hi,

 

1)

When setting up the relationships for a direct lake semantic model, there is the option to select "Assume referential integrity". What does this option mean in the context of direct lake?

I have been trying to read up a bit on "Assume referential integrity", and here is my understanding so far:

 

"Assume referential integrity" is available as an option only in the context of DirectQuery in Power BI Desktop, and this setting enables DirectQuery to run more efficient queries against the data source by using inner joins instead of outer joins, according to the documentation.

 

Let's say a relationship is a one-to-many relationship:

By checking the "Assume referential integrity" box, the developer of the semantic model is telling Power BI to assume that

  • for each value in the many-column, there will be a corresponding value in the one-column (the inner join will exclude records from the many-table which don't have a corresponding value in the one-table).
  • the data in the many-column is never null or blank (the inner join will exclude records from the many-table where the key column is null or blank).

The developer of the semantic model bears the risk of getting inconsistent results if these assumptions about the data source proves to be wrong. For example, exclusion of records from the many-table would impact visual totals.

 

In Microsoft Fabric, when setting up the relationships in a Direct Lake semantic model, there is the option of checking "Assume referential integrity".

Does "Assume referential integrity" apply to the Direct Lake mode, or does it only apply if the visual queries fall back to DirectQuery?


Does Direct Lake behave like Import mode or like DirectQuery, with regards to "Assume referential integrity"? 

 

fabric_user_0-1704837043081.png

(Sorry for using FactOrder and FactOrderLine in this example of a relationship 😅 I know relationships between fact tables are not optimal from a performance perspective. Imagine instead that this is a relationship between a regular dimension table and a fact table.)

 

2)

Primary, foreign and unique keys in Fabric Data Warehouse and SQL Analytics Endpoint are not enforced, meaning that the keys in Data Warehouse and SQL Analytics Endpoint aren't a guarantee for avoiding duplicate values in the key columns. 

The documentation says: "Primary key columns must contain unique values. DAX queries will fail if duplicate primary key values are detected."

My interpretation of this is that the DAX queries, which are triggered when we browse or refresh a Power BI report page, validates whether or not the relationship key columns have unique values. This is also my experience from reports built on top of the direct lake default semantic model. If I have duplicate values (or null values) in the column on the one-side of a relationship, then the visuals in the Power BI report which are depending on this relationship will display error messages (see below).

I'm curious if the DAX queries (in Direct Lake mode) always will detect if there are duplicate (or null) values in the primary key?
("DAX queries will fail if duplicate primary key values are detected.")
I hope they always will detect such cases, so we get consistent results (i.e. an error message if there are duplicates or null values in the primary key column) 😃

 

I hope someone can shed some light on this. Thank you! 😀

 

 

 

fabric_user_1-1704837671959.png

 

 

fabric_user_2-1704837687803.png

 

 

Ps. The Direct Lake documentation mentions that

 

  • "Validation is limited for Direct Lake models. User selections are assumed correct and no queries will validate cardinality and cross filter selections for relationships, or for the selected date column in a date table."

    Does the limited validation which is mentioned here, refer to the edit relationship dialog we use when we are defining the relationships in the semantic model? (and also the "mark as date table" dialog)

    Or does the limited validation also refer to the DAX queries in the Power BI report visuals (as mentioned above, i.e. checking for duplicate or null values on the one-side of relationships before rendering a visual)? 
3 REPLIES 3
v-gchenna-msft
Community Support
Community Support

Hi @fabric_user ,

Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Hi @fabric_user ,

Apologies for the delay in reply from our side. 
Unfortunately we didn't get any update from the internal team regarding the ask.
I kindly request you to raise a support-ticket here. Our team will coordinate with you and help you with the same.

After creating a Support ticket please provide the ticket number as it would help us to track for more information. Hope this helps. Please let us know if you have any other queries.

Hi @fabric_user ,

We haven’t heard from you on the last response and was just checking back to see if you got a chance to create a support ticket. 
Incase if created, please provide the ticket number as it would help us to track for more information.

Thanks,
Gopi Krishna Chenna

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.