Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Hello Community,
I'm looking for Best Practice (or otherwise) guidance on how to handle Power BI reporting for this specific undesirable source data situation.
PROBLEM: Usually in source data, ID lookup fields are numeric & singular. However, in this interesting case, the ID field is an nvarchar in the database, and can sometimes have multiple values (i.e. comma delimited).
Here is an oversimplified mock-up example.
- Sales Agents help Customers to fill out & submit applications.
- Once submitted, the Auditing Team can either accept or reject each application.
- A single rejection might have only 1 rejection reason, or it might have multiple rejection reasons.
'Fact Rejections' source data:
'D Rejection Reasons' source data:
'D Sales Persons' source data is as follows:
As expected, the Dimension 'D Rejection Reasons' can only filter the Fact table when there is a single rejection reason. Any time there are multiple reasons for a single rejection, the relationship is broken.
When SalesPerson = Abby, everything works because each rejection has only 1 reason.
But if SalesPerson = Shawn, things start to break.
- Measure 01 DISTINCT Rejections still works because it's dependent on Rejection ID.
- Measure 02 DISTINCT Rejection Reasons SHOULD return 5; because all 5 reasons are in the result set.
- Instead, it incorrectly returns 2, because it's counting each string as 1 DISTINCT reason
- "2, 3, 4" = 1 DISTINCT reason
- "1, 2, 3, 4, 5" = 1 DISTINCT reason
- Measure 03 Total Rejection Reasons SHOULD return 8; because 3 reasons (record 1) + 5 reasons (record 2) = 8.
- Instead, it incorrectly returns 2, because it can't join the FACT comma delimited list back to the dimension.
- And even if it could, the counting would still incorrectly return 2 (counting each comma delimited string as 1).
- Additionally, in this example, I am not able to slice on Rejection Reason because the relationship is broken.
Closing Q's
- Is there a way for Power BI to handle source data like this without modifying it?
- Or is it absolutely necessary to transform/modify the source data for Power BI use?
- Either in the Database or the Data Warehouse:
- Change Rejection Reason ID from an nvarchar to an int so it can be used as a proper relationship ID field.
- Ensure that Rejection Reason ID only shows a single value per record (no comma delimited values allowed).
- Expand table rows to repeat column data for each single Rejection Reason ID value. (Screenshot below)
- Are there other options (Best Practice or otherwise) that I'm not considering?
Regards,
Nathan
Solved! Go to Solution.
Hi @WinterMist
To split the [Rejection Reason ID] column, in Power Query, do the following:
1. Select 'Fact Rejections'[Rejection Reason ID] and hit "Split Column".
2. Use "comma" as the delimiter.
3. Open "Advanced Options" and set "Split into" to "Rows"
4. Close & Apply
I hope this helps.
Works beautifully! Thank you!
So is it just personal preference then at which layer the transformation occurs?
- DB Layer
- DW Layer
- Power Query Layer
Regards,
Nathan
Ideally, you want to do it as far upstream as possible. (but that can be very limiting/impossible in certain scenarios)
I'm not very good at PQ's M language but if I can find a way to do it in PQ instead of DAX, I'd go with PQ.
Hi @WinterMist
To split the [Rejection Reason ID] column, in Power Query, do the following:
1. Select 'Fact Rejections'[Rejection Reason ID] and hit "Split Column".
2. Use "comma" as the delimiter.
3. Open "Advanced Options" and set "Split into" to "Rows"
4. Close & Apply
I hope this helps.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
86 | |
75 | |
74 | |
55 | |
45 |
User | Count |
---|---|
117 | |
106 | |
77 | |
66 | |
65 |