Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
WinterMist
Impactful Individual
Impactful Individual

Best Practices Q - Handling Text ID Fields With Comma Delimited Values

Hello Community,

 

I'm looking for Best Practice (or otherwise) guidance on how to handle Power BI reporting for this specific undesirable source data situation.

 

PROBLEM: Usually in source data, ID lookup fields are numeric & singular.  However, in this interesting case, the ID field is an nvarchar in the database, and can sometimes have multiple values (i.e. comma delimited).

 

Here is an oversimplified mock-up example.

- Sales Agents help Customers to fill out & submit applications.

- Once submitted, the Auditing Team can either accept or reject each application.

- A single rejection might have only 1 rejection reason, or it might have multiple rejection reasons.

 

WinterMist_0-1730480503712.png

 

'Fact Rejections' source data:

 

WinterMist_2-1730480857992.png

 

'D Rejection Reasons' source data:

 

WinterMist_4-1730481145230.png

'D Sales Persons' source data is as follows:

 

WinterMist_5-1730481195398.png

 

As expected, the Dimension 'D Rejection Reasons' can only filter the Fact table when there is a single rejection reason.  Any time there are multiple reasons for a single rejection, the relationship is broken.

 

WinterMist_6-1730481518860.png

 

When SalesPerson = Abby, everything works because each rejection has only 1 reason.

 

WinterMist_7-1730481788710.png

 

But if SalesPerson = Shawn, things start to break.

 

WinterMist_8-1730481906141.png

 

 

- Measure 01 DISTINCT Rejections still works because it's dependent on Rejection ID.

- Measure 02 DISTINCT Rejection Reasons SHOULD return 5; because all 5 reasons are in the result set.

   - Instead, it incorrectly returns 2, because it's counting each string as 1 DISTINCT reason

      - "2, 3, 4" = 1 DISTINCT reason

      - "1, 2, 3, 4, 5" = 1 DISTINCT reason

- Measure 03 Total Rejection Reasons SHOULD return 8; because 3 reasons (record 1) + 5 reasons (record 2) = 8.

   - Instead, it incorrectly returns 2, because it can't join the FACT comma delimited list back to the dimension.

   - And even if it could, the counting would still incorrectly return 2 (counting each comma delimited string as 1).

- Additionally, in this example, I am not able to slice on Rejection Reason because the relationship is broken.

 

Closing Q's

- Is there a way for Power BI to handle source data like this without modifying it?

- Or is it absolutely necessary to transform/modify the source data for Power BI use?

   - Either in the Database or the Data Warehouse:

      - Change Rejection Reason ID from an nvarchar to an int so it can be used as a proper relationship ID field.

      - Ensure that Rejection Reason ID only shows a single value per record (no comma delimited values allowed).

      - Expand table rows to repeat column data for each single Rejection Reason ID value.  (Screenshot below)

          

WinterMist_9-1730486624419.png

 

- Are there other options (Best Practice or otherwise) that I'm not considering?

 

Regards,

Nathan

1 ACCEPTED SOLUTION
gmsamborn
Super User
Super User

Hi @WinterMist 

To split the [Rejection Reason ID] column, in Power Query, do the following:
1. Select 'Fact Rejections'[Rejection Reason ID] and hit "Split Column".
2. Use "comma" as the delimiter.
3. Open "Advanced Options" and set "Split into" to "Rows"
4. Close & Apply

 

I hope this helps.

View solution in original post

3 REPLIES 3
WinterMist
Impactful Individual
Impactful Individual

@gmsamborn 

 

Works beautifully!  Thank you!

 

So is it just personal preference then at which layer the transformation occurs?

- DB Layer

- DW Layer

- Power Query Layer

 

WinterMist_0-1730495779055.png

 

Regards,

Nathan

 

Ideally, you want to do it as far upstream as possible.  (but that can be very limiting/impossible in certain scenarios)

 

I'm not very good at PQ's M language but if I can find a way to do it in PQ instead of DAX, I'd go with PQ.

 

gmsamborn
Super User
Super User

Hi @WinterMist 

To split the [Rejection Reason ID] column, in Power Query, do the following:
1. Select 'Fact Rejections'[Rejection Reason ID] and hit "Split Column".
2. Use "comma" as the delimiter.
3. Open "Advanced Options" and set "Split into" to "Rows"
4. Close & Apply

 

I hope this helps.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.