Re: Cross reference table column with multiple val...

newtofabric · ‎02-08-2024

Hello,

We are working on dataflow in fabric and running into a problem, where we were unable to look up information dynamically.

Eg:

We have Tbl_ref which has following info

ref_ID ref_type System ref_code Short_Txt long_txt

1 cg A PA PersonAcc PersonalAccount

2 cg_part A PIN PersonID PersonalIdentification

we have dim_orig with following info

Orid_id Orig_key system cg_Part_Code Ref_part_Code Ref_part_short_text Ref_part_long_text CG_Code Ref_CG_Code Ref_cg_short_text Ref_gc_long_text
1 xxxx-FPPxxxxxxOV A PIN null null null PA null null null

We are trying to fill in the short text and long text information from tbl ref to dim_orig using merge operator.

The problem here is: we had to use merge operators one for each ref_type from tbl_ref and appending the values to dim_orig .

Below is the merge conditions in Fabric translated to SQL

SELECT Ref_cg_short_text Ref_gc_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg'

SELECT Ref_part_short_text Ref_part_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_part_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg_part'

Do we have any alternatives to avoid using merge operator multiple times with hardcoded values in dataflows as above and combine as one merge?

cpwebb · ‎02-20-2024

Are there only ever two possible values in the REF_TYPE column - 'Cg' and 'Cg_part'? If so you could do a single merge and then pivot the data so the two rows for Cg and Cg_part become two columns. I don't know if that would perform better than two separate merges though.

newtofabric · ‎02-21-2024

Hello cpwebb,

Thank you for your reponse. There are 38 more columns that we had to deal with including cg and Cg_part We took an approach of creating views on our Database with required joins and then use views as source on dataflows to import data. We are not quite certain if this is an ideal way. but this approach seems to work for our needs.

Element115 · ‎02-21-2024

I am not a DB guy, but here is what I've seen... If what I am about to say is wrong, someone please correct me.

It is indeed better to do some operations at the source, ie the DB, and sometimes not. It really depends on many factors, such as whether your DB can have materialized views (Oracle, but not SQL Server), or if you have a competent DBA that has done a top notch job at optimizing the DB performance, then I'd say do as much as possible in the DB. But if not, PQ in the cloud can save the day.

This business of views, if the tables keep growing and growing, and everytime you request the view, it has to be rebuilt. There could be performance issues one day. A solution would be to push the raw data into a Fabric lakehouse, then use the SQL Analytics endpoint to create your views in the cloud and feed these to your DFg2. Or instead of the SQL endpoint, use a Spark notebook, which can actually write and create new tables in the lakehouse, if I remember my webinar correctly. The SQL endpoint can't do this, only create views, functions, and procedures.

v-cboorla-msft · ‎02-21-2024

Hi @newtofabric

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.

Thanks.

v-cboorla-msft · ‎02-19-2024

Hi @newtofabric

Apologies for the issue you have been facing and delay in the response from my end. I would like to check are you still facing this issue?

If the issue still persists, please reach out to our support team.

Please go ahead and raise a support ticket to reach our support team:

https://support.fabric.microsoft.com/support

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thank you.

Element115 · ‎02-09-2024

Try to run the same M code in Power Query Desktop to see if you get the same error. That way we can determine if it is a Power Query Fabric issue or just a general Power Query engine issue.

Also, are you using a DFg1 or DFg2? In other words, did you create the DF from the Power BI persona, or the Data Factory persona?

newtofabric · ‎02-21-2024

Hello Element115, Thank you for your response. We are using data factory persona for our ETL not power BI to use M code

Element115 · ‎02-21-2024

Data factory... so I guess dataflows gen 2, or DFg2, which is an ETL tool. In fact, it is Power Query in the cloud. So, when you right click on any query in your DFg2, you can select Advanced editor from the contextual menu and it will show you the M code for the query.

For debugging purposes, and curiosity's sake 😉, once the Advanced editor is open, you can select all and copy the code, and then paste it into Power Query Desktop to see if it causes the same issue. If not, then we would have determined that it is something specific to Power Query in the cloud, ie DFg2. When you have time, of course.

v-cboorla-msft · ‎02-09-2024

Hi @newtofabric

Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Appreciate your patience.

Thanks

Cross reference table column with multiple values in Fabric Dataflow

Helpful resources

Join us at the Microsoft Fabric Community Conference

We want your feedback!

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

Join us at the 2025 Microsoft Fabric Community Conference

Cross reference table column with multiple values in Fabric Dataflow

Helpful resources

Join us at the Microsoft Fabric Community Conference

We want your feedback!

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024