Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
newtofabric
Regular Visitor

Cross reference table column with multiple values in Fabric Dataflow

Hello,

We are working on dataflow in fabric and running into a problem, where we were unable to look up information dynamically.

Eg:

We have Tbl_ref which has following info

ref_ID  ref_type  System  ref_code  Short_Txt        long_txt

1          cg           A           PA           PersonAcc       PersonalAccount

2          cg_part    A           PIN          PersonID        PersonalIdentification

 

we have dim_orig with following info


Orid_id Orig_key system cg_Part_Code Ref_part_Code Ref_part_short_text Ref_part_long_text CG_Code Ref_CG_Code Ref_cg_short_text Ref_gc_long_text
1 xxxx-FPPxxxxxxOV A PIN null null null PA null null null


We are trying to fill in the short text and long text information from tbl ref to dim_orig using merge operator.

The problem here is: we had to use merge operators one for each ref_type from tbl_ref and appending the values to dim_orig .


Below is the merge conditions in Fabric translated to SQL

SELECT Ref_cg_short_text Ref_gc_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg'

 

SELECT Ref_part_short_text Ref_part_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_part_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg_part'


Do we have any alternatives to avoid using merge operator multiple times with hardcoded values in dataflows as above and combine  as one merge?

9 REPLIES 9
cpwebb
Microsoft Employee
Microsoft Employee

Are there only ever two possible values in the REF_TYPE column - 'Cg' and 'Cg_part'? If so you could do a single merge and then pivot the data so the two rows for Cg and Cg_part become two columns. I don't know if that would perform better than two separate merges though.

Hello cpwebb,

 

Thank you for your reponse. There are 38 more columns that we had to deal with including cg and Cg_part We took an approach of creating views on our Database with required joins and then use views as source on dataflows to import data. We are not quite certain if this is an ideal way. but this approach seems to work for our needs.

I am not a DB guy, but here is what I've seen... If what I am about to say is wrong, someone please correct me.

 

It is indeed better to do some operations at the source, ie the DB, and sometimes not.  It really depends on many factors, such as whether your DB can have materialized views (Oracle, but not SQL Server), or if you have a competent DBA that has done a top notch job at optimizing the DB performance, then I'd say do as much as possible in the DB.  But if not, PQ in the cloud can save the day.  

 

This business of views, if the tables keep growing and growing, and everytime you request the view, it has to be rebuilt.  There could be performance issues one day.  A solution would be to push the raw data into a Fabric lakehouse, then use the SQL Analytics endpoint to create your views in the cloud and feed these to your DFg2. Or instead of the SQL endpoint, use a Spark notebook, which can actually write and create new tables in the lakehouse, if I remember my webinar correctly.  The SQL endpoint can't do this, only create views, functions, and procedures.

Hi @newtofabric 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.


Thanks.

v-cboorla-msft
Community Support
Community Support

Hi @newtofabric 

 

Apologies for the issue you have been facing and delay in the response from my end. I would like to check are you still facing this issue? 

If the issue still persists, please reach out to our support team.

Please go ahead and raise a support ticket to reach our support team:

https://support.fabric.microsoft.com/support

 

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

 

Thank you.

Element115
Power Participant
Power Participant

Try to run the same M code in Power Query Desktop to see if you get the same error. That way we can determine if it is a Power Query Fabric issue or just a general Power Query engine issue.

 

Also, are you using a DFg1 or DFg2?  In other words, did you create the DF from the Power BI persona, or the Data Factory persona?

Hello Element115, Thank you for your response. We are using data factory persona for our ETL not power BI to use M code

Data factory... so I guess dataflows gen 2, or DFg2, which is an ETL tool. In fact, it is Power Query in the cloud.  So, when you right click on any query in your DFg2, you can select Advanced editor from the contextual menu and it will show you the M code for the query.

 

For debugging purposes, and curiosity's sake 😉, once the Advanced editor is open, you can select all and copy the code, and then paste it into Power Query Desktop to see if it causes the same issue. If not, then we would have determined that it is something specific to Power Query in the cloud, ie DFg2.  When you have time, of course. 

v-cboorla-msft
Community Support
Community Support

Hi @newtofabric 

 

Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Appreciate your patience.

 

Thanks

Helpful resources

This widget could not be displayed.