Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hello,
We are working on dataflow in fabric and running into a problem, where we were unable to look up information dynamically.
Eg:
We have Tbl_ref which has following info
ref_ID ref_type System ref_code Short_Txt long_txt
1 cg A PA PersonAcc PersonalAccount
2 cg_part A PIN PersonID PersonalIdentification
we have dim_orig with following info
Orid_id Orig_key system cg_Part_Code Ref_part_Code Ref_part_short_text Ref_part_long_text CG_Code Ref_CG_Code Ref_cg_short_text Ref_gc_long_text
1 xxxx-FPPxxxxxxOV A PIN null null null PA null null null
We are trying to fill in the short text and long text information from tbl ref to dim_orig using merge operator.
The problem here is: we had to use merge operators one for each ref_type from tbl_ref and appending the values to dim_orig .
Below is the merge conditions in Fabric translated to SQL
SELECT Ref_cg_short_text Ref_gc_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg'
SELECT Ref_part_short_text Ref_part_long_text FROM dim_orig CS
LEFT JOIN Tbl_ref RM ON
CS.CG_part_CODE = RM.Ref_Code AND CS.SYSTEM = RM.System
AND RM.REF_TYPE_ = 'Cg_part'
Do we have any alternatives to avoid using merge operator multiple times with hardcoded values in dataflows as above and combine as one merge?
Are there only ever two possible values in the REF_TYPE column - 'Cg' and 'Cg_part'? If so you could do a single merge and then pivot the data so the two rows for Cg and Cg_part become two columns. I don't know if that would perform better than two separate merges though.
Hello cpwebb,
Thank you for your reponse. There are 38 more columns that we had to deal with including cg and Cg_part We took an approach of creating views on our Database with required joins and then use views as source on dataflows to import data. We are not quite certain if this is an ideal way. but this approach seems to work for our needs.
I am not a DB guy, but here is what I've seen... If what I am about to say is wrong, someone please correct me.
It is indeed better to do some operations at the source, ie the DB, and sometimes not. It really depends on many factors, such as whether your DB can have materialized views (Oracle, but not SQL Server), or if you have a competent DBA that has done a top notch job at optimizing the DB performance, then I'd say do as much as possible in the DB. But if not, PQ in the cloud can save the day.
This business of views, if the tables keep growing and growing, and everytime you request the view, it has to be rebuilt. There could be performance issues one day. A solution would be to push the raw data into a Fabric lakehouse, then use the SQL Analytics endpoint to create your views in the cloud and feed these to your DFg2. Or instead of the SQL endpoint, use a Spark notebook, which can actually write and create new tables in the lakehouse, if I remember my webinar correctly. The SQL endpoint can't do this, only create views, functions, and procedures.
Hi @newtofabric
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.
Thanks.
Hi @newtofabric
Apologies for the issue you have been facing and delay in the response from my end. I would like to check are you still facing this issue?
If the issue still persists, please reach out to our support team.
Please go ahead and raise a support ticket to reach our support team:
https://support.fabric.microsoft.com/support
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thank you.
Try to run the same M code in Power Query Desktop to see if you get the same error. That way we can determine if it is a Power Query Fabric issue or just a general Power Query engine issue.
Also, are you using a DFg1 or DFg2? In other words, did you create the DF from the Power BI persona, or the Data Factory persona?
Hello Element115, Thank you for your response. We are using data factory persona for our ETL not power BI to use M code
Data factory... so I guess dataflows gen 2, or DFg2, which is an ETL tool. In fact, it is Power Query in the cloud. So, when you right click on any query in your DFg2, you can select Advanced editor from the contextual menu and it will show you the M code for the query.
For debugging purposes, and curiosity's sake 😉, once the Advanced editor is open, you can select all and copy the code, and then paste it into Power Query Desktop to see if it causes the same issue. If not, then we would have determined that it is something specific to Power Query in the cloud, ie DFg2. When you have time, of course.
Hi @newtofabric
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.
Appreciate your patience.
Thanks
Check out the September 2024 Fabric update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.