Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Syndicate_Admin
Administrator
Administrator

Performance ETL powerQuery or previous ETL ?

Good afternoon community.

I have some doubts regarding the performance with 3 custiones.

1- What is the best performance with respect to the ETL part?
I understand that to do data cleaning in powerBi it is convenient to do it from powerQuery and not with DAX functions.
But suppose I have a SQL server database and a table of 50 million records, of which 10 million have to be filtered because they are not useful.
It is convenient to make a view in SQL, clean all the data (do the ETL process) and then from PowerBi I take that view with the data that is useful to me?
Or do I take the table directly from powerBi and do ETL from powerQuery?

2- Is it convenient to put together a "massive" data flow from which you can then connect from different reports?
Assuming this flow connected to a SQL server database and took 1 hour to upgrade. Assuming a super consultation.
Or is it better to query the database directly in each report?
One way I see that it is better not to stick to the database many times for the same information, but in the other way I think it is more efficient.
In my opinion it is more effective to throw 100 small queries to the database, than a single query that contains the 100.

3- What would be the best way to "relate" or solve this problem.
I have 3 tables. Form A / Form B / Users.
Form A and B are related. And both have a user code that relates to the users table.
BUT, the user of Form A and B are not necessarily the same.
Should you duplicate the User table and relate each one to each Form? Or should you have the table loose and search from both forms with a calculated column your user?
Or if there is another alternative it would help me.
This example also appears to me in case in the same table I have more than 1 way of relating to another.
Example if Form had a cod_usuario_creador / cod_usuario_aprobador / cod_usuario_modifica.

Thanks a lot!

1 REPLY 1
lbendlin
Super User
Super User

Great questions. The answer to all of them is "It depends".  There are many factors that can impact ETL performance. You will need to test both options in your specific scenario and then make a decision. Generally you want to do the data destruction as late as possible. Same for the data modeling.  Try both options and see which one works better for you. Generally you want to let the data model do the work for you as much as possible.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.