Showing results for 
Search instead for 
Did you mean: 

Performance ETL powerQuery or previous ETL ?

Good afternoon community.

I have some doubts regarding the performance with 3 custiones.

1- What is the best performance with respect to the ETL part?
I understand that to do data cleaning in powerBi it is convenient to do it from powerQuery and not with DAX functions.
But suppose I have a SQL server database and a table of 50 million records, of which 10 million have to be filtered because they are not useful.
It is convenient to make a view in SQL, clean all the data (do the ETL process) and then from PowerBi I take that view with the data that is useful to me?
Or do I take the table directly from powerBi and do ETL from powerQuery?

2- Is it convenient to put together a "massive" data flow from which you can then connect from different reports?
Assuming this flow connected to a SQL server database and took 1 hour to upgrade. Assuming a super consultation.
Or is it better to query the database directly in each report?
One way I see that it is better not to stick to the database many times for the same information, but in the other way I think it is more efficient.
In my opinion it is more effective to throw 100 small queries to the database, than a single query that contains the 100.

3- What would be the best way to "relate" or solve this problem.
I have 3 tables. Form A / Form B / Users.
Form A and B are related. And both have a user code that relates to the users table.
BUT, the user of Form A and B are not necessarily the same.
Should you duplicate the User table and relate each one to each Form? Or should you have the table loose and search from both forms with a calculated column your user?
Or if there is another alternative it would help me.
This example also appears to me in case in the same table I have more than 1 way of relating to another.
Example if Form had a cod_usuario_creador / cod_usuario_aprobador / cod_usuario_modifica.

Thanks a lot!

Super User
Super User

Great questions. The answer to all of them is "It depends".  There are many factors that can impact ETL performance. You will need to test both options in your specific scenario and then make a decision. Generally you want to do the data destruction as late as possible. Same for the data modeling.  Try both options and see which one works better for you. Generally you want to let the data model do the work for you as much as possible.

Helpful resources

Exciting changes

Power BI Community Changes

Check out the changes to the Power BI Community announced at Build.

May 2023 update

Power BI May 2023 Update

Find out more about the May 2023 update.

Kudo Data Story carousel

Data Stories Gallery

Visit our Data Stories Gallery and give kudos to your favorite Data Stories.

Top Solution Authors