Before getting onto the tips, it's important to understand that "close & apply" button will run all the steps on each table we have added in the edit queries or transform data window. This means that the more we forget about good practices in power query, the longer the query generated will take.
That's why it is important to apply good practices in Transform Data with Power Query.
Best Practices
- Consolidate repeated steps, it's probably that the engine will run automatically steps for any minor change we can make or even add some "Change Types". Let's try to reduce the steps the maximum we can. All removing columns together, all change types together. Check how is the best way in your query to get the smallest amount of steps building the same query.
- Create Group of tables to organize the tables in the data model. For example: Group of dimensions, facts, auxiliars, etc.
- Be sure that your tables and columns names are easy to understand for the final user. If they are not, rename tables and columns to make them clearer.
- All the columns must have a data type assigned corresponding to the data. We should avoid having “ABC123″ in a column header because the correct type of data can make our model better and even smaller sometimes.
- Always have the formula bar turned on to learn and modify Power Query in order to win more flexibility in transformations and optimize steps.
- If you are adding a custom column or function it is important to rename the step in the advanced editor to make the reading easier in the steps. We have to avoid having steps like "Add Custom Columns" or "Change Type 4". You should also rename default steps if you have more than one to avoid numbers at the end. Clarify what column you have rename or change type.
- If the transformation is too complex to explain it with just the rename of the step (recommendation 6) we can get in the advanced editor where all the power query code is and add a comment line above with //Comment text. This will create a small "i" icon next to the steps on the right transformations pane that will allow the user to read the description of the step.
- The Power Query parameters can help us to filter the data in order to work with a smaller sample of data during the development with Power Bi Desktop and increase it after the publish to Service.
- Run transformations as sooner or earlier as possible. This means that if we have a single database/warehouse source it will be better to run the more transformations we can right there. Building store procedures for new tables or building views with the expected fact or dimension. If we can't because we have more than one source or we have files instead of databases, then we can go ahead and transform data model in Power Query.
- Keeping previous idea, we should avoid breaking the famous "dark" query folding. We can check here which Power Query steps might break it.
- If we achieve query folding and the data sources allows us to have incremental, we can configure incremental refresh in order to reduce the refresh time for a dataset in Service.
- We have to be aware of the diagnostics tool to discover performance issues in the queries and tables.
- If you want to check which is the best way to do something, look for Chris Webb blog.
- In order to deep dive in Power Query M and learn really good the language you can read the blog series written by Ben Gribaudo. It now has almost 20 chapters of knowledge.
These are not all the good practices we can have. I'm sure that this community can provide more at this list 🙂
I encourage you to start using this list. Baby steps. Start using the numbers that make sense for you and soon you might have your own list to improve the usage of Transform Data with Power Query (data modeling) in Power Bi Desktop.
This is an original post from my spanish blog LaDataWeb.