Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
travbum
Advocate I
Advocate I

How many Applied Steps do you use for a typical query?

I've just worked my way through a somewhat more advanced task and it took me 12 steps to complete. I'm wondering if this is excessive and if I should aim to use less?

1 ACCEPTED SOLUTION
KHorseman
Community Champion
Community Champion

Unfortunately that's too vague a question to give a sensible answer to. It's like asking "How much water do I need?" Need for what? To drink today? To survive in the desert for a month? To fill a fish tank?

 

You need exactly as many steps as it takes to get the result you need in by the most efficient means. In the report I have open there are 14 queries. One of them has two steps, and it's one of the largest and most important tables in the whole model. Another one has 21 steps, plus 6 more steps that take place in a different query that is merged with it on step 15.

 

A better question might be, "Are my query load times as quick as they could be?" If refreshing your data takes a long time, maybe there's a better way to write your query in fewer steps. But maybe there isn't. Fewer steps also might not be more efficient. Sometimes an extra step or two might make things load faster.

 

Can you give an example of what this query does and how? Maybe there's a better way.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




View solution in original post

8 REPLIES 8
KHorseman
Community Champion
Community Champion

Unfortunately that's too vague a question to give a sensible answer to. It's like asking "How much water do I need?" Need for what? To drink today? To survive in the desert for a month? To fill a fish tank?

 

You need exactly as many steps as it takes to get the result you need in by the most efficient means. In the report I have open there are 14 queries. One of them has two steps, and it's one of the largest and most important tables in the whole model. Another one has 21 steps, plus 6 more steps that take place in a different query that is merged with it on step 15.

 

A better question might be, "Are my query load times as quick as they could be?" If refreshing your data takes a long time, maybe there's a better way to write your query in fewer steps. But maybe there isn't. Fewer steps also might not be more efficient. Sometimes an extra step or two might make things load faster.

 

Can you give an example of what this query does and how? Maybe there's a better way.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




a@test.comQ1Correct 
a@test.comQ2Incorrect 
b@test.comQ1Incorrect 
b@test.comQ2Incorrect 
c@test.comQ1Correct 
c@test.comQ2Correct 
    
    
a@test.comQ1Incorrect 
a@test.comQ2Correct 
b@test.comQ2Incorrect 
d@test.comQ1Correct 
d@test.comQ2Incorrect 
    
    
a@test.comQ1CorrectIncorrect
a@test.comQ2IncorrectCorrect
b@test.comQ1Incorrect 
b@test.comQ2IncorrectIncorrect
c@test.comQ1Correct 
c@test.comQ2Correct 
d@test.comQ1 Correct
d@test.comQ2 Incorrect

 

I've got two tables, Table1 and Table2 which each have a set of email addressess, questions, and how people responded which is shown in the final table. Column 3 of the last table is column 3 of Table1, and column 4 of the last table is column 3 of Table2. This is what I've got thus far:

 

let
    Table1 = Table.AddColumn(Sheet1, "Response_Unique", each "S1"&[Response]),
    Table2 = Table.AddColumn(Sheet2, "Response_Unique", each "S2"&[Response]),
    Source = Table.Combine({Table1, Table2}),
    #"Added Custom" = Table.AddColumn(Source, "Merge_ID", each [Email]&"-"&[Question]),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Merge_ID"}, {{"Merged Responses", each Text.Combine([Response_Unique], ""), type text}}),
    Joined = Table.Join(#"Added Custom", "Merge_ID", #"Grouped Rows", "Merge_ID"),
    #"Split Column by Delimiter" = Table.SplitColumn(Joined,"Merged Responses",Splitter.SplitTextByDelimiter("S2", QuoteStyle.Csv),{"Merged Responses.1", "Merged Responses.2"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Merged Responses.1", type text}, {"Merged Responses.2", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","S1","",Replacer.ReplaceText,{"Merged Responses.1"}),
    #"Removed Duplicates" = Table.Distinct(#"Replaced Value", {"Merge_ID"}),
    #"Renamed Columns1" = Table.RenameColumns(#"Removed Duplicates",{{"Merged Responses.1", "Sheet 1 Response"}, {"Merged Responses.2", "Sheet 2 Response"}}),
    #"Replaced Value1" = Table.ReplaceValue(#"Renamed Columns1","",null,Replacer.ReplaceValue,{"Sheet 1 Response"}),
    #"Removed Columns" = Table.RemoveColumns(#"Replaced Value1",{"Response", "Response_Unique", "Merge_ID"})
in
    #"Removed Columns"

 

This is my first adventure into the Advanced Editor and making something a bit more complicated than just a new column. I'm guessing then that this is just the way things go with Power BI when you're dealing with data sources that aren't so organized.

@travbum I don't see anything obvious that I would do differently to make that more efficient, though some here might spot something I missed. I might change the order slightly so that similar steps go together when possible (e.g. do all the Replace Value stuff in a row before moving on) but that's a purely cosmetic thing. How long does it take to refresh this?

 

Messy input definitely does make for long queries. If you have some Advil on hand I'll tell you about that 20-something step query of mine.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




Thanks for the feedback. As far as refreshing, with 2 Excel Sheets each with 6 rows and 3 columns takes about 1s. The data we're actually working with has about ~60,000 rows from Salesforce though and it honestly seems like that takes up most of the refresh time. Should I run a refresh to see how long Power BI takes to pull all the data from Salesforce and then use that a baseline? If so, if I'm looking at the time after the data has been pulled from Salesforce, after what period of time should I consider my query too slow? 5s? 30s? Something else?

kcantor
Community Champion
Community Champion

@travbum In all honesty, when I hit refresh on the UPS file I have mentioned, I cannot do any other work while it churns. It finishes up in under 3 minutes that way. It is a beast but every one of those steps is necessary. It currently has 360,000 rows and about 8 columns in the largest query. Refresh time is important, more so if you are hitting live tables. I have, in the past, crashed navision here at my company, but, in the grand scheme of things, getting the data cleaned and usable is much more important than time spent in refresh. Now, if your query is crashing the software or the live tables you may want to take a second look . . .





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@travbum pfft, I don't consider a query slow until it takes more than 10 minutes! But then the database I'm pulling from wasn't so much "designed" as "conglommerated" "congealed" "summoned" "reanimated" "excreted"...





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




This question is akin to asking how many licks does it take to get to the center of a Tootsie Pop. Some steps are tiny and others are the large bite at the end. Each data set is different and needs to be treated as such. My UPS file tops out at 22 steps but each one is necessary and without it, I would be dealing with crap data.

If it works, it works.





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!




@travbum As mentioned. The number of steps shouldn't be the initial concern. Performance should be. If you need to clean the data, then you need the steps. If the performance is horrible and you are asking about better ways to do something, I'm sure someone can help. But quantity of steps alone shouldn't worry you other than it would take more time to come back and figure out.


Looking for more Power BI tips, tricks & tools? Check out PowerBI.tips the site I co-own with Mike Carlo. Also, if you are near SE WI? Join our PUG Milwaukee Brew City PUG

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.