Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
mafaber
Helper II
Helper II

Filter by highest and second highest value in Power Query

Hi everyone,

 

I have a table containing data from different months. I want to filter for the latest and the previous months only.

 

My question is, what is the most efficient way of doing this?

I tried using the following methods:

  • Calculated column using List.Max and List.MaxN + filter
  • Using a supplementary table where I removed everything else but two cells containing the latest and previous periods, then filtering my main table using inner merge
  • or same method as above, but left outer merge + expand + filter nulls

All 3 methods resulted in a significant increase in calculation time bringing the loading times from 10-30 secs to 6-10 minutes effectively making it unusable to develop further. Also refresh time is up significantly due to any of these steps.

 

Do you have a more effective way to achieve this result in a dataflow?

 

Thank you,

M

4 REPLIES 4
v-jianboli-msft
Community Support
Community Support

Hi @mafaber ,

 

Could you please tell me if my post helped you? please feel free to ask.

 

Best regards,

Jianbo li

v-jianboli-msft
Community Support
Community Support

Hi @mafaber ,

 

You can improve the performance of power query, such as: Remove unnecessary columns, Remove unnecessary rows, Group by and summarize and so on.

Besides, how about using Keep Top Rows after sorting?

vjianbolimsft_0-1658832137470.png

 

For more details, please refer to:

Data reduction techniques for Import modeling

Speeding up power query

 

Best Regards,

Jianbo Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

mafaber
Helper II
Helper II

Not sure if it's the best, but I managed to significantly speed up the filtering by using supplementary tables containing only the months I want to keep in my main table, converting said tables to list (Transform pane > convert to list), it's an important step, without converting it to list, load still takes 6-8 minutes, then filter using List.Contains(#"supplementary_list", [yyyymm]) logic.

 

This way it takes ~30-60 secs to load.

mafaber
Helper II
Helper II

What I really don't understand is that it is the filtering that always slows the loading down so much, it's not the merge & expand, nor the calculation of List.MaxN.

When I simply filter manually like 'yyyymm' = "202207" it is quick, ~30 seconds.

Whenever I create a calculated column comparing the highest value of 'yyyymm' to 'yyyymm' using List.Max it's still only ~30 seconds to create the column containing the 1s and 0s (1 if 'yyyymm' = List.Max('yyyymm')), but when I do the filter on that column, that is the step which is killing my dataflow. It's the same with Merge+Expand+filter.

 

Why is the filter after a List.Max or Merge+Expand so heavy on the resource?

 

(my main question is still how to solve it, but I'm really curious about this one)

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors