Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
How can I filter rows based on comparisons to other rows?
My desired filter is "Remove any row that has another row within X rows whose timestamp is more than Y seconds away from the timestamp of the next row"
My data is a series of about 600 CSV files with around 7000 rows and <500kB each, totalling less than 400MB. If it makes a difference, each file is in order by the Time column. Sample data:
Time (s) | Data |
0 | A |
1.012 | B |
2.022 | C |
3.032 | D |
4.041 | E |
5.05 | F |
5.906 | G |
5.917 | H |
5.937 | I |
5.948 | J |
5.959 | K |
5.97 | L |
5.98 | M |
5.991 | N |
The desired result is to remove A, B, C, D, E, F, G, H, I, J for an X rows of 3 and Y seconds of 0.025
I tried adding an index column and creating a custom column using List.Range to calculate the time difference to the previous row
try [#"Time (s)"] - List.Range(#"Reordered Columns"[#"Time (s)"], [Index]-1, 1){0} otherwise #"Max Timestep"+1
Followed by creating a second column again using List.Range to check if any values nearby exceed Y
try List.Max( List.Range(#"Reordered Columns1"[Time to Previous], [Index]-#"Filter Range", (#"Filter Range"*2)+1) ) < #"Max Timestep" otherwise false
Then just filtered out any rows that had false in that column.
This worked when testing on a single file, but encountered issues when I tried it on the whole folder. Loading has still not completed after about 3 hours at the time of writing (Imports of similar datasets without the filtering have completed in minutes in the past), and I am seeing insane file sizes in the load dialog - 100+GB from individual files in some cases.
What is a better way to do this kind of filtering?
User | Count |
---|---|
77 | |
76 | |
43 | |
30 | |
26 |
User | Count |
---|---|
97 | |
91 | |
52 | |
47 | |
46 |