March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello,
We have a very large dataset we are trying to manage (100 million+ rows) but this data is accross the past couple of months and gets updated/uploaded daily to our sharepoint. We can get our data into a more manageable size if we can cut out everything out in the Query to only the files that were created in the current month.
I have created the "PeriodM" column which will return the month the file was created. I want to create a flag that will return a 1 in a new column if the "PeriodM" month is equal to the max of all those values.
Example in the following image the latest/current month is 6/1/2022 so Latest Period should return a 1. And for 5/1/2022 it should return a 0.
This way I will be able to apply a step in our query to only filter rows with 1 in this new column, hopefully soving our size issue.
Solved! Go to Solution.
Your goal is to filter down to just the latest period, right? Although it doesn't create a flag column as an intermediate step, that's exactly what I'm doing in the example I linked to. You can add a flag column too but that adds extra intermediate steps.
Here's an example that adds a flag an then uses it to filter:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLVN9Q3MjAyAjKNTA2UYnUgwmYIYUNTU7CwEapqZGEk1UbGBnBhcyTVBgjVFkjCZkDVsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, PeriodM = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PeriodM", type date}, {"Value", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"PeriodMax", each List.Max([PeriodM]), type nullable date}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID", "PeriodM"}, #"Grouped Rows", {"ID", "PeriodMax"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"PeriodMax"}, {"Flag"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Grouped Rows",each [Flag], each if [Flag] = null then 0 else 1,Replacer.ReplaceValue,{"Flag"}),
#"Filtered Rows" = Table.SelectRows(#"Replaced Value", each ([Flag] = 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Flag"})
in
#"Removed Columns"
Here's a simpler version that bypasses that extra work:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLVN9Q3MjAyAjKNTA2UYnUgwmYIYUNTU7CwEapqZGEk1UbGBnBhcyTVBgjVFkjCZkDVsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, PeriodM = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PeriodM", type date}, {"Value", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"PeriodMax", each List.Max([PeriodM]), type nullable date}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID", "PeriodM"}, #"Grouped Rows", {"ID", "PeriodMax"}, "Grouped Rows", JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"Grouped Rows"})
in
#"Removed Columns"
This post might help with this.
https://community.powerbi.com/t5/Community-Blog/Select-Distinct-Rows-Ordered-by-Another-Column-Power...
I'd recommend approach #4 so that it can be folded into a native query.
@AlexisOlson Thanks for the suggestion but this is not what I'm looking to do and I'm not sure how to apply it to my our case. I was really looking for the "flag" or 1/0 solution as this is also something we can apply to other projects.
Your goal is to filter down to just the latest period, right? Although it doesn't create a flag column as an intermediate step, that's exactly what I'm doing in the example I linked to. You can add a flag column too but that adds extra intermediate steps.
Here's an example that adds a flag an then uses it to filter:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLVN9Q3MjAyAjKNTA2UYnUgwmYIYUNTU7CwEapqZGEk1UbGBnBhcyTVBgjVFkjCZkDVsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, PeriodM = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PeriodM", type date}, {"Value", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"PeriodMax", each List.Max([PeriodM]), type nullable date}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID", "PeriodM"}, #"Grouped Rows", {"ID", "PeriodMax"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"PeriodMax"}, {"Flag"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Grouped Rows",each [Flag], each if [Flag] = null then 0 else 1,Replacer.ReplaceValue,{"Flag"}),
#"Filtered Rows" = Table.SelectRows(#"Replaced Value", each ([Flag] = 1)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Flag"})
in
#"Removed Columns"
Here's a simpler version that bypasses that extra work:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLVN9Q3MjAyAjKNTA2UYnUgwmYIYUNTU7CwEapqZGEk1UbGBnBhcyTVBgjVFkjCZkDVsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, PeriodM = _t, Value = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"PeriodM", type date}, {"Value", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"PeriodMax", each List.Max([PeriodM]), type nullable date}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ID", "PeriodM"}, #"Grouped Rows", {"ID", "PeriodMax"}, "Grouped Rows", JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Queries",{"Grouped Rows"})
in
#"Removed Columns"
Please note I did not try this solution and the PBI forum auto selected this as a correct answer.
I was able to come up with my own solution.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.