Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
smpa01
Super User
Super User

Incorporating Group by Max in Table.SelectRows

I have a dataset and I want Table.SelectRows to return the rows that has max fileIndex group by file Name.

 

I can currently do this

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    #"Added Custom" = Table.AddColumn(ct, "Custom", each Table.SelectRows(
        ct,
        (r) => r[fileName]
            = [fileName] and
           r[fileIndex]
            = List.Max(Table.SelectRows(ct, (q) => q[fileName] = [fileName])[fileIndex])
      ))
in
    #"Added Custom"

 but instead of adding a column I want to utilize that in Table .SelectRows so that it returns the filtered rows only that have the max fileIndex group by fileName. 

 

This is what I tried which does not work

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    Custom1 = Table.SelectRows(
        ct, each 
        (r) => r[fileName]
            = [fileName] and
           r[fileIndex]
            = List.Max(Table.SelectRows(ct, (q) => q[fileName] = [fileName])[fileIndex])
      )
in
    Custom1

 

DAX equivalent is

smpa01_0-1674180482121.png

 

pbix is attached

 

++ @AlexisOlson @CNENFRNL 

 

Thank you in advance

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
3 ACCEPTED SOLUTIONS
ronrsnfld
Super User
Super User

Group and filter:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),

    #"Max fileIndex per filName" = Table.Group(ct,{"fileName"},{
        {"filtered", each Table.FromRecords({Table.Max(_,"fileIndex" )}),
            type table[fileName=text, fileIndex=Int64.Type, Content=text]}
        }),
    #"Expanded filtered" = Table.ExpandTableColumn(#"Max fileIndex per filName", "filtered", {"fileIndex", "Content"})
in
    #"Expanded filtered"

ronrsnfld_0-1674184151630.png

 

 

 

View solution in original post

AlexisOlson
Super User
Super User

This might help. Check out my Approach #4 in particular:

https://community.powerbi.com/t5/Community-Blog/Select-Distinct-Rows-Ordered-by-Another-Column-Power...

 

Here's a version with Table.Join instead of Table.NestedJoin:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    #"Grouped Rows" = Table.Group(ct, {"fileName"}, {{"fileIndex", each List.Max([fileIndex]), type nullable number}}),
    #"Merged Queries" = Table.Join(ct, {"fileName", "fileIndex"}, #"Grouped Rows", {"fileName", "fileIndex"}, JoinKind.Inner)
in
    #"Merged Queries"

View solution in original post

CNENFRNL
Community Champion
Community Champion

In terms of performance, Table.Group() is better. All in all, try to use database to finish this task.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    Grouped = Table.Group(ct, "fileName", {"grp", each Table.Max(_, "fileIndex")}),
    #"Expanded grp" = Table.ExpandRecordColumn(Grouped, "grp", {"fileIndex", "Content"}, {"fileIndex", "Content"})
in
    #"Expanded grp"
-- for SQL Server
WITH cte_rn
     AS (SELECT *
                , Row_number ()
                    OVER (
                      partition BY filename
                      ORDER BY fileindex DESC) AS RN
         FROM   yearly_sales)
SELECT *
FROM   cte_rn
WHERE  rn = 1 

 


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

View solution in original post

6 REPLIES 6
CNENFRNL
Community Champion
Community Champion

In terms of performance, Table.Group() is better. All in all, try to use database to finish this task.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    Grouped = Table.Group(ct, "fileName", {"grp", each Table.Max(_, "fileIndex")}),
    #"Expanded grp" = Table.ExpandRecordColumn(Grouped, "grp", {"fileIndex", "Content"}, {"fileIndex", "Content"})
in
    #"Expanded grp"
-- for SQL Server
WITH cte_rn
     AS (SELECT *
                , Row_number ()
                    OVER (
                      partition BY filename
                      ORDER BY fileindex DESC) AS RN
         FROM   yearly_sales)
SELECT *
FROM   cte_rn
WHERE  rn = 1 

 


Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension!

DAX is simple, but NOT EASY!

How much I wish that the data already went to a server so that I did not have to use PQ. I have nothing against PQ but it is not scalable.  Datamart is also switched off  (org-level) which is why can't take advantage of that too.

 

While we are still at it, other TSQL to pull the same string

 

declare @t1 as table (fileName int, fileIndex int, cat varchar(max))

insert into @t1

select * from 
(values(1,1,'bin'),(1,2,'bin'),(1,3,'bin'),(2,1,'bin'),(2,2,'bin')) t(a,b,c)

--option#1
select a.fileName, a.fileIndex, a.cat
from @t1 a
where exists 
(select c.fileName, c.fileIndex from 
		(select b.fileName, max(b.fileIndex) as fileIndex from @t1 b group by b.fileName) as c 
			where a.fileName =c.fileName and a.fileIndex=c.fileIndex)

--option#2
select top 1 with ties
a.fileName, a.fileIndex, a.cat
from @t1 a
order by row_number() over (partition by a.fileName order by a.fileIndex desc)

 

smpa01_0-1674224923072.png

 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
AlexisOlson
Super User
Super User

This might help. Check out my Approach #4 in particular:

https://community.powerbi.com/t5/Community-Blog/Select-Distinct-Rows-Ordered-by-Another-Column-Power...

 

Here's a version with Table.Join instead of Table.NestedJoin:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),
    #"Grouped Rows" = Table.Group(ct, {"fileName"}, {{"fileIndex", each List.Max([fileIndex]), type nullable number}}),
    #"Merged Queries" = Table.Join(ct, {"fileName", "fileIndex"}, #"Grouped Rows", {"fileName", "fileIndex"}, JoinKind.Inner)
in
    #"Merged Queries"
ronrsnfld
Super User
Super User

Group and filter:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjNU0lEC4WinzLzEospYpVgdqKgRVlFjDFEjrCYYYTXBCKcJJiiisQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [fileName = _t, fileIndex = _t, Content = _t]),
    ct = Table.TransformColumnTypes(Source,{{"fileName", type text}, {"fileIndex", Int64.Type}, {"Content", type text}}),

    #"Max fileIndex per filName" = Table.Group(ct,{"fileName"},{
        {"filtered", each Table.FromRecords({Table.Max(_,"fileIndex" )}),
            type table[fileName=text, fileIndex=Int64.Type, Content=text]}
        }),
    #"Expanded filtered" = Table.ExpandTableColumn(#"Max fileIndex per filName", "filtered", {"fileIndex", "Content"})
in
    #"Expanded filtered"

ronrsnfld_0-1674184151630.png

 

 

 

@ronrsnfld  thanks for this, I am going to use this on a 12mil data source which has not been incorporated into any server yet.

 

Hence, I am looking for a working and optimized solution. Is it the most optimized solution? 

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

I don't know how optimal this is. Often things may be faster if you can get it done on the server. Maybe someone will contribute a different solution and you can test it.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors