Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Harshi_bi
Frequent Visitor

Need help, on Count if row value repeats, Count number of times value repeats in column

My dataset is in direct query mode and i wanted to achieve below requirement using power query.

In PowerQuery I want to add a custom column that evaluates how often a value repeats. I want to do this in PowerQuery, not DAX.

For example, if same combination of column 1 and column 2 repeats then the new custom column(Count) would show "2" in each row. The below screen shot is what I'm trying to create where Column A exists and Column B is the column that does the countif.

I realize I could do a grouping and count, but I need to maintain the detail. I also don't want to create another group table and then merge it back

Harshi_bi_0-1672045442013.png

 

 

14 REPLIES 14
AlB
Super User
Super User

@Harshi_bi 

If by "retrieve" you mean you want to keep the row with the non-blank extension id and discard the other, you can just sort by request id, plant and extension id  in descending order, to make sure that the row with the non-blank extension id stays above the on with the blank extension id , then remove duplicates on request id, plant. Power Query will keep the first one it encounters going down, i.e. the one with the non-blank request id

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

     

Mahesh0016
Super User
Super User

@Harshi_bi ,
Can you share sample data and sample output in table format? Or a sample pbix after removing sensitive data.

 

@Mahesh0016 

I am not able to attach the sample data, but i tried using custom column and it's not allowed as my connect is "Direct Query" and grouping is also not woking. 
I have 2 tables, Table1 - request ID, plant, active stage etc. where as table 2 - request Id and plants which are not present in Table1. So i appended these 2 queries to get all plants related to each request.
Now, there is a extension id related to only completed request and plant from table1.
When request ID and plant are same then I need to retrieve row which has extension id. In below example i need second row for ID- 3000.

Harshi_bi_0-1672122195938.png

 

Mahesh0016
Super User
Super User

@Harshi_bi ,

>>Add custom Column

Mahesh0016_0-1672048907562.png

=Table.RowCount(Table.SelectRows(#"Changed Type", (inner)=> [Column1]=inner[Column1] and [Column2]=inner[Column2]))

AlB
Super User
Super User

Hi @Harshi_bi 

Here's an alternative to the grouping and merging you do not want. Place the following M code in a blank query to see the steps.

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8istUYrVQec5OjkDeW5FpZn4+F6pOTmVSLqdE/NSgPxYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),

#"Added Custom" = Table.AddColumn(#"Changed Type", "Count", each Table.RowCount(Table.SelectRows(#"Changed Type", (inner)=> [Column1]=inner[Column1] and [Column2]=inner[Column2])))

in
    #"Added Custom"

 

Do bear in mind though than grouping and merging will be faster, particularly if your table is large:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8istUYrVQec5OjkDeW5FpZn4+F6pOTmVSLqdE/NSgPxYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),

#"Grouped Rows" = Table.Group(#"Changed Type", {"Column1", "Column2"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type", {"Column1", "Column2"}, #"Grouped Rows", {"Column1", "Column2"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Count"}, {"Count"})

in

#"Expanded Grouped Rows"

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Harshi_bi
Frequent Visitor

Hi @Mahesh0016 - Thank you for the suggestion, grouping is not giving me the desired result. Also adding custom column as above is too slow. How to apply below code on two columns can you suggest.

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVorVQdCOTs6YdCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Repeats", each let
thisrow = [Column1],
templist = List.Buffer(#"Changed Type"[Column1])
in
List.Count(List.Select(templist, each _ = thisrow)))
in
#"Added Custom"

@Harshi_bi 

You are responding to my post but tagging somebody else. Who's is the question for?

You can extend that reasoning to two columns as below but I do not believe it will deliver any significant speed improvement over the approach based on the custom column I shared earlier

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8istUYrVQec5OjkDeW5FpZn4+F6pOTmVSLqdE/NSgPxYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),

#"Added Custom" = Table.AddColumn(#"Changed Type", "Repeats", each let
thisrowCol1 = [Column1],
thisrowCol2 = [Column2],
templist = List.Buffer(List.Zip({#"Changed Type"[Column1], #"Changed Type"[Column2]}))
in
List.Count(List.Select(templist, each _{0} = thisrowCol1 and _{1} = thisrowCol2 )), Int64.Type)
in
#"Added Custom"

 

 

SU18_powerbi_badge

Please accept the solution when done and consider giving a thumbs up if posts are helpful. 

Contact me privately for support with any larger-scale BI needs, tutoring, etc.

 

Mahesh0016
Super User
Super User

@Harshi_bi 

Step by step:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtJR8istUYrVQec5OjkDeW5FpZn4+F6pOTmVSLqdE/NSgPxYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Column1"}, {{"Count", each Table.RowCount(_), type number}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Column1"},#"Grouped Rows",{"Column1"},"Grouped Rows",JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Count"}, {"Grouped Rows.Count"})
in
#"Expanded Grouped Rows"

 

 

** If this post helps, please consider accept as solution to help other members find it more quickly **

 

Hi @Mahesh0016 
 I tried using grouping as above and while doing the expand step i got the below error. But the table contains only 2300 rows. Any idea?

Harshi_bi_0-1672056503709.png

 

@Harshi_bi share advance editor Query .

@Mahesh0016 

let
Source = cmp_material,
#"Grouped Rows" = Table.Group(Source, {"cmp_name", "cmp_locationidname", "LocalExt Name", "LocExtStatuscodename"}, {{"Count", each _, type table [cmp_materialid=text, createdon=nullable datetime, ownerid=text, owneridname=text, statecodename_GB=nullable text, statuscodename=nullable text, cmp_name=nullable text, cmp_activestageid=nullable text, cmp_lastmovementdate=nullable datetime, cmp_locationid=nullable text, cmp_materialflowid=nullable text, cmp_materialtypeid=nullable text, cmp_movementdirection=nullable number, cmp_movementdirectionname=nullable text, cmp_newcodedescription=nullable text, cmp_projectid=nullable text, cmp_projecttypeid=nullable text, cmp_requestingcountryid=nullable text, cmp_sapcode=nullable text, cmp_subprojectid=nullable text, cmp_locationidname=nullable text, cmp_materialflowidname=nullable text, cmp_materialtypeidname=nullable text, cmp_projecttypeidname=nullable text, cmp_requestingcountryidname=nullable text, cmp_subprojectidname=nullable text, cmp_activestageidname=nullable text, cmp_bpoownerid=nullable text, cmp_bpoowneridname=nullable text, Requestor email=nullable text, BPO owner email=nullable text, cmp_validatoremail=nullable text, Stage Owner=nullable text, LocExtStatuscodename=nullable text, LocalExt Name=nullable text, Extension Name=nullable text, cmp_needhelprecipients=nullable text, description=nullable text, NeedHelp CreatedOn=nullable datetime, NeedHelp Owner=nullable text, task statecodename=nullable text, cmp_needhelpsectiongcidname=nullable text, Today=nullable datetime, Need Help Recipients=nullable text, Owner of Pending Activity=nullable text, Need Help Description=nullable text, Need Help Created On=nullable datetime, Need Help Requestor=nullable text, Need Help statecodename=nullable text, Duration=duration, Open Section=nullable text, URL=nullable text, KeyColumn=nullable text]}}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Count"}),
#"Appended Query" = Table.Combine({#"Removed Columns", cmp_materialprimaryplant}),
#"Merged Queries" = Table.NestedJoin(#"Appended Query", {"cmp_name"}, cmp_material, {"cmp_name"}, "cmp_material", JoinKind.LeftOuter),
#"Expanded cmp_material" = Table.ExpandTableColumn(#"Merged Queries", "cmp_material", {"cmp_activestageidname"}, {"cmp_activestageidname"}),
#"Removed Duplicates" = Table.Distinct(#"Expanded cmp_material"),
#"Added Conditional Column" = Table.AddColumn(#"Removed Duplicates", "Plant", each if [cmp_activestageidname] = "Complete" then [cmp_locationidname] else null),
#"Changed Type" = Table.TransformColumnTypes(#"Added Conditional Column",{{"Plant", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Extension Name", each if ([cmp_activestageidname] = "Complete") and ([LocExtStatuscodename] <> "Unnecessary request") then [LocalExt Name] else ""),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Extension Name", type text}}),
#"Grouped Rows1" = Table.Group(#"Changed Type1", {"cmp_name", "Plant"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
#"Merged Queries1" = Table.NestedJoin(#"Changed Type1", {"cmp_name", "Plant"}, #"Grouped Rows1", {"cmp_name", "Plant"}, "Grouped Rows1", JoinKind.LeftOuter),
#"Expanded Grouped Rows1" = Table.ExpandTableColumn(#"Merged Queries1", "Grouped Rows1", {"Count"}, {"Grouped Rows1.Count"})
in
#"Expanded Grouped Rows1"

@Harshi_bi just simple add custom column for use below code.
=Table.RowCount(Table.SelectRows(#"Changed Type", (inner)=> [Column1]=inner[Column1] and [Column2]=inner[Column2]))

 

>> I hope you get it.

Hi Mahesh,

 

Thanks for the help, but as mentioned in example, i need the combination of column1 and column 2 repeated.

Harshi_bi_0-1672047950005.png

 

Mahesh0016_0-1672055595681.png

@Harshi_bi I hope this helps You!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors