Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Underscore
New Member

Assign batch number according to consecutive occurrence of a value in a column and group them

This is the snapshot of raw data of my data source (up to 100k rows)

IndexCategoryOutput 1Output 2
1A11
2A11
3A11
4A11
5A11
6A11
7B11
8B11
9B11
10B11
11B11
12B11
13B11
14B11
15B10
16B10
17B10
18B10
19B10

 

How do I achieve this in Power Query where I need to assign batch number based on the condition consecutive 3 rows of A. This means Batch #1 starts at Index #4 based on the sample data above. And after the next 3 consecutive rows of A, Batch number will increment by 1 (i.e Batch #2). The final result should count and sum it according to the batch number:

BatchABOutput 1Output 2
13131611



1 ACCEPTED SOLUTION
AnkitKukreja
Super User
Super User

Hi! @Underscore 

 Try to use this m code and see if you get the desired result.

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
AddBatchStartFlag = Table.AddColumn(AddIndex, "BatchStartFlag", each
if [Category] = "A" and Number.Mod([Index] - 1, 3) = 0 then 1 else 0, Int64.Type),
AddBatchNumber = Table.AddColumn(AddBatchStartFlag, "Batch", each
let
currentBatchStart = List.Sum(List.FirstN(AddBatchStartFlag[BatchStartFlag], [Index])),
currentBatchNumber = if [Category] = "A" then currentBatchStart - 1 else currentBatchStart - 1
in
currentBatchNumber, Int64.Type),
RemoveBatchStartFlag = Table.RemoveColumns(AddBatchNumber, {"BatchStartFlag"})
in
RemoveBatchStartFlag

 

 

 

For Power BI trainings or support dm or reach out to me on LinkedIn.
If my response has successfully addressed your question or concern, I kindly request that you mark this post as resolved. Additionally, if you found my assistance helpful, a thumbs-up would be greatly appreciated.

Thanks,
Ankit Kukreja
www.linkedin.com/in/ankit-kukreja1904
https://topmate.io/ankit_kukreja

View solution in original post

6 REPLIES 6
dufoq3
Super User
Super User

Hi @Underscore, different approach here:

 

Output:

dufoq3_0-1725548455891.png

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
    AddedIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
    GroupedRows = Table.Group(AddedIndex, {"Category", "Index"}, {{"All", each Table.RemoveColumns(_, {"Index"}), type table}}, GroupKind.Local,
        (x,y)=> Number.From( List.ContainsAll({"A"}, {x[Category], y[Category]}) and y[Index] - x[Index] >= 3 ) ),
    Ad_Batch = Table.AddIndexColumn(GroupedRows, "Batch", 0, 1, Int64.Type),
    RemovedOtherColumns = Table.SelectColumns(Ad_Batch,{"All", "Batch"}),
    ExpandedAll = Table.ExpandTableColumn(RemovedOtherColumns, "All", Table.ColumnNames(Source))
in
    ExpandedAll

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

AnkitKukreja
Super User
Super User

Hi! @Underscore 

 Try to use this m code and see if you get the desired result.

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
AddBatchStartFlag = Table.AddColumn(AddIndex, "BatchStartFlag", each
if [Category] = "A" and Number.Mod([Index] - 1, 3) = 0 then 1 else 0, Int64.Type),
AddBatchNumber = Table.AddColumn(AddBatchStartFlag, "Batch", each
let
currentBatchStart = List.Sum(List.FirstN(AddBatchStartFlag[BatchStartFlag], [Index])),
currentBatchNumber = if [Category] = "A" then currentBatchStart - 1 else currentBatchStart - 1
in
currentBatchNumber, Int64.Type),
RemoveBatchStartFlag = Table.RemoveColumns(AddBatchNumber, {"BatchStartFlag"})
in
RemoveBatchStartFlag

 

 

 

For Power BI trainings or support dm or reach out to me on LinkedIn.
If my response has successfully addressed your question or concern, I kindly request that you mark this post as resolved. Additionally, if you found my assistance helpful, a thumbs-up would be greatly appreciated.

Thanks,
Ankit Kukreja
www.linkedin.com/in/ankit-kukreja1904
https://topmate.io/ankit_kukreja
dufoq3
Super User
Super User

Provide better explanation with expected result please.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Omid_Motamedise
Memorable Member
Memorable Member

The question is not clear to me, can you provide your desired result?

AnkitKukreja
Super User
Super User

Hi! @Underscore 
Just to clarify one thing why do you say Batch 1 has count 3 for A? do you want to exclud first 3 rows for category - A ?
Try using the group by feature in power query for this.

 

 

For Power BI trainings or support dm or reach out to me on LinkedIn.
If my response has successfully addressed your question or concern, I kindly request that you mark this post as resolved. Additionally, if you found my assistance helpful, a thumbs-up would be greatly appreciated.

Thanks,
Ankit Kukreja
www.linkedin.com/in/ankit-kukreja1904
https://topmate.io/ankit_kukreja

Hi @AnkitKukreja , 

The first 3 rows of A is considered as previous batch (or Batch 0 in this case), so I exclude them out.

I need to separate them by batch, grouping them will just group all into 2 categories which is A and B which I do not want.

New example: 
Row 1 to 3: Category A (Batch 0)
Row 4 to 6: Category A (Batch 1) (New batch after 3 consecutive A at row 1-3)
Row 7 to 15: Category B (Batch 1)
Row 16 to 18: Category A (Batch 1)
Row 19 to 21: Category A (Batch 2) (New batch after 3 consecutive A at row 16-18)
Row 21 to 60: Category B (Batch 2) 
and so on.

Generally 3 consecutive rows of Category A act as a batch separator.

I can do this in Excel, but I'm not sure how to do it in Power Query. Doing it in Excel is slow as it involves 100k rows and it's growing.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.