Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
This is the snapshot of raw data of my data source (up to 100k rows)
Index | Category | Output 1 | Output 2 |
1 | A | 1 | 1 |
2 | A | 1 | 1 |
3 | A | 1 | 1 |
4 | A | 1 | 1 |
5 | A | 1 | 1 |
6 | A | 1 | 1 |
7 | B | 1 | 1 |
8 | B | 1 | 1 |
9 | B | 1 | 1 |
10 | B | 1 | 1 |
11 | B | 1 | 1 |
12 | B | 1 | 1 |
13 | B | 1 | 1 |
14 | B | 1 | 1 |
15 | B | 1 | 0 |
16 | B | 1 | 0 |
17 | B | 1 | 0 |
18 | B | 1 | 0 |
19 | B | 1 | 0 |
How do I achieve this in Power Query where I need to assign batch number based on the condition consecutive 3 rows of A. This means Batch #1 starts at Index #4 based on the sample data above. And after the next 3 consecutive rows of A, Batch number will increment by 1 (i.e Batch #2). The final result should count and sum it according to the batch number:
Batch | A | B | Output 1 | Output 2 |
1 | 3 | 13 | 16 | 11 |
Solved! Go to Solution.
Hi! @Underscore
Try to use this m code and see if you get the desired result.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
AddBatchStartFlag = Table.AddColumn(AddIndex, "BatchStartFlag", each
if [Category] = "A" and Number.Mod([Index] - 1, 3) = 0 then 1 else 0, Int64.Type),
AddBatchNumber = Table.AddColumn(AddBatchStartFlag, "Batch", each
let
currentBatchStart = List.Sum(List.FirstN(AddBatchStartFlag[BatchStartFlag], [Index])),
currentBatchNumber = if [Category] = "A" then currentBatchStart - 1 else currentBatchStart - 1
in
currentBatchNumber, Int64.Type),
RemoveBatchStartFlag = Table.RemoveColumns(AddBatchNumber, {"BatchStartFlag"})
in
RemoveBatchStartFlag
Hi @Underscore, different approach here:
Output:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
AddedIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
GroupedRows = Table.Group(AddedIndex, {"Category", "Index"}, {{"All", each Table.RemoveColumns(_, {"Index"}), type table}}, GroupKind.Local,
(x,y)=> Number.From( List.ContainsAll({"A"}, {x[Category], y[Category]}) and y[Index] - x[Index] >= 3 ) ),
Ad_Batch = Table.AddIndexColumn(GroupedRows, "Batch", 0, 1, Int64.Type),
RemovedOtherColumns = Table.SelectColumns(Ad_Batch,{"All", "Batch"}),
ExpandedAll = Table.ExpandTableColumn(RemovedOtherColumns, "All", Table.ColumnNames(Source))
in
ExpandedAll
Hi! @Underscore
Try to use this m code and see if you get the desired result.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIE41gdavGcBpQ3/Hw0yhvkvFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, #"Output 1" = _t, #"Output 2" = _t]),
AddIndex = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
AddBatchStartFlag = Table.AddColumn(AddIndex, "BatchStartFlag", each
if [Category] = "A" and Number.Mod([Index] - 1, 3) = 0 then 1 else 0, Int64.Type),
AddBatchNumber = Table.AddColumn(AddBatchStartFlag, "Batch", each
let
currentBatchStart = List.Sum(List.FirstN(AddBatchStartFlag[BatchStartFlag], [Index])),
currentBatchNumber = if [Category] = "A" then currentBatchStart - 1 else currentBatchStart - 1
in
currentBatchNumber, Int64.Type),
RemoveBatchStartFlag = Table.RemoveColumns(AddBatchNumber, {"BatchStartFlag"})
in
RemoveBatchStartFlag
The question is not clear to me, can you provide your desired result?
Hi! @Underscore
Just to clarify one thing why do you say Batch 1 has count 3 for A? do you want to exclud first 3 rows for category - A ?
Try using the group by feature in power query for this.
Hi @AnkitKukreja ,
The first 3 rows of A is considered as previous batch (or Batch 0 in this case), so I exclude them out.
I need to separate them by batch, grouping them will just group all into 2 categories which is A and B which I do not want.
New example:
Row 1 to 3: Category A (Batch 0)
Row 4 to 6: Category A (Batch 1) (New batch after 3 consecutive A at row 1-3)
Row 7 to 15: Category B (Batch 1)
Row 16 to 18: Category A (Batch 1)
Row 19 to 21: Category A (Batch 2) (New batch after 3 consecutive A at row 16-18)
Row 21 to 60: Category B (Batch 2)
and so on.
Generally 3 consecutive rows of Category A act as a batch separator.
I can do this in Excel, but I'm not sure how to do it in Power Query. Doing it in Excel is slow as it involves 100k rows and it's growing.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
27 | |
17 | |
11 | |
9 | |
6 |
User | Count |
---|---|
48 | |
30 | |
21 | |
18 | |
11 |