Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Philippfk
New Member

Remove values in specific cell based on duplicates in other columns

Hi everyone, 

 

I’m trying to solve the following problem in Power Query:
I have two tables — one contains all the costs, the other contains adjustments related to those costs. I want to merge these two tables because I need the values for costs, adjustments, and adjusted costs (= costs - adjustments) in a single table.

Each table includes, among other things, the columns Account Nr, Month, and Year. Normally, that would be sufficient to merge the tables, since the Account Nr is supposed to serve as a unique identifier for each month. However, in some months there are multiple identical account numbers in the adjustments table. As a result, when merging the tables, some rows get duplicated — because in the costs table, each account number only appears once per month.

These duplicate rows aren't really a problem — except for the Value column.

What I want to achieve now is to remove duplicate values from the Value column if the combination of Account Nr, Month, and Year also appears more than once.

Philippfk_0-1752494514588.png

 

What’s the best way to approach this? The table is also quite large, so performance matters.

Thanks in advance!

1 ACCEPTED SOLUTION
DataNinja777
Super User
Super User

Hi @Philippfk ,

 

The most efficient way to solve this in Power Query is to group the rows by Account Nr, Month, and Year, which bundles all related rows together. You can then add an index to each of these bundles, allowing you to identify the first row. With the first row identified, you can create a new column that keeps the Values amount only for that row and sets all others to null. This approach is highly performant, even on very large datasets, because it uses Power Query's optimized grouping engine.

 

To begin, after you have merged your tables, you will use the Group By function. Configure it to group by Account Nr, Month, and Year, and set the operation to All Rows with a new column name like AllData. This creates a nested table for each group. Next, add a custom column to introduce an index into each of these nested tables using the following M code formula. This step is crucial for identifying the unique first entry in each group.

Table.AddIndexColumn([AllData], "Index", 1, 1)

After creating the index, expand this new custom column to reveal all your original columns plus the new Index column. With the table expanded, you can now add a Conditional Column. Set the condition so that if the Index column equals 1, the output is the value from your Values column; otherwise, the output should be null. Finally, clean up your table by removing the original Values column and the temporary Index column, then rename your new conditional column to Values. This will leave you with the exact table structure you want.

 

For those who prefer using the Advanced Editor, you can paste the following M code after the step where you merged the tables. Be sure to replace PreviousStep with the actual name of your last step. This code performs the grouping, indexing, conditional logic, and cleanup in a single query block.

let
    // Assumes 'PreviousStep' is the name of your merged table
    GroupedRows = Table.Group(PreviousStep, {"Account Nr", "Month", "Year"}, {{"Data", each _, type table}}),

    // Add an index to each nested table to identify the first row
    AddIndex = Table.AddColumn(GroupedRows, "WithIndex", each Table.AddIndexColumn([Data], "Index", 1, 1)),

    // Remove the intermediate data column
    RemovedIntermediateCol = Table.RemoveColumns(AddIndex,{"Data"}),

    // Expand the indexed table
    ExpandedTable = Table.ExpandTableColumn(RemovedIntermediateCol, "WithIndex", {"Account Nr", "Month", "Year", "Values", "AdjustedValues", "Index"}),

    // Add the final conditional column for Values
    AddConditionalCol = Table.AddColumn(ExpandedTable, "Final Values", each if [Index] = 1 then [Values] else null, type number),

    // Clean up helper and old columns
    FinalCleanup = Table.SelectColumns(AddConditionalCol,{"Account Nr", "Month", "Year", "Final Values", "AdjustedValues"}),

    // Rename the column to its final name
    RenamedCol = Table.RenameColumns(FinalCleanup,{{"Final Values", "Values"}})
in
    RenamedCol

 

Best regards,

 

View solution in original post

3 REPLIES 3
DataNinja777
Super User
Super User

Hi @Philippfk ,

 

The most efficient way to solve this in Power Query is to group the rows by Account Nr, Month, and Year, which bundles all related rows together. You can then add an index to each of these bundles, allowing you to identify the first row. With the first row identified, you can create a new column that keeps the Values amount only for that row and sets all others to null. This approach is highly performant, even on very large datasets, because it uses Power Query's optimized grouping engine.

 

To begin, after you have merged your tables, you will use the Group By function. Configure it to group by Account Nr, Month, and Year, and set the operation to All Rows with a new column name like AllData. This creates a nested table for each group. Next, add a custom column to introduce an index into each of these nested tables using the following M code formula. This step is crucial for identifying the unique first entry in each group.

Table.AddIndexColumn([AllData], "Index", 1, 1)

After creating the index, expand this new custom column to reveal all your original columns plus the new Index column. With the table expanded, you can now add a Conditional Column. Set the condition so that if the Index column equals 1, the output is the value from your Values column; otherwise, the output should be null. Finally, clean up your table by removing the original Values column and the temporary Index column, then rename your new conditional column to Values. This will leave you with the exact table structure you want.

 

For those who prefer using the Advanced Editor, you can paste the following M code after the step where you merged the tables. Be sure to replace PreviousStep with the actual name of your last step. This code performs the grouping, indexing, conditional logic, and cleanup in a single query block.

let
    // Assumes 'PreviousStep' is the name of your merged table
    GroupedRows = Table.Group(PreviousStep, {"Account Nr", "Month", "Year"}, {{"Data", each _, type table}}),

    // Add an index to each nested table to identify the first row
    AddIndex = Table.AddColumn(GroupedRows, "WithIndex", each Table.AddIndexColumn([Data], "Index", 1, 1)),

    // Remove the intermediate data column
    RemovedIntermediateCol = Table.RemoveColumns(AddIndex,{"Data"}),

    // Expand the indexed table
    ExpandedTable = Table.ExpandTableColumn(RemovedIntermediateCol, "WithIndex", {"Account Nr", "Month", "Year", "Values", "AdjustedValues", "Index"}),

    // Add the final conditional column for Values
    AddConditionalCol = Table.AddColumn(ExpandedTable, "Final Values", each if [Index] = 1 then [Values] else null, type number),

    // Clean up helper and old columns
    FinalCleanup = Table.SelectColumns(AddConditionalCol,{"Account Nr", "Month", "Year", "Final Values", "AdjustedValues"}),

    // Rename the column to its final name
    RenamedCol = Table.RenameColumns(FinalCleanup,{{"Final Values", "Values"}})
in
    RenamedCol

 

Best regards,

 

HarishKM
Solution Sage
Solution Sage

@Philippfk  hey,
Can you do join in SQL data because it will be more efficient .

you can try something like this as below

 

WITH B_unique AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY key ORDER BY some_column) AS rn
FROM B
)
SELECT A.*, B_unique.*
FROM A
LEFT JOIN B_unique ON A.key = B_unique.key AND B_unique.rn = 1;

 

 

Thanks

Harish M

Please accept this as a solution if this solves your problem and give kudos as well.

lbendlin
Super User
Super User

Power Query is notoriously bad for things like this.  Have you considered implementing your logic in the data model instead?

 

Please provide sample data that covers your issue or question completely, in a usable format (not as a screenshot).
Do not include sensitive information. Do not include anything that is unrelated to the issue or question.
Please show the expected outcome based on the sample data you provided.

Need help uploading data? https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...
Want faster answers? https://community.fabric.microsoft.com/t5/Desktop/How-to-Get-Your-Question-Answered-Quickly/m-p/1447...

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors