Solved: Matching single transactions when multiple matches...

ArturasT · ‎11-26-2019

Hey all,

First time posting on these forums, so apologies if I end up doing anything wrong 😐

I have a single table from my accounting software with all of my company's issued invoices and received payments, and I am trying to create a query to match them. I do not have unique identifiers, so I am using approximations (date / amount, etc.). I've got some of my matching logic in place, and have successfully done what I set out to do - except for all of the duplicates that occur when using such common variables as date and amount...

Here is an example of the data set I am starting with:

Document ID	Document type	Invoice date	Invoice amount
Invoice A	Invoice	2019-10-01	100
Invoice B	Invoice	2019-10-01	100
Invoice C	Invoice	2019-10-01	100
Invoice D	Invoice	2019-10-01	100
Invoice E	Invoice	2019-10-01	100
Payment 1	Payment	2019-10-15	100
Payment 2	Payment	2019-10-07	100
Payment 3	Payment	2019-10-01	100
Payment 4	Payment	2019-10-01	100
Payment 5	Payment	2019-10-01	100

I queried and split this table into two queries, and joined them again based date and amount combinations. There are 5 invoices on 2019-10-01, and 3 payments on 2019-10-01. This results in 15 total rows, as each invoice is assigned 3 payments - no good for me, as I need their to be only 1 payment for each invoice, and it has to be a different payment every time (so 1 invoice can't have 3 payments, and 1 payment can't be applied to 5 invoices). Which Invoice ID is assigned which Payment ID does not matter. So long as they are matched, any combination is equally viable (so Invoice A - Payment 5 is just as good as Invoice A - Payment 3). Here is what I am getting, and which rows would need to be kept:

Invoice ID	Document type	Invoice amount	Payment number	Document type	Payment amount	Comment on result
Invoice A	Invoice	100	Payment 3	Payment	100	This row is good - unmatched invoice and unmatched payment
Invoice A	Invoice	100	Payment 4	Payment	100	Invoice A has already been matched, row not needed and payment needs to remain "unmatched"
Invoice A	Invoice	100	Payment 5	Payment	100	Invoice A has already been matched, row not needed and payment needs to remain "unmatched"
Invoice B	Invoice	100	Payment 3	Payment	100	Payment 3 has already been matched with Invoice A, it cannot be used here
Invoice B	Invoice	100	Payment 4	Payment	100	This row is good - unmatched invoice and unmatched payment
Invoice B	Invoice	100	Payment 5	Payment	100	Invoice B has already been matched, row not needed and payment needs to remain "unmatched"
Invoice C	Invoice	100	Payment 3	Payment	100	Payment 3 has already been matched with Invoice A, it cannot be used here
Invoice C	Invoice	100	Payment 4	Payment	100	Payment 4 has already been matched with Invoice B, it cannot be used here
Invoice C	Invoice	100	Payment 5	Payment	100	This row is good - unmatched invoice and unmatched payment
Invoice D	Invoice	100	Payment 3	Payment	100	No unmatched payments remaining
Invoice D	Invoice	100	Payment 4	Payment	100	No unmatched payments remaining
Invoice D	Invoice	100	Payment 5	Payment	100	No unmatched payments remaining
Invoice E	Invoice	100	Payment 3	Payment	100	No unmatched payments remaining
Invoice E	Invoice	100	Payment 4	Payment	100	No unmatched payments remaining
Invoice E	Invoice	100	Payment 5	Payment	100	No unmatched payments remaining

Any advice on how to achieve this? Duplicate filtering doesn't really fit my need, as it is too restrictive (for example, the match Invoice C - Payment 5 is a duplicate on both Invoice ID column and Payment ID column, but it should be kept because it is the first "unmatched" occurence of both IDs).

Best I have come up with so far is several loops of duplicate filtering on the base query, where I take the rows with first occurence of both ID fields for this query, duplicate it, filter those unique values out of my duplicate query and then run the duplicate. It then gives me some additional unique results, and I can combine them with my first query to get some more hits. But this creates a lot of extra queries, and in my dataset I have cases of 10 or more duplicates for the same date & amount combination, so I am looking for a simpler and more effective method.

v-frfei-msft · ‎11-26-2019

Hi @ArturasT ,

We can insert an index column in table 2 and achieve that by DAX.

Table 2 =
VAR k =
    ADDCOLUMNS (
        'Table (2)',
        "filter",
        VAR ind = 'Table (2)'[Index]
        VAR sn =
            CALCULATE (
                DISTINCTCOUNT ( 'Table (2)'[Invoice ID] ),
                FILTER ( ALL ( 'Table (2)' ), 'Table (2)'[Index] <= ind )
            )
        VAR no =
            CALCULATE (
                DISTINCTCOUNT ( 'Table'[Document ID] ),
                FILTER (
                    'Table',
                    'Table'[Document type] = "Payment"
                        && 'Table'[Invoice date] = DATE ( 2019, 10, 01 )
                )
            )
        VAR index2 = ( sn - 1 ) * ( no + 1 ) + 1
        RETURN
            IF ( sn <= no && 'Table (2)'[Index] = index2, 1, BLANK () )
    )
RETURN
    FILTER ( k, [filter] = 1 )

For more details, please check the pbix as attached.

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.

View solution in original post

v-frfei-msft · ‎11-26-2019

Hi @ArturasT ,

We can insert an index column in table 2 and achieve that by DAX.

Table 2 =
VAR k =
    ADDCOLUMNS (
        'Table (2)',
        "filter",
        VAR ind = 'Table (2)'[Index]
        VAR sn =
            CALCULATE (
                DISTINCTCOUNT ( 'Table (2)'[Invoice ID] ),
                FILTER ( ALL ( 'Table (2)' ), 'Table (2)'[Index] <= ind )
            )
        VAR no =
            CALCULATE (
                DISTINCTCOUNT ( 'Table'[Document ID] ),
                FILTER (
                    'Table',
                    'Table'[Document type] = "Payment"
                        && 'Table'[Invoice date] = DATE ( 2019, 10, 01 )
                )
            )
        VAR index2 = ( sn - 1 ) * ( no + 1 ) + 1
        RETURN
            IF ( sn <= no && 'Table (2)'[Index] = index2, 1, BLANK () )
    )
RETURN
    FILTER ( k, [filter] = 1 )

For more details, please check the pbix as attached.

Community Support Team _ Frank
If this post helps, then please consider Accept it as the solution to help the others find it more quickly.