Solved: Calculating new column using row values from same ...

Anonymous · ‎09-06-2018

Hello,

I have a LogTable listing all mutation events on some objects.

The row format is:

Object_ID: the object on which mutations are carried on

Mutation_Date: Timestamp of the mutation

Mutation_Parameter: the object parameter on which the mutation is done, e.g. Status, Urgency, Other

Old_Value: the value of the object parameter prior to the mutation

New_Value: the value of the object parameter after the mutation.

I would like to create a new column "Status_before_Mutation" showing the value of the status parameter prior to each mutation. In case of a status mutation, this is logically equal to the existing value in column Old_Value, but in case of other mutations, I need to consider the latest change of status prior to the time of the mutation being considered.

For each mutation, the logic I am pursuing is to select all rows with the same Object_ID, then filter rows with MutationParameter = "Status", then filter rows with Mutation_Date prior to Mutation_Date of the mutation being considered, sort the remaining rows in descending order of Date, pick the first one and extract the column vale "Old_Value" from this row.

Unfortunately, I get a circular dependency error. I guess I need to exclude the new column "Status_before_Mutation" from the CALCULATETABLE row context, which I thought I did with the ALLEXCEPT...

Here is the code leading to this circular dependency:

Status_before_Mutation = 
    CALCULATE(
        VALUES(LogTable[Old_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
                LogTable[Mutation_Parameter] = "Status";
                LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])
            );
            LogTable[Mutation_Date];
            DESC
        )
    )

Another potential issue is maybe the use of EARLIER, which I have read is not recommended for large tables.

The LogTable is quite big, up to several hundred thousands rows. I have several thousands different objects each with a number of mutations in the range [1...50].

Any idea on solving this, possible avoiding the use of EARLIER?

Thanks for any suggestion.

Karim

Anonymous · ‎09-07-2018

Hi Jimmy,

Thanks for your proposition. I have never used RANKX before....

In the meantime, I could get rid of the circular dependency problem I had by using DISTINCT instead of VALUES in my CALCULATE new column:

Status_after_Mutation = 
    CALCULATE(
        DISTINCT(LogTable[New_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
                LogTable[Mutation_Parameter] = "Status";
                LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])
            );
            LogTable[Mutation_Date];
            DESC
        )
    )

Also, I corrected the column to be "Status_after_Mutation" using the LogTable[New_Value].

Now I am getting what I want (see marked column below):

In the last column, I have also included what I got with your proposition:

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

I have two questions now:

- Your solution only provides the status for a status mutation. I tried to modify it without success, as I don't quite understand the use of RANKX within RANKX. How should I modify it to get the current status for all mutations on the same object?

- With my solution using TOPN on a big mutation table (> 200,000 rows), I get the following error from PowerBI
"There is not enough memory to complete this operation. Please try again later when there may be more memory available".
Is RANKX or another approach more memory efficient on big tables? Of course I could first split the table in smaller tables before computing the new columns but if I could avoid it with a better method, that would be interesting.

Thanks

Karim

View solution in original post

Anonymous · ‎09-06-2018

Addendum:

Based on the LogTable I gave as an example, here is the new column "Status_before_Mutation" I would like to get:

Object_ID	Mutation_Date	Mutation_Parameter	Old_Value	New_Value	Status_before_Mutation
Object_A	25.01.2018 17:38	Status		open
Object_A	25.01.2018 17:40	Urgency		1	open
Object_A	26.01.2018 04:15	Other		yy	open
Object_A	26.01.2018 07:30	Urgency	1	2	open
Object_A	29.01.2018 08:47	Status	open	work	open
Object_A	29.01.2018 17:54	Urgency	2	1	work
Object_A	30.01.2018 06:08	Other	yy	zz	work
Object_A	30.01.2018 06:53	Status	work	review	work
Object_A	31.01.2018 13:30	Status	review	closed	review
Object_B	24.01.2018 08:47	Status		open
Object_B	24.01.2018 08:47	Urgency		3	open
Object_B	26.01.2018 14:43	Urgency	3	2	open
Object_B	26.01.2018 15:02	Other		xx	open
Object_B	02.02.2018 15:38	Status	open	work	open
Object_B	07.02.2018 10:06	Urgency	2	1	work
Object_B	07.02.2018 10:06	Status	work	closed	work

v-yuta-msft · ‎09-07-2018

Hi Karim11,

Modify your calculate column using DAX below and check if it can meet your requirement:

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

Regards,

Jimmy Tao

Anonymous · ‎09-07-2018

Hi Jimmy,

Thanks for your proposition. I have never used RANKX before....

In the meantime, I could get rid of the circular dependency problem I had by using DISTINCT instead of VALUES in my CALCULATE new column:

Status_after_Mutation = 
    CALCULATE(
        DISTINCT(LogTable[New_Value]);
        TOPN(
            1;
            CALCULATETABLE(
                LogTable;
                ALLEXCEPT(LogTable; LogTable[Object_ID]);
                LogTable[Mutation_Parameter] = "Status";
                LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])
            );
            LogTable[Mutation_Date];
            DESC
        )
    )

Also, I corrected the column to be "Status_after_Mutation" using the LogTable[New_Value].

Now I am getting what I want (see marked column below):

In the last column, I have also included what I got with your proposition:

Status_before_Mutation = 
VAR Ranking = RANKX(FILTER(LogTable, LogTable[Object_ID] = EARLIER(LogTable[Object_ID]) && LogTable[Mutation_Date] <= EARLIER(LogTable[Mutation_Date])), RANKX(ALL(LogTable), LogTable[Mutation_Date]), , DESC, Dense)
RETURN
IF(Ranking = 1, LogTable[New_Value])

I have two questions now:

- Your solution only provides the status for a status mutation. I tried to modify it without success, as I don't quite understand the use of RANKX within RANKX. How should I modify it to get the current status for all mutations on the same object?

- With my solution using TOPN on a big mutation table (> 200,000 rows), I get the following error from PowerBI
"There is not enough memory to complete this operation. Please try again later when there may be more memory available".
Is RANKX or another approach more memory efficient on big tables? Of course I could first split the table in smaller tables before computing the new columns but if I could avoid it with a better method, that would be interesting.

Thanks

Karim

Calculating new column using row values from same table: circular dependency; avoiding EARLIER?

Helpful resources

Join us at the Microsoft Fabric Community Conference

Power BI Monthly Update - February 2025

Fabric Community Update - February 2025

How to Get Your Question Answered Quickly

Join us at the 2025 Microsoft Fabric Community Conference

Calculating new column using row values from same table: circular dependency; avoiding EARLIER?

Helpful resources

Join us at the Microsoft Fabric Community Conference

Power BI Monthly Update - February 2025

Fabric Community Update - February 2025

How to Get Your Question Answered Quickly