Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Anonymous
Not applicable

Calculate date difference between two consecutive rows grouped by column using python script

Hi all,

 

I am working on clinical data where I need to calulate sessions for patients by calculating date difference. I can do the same in python but the same script when used in power bi's power query it gives error as below

DataSource.Error: ADO.NET: Python script error.
<pi>TypeError: unsupported operand type(s) for -: 'str' and 'str'
</pi>
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
<pi>TypeError: unsupported operand type(s) for -: 'str' and 'str'
</pi>
ErrorCode=-2147467259
ExceptionType=Microsoft.PowerBI.Scripting

Original data is as below:-

data.PNG

Expected output is as below:-

CD.PNG

 

The python script used by me is

# 'dataset' holds the input data for this script
import pandas as pd
import numpy as np
import datetime
dataset['Days_btw'] = dataset.groupby('PatientID')['SessionDate'].diff() / np.timedelta64(1, 'D')

Any help or suggestions are appreciated.

Thank You.

1 ACCEPTED SOLUTION
MFelix
Super User
Super User

Hi @Anonymous ,

 

You can do this using two different approaches Power Query or DAX.

 

Power Query

  • Sort the table by ID and by Date
  • Add an index column
  • Add the following column to your model:
try if [PatientID] = #"Added Index"{[Index]-1}[PatientID] then  [SessionDate] - #"Added Index"{[Index]-1}[SessionDate] else 0 otherwise 0

 

Result and complete code below:

MFelix_0-1639995221388.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bc7bCcAwDEPRXfwdiKXmOUvI/mskpaW0Rb8HG90xjCVZMNToOdIJm+HBprALpCuEwqQwKywKVSd/nQ1n5x7CRvQ3FoFOgTwUNvWeL/ysV3XYdRH8xrkA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [PatientID = _t, SessionDate = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SessionDate", type date}, {"PatientID", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"PatientID", Order.Ascending}, {"SessionDate", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Days_btw", each try if [PatientID] = #"Added Index"{[Index]-1}[PatientID] then  [SessionDate] - #"Added Index"{[Index]-1}[SessionDate] else 0 otherwise 0),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Days_btw", Int64.Type}})
in
    #"Changed Type1"

 

DAX

  • Add a calculated column with the following code:
Days_btw_dax = 
COALESCE (
    DATEDIFF (
        CALCULATE (
            MAX ( 'Table (3)'[SessionDate] ),
            FILTER (
                ALL ( 'Table (3)'[PatientID],'Table (3)'[SessionDate] ),
                'Table (3)'[PatientID] = EARLIER ( 'Table (3)'[PatientID] )
                    && 'Table (3)'[SessionDate] < EARLIER ( 'Table (3)'[SessionDate] )
            )
        ),
        'Table (3)'[SessionDate],
        DAY
    ),
    0
)

 

MFelix_1-1639996179799.png

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



View solution in original post

1 REPLY 1
MFelix
Super User
Super User

Hi @Anonymous ,

 

You can do this using two different approaches Power Query or DAX.

 

Power Query

  • Sort the table by ID and by Date
  • Add an index column
  • Add the following column to your model:
try if [PatientID] = #"Added Index"{[Index]-1}[PatientID] then  [SessionDate] - #"Added Index"{[Index]-1}[SessionDate] else 0 otherwise 0

 

Result and complete code below:

MFelix_0-1639995221388.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bc7bCcAwDEPRXfwdiKXmOUvI/mskpaW0Rb8HG90xjCVZMNToOdIJm+HBprALpCuEwqQwKywKVSd/nQ1n5x7CRvQ3FoFOgTwUNvWeL/ysV3XYdRH8xrkA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [PatientID = _t, SessionDate = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"SessionDate", type date}, {"PatientID", Int64.Type}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"PatientID", Order.Ascending}, {"SessionDate", Order.Ascending}}),
    #"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "Days_btw", each try if [PatientID] = #"Added Index"{[Index]-1}[PatientID] then  [SessionDate] - #"Added Index"{[Index]-1}[SessionDate] else 0 otherwise 0),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Days_btw", Int64.Type}})
in
    #"Changed Type1"

 

DAX

  • Add a calculated column with the following code:
Days_btw_dax = 
COALESCE (
    DATEDIFF (
        CALCULATE (
            MAX ( 'Table (3)'[SessionDate] ),
            FILTER (
                ALL ( 'Table (3)'[PatientID],'Table (3)'[SessionDate] ),
                'Table (3)'[PatientID] = EARLIER ( 'Table (3)'[PatientID] )
                    && 'Table (3)'[SessionDate] < EARLIER ( 'Table (3)'[SessionDate] )
            )
        ),
        'Table (3)'[SessionDate],
        DAY
    ),
    0
)

 

MFelix_1-1639996179799.png

 


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.