Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
shanebo3239
Helper I
Helper I

Join Data on Date Range

I work for a customer service center where we have customers come into a lobby and are served with tickets (kind of like the old "take a number" system).  In our office, we have two big datasets.  The first is from our queueing system, and the second is from our cashiering system.  These systems are not linked in any way.  The queueing system has a table called history and the cashiering system has a table called transactions.

 

Here is what the queueing system's history table contains:

 

StartDate | EndDate | CustomerID | DeskID

09-14-2017 12:47pm | 09-14-2017 1:02 PM | 101 | 12

09-14-2017 12:50 pm | 09-14-2017 1:16 PM | 102 | 15

 

The transactions table will contain something like this:

 

TransactionDate | AmountPaid | DeskID

09-14-2017 1:01 PM | 45.25 | 12

09-14-2017 1:14 PM | 102.60 | 15

 

Only one customer can possibly transact one transaction at one DeskID at one time.  Therefore, I should be able to join the data in such a way that I can check the date range and the DeskID and determine which transaction belongs to which history entry.  That would result in something like this:

 

history.StartDate | history.EndDate | transaction.TransactionDate | transaction.AmountPaid | DeskID | history.CustomerID

 

I have this working somewhat in SQL (both of these data sets are on SQL Azure) by doing a JOIN, but it is SLOW, and I'd rather bring the datasets into Power BI natively and then let PowerBI join them.  I just don't see a way of doing it.

 

Thoughts?

1 ACCEPTED SOLUTION
Eric_Zhang
Employee
Employee


@shanebo3239 wrote:

I work for a customer service center where we have customers come into a lobby and are served with tickets (kind of like the old "take a number" system).  In our office, we have two big datasets.  The first is from our queueing system, and the second is from our cashiering system.  These systems are not linked in any way.  The queueing system has a table called history and the cashiering system has a table called transactions.

 

Here is what the queueing system's history table contains:

 

StartDate | EndDate | CustomerID | DeskID

09-14-2017 12:47pm | 09-14-2017 1:02 PM | 101 | 12

09-14-2017 12:50 pm | 09-14-2017 1:16 PM | 102 | 15

 

The transactions table will contain something like this:

 

TransactionDate | AmountPaid | DeskID

09-14-2017 1:01 PM | 45.25 | 12

09-14-2017 1:14 PM | 102.60 | 15

 

Only one customer can possibly transact one transaction at one DeskID at one time.  Therefore, I should be able to join the data in such a way that I can check the date range and the DeskID and determine which transaction belongs to which history entry.  That would result in something like this:

 

history.StartDate | history.EndDate | transaction.TransactionDate | transaction.AmountPaid | DeskID | history.CustomerID

 

I have this working somewhat in SQL (both of these data sets are on SQL Azure) by doing a JOIN, but it is SLOW, and I'd rather bring the datasets into Power BI natively and then let PowerBI join them.  I just don't see a way of doing it.

 

Thoughts?


@shanebo3239

You could try to implement the similar JOIN logic in DAX when creating a calculated table.

Table =
FILTER (
    CROSSJOIN (
        SELECTCOLUMNS (
            history,
            "history.StartDate", history[StartDate],
            "history.Enddate", history[EndDate],
            "history.CustomerID", history[CustomerID],
            "history.DeskID", history[DeskID]
        ),
        transactions
    ),
    [history.DeskID] = transactions[DeskID]
        && transactions[TransactionDate] > [history.StartDate]
        && transactions[TransactionDate] <= [history.Enddate]
)

However, I would concern about the performance when the dataset is huge. I'd still suggest you do the JOIN in Azure database. As to performance aspect, try to create proper index and apply proper where clause to narrow down the date range to shrink the data size.

View solution in original post

2 REPLIES 2
Eric_Zhang
Employee
Employee


@shanebo3239 wrote:

I work for a customer service center where we have customers come into a lobby and are served with tickets (kind of like the old "take a number" system).  In our office, we have two big datasets.  The first is from our queueing system, and the second is from our cashiering system.  These systems are not linked in any way.  The queueing system has a table called history and the cashiering system has a table called transactions.

 

Here is what the queueing system's history table contains:

 

StartDate | EndDate | CustomerID | DeskID

09-14-2017 12:47pm | 09-14-2017 1:02 PM | 101 | 12

09-14-2017 12:50 pm | 09-14-2017 1:16 PM | 102 | 15

 

The transactions table will contain something like this:

 

TransactionDate | AmountPaid | DeskID

09-14-2017 1:01 PM | 45.25 | 12

09-14-2017 1:14 PM | 102.60 | 15

 

Only one customer can possibly transact one transaction at one DeskID at one time.  Therefore, I should be able to join the data in such a way that I can check the date range and the DeskID and determine which transaction belongs to which history entry.  That would result in something like this:

 

history.StartDate | history.EndDate | transaction.TransactionDate | transaction.AmountPaid | DeskID | history.CustomerID

 

I have this working somewhat in SQL (both of these data sets are on SQL Azure) by doing a JOIN, but it is SLOW, and I'd rather bring the datasets into Power BI natively and then let PowerBI join them.  I just don't see a way of doing it.

 

Thoughts?


@shanebo3239

You could try to implement the similar JOIN logic in DAX when creating a calculated table.

Table =
FILTER (
    CROSSJOIN (
        SELECTCOLUMNS (
            history,
            "history.StartDate", history[StartDate],
            "history.Enddate", history[EndDate],
            "history.CustomerID", history[CustomerID],
            "history.DeskID", history[DeskID]
        ),
        transactions
    ),
    [history.DeskID] = transactions[DeskID]
        && transactions[TransactionDate] > [history.StartDate]
        && transactions[TransactionDate] <= [history.Enddate]
)

However, I would concern about the performance when the dataset is huge. I'd still suggest you do the JOIN in Azure database. As to performance aspect, try to create proper index and apply proper where clause to narrow down the date range to shrink the data size.

Ashish_Mathur
Super User
Super User

Hi,

 

You can Merge queries.  These screenshots should help.Untitled.pngUntitled1.pngUntitled2.png


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.