Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
uvil
Resolver I
Resolver I

SSIS Lookup No Match Output

Good Afternoon, 

 

I just started to work with Microsoft Fabric a few days ago. In General it's all so intuitive and quite easy to understand and to discover. After some days trying and doing things I wanted to replicate one SSIS package that we use every night to copy all the works done in the day and insert them in our DataWareHouse. 

 

The thing is that i'm not able to see the Lookup object of SSIS, that I was using. The function of this Lookup is so easy. Every night I do a Select statement where I get all the works of the last 30 days, as it's possible to enter some jobs manually in SSIS I can do a lookup with No Match Output where inside them I can check if the row key is inside or no, and insert it if necessary. As some employees can insert manually a job made 5 days ago because the mobile app was down or whatever. 

 

uvil_1-1723038468862.png

 

In the Dataflow object, i'm not able to see any thing that can help me to avoid inserting duplicates as when I do the select I look 30 days back, just to recover manual introductions, 

 

¿Is there any way to do this, or I need to change my way of think on this process?

 

For the moment I see that SSIS is more complete than ADF in this terms... Maybe i'm losing something, 

 

Any help is welcomed, 

 

Best Regards,.

2 ACCEPTED SOLUTIONS
frithjof_v
Community Champion
Community Champion

I'm not sure I followed everything in your description, but I am guessing you want to use the Dataflow Gen2 to insert only new rows into the Data Warehouse.

 

If so, I think it can be done the following way:

 

Bring in both your source system ("jobs") and the data warehouse table as two separate queries into your Dataflow Gen2 (i.e. Get Data).

 

Filter the data from the data warehouse table to only query the last 30 days (as per your description).

 

Next, I am assuming you have a Job ID column or another column/set of columns which uniquely identify each row.

 

Do an anti join between your source system query and the data warehouse query. The anti join will ensure that any rows in the source system query, which are already in the warehouse, will be filtered out of the source system query. 

 

Then write the remaining rows to the data warehouse table. Use the warehouse table as the Data destination and choose Append method.

View solution in original post

Good Afternoon, 

 

Yes, I finally did it inside the Dataflow, I made it as you said, 

 

1 - Via PowerQuery, in my case it was easier with the Advanced Editor I created a Variable Destination with the select of just the keys that are in the Destination

2 - I did a filtering doing the anti-Join, so PowerQuery just returns me the new data. 

3 - On the data destination section I chosed Append, so I insert the rows. 

 

Doing this it works, it's a bit complex than before, but it does the job, thanks for the help, 

 

Best Regards,.

View solution in original post

2 REPLIES 2
frithjof_v
Community Champion
Community Champion

I'm not sure I followed everything in your description, but I am guessing you want to use the Dataflow Gen2 to insert only new rows into the Data Warehouse.

 

If so, I think it can be done the following way:

 

Bring in both your source system ("jobs") and the data warehouse table as two separate queries into your Dataflow Gen2 (i.e. Get Data).

 

Filter the data from the data warehouse table to only query the last 30 days (as per your description).

 

Next, I am assuming you have a Job ID column or another column/set of columns which uniquely identify each row.

 

Do an anti join between your source system query and the data warehouse query. The anti join will ensure that any rows in the source system query, which are already in the warehouse, will be filtered out of the source system query. 

 

Then write the remaining rows to the data warehouse table. Use the warehouse table as the Data destination and choose Append method.

Good Afternoon, 

 

Yes, I finally did it inside the Dataflow, I made it as you said, 

 

1 - Via PowerQuery, in my case it was easier with the Advanced Editor I created a Variable Destination with the select of just the keys that are in the Destination

2 - I did a filtering doing the anti-Join, so PowerQuery just returns me the new data. 

3 - On the data destination section I chosed Append, so I insert the rows. 

 

Doing this it works, it's a bit complex than before, but it does the job, thanks for the help, 

 

Best Regards,.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors