Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
pace-jp
Frequent Visitor

Dataflow Join Bug

I have a JSON file with many nested arrays that I'm trying to flatten into a single table using a Dataflow Gen2. I have a main table (table0) that contains some metadata about each record and an expandable object. I added a few steps to filter out unwanted records from table0.

pacejp_0-1731682421534.png

 

I referenced table0 to make two more tables (table1 and table 2), expanding different nested arrays but keeping the id field which is the unique record identifier. I then left joined table1 to table2. As soon as I expanded table2 to add fields to table1, a number of records (81) disappeared from table1 (the left side of the join) and were replaced by the same number of new records, but the new records were all ones that had been filtered out of table0 (the source for both table1 and table2). I've compared the datasets both right before and after expanding the fields and confirmed this step was introducing the bad data. Is this this a known issue or has anyone seen anything like this before? It's making dataflows unusable.

2 REPLIES 2
pace-jp
Frequent Visitor

Thanks for the reply @v-yilong-msft. I double checked the tables and joins and couldn't find anything suspicious. I also did not find anything in the known issues site. I'm guessing it has something to do with the way the system is parsing the JSON, and possibly (mis)handling empty arrays (even though I am actively filtering them out). So, I've abandoned trying to parse the file using a dataflow and am changing to a spark notebook instead. 

v-yilong-msft
Community Support
Community Support

Hi @pace-jp ,

Based on the behavior you describe, i.e., records that were filtered out after expanding fields after connecting reappear, it is unusual.

So I think you can take the following steps:

 

1. Ensure that the join conditions are correctly specified and that there are no unintended matches causing the reintroduction of filtered records.

 

2. Verify that the data in table0, table1, and table2 is consistent and that the filtering steps are correctly applied before the join operation.

 

3. The issue might be related to how the expansion step is handled. Double-check the configuration of this step to ensure it doesn't inadvertently reintroduce filtered records.

 

4. You can also look at this document about dataflow known issues below: 

Fabric known issues - Microsoft Fabric | Microsoft Learn

 

 

Best Regards

Yilong Zhou

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors