Re: DataflowGen2, which does not copy all the data...

vivien57 · ‎07-24-2025

Hello everyone,

I have a Dataflow Gen2 that copies data from a Dataverse table (D365) to a Lakehouse table.

The Dataverse table has 70,037 rows.

In the dataflow development screen I can see the same number of lines

However, when I go to the result table I only have 18,468 rows

And if I do a re-run, it won't be the same number

In the end, the result is always a success...

After some research I tried to deactivate ‘Fast Copy’, but it doesn't change anything...

Have you ever encountered this problem?

For information, the table is large (over 300 columns), but that's no reason, especially as it's a success.

Thank in advance for you return,

Vivien

miguel · ‎08-11-2025

I'd still recommend raising the support ticket so an engineer can take a closer look at your scenario and get the traces for it.

Would you happen to have a way in which we can reproduce the issue?

I initially tried to create a Dataflow with the sources you mentioned but was never able to get the same result that you describe. It could be something quite specific about your query and/or your sources, so a deeper investigation through the support route would be the best place to get a much closer look.

vivien57 · ‎08-11-2025

Before creating a ticket, I wanted to see if the community had already encountered this problem, as it does not seem specific to my environment.

Yes, to reproduce the problem, I think you just need to:

Have a source table via the Dataverse connector with many columns (the account table on my system has more than 300 columns)
Copy the data to a table with the same structure in a lakehouse

With the first code, some rows are lost, while with the second code, all rows are transferred.

In fact, as shown in this screenshot, with the first code, not all rows are transferred because it cannot read all the rows.

It transfers all the lines that are read, but since it cannot read everything, not all of it is transferred.

miguel · ‎08-11-2025

could you confirm whats the code being passed for each of those queries? the screenshot uses references not visible (specifically Dataverse_Tablename)

vivien57 · ‎08-11-2025

I specified this in my reply here (dbo.Account).

miguel · ‎08-12-2025

Thanks for the info!

I'd recommend removing Staging from your queries if its enabled. Would also recommend removing the add custom column step to see if you get the same issue and to isolate the variables as you've mentioned that the issue could be strictly with the Value.NativeQuery usage.

There's a situation where the native query might not give you the same data that you get through the navigator of a connector. However, the difference would be at the table schema shown in the query and not the number of rows. This is because the connector provides projections or reference columns that have either "record" or "table" values which are not supported to be loaded to a Lakehouse, but they'll show up as columns in your query. A native query wouldn't "project" or provide those column references.

I finished doing some tests and I can see a behavior that I'd like to explain.

When we create a connection to a source, we may need to read multiple rows or make multiple calls to get the data that we need. In the activity statistics it may show up the total of all the rows that we read to perform the activity, but it doesn't directly mean that it was the output of the query that could've ran inside of the activity.

Both of the attemps that I had show this behavior:

But they write the same number of rows which is the number of rows that your query, in theory, should have.

You can run 2 validations to make sure that your query is evaluating to the number of rows expected:

You can head over to the Power Query editor and in the "Transform" tab of the ribbon add a "Count rows" transform to read the total rows. After doing that it should match the "rows written" (in theory) - in my case it does match for 18 rows:
You can check in your PowerApps admin access how many rows that table has. In my case it also matches the 18 rows written.

If both the queries that you've created are exactly equivalent from each other and they yield the same number of rows within the Power Query Editor (you can validate this by the process stated in #1 using the "Count rows"), then this is something that we would like to understand more and should be raised as a support ticket so an engineer can take a closer look. With a support ticket we would get to the bottom of your scenario and determine why one query loads X number of rows whereas the other might load Y.

vivien57 · ‎08-12-2025

@miguel ,

Thank you very much for your feedback and testing.

I can confirm that the Staging feature is not enabled.

When I count the lines, I have more than 70,000 lines.

I did another test. I took only the 111 columns I need (out of 350).

And in this case, it works :

This confirms that it is Value.NativeQuery that causes problems when there are too many columns/too much data.

Can you reproduce a test on your side with a volume (rows and columns) equivalent to what I have on my side?

Thank you in advance for your feedback.

Vivien

v-saisrao-msft · ‎08-14-2025

Hi @vivien57,

Had a chance to review the details shared by @miguel. If the issue still persists after your tests, I recommend raising a support ticket.

Thank you.

miguel · ‎08-12-2025

I don't have an environment that matches that criteria, but it would help tremendously if our engineering team can capture some traces for your Dataflow. They wouldn't need access to the environment, but just be able to capture the traces from your account / dataflow with explicit consent and do some deeper investigations. Definitely do feel free to raise the support ticket if you have a chance and our team can get to the bottom of it.

In theory a "SELECT * FROM dbo.Account" is not exactly the same query plan that shows up for the query without the Value.NativeQuery, so the team might start by analyzing that first.

vivien57 · ‎08-11-2025

Hello,

I have just made some progress in investigating the issue.

Here are two DataFlowGen 2 codes (in the first one, Dataverse_Tablename = ‘account’).

In the first one, I lose lines as explained above, but not in the second one.

This seems to be caused by the use of the Value.NativeQuery line.

Does that give you any ideas? Have you ever encountered this problem before?

Thank in advance,

Have a nice day,

Vivien

Cookistador · ‎07-24-2025

Hello @vivien57

First, can you try to only select a few columns
Then, to check if the issue is comming from dataverse OR on fabric side, could you try to execute it in a dataflow gen 1 and check if in your Power BI report you retrieve all data ?

Keep us inform about the results

vivien57 · ‎07-24-2025

Hello,

Thank you for your feedback.

By randomly removing 87/366 columns I have managed to read and write 32.152 lines on 70.037 lines.

By randomly removing 151/366 columns I have managed to read and write 54.922 lines on 70.037 lines.

By randomly removing 364/366 columns I have managed to read and write 70.037 lines on 70.037 lines.

So there seems to be a link between the number of columns and this problem.

These tests were done on an F2 capacity, I've tried on an F64 capacity, which doesn't solve the problem.

Do you have any idea what the problem is? It seems to come from DataflowGen2....

Thank in advance,

Vivien

v-saisrao-msft · ‎07-29-2025

Hi @vivien57,

Could you let me know if your issue with the activity has been resolved?

Thank you.

vivien57 · ‎07-30-2025

Hello @v-saisrao-msft ,

No the issue is not resolved. I have tested again, and it's the same.

Thank in advance,

Vivien

v-saisrao-msft · ‎08-03-2025

Hi @vivien57,

May I know if the issue has been resolved? If not, please raise a support ticket as suggested by @miguel, so that the backend team can investigate the issue further.

Below is the link to raise the support ticket:
How to create a Fabric and Power BI Support ticket - Power BI | Microsoft Learn

Thank you.

miguel · ‎08-02-2025

Would you mind raising a support ticket so an engineer can take a closer look at your scenario and collect all the information to troubleshoot ? Below is the link to raise the support ticket:

https://support.fabric.microsoft.com/support

v-saisrao-msft · ‎07-31-2025

Hi @vivien57,

Please see the thread below, which outlines an issue similar to yours. In this case, the data preview in Dataflow Gen2 displays correctly, but some rows are missing in the Lakehouse after publishing.

Solved: Dataflow Gen2 to Lakehouse - missing rows - Microsoft Fabric Community

Thank you.