Solved: Re: Fabric Notebook - shows wrong # of rows in a L...

Scott_Powell · ‎03-16-2024

Hi, I've got the most basic of issues. I'm trying to write some Fabric notebooks to apply change data capture logic on several large tables. But I'm running into issues - I can't get the notebook to even return the proper number of rows that are in the table.

Here's the test notebook I'm running. Note that according to this code, the table has 113,019,072 rows:

But if I go into the SQL endpoint for the Lakehouse and run a count(*) against that table - it returns a different number of rows - 112,417,049:

Any ideas what could be going wrong? I'm going to open a ticket with Microsoft but hoping someone knows what the underlying issue might be.

Thanks!

Scott

Scott_Powell · ‎04-04-2024

Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).

In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})

sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()

sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()

I hope this helps anyone else who runs into this issue.

Thanks,

Scott

View solution in original post

RG1565 · ‎09-06-2024

I am facing the same issue, when I am trying to create a shortcut from Dataverse to lakehouse, the rows# count originally in dataverse is around 250000 and when I am writing spark code to get rows count in the shortcut delta table it is showing 100000. Can anybody help me to solve this issue?

frithjof_v · ‎04-04-2024

Do you know which rows are different between the Notebook and the SQL Analytics Endpoint?

If you need to find out which rows are different, I guess you could use the notebook to write the test dataframe to a new lakehouse table e.g. UserActionLogLoginEventFact_test

Then I think you will find all the 113,019,072 rows of this new table UserActionLogLoginEventFact_test in the SQL Analytics Endpoint.

Then you could compare the two tables (UserActionLogLoginEventFact and

UserActionLogLoginEventFact_test) in the SQL Analytics Endpoint or in Power BI desktop.

If you need to find out which rows are different.

Scott_Powell · ‎04-04-2024

Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).

In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})

sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()

sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()

I hope this helps anyone else who runs into this issue.

Thanks,

Scott

Anonymous · ‎05-13-2024

Hello, In my scenario, in pipeline , LookUp activity is referring to older version of delta table. Just before that LookUp activity, I am updating the delta table in a notebook. Still, do you think using the same code in notebook would help Lookup activity referring the latest updated version?

Anonymous · ‎05-14-2024

Hi @Anonymous ,

I can find you have raised your query in Fabric Community - In pipeline, Lookup activity does not query update... - Microsoft Fabric Community

As I can see my team mate is looking into your query closely. Please follow the steps mentioned by her inorder to get some resolution.

You can also a give a try as @Scott_Powell mentioned -

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()

Thank you

amaaiia · ‎04-04-2024

I will try it. What are "default" and "dafault_lh" parameters? Just to adatpt it to my scenario.

I guess default is workspace name and default_lh is lakehouse name?

Scott_Powell · ‎04-05-2024

Hi @amaaiia , just leave the code "as is" - you don't need to replace anything with your actual lakehouse names or anything.

Hope it helps!

Scott

frithjof_v · ‎04-04-2024

@Scott_Powell did you use dataflow gen2 to populate the Lakehouse table?

There are some other community threads also which shows different row counts in Notebook and SQL Analytics Endpoint.

(It seems to be some different interpretation of delta table version history between the Notebook and the SQL Analytics Endpoint maybe?)

I created an idea: https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=cea60fc6-93f2-ee11-a73e-6045bd7cb2b6

Here are the links to the other community threads:

https://community.fabric.microsoft.com/t5/General-Discussion/Duplicated-Rows-In-Tables-Built-By-Note...

https://community.fabric.microsoft.com/t5/General-Discussion/Duplicated-rows-between-notebook-and-SQ...

Anonymous · ‎03-18-2024

Hi @Scott_Powell ,

Thanks for using fabric community.
Apologies for the issue you're facing, at this time, we are reaching out to the internal team to get some help on this.
We will update you once we hear back from them.

Scott_Powell · ‎03-18-2024

Awesome, thank you @Anonymous . Please let me know if you hear anything.

Appreciate the help!

Scott

Anonymous · ‎03-20-2024

Hi @Scott_Powell ,

Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.

Please go ahead and raise a support ticket to reach our support team: Link

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thank you.

Scott_Powell · ‎03-20-2024

Thanks @Anonymous , I've opened support request 2403200040013153 for this.

Scott

Anonymous · ‎03-20-2024

Hi @Scott_Powell ,

Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.

Please go ahead and raise a support ticket to reach our support team: Link

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thank you.

Fabric Notebook - shows wrong # of rows in a Lakehouse table?!?

Helpful resources

FabCon Global Hackathon

Fabric Monthly Update - September 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

Fabric Notebook - shows wrong # of rows in a Lakehouse table?!?

Helpful resources

FabCon Global Hackathon

Fabric Monthly Update - September 2025

FabCon Atlanta 2026