Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
Scott_Powell
Advocate III
Advocate III

Fabric Notebook - shows wrong # of rows in a Lakehouse table?!?

Hi, I've got the most basic of issues. I'm trying to write some Fabric notebooks to apply change data capture logic on several large tables. But I'm running into issues - I can't get the notebook to even return the proper number of rows that are in the table.

 

Here's the test notebook I'm running. Note that according to this code, the table has 113,019,072 rows:

Scott_Powell_0-1710612245690.png

 

But if I go into the SQL endpoint for the Lakehouse and run a count(*) against that table - it returns a different number of rows -  112,417,049:

Scott_Powell_1-1710612312250.png

 

Any ideas what could be going wrong? I'm going to open a ticket with Microsoft but hoping someone knows what the underlying issue might be.

 

Thanks!

Scott

1 ACCEPTED SOLUTION

Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).

 

In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:

 

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()
 
I hope this helps anyone else who runs into this issue.
 
Thanks,
Scott

View solution in original post

13 REPLIES 13
RG1565
Regular Visitor

I am facing the same issue, when I am trying to create a shortcut from Dataverse to lakehouse, the rows# count originally in dataverse is around 250000 and when I am writing spark code to get rows count in the shortcut delta table it is showing 100000. Can anybody help me to solve this issue?

 

frithjof_v
Super User
Super User

Do you know which rows are different between the Notebook and the SQL Analytics Endpoint?

 

If you need to find out which rows are different, I guess you could use the notebook to write the test dataframe to a new lakehouse table e.g. UserActionLogLoginEventFact_test

 

Then I think you will find all the 113,019,072 rows of this new table UserActionLogLoginEventFact_test in the SQL Analytics Endpoint.

 

Then you could compare the two tables (UserActionLogLoginEventFact and 

UserActionLogLoginEventFact_test) in the SQL Analytics Endpoint or in Power BI desktop.

 

If you need to find out which rows are different.

Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).

 

In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:

 

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()
 
I hope this helps anyone else who runs into this issue.
 
Thanks,
Scott
Anonymous
Not applicable

Hello, In my scenario, in pipeline , LookUp activity is referring to older version of delta table. Just before that LookUp activity, I am updating the delta table in a notebook. Still, do you think using the same code in notebook would help Lookup activity referring the latest updated version?

Anonymous
Not applicable

Hi @Anonymous ,

I can find you have raised your query in Fabric Community - In pipeline, Lookup activity does not query update... - Microsoft Fabric Community

As I can see my team mate is looking into your query closely. Please follow the steps mentioned by her inorder to get some resolution.

You can also a give a try as @Scott_Powell  mentioned -

mssparkutils.fs.unmount("/default", {"scope": "default_lh"})
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()


Thank you

I will try it. What are "default" and "dafault_lh" parameters? Just to adatpt it to my scenario.

I guess default is workspace name and default_lh is lakehouse name?

Hi @amaaiia , just leave the code "as is" - you don't need to replace anything with your actual lakehouse names or anything.

 

Hope it helps!

Scott

frithjof_v
Super User
Super User

@Scott_Powell did you use dataflow gen2 to populate the Lakehouse table?

 

There are some other community threads also which shows different row counts in Notebook and SQL Analytics Endpoint.

 

(It seems to be some different interpretation of delta table version history between the Notebook and the SQL Analytics Endpoint maybe?)

 

I created an idea: https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=cea60fc6-93f2-ee11-a73e-6045bd7cb2b6 

 

Here are the links to the other community threads:

 

https://community.fabric.microsoft.com/t5/General-Discussion/Duplicated-Rows-In-Tables-Built-By-Note...

 

https://community.fabric.microsoft.com/t5/General-Discussion/Duplicated-rows-between-notebook-and-SQ...

Anonymous
Not applicable

Hi @Scott_Powell ,

Thanks for using fabric community.
Apologies for the issue you're facing, at this time, we are reaching out to the internal team to get some help on this.
We will update you once we hear back from them.

Awesome, thank you @Anonymous . Please let me know if you hear anything. 

 

Appreciate the help!

Scott

Anonymous
Not applicable

Hi @Scott_Powell ,

Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.

 

Please go ahead and raise a support ticket to reach our support team: Link 

 

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thank you.

Thanks @Anonymous , I've opened support request 2403200040013153 for this.

 

Scott

Anonymous
Not applicable

Hi @Scott_Powell ,

Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.

 

Please go ahead and raise a support ticket to reach our support team: Link 

 

After creating a Support ticket please provide the ticket number as it would help us to track for more information.

Thank you.

Helpful resources

Announcements
May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.