Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hi, I've got the most basic of issues. I'm trying to write some Fabric notebooks to apply change data capture logic on several large tables. But I'm running into issues - I can't get the notebook to even return the proper number of rows that are in the table.
Here's the test notebook I'm running. Note that according to this code, the table has 113,019,072 rows:
But if I go into the SQL endpoint for the Lakehouse and run a count(*) against that table - it returns a different number of rows - 112,417,049:
Any ideas what could be going wrong? I'm going to open a ticket with Microsoft but hoping someone knows what the underlying issue might be.
Thanks!
Scott
Solved! Go to Solution.
Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).
In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:
I am facing the same issue, when I am trying to create a shortcut from Dataverse to lakehouse, the rows# count originally in dataverse is around 250000 and when I am writing spark code to get rows count in the shortcut delta table it is showing 100000. Can anybody help me to solve this issue?
Do you know which rows are different between the Notebook and the SQL Analytics Endpoint?
If you need to find out which rows are different, I guess you could use the notebook to write the test dataframe to a new lakehouse table e.g. UserActionLogLoginEventFact_test
Then I think you will find all the 113,019,072 rows of this new table UserActionLogLoginEventFact_test in the SQL Analytics Endpoint.
Then you could compare the two tables (UserActionLogLoginEventFact and
UserActionLogLoginEventFact_test) in the SQL Analytics Endpoint or in Power BI desktop.
If you need to find out which rows are different.
Hi all, I should have updated this thread a few weeks ago. I opened a ticket with Microsoft for this issue. It seems like the notebooks were somehow refering to older copies of the parquet tables instead of the latest and greatest. This is a known bug and a fix is being worked on (I'm not sure if it's been implemented or not).
In the meantime, the Microsoft engineer had me run the following code in a notebook cell. I then waited for 15 or 20 minutes and everything was fine:
Hello, In my scenario, in pipeline , LookUp activity is referring to older version of delta table. Just before that LookUp activity, I am updating the delta table in a notebook. Still, do you think using the same code in notebook would help Lookup activity referring the latest updated version?
Hi @Anonymous ,
I can find you have raised your query in Fabric Community - In pipeline, Lookup activity does not query update... - Microsoft Fabric Community
As I can see my team mate is looking into your query closely. Please follow the steps mentioned by her inorder to get some resolution.
You can also a give a try as @Scott_Powell mentioned -
mssparkutils.fs.unmount("/default", {"scope": "default_lh"})
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.reset()
sc._jvm.com.microsoft.spark.notebook.common.trident.TridentRuntimeContext.personalizeSession()
Thank you
I will try it. What are "default" and "dafault_lh" parameters? Just to adatpt it to my scenario.
I guess default is workspace name and default_lh is lakehouse name?
Hi @amaaiia , just leave the code "as is" - you don't need to replace anything with your actual lakehouse names or anything.
Hope it helps!
Scott
@Scott_Powell did you use dataflow gen2 to populate the Lakehouse table?
There are some other community threads also which shows different row counts in Notebook and SQL Analytics Endpoint.
(It seems to be some different interpretation of delta table version history between the Notebook and the SQL Analytics Endpoint maybe?)
I created an idea: https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=cea60fc6-93f2-ee11-a73e-6045bd7cb2b6
Here are the links to the other community threads:
Hi @Scott_Powell ,
Thanks for using fabric community.
Apologies for the issue you're facing, at this time, we are reaching out to the internal team to get some help on this.
We will update you once we hear back from them.
Awesome, thank you @Anonymous . Please let me know if you hear anything.
Appreciate the help!
Scott
Hi @Scott_Powell ,
Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.
Please go ahead and raise a support ticket to reach our support team: Link
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thank you.
Thanks @Anonymous , I've opened support request 2403200040013153 for this.
Scott
Hi @Scott_Powell ,
Apologies for the issue you have been facing. The best course of action is to open a support ticket and have our support team take a closer look at it.
Please go ahead and raise a support ticket to reach our support team: Link
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thank you.
User | Count |
---|---|
13 | |
4 | |
3 | |
3 | |
3 |
User | Count |
---|---|
8 | |
8 | |
7 | |
6 | |
5 |