Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Dear Community,
Have you encountered this problem? I'm processing a text file and using the code below. The script is running successfuly on my laptop but with Fabric it's not working. Do you have other solutions on this? I have tried increase the workspace to the larges nodes available and still same issue arrised.
with open (amm, 'rt') as myfile: # Open lorem.txt for reading text
contents = myfile.read() # Read the entire file to a string
print(contents)
# Split Text body by delimiter <TASK
split1 = contents.split('<TASK')
# Create a Dataframe of n number of Rows based on the Split
update_df = pd.DataFrame(split1, columns =['row_text'])
# Delete first row and retain only AMM Tasks
df = update_df.drop(0)
Solved! Go to Solution.
Dear Everyone,
the problem is solved =D by removing the print() function.
# print(contents)
For some reason the fabric can't handle large concatinated string.
To all, this is the latest discovery. They will go deeper invgestigation on this as we found this warning during the execution of the simple read and print of 200mb text file
2024-01-03 05:09:38,976 WARN ClientCnxn [Thread-68-SendThread(vm-fdb01374:XXXX)]: Session 0x10000008c0c0002 for sever vm-fdb01374/XX.X.XXX.X:XXX, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.
org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 73538ms for session id 0x10000008c0c0002
To all following this issue:, They asked us to increase the workspace setting, but before reaching them we already did and increase it to XXlarge node but still same issue.
We will have a support session tomorrow. I will update everyone here.
Dear Everyone,
the problem is solved =D by removing the print() function.
# print(contents)
For some reason the fabric can't handle large concatinated string.
Hi @kirah2128 ,
Glad to know your query got resolved. Please continue using Fabric Community for your further queries.
Hi @kirah2128
Since small files are working fine and only the big files are failing , I suggest you to work with the Microsoft support team on this . The reason being that they will have the details as what is the real error under the hood . At this time I do not think that cluster resource can be an issue as the spark is deisgned to handle big loads , Once you have created a support ticket , please do share the same here so that we can also keep an eye on the same .
Thanks
Himanshu
Hi @kirah2128 ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Did you got a chance to create a support ticket? If you have created a support ticket , please do share the same here so that we can also keep an eye on the same.
Hello @kirah2128
Are you sure that its the size of the file which which is creating the issue ? I am just trying to make sure that you have tried to process a smaller file and if that works .
Also I am assuming that the path of the file ( may be the file is in Lakehouse ) is correct .
I will wait to hear back on this from you .
Thanks
Himanshu
I tried with smaller file around 2mb and its working just fine. Both files are located in the same folder location. But the one I raised is still not working.
Hello @kirah2128 ,
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
Check out the November 2024 Fabric update to learn about new features.
User | Count |
---|---|
5 | |
5 | |
2 | |
1 | |
1 |
User | Count |
---|---|
15 | |
7 | |
5 | |
4 | |
3 |