Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
kirah2128
Helper II
Helper II

One Large Text File Parsing 200MB Error 500

Dear Community,

Have you encountered this problem? I'm processing a text file and using the code below. The script is running successfuly on my laptop but with Fabric it's not working. Do you have other solutions on this? I have tried increase the workspace to the larges nodes available and still same issue arrised. 

 

 

with open (amm, 'rt') as myfile:  # Open lorem.txt for reading text
    contents = myfile.read()              # Read the entire file to a string
print(contents)   
# Split Text body by delimiter <TASK
split1 = contents.split('<TASK')    

# Create a Dataframe of n number of Rows based on the Split
update_df = pd.DataFrame(split1, columns =['row_text'])

# Delete first row and retain only AMM Tasks
df = update_df.drop(0)

 

 

LivyHttpRequestFailure: Something went wrong while processing your request. Please try again later. HTTP status code: 500. Trace ID: f606057c-c7bd-46de-8c3f-c215cc076b03.
 
Regards,
King
1 ACCEPTED SOLUTION

Dear Everyone,

 

the problem is solved =D by removing the print() function. 

# print(contents)   

For some reason the fabric can't handle large concatinated string. 

View solution in original post

10 REPLIES 10
kirah2128
Helper II
Helper II

To all, this is the latest discovery. They will go deeper invgestigation on this as we found this warning during the execution of the simple read and print of 200mb text file

 

2024-01-03 05:09:38,976 WARN ClientCnxn [Thread-68-SendThread(vm-fdb01374:XXXX)]: Session 0x10000008c0c0002 for sever vm-fdb01374/XX.X.XXX.X:XXX, Closing socket connection. Attempting reconnect except it is a SessionExpiredException.

org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed out, have not heard from server in 73538ms for session id 0x10000008c0c0002

 

 

To all following this issue:, They asked us to increase the workspace setting, but before reaching them we already did and increase it to XXlarge node but still same issue. 

kirah2128
Helper II
Helper II

We will have a support session tomorrow. I will update everyone here. 

Dear Everyone,

 

the problem is solved =D by removing the print() function. 

# print(contents)   

For some reason the fabric can't handle large concatinated string. 

Anonymous
Not applicable

Hi @kirah2128 ,

Glad to know your query got resolved. Please continue using Fabric Community for your further queries.

HimanshuS-msft
Community Support
Community Support

Hi @kirah2128 

Since small files are working fine and only the big files are failing , I suggest you to work with the Microsoft support team on this . The reason being that they will have the details as what is the real error under the hood . At this time I do not think that cluster resource can be an issue as the spark is deisgned to handle big loads , Once you have created a support ticket , please do share the same here so that we can also keep an eye on the same .

Thanks 
Himanshu 

Anonymous
Not applicable

Hi @kirah2128 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Did you got a chance to create a support ticket? If you have created a support ticket , please do share the same here so that we can also keep an eye on the same.

HimanshuS-msft
Community Support
Community Support

Hello @kirah2128 
Are you sure that its the size of the file which which is creating the issue ? I am just trying to make sure that you have tried to process a smaller file and if that works . 
Also I am assuming that the path of the file ( may be the file is in Lakehouse ) is correct . 
I will wait to hear back on this from you .
Thanks 
Himanshu 

I tried with smaller file around 2mb and its working just fine. Both files are located in the same folder location. But the one I raised is still not working. 

Anonymous
Not applicable

Hello @kirah2128 ,

At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

November Update

Fabric Monthly Update - November 2024

Check out the November 2024 Fabric update to learn about new features.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.