<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Getting error Py4JJavaError: An error occurred while calling o36814.save. : org.apache.spark.Spa in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4323839#M5576</link>
    <description>&lt;P&gt;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/888369"&gt;@Lakssh&lt;/a&gt;&amp;nbsp; Can you provide the way how you are extracting data from REST API? Since Spark does not have a native way to connect REST API and we usually Python request library to fetch data from the API.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you are using a python UDF, it is not the most efficient way and can cause errors. I would recommend splitting the extraction logic by implementing it in Python requests or Data factory pipeline or DF g2 and then run the transformation process.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 11 Dec 2024 12:32:28 GMT</pubDate>
    <dc:creator>govindarajan_d</dc:creator>
    <dc:date>2024-12-11T12:32:28Z</dc:date>
    <item>
      <title>Getting error Py4JJavaError: An error occurred while calling o36814.save. : org.apache.spark.SparkEx</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4314597#M5492</link>
      <description>&lt;P&gt;Iam trying to validate image loaded into Lakehouse using API call . The API is in a VM Server . i will be getting 40000 rows /day . When iam trying to call the API in a batch of 50 rows/batch my notebook is failiing in 1000th row( 25th batch ) with the above row . Iam uisng F4 license , what could be the issue . I tried with batch size of 10 , 20 and 50 . even tried using rdd and map partitition but not able to prpcess more than 1000 row every time it fails with some error or the other , what am i missing here&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2024 06:00:10 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4314597#M5492</guid>
      <dc:creator>Lakssh</dc:creator>
      <dc:date>2024-12-05T06:00:10Z</dc:date>
    </item>
    <item>
      <title>Re: Getting error Py4JJavaError: An error occurred while calling o36814.save. : org.apache.spark.Spa</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4314887#M5497</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/888369"&gt;@Lakssh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;DIV class=""&gt;
&lt;DIV class="" aria-live="assertive"&gt;
&lt;DIV class=""&gt;
&lt;P&gt;It seems like this might be an issue related to query limits or data volume limits.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Y&lt;SPAN&gt;ou could check the relevant API documentation to see if there are any query limits. For example, some APIs might limit the number of read and write requests per minute. Additionally, some APIs might restrict the maximum amount of data that can be queried per request.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;It might also be related to the computing power of the F4 SKU. You could try temporarily upgrading the Fabric capacity to F8 to see if it yields better results.&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class=""&gt;
&lt;DIV class=""&gt;
&lt;DIV class=""&gt;
&lt;DIV class=""&gt;
&lt;DIV class=""&gt;&amp;nbsp;
&lt;DIV class=""&gt;&lt;SPAN&gt;&lt;SPAN&gt;Are you loading all the data into the lakehouse at once after querying from the API, or are you loading it in batches? If you change the loading frequency, would you get the same error?&lt;/SPAN&gt;&lt;/SPAN&gt;
&lt;DIV class="" role="toolbar" data-tabster="{&amp;quot;mover&amp;quot;:{&amp;quot;cyclic&amp;quot;:false,&amp;quot;direction&amp;quot;:2,&amp;quot;memorizeCurrent&amp;quot;:true}}"&gt;&amp;nbsp;
&lt;DIV class="" role="toolbar" data-tabster="{&amp;quot;mover&amp;quot;:{&amp;quot;cyclic&amp;quot;:false,&amp;quot;direction&amp;quot;:2,&amp;quot;memorizeCurrent&amp;quot;:true}}"&gt;Best Regards,&lt;BR /&gt;Jing&lt;BR /&gt;Community Support Team&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Thu, 05 Dec 2024 08:05:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4314887#M5497</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-12-05T08:05:28Z</dc:date>
    </item>
    <item>
      <title>Re: Getting error Py4JJavaError: An error occurred while calling o36814.save. : org.apache.spark.Spa</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4323839#M5576</link>
      <description>&lt;P&gt;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/888369"&gt;@Lakssh&lt;/a&gt;&amp;nbsp; Can you provide the way how you are extracting data from REST API? Since Spark does not have a native way to connect REST API and we usually Python request library to fetch data from the API.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If you are using a python UDF, it is not the most efficient way and can cause errors. I would recommend splitting the extraction logic by implementing it in Python requests or Data factory pipeline or DF g2 and then run the transformation process.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Dec 2024 12:32:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Getting-error-Py4JJavaError-An-error-occurred-while-calling/m-p/4323839#M5576</guid>
      <dc:creator>govindarajan_d</dc:creator>
      <dc:date>2024-12-11T12:32:28Z</dc:date>
    </item>
  </channel>
</rss>

