<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Json.Document and Table.GenerateByPage slow performance/alternatives in Developer</title>
    <link>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4112185#M54895</link>
    <description>&lt;P&gt;You could consider running Python or R scripts to parse the JSON but I'd think these may not be faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Binary.Buffer might speed up some of the network transfers but probably won't help with parsing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As for the sample code - your main tools are List.Generate and List.Accumulate . Both have their strengths and weaknesses, you'd have to test how each behaves in your scenario. Then use Table.ExpandTableColumn rather than Table.Combine.&lt;/P&gt;</description>
    <pubDate>Thu, 22 Aug 2024 00:20:13 GMT</pubDate>
    <dc:creator>lbendlin</dc:creator>
    <dc:date>2024-08-22T00:20:13Z</dc:date>
    <item>
      <title>Json.Document and Table.GenerateByPage slow performance/alternatives</title>
      <link>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4111400#M54888</link>
      <description>&lt;P&gt;We have a Power BI connector that receives JSON data page-by-page. We see that performance in terms of rows of data loaded per second lags behind that of an ODBC driver very significantly.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;In particular, there were a couple of performance bottlenecks that I indentified:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;It takes about 1.4 seconds to parse a 30MB JSON object with&lt;FONT face="courier new,courier"&gt; Json.Document&lt;/FONT&gt;. By contrast, it only takes a small fraction of a second to do the same in Node.js.&lt;BR /&gt;&lt;UL&gt;&lt;LI&gt;Is there any faster way to parse JSON (maybe we're calling it the wrong way?) or can the implementation of &lt;FONT face="courier new,courier"&gt;Json.Document &lt;/FONT&gt;be potentially improved if it's an inherent limitation?&lt;BR /&gt;&lt;BR /&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Furthermore, we used the&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/power-query/helper-functions#tablegeneratebypage" target="_self"&gt;Table.GenerateByPage sample code&lt;/A&gt;&amp;nbsp;to combine pages of data together. However, this seems to take about 2.4 seconds to add each ~30MB page of data to the combined table, which seems very slow.&lt;BR /&gt;&lt;UL&gt;&lt;LI&gt;Could there be any other way to combine tables together in Power Query that still works for very large datasets? We can also use &lt;FONT face="courier new,courier"&gt;Table.Combine&lt;/FONT&gt;, but it seems to slow down greatly/fail for 2-3 GB of data.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Wed, 21 Aug 2024 13:59:38 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4111400#M54888</guid>
      <dc:creator>xli629b</dc:creator>
      <dc:date>2024-08-21T13:59:38Z</dc:date>
    </item>
    <item>
      <title>Re: Json.Document and Table.GenerateByPage slow performance/alternatives</title>
      <link>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4112185#M54895</link>
      <description>&lt;P&gt;You could consider running Python or R scripts to parse the JSON but I'd think these may not be faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Binary.Buffer might speed up some of the network transfers but probably won't help with parsing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As for the sample code - your main tools are List.Generate and List.Accumulate . Both have their strengths and weaknesses, you'd have to test how each behaves in your scenario. Then use Table.ExpandTableColumn rather than Table.Combine.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Aug 2024 00:20:13 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4112185#M54895</guid>
      <dc:creator>lbendlin</dc:creator>
      <dc:date>2024-08-22T00:20:13Z</dc:date>
    </item>
    <item>
      <title>Re: Json.Document and Table.GenerateByPage slow performance/alternatives</title>
      <link>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4140755#M55188</link>
      <description>&lt;P&gt;Thank you for the recommendations. We don't think we can use Python/R scripts in our connector itself as I think these would create external dependencies for our clients.&lt;BR /&gt;&lt;BR /&gt;I have tried using Binary.Buffer on the response, and it seem that it's slightly slower (measured by total processing time) than directly passing the result of Web.Contents to Json.Document.&lt;BR /&gt;&lt;BR /&gt;I think we do need to always use List.Generate as we don't have a pre-determined number of pages, but only learn whether there is a next page or not after receiving each page. I have also tried List.Combine on raw lists (rather than of tables) instead of Table.ExpandTableColumn and Table.Combine, but the performance seems to be similar.&lt;BR /&gt;&lt;BR /&gt;Do you think there might be anything else that could affect the performance?&lt;/P&gt;</description>
      <pubDate>Mon, 09 Sep 2024 13:43:11 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4140755#M55188</guid>
      <dc:creator>xli629b</dc:creator>
      <dc:date>2024-09-09T13:43:11Z</dc:date>
    </item>
    <item>
      <title>Re: Json.Document and Table.GenerateByPage slow performance/alternatives</title>
      <link>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4140764#M55189</link>
      <description>&lt;P&gt;The size of the JSON payload plays a big role. See if you can get a less chatty endpoint that only has the fields you need.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here's an interesting article, slightly off topic but may help&amp;nbsp;&lt;A href="https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/effective-strategies-for-storing-and-parsing-json-in-sql-server/" target="_blank"&gt;Effective Strategies for Storing and Parsing JSON in SQL Server - Simple Talk (red-gate.com)&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Sep 2024 15:35:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Developer/Json-Document-and-Table-GenerateByPage-slow-performance/m-p/4140764#M55189</guid>
      <dc:creator>lbendlin</dc:creator>
      <dc:date>2024-09-09T15:35:28Z</dc:date>
    </item>
  </channel>
</rss>

