Solved: reading huge table from Oracle (using On-Prem gate...

amien · ‎10-19-2024

What is the best approach in Fabric if i need to read 100 million of records from a Oracle using an on-prem gateway?

Do i have other options besides incremental reloads using pipelines/dataflowgen2

Thanks in advanced

lbendlin · ‎10-20-2024

Here's another idea - create Parquet files from Oracle and then directly use them in the lakehouse.

amien · ‎10-19-2024

Thanks for your reply.

query is select * .. it takes hours and hours to load .. i stopped it. every day/week, more data is added to the table.

lbendlin · ‎10-19-2024

How many columns is * ?

Is the data slowly changing or can you use Incremental Refresh?

Is it any faster when you export to CSV (for example)?

amien · ‎10-20-2024

has about 50 columns. 1 week of data already takes one hour.

Incremental for each week would be possible. there are no smart fabric features that i can leverage right? I was looking for mirroring etc.

If i want to explore incremental and want to store the data in an ingestion layer (first layer). Then i have two questions:

* Shoud i use a pipleline for that? or better to use dataflow gen2?

* What is the best way to import the previous data from a CSV file into the lakehouse? Including adding the field mapping.

lbendlin · ‎10-20-2024

Here's another idea - create Parquet files from Oracle and then directly use them in the lakehouse.

lbendlin · ‎10-19-2024

Please provide more details. Is this a one time load or do you plan to refresh the data? How fast is the Oracle source? How complex is the query?

reading huge table from Oracle (using On-Prem gateway)