Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
nrogers
New Member

Data corruption when copying an Oracle table with number fields

I created a pipeline with a Copy Data task that reads from Oracle and writes to a Fabric lakehouse.  I noticed that every column defined in Oracle as NUMBER with no specified scale or precision is corrupted by the copy task.  The values in the lakehouse table are off by factors of around 10^18.  The factor seems to depend on the data.

 

As a test, I ran a copy task that reads from an Oracle query that hard-codes a variety of number values.  The values show correctly in the preview in the copy task wizard, but not in the lakehouse table.

 

The only workaround I've found is to first copy from Oracle to csv and then from csv to lakehouse tables.

 

select cast(1 as number) test_1number, cast(0.1 as number) test_01number, cast(10 as number) test_10number, cast(0.15 as number) test_015number, cast(1 as number) test_mixed_number, cast(15 as number) test_15number,  cast(1 as number(5)) test_number_5, cast(1 as number(5,2)) test_number_5_2 from dual
union all
select cast(1 as number) test_1number, cast(0.1 as number) test_01number, cast(10 as number) test_10number, cast(0.15 as number) test_015number, cast(0.1 as number) test_mixed_number, cast(15 as number) test_15number, cast(1 as number(5)) test_number_5, cast(1 as number(5,2)) test_number_5_2 from dual
nrogers_0-1730389739610.png

 

1 ACCEPTED SOLUTION
FabianSchut
Impactful Individual
Impactful Individual

This has been a struggle for me for a few weeks too. I've created my own workaround for this problem. It took some time to finetune, but it works for me. Hope it helps you too.
First, I created a lookup that retrieves all the column names from the table in Oracle. The table and schema are pipeline parameters in my case, but can be hardcoded or retrieved in another way too.

FabianSchut_0-1730391126814.png

The Query for the lookup is (I use XMLAGG instead of other options to mitigate the problem that the returned string was too long) :

@concat('SELECT RTRIM(XMLAGG(XMLELEMENT(E,CASE WHEN DATA_TYPE = ''NUMBER'' AND (CAST (DATA_SCALE AS NUMBER(5,0)) >= CAST (DATA_PRECISION AS NUMBER(5,0)) OR DATA_SCALE IS NULL OR DATA_PRECISION IS NULL) THEN ''CAST('' || COLUMN_NAME || '' AS NUMBER(38,18)) AS '' || COLUMN_NAME ELSE COLUMN_NAME END,'', '').EXTRACT(''//text()'') ORDER BY COLUMN_NAME).GetClobVal(),'', '') AS query_string
 FROM ALL_TAB_COLUMNS
 WHERE TABLE_NAME = ''', pipeline().parameters.table_name, '''
 AND OWNER = ''', pipeline().parameters.schema, '''')

 Then I use these retrieved columns in my copy data:

FabianSchut_1-1730391271767.png

And the query for the source is:

@concat('SELECT ',activity('lookup_columns_from_table').output.firstRow.QUERY_STRING, ' FROM ', pipeline().parameters.schema, '.', pipeline().parameters.table_name)

 

View solution in original post

1 REPLY 1
FabianSchut
Impactful Individual
Impactful Individual

This has been a struggle for me for a few weeks too. I've created my own workaround for this problem. It took some time to finetune, but it works for me. Hope it helps you too.
First, I created a lookup that retrieves all the column names from the table in Oracle. The table and schema are pipeline parameters in my case, but can be hardcoded or retrieved in another way too.

FabianSchut_0-1730391126814.png

The Query for the lookup is (I use XMLAGG instead of other options to mitigate the problem that the returned string was too long) :

@concat('SELECT RTRIM(XMLAGG(XMLELEMENT(E,CASE WHEN DATA_TYPE = ''NUMBER'' AND (CAST (DATA_SCALE AS NUMBER(5,0)) >= CAST (DATA_PRECISION AS NUMBER(5,0)) OR DATA_SCALE IS NULL OR DATA_PRECISION IS NULL) THEN ''CAST('' || COLUMN_NAME || '' AS NUMBER(38,18)) AS '' || COLUMN_NAME ELSE COLUMN_NAME END,'', '').EXTRACT(''//text()'') ORDER BY COLUMN_NAME).GetClobVal(),'', '') AS query_string
 FROM ALL_TAB_COLUMNS
 WHERE TABLE_NAME = ''', pipeline().parameters.table_name, '''
 AND OWNER = ''', pipeline().parameters.schema, '''')

 Then I use these retrieved columns in my copy data:

FabianSchut_1-1730391271767.png

And the query for the source is:

@concat('SELECT ',activity('lookup_columns_from_table').output.firstRow.QUERY_STRING, ' FROM ', pipeline().parameters.schema, '.', pipeline().parameters.table_name)

 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

November Update

Fabric Monthly Update - November 2024

Check out the November 2024 Fabric update to learn about new features.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.