Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
arpost
Post Prodigy
Post Prodigy

Anyone know how to fix external data type error with Lakehouse files?

Greetings, all. I'm trying to copy data from files loaded into a Lakehouse into a Data Warehouse table and am running into some sort of data type error. The files loaded into the Lakehouse were loaded as CSVs (DelimitedText).

 

Here's a "cleaned up" version of the error I get with my Copy Activity:

 

Column 'TotalItems' of type 'INT' is not compatible with external data type 'Parquet physical type: BYTE_ARRAY, logical type: UTF8'

I was rather suprised as I didn't know/expect that files in a Lakehouse would be typed. The column referenced is a column of integers that can range from 1 to 100+.

 

Anyone know how to resolve this? I'm not specifying any data types when loading into the Lakehouse, and I'd REALLY like to get rid of this error.

23 REPLIES 23
smeetsh
Helper V
Helper V

In order to fix this you will 1st have to read the file into a lakehouse table. Next you create a second copy data step.

 

The connector of this step will be your warehouse, but the source of your data will be query.

In that query you will select all your fields, but your "FROM" will be the lakehouse table.

In the query you will need to do CONVERT of CAST to the correct type in your warehouse. CSV are always interpreted as string and fabric is not always able to convert it via the GUI, hence the need sometimes to do query steps like conver/cast.
An example of one of our pipelines below:

smeetsh_1-1738615736494.png

 

smeetsh_2-1738615833267.png

 



smeetsh_0-1738615703664.png

 



pavannarani
Helper III
Helper III

Did anyone find any solution to this post? How can we load data from CSV files in Lakehouse to fabric warehouse tables through either a data pipeline or is there any other component we can use in fabric to do this job? 

frithjof_v
Super User
Super User

I'm experiencing the same issue.

v-cboorla-msft
Microsoft Employee
Microsoft Employee

Hi @arpost 

 

Welcome to Microsoft Fabric Community and thanks for posting your question here.

 

As I understand that you are trying to copy data from files loaded into a Lakehouse into a Data Warehouse table and you are facing some sort of data type error.

 

When I tried to Repro this scenario using csv files I didn't get any error. I was successful in loading the file into a table in warehouse. I have tried the repro with multiple csv files yet there was no error. 

vcboorlamsft_0-1697802834643.png

 

Can you please share the screenshot of the copy activity which you are using?

 

Thanks.

 

 

Hi, @v-cboorla-msft. I am wondering if the issue is that Parquet type is being interpreted by Fabric as a BYTE_ARRAY for some files because the files have a value of 1 for the TotalItems column. So while File A's TotalItems column may have values of 1, 2, 5, 10, 25, etc. File B's TotalItems column has 1.

 

Here's the Copy Activity Source/Destination that loads the files into the Lakehouse:

 

arpost_1-1697808098502.png

 

And here's the Copy Activity that's failing when trying to load into Data Warehouse:

 

arpost_3-1697808393232.png

 

Hi @arpost 

 

Apologies for the delay in response.

Thank you for the information that you have provided.

 

I can see in your screenshot that you're using an existing file to load data into the warehouse. If the schema doesn't match the existing file, we can encounter the same error. I attempted to replicate the same situation with my data and encountered a similar issue.

 

vcboorlamsft_0-1698324110776.png

 

It is recommended to always validate the schema of the source table before using the 'auto create table' option to ensure that the table is created with the correct data types. Try loading the data using the 'auto create table' option. This way, if it is successful, it can help confirm whether the issue is related to the schema.

 

When I tried using the exact file data with an existing table, it copied successfully for me.

 

vcboorlamsft_0-1698324591831.png

 

I hope this information is helpful. Please do let us know if you have any questions.

Any thoughts, @v-cboorla-msft?

Hi @arpost 

 

Apologies for the delay in response.

Sorry for the inconvenience that you are facing.

I have reached the internal team for help on this. I will update you once I hear from them.
Appreciate your patience.

Hi @arpost 

 

When you say "When loaded into the Lakehouse as a file, FileA.Value is being typed as an Integer while FileB.Value is being typed as a Byte Array". Where are you verifying that? In the lakehouse?

 

Could you please provide me the details so that I can understand the issue better.

 

Thanks.

That's what the error message (see image you posted above) in ADF states. It notes that the external data type (i.e., data in Lakehouse) is of a particular Parquet physical type (i.e., Byte Array).

 

Column 'TotalItems' of type 'INT' is not compatible with external data type 'Parquet physical type: BYTE_ARRAY, logical type: UTF8'

 

That typing wasn't something I ever set up, and there isn't a way to set that up when copying files into the Lakehouse initially before trying to load to the DW.

Hi @arpost 

 

Here is an update from the internal team:

 

"There's unique mapping for the combination of Parquet physical type & logical type to .NET type.

 

Check out this online tool to read Parquet file content and inspect schema & metadata with an intuitive user experience." 

 

Please let us know if you need any help.

 

Thanks.

Could you clarify this a bit further, @v-cboorla-msft? Is this purely informational? The issue is Fabric's storing the files in a Lakehouse with types that can't be converted for use in a DW, and there's nothing I as the user can do to solve for that (to my knowledge) because the Parquet typing is all done by Fabric under the hood.

Hi @arpost 

 

Can you please provide a sample parquet file for us to repro the issue? We would like to understand why there is an issue?

You can send us this information through email to AzCommunity[at]Microsoft[dot]com with the below details,

Subject of the email: ATTN: PRADEEP - Thread title

 

Thanks.

Any update on this, @v-cboorla-msft?

@v-cboorla-msft, I emailed the address you provided (AzCommunity@Microsoft.com) and got the following back:

 

Good day!

 

I am one of the moderators for Azure Community Connect DL. Post reviewing your email, we couldn’t find active thread details. Please confirm if there is any active thread posted in Microsoft Q&A platform : https://learn.microsoft.com/en-us/answers/index.html ?

 

If yes, share us the URL. Otherwise, request you to open a new thread on Q&A platform for your query/issue so that one of the community experts will reach you shortly for further assistance.

Hi @arpost 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.

Thanks

@v-cboorla-msft, any update on this?

I just emailed you some sample files. Both CSVs get uploaded to a Lakehouse as files. We then need to copy them into a Data Warehouse with a table that has an INT column using a Data Pipeline. ColumnB in the CSV files should be mapped to the INT column in the DW.

arpost_1-1701709281048.png

 

 

@v-cboorla-msft, appreciate the reply. Glad to see you were able to replicate the issue. The destination table in the DW is in the desired schema; the issue is Fabric is apparently identifying the Parquet data type automatically behind the scenes, which makes it impossible to load files from a Lakehouse into a DW because they"should" have the same data types but are being read into the Lakehouse as different Parquet file types.

 

So if File A looks like this:

DateValue
1/1/202325
1/2/202350

 

But File B looks like this:

DateValue
2/1/20231
2/2/20231

 

When loaded into the Lakehouse as a file, FileA.Value is being typed as an Integer while FileB.Value is being typed as a Byte Array. But they are the same "schema" and should be treated as such by Fabric.

 

Is there a solution to fix this behavior? Otherwise, it really doesn't make sense to load files into the Lakehouse.

Hello, old post but im facing a similar issue. Did you get a fix?

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

Top Solution Authors