Solved: Fabric Copy Activity (Data Pipeline) Reading Excel...

Fabroulous · ‎04-16-2025

In Microsoft Fabric I have build a Data Pipeline that uses a Copy Activity to read an Excel in my DataLakehouse ( I have uploaded it there)

But the Copy Activity doesn't reliably infer or recongize data types, especially when going through formats like Parquet or Excel in between.

I mean I have a column that basically has a numbering between 1 and 5000 and it assumes its a string. It assumes all of my columns are strings.

Is there a solution to this ?

Anonymous · ‎04-16-2025

Hi @Fabroulous ,
Thank you for reaching out to us on Microsoft Fabric Community Forum!

Excel files may be tricky when used with Copy Activity in Microsoft Fabric. Even when a column clearly has numbers, Fabric might treat it as text due to how Excel handles formatting. There are a few ways to fix this and ensure your data types are set correctly:

Manually define data types in the Copy Activity.In the Copy Activity’s Source settings, go to the 'Schema' tab and manually set each column’s data type.
Use a Dataflow Gen2 to cast types with functions like toInteger() before loading.
Check your Excel formatting , make sure columns are set to "Number" before upload, or save as CSV for cleaner inference.

Refer the document here for more understanding:
https://learn.microsoft.com/en-us/fabric/data-factory/format-excel
https://learn.microsoft.com/en-us/fabric/data-factory/data-type-mapping

If this resolved your query,consider accepting it as solution.

Regards,
Pallavi.

View solution in original post

Anonymous · ‎04-20-2025

Hi @Fabroulous ,
I wanted to follow up on our previous suggestions regarding the issue you are facing. We would like to hear back from you to ensure we can assist you further. If our response has addressed your query, please accept it as a solution and give a ‘Kudos’ so other members can easily find it.
Thank you.

andrewsommer · ‎04-16-2025

Excel doesn’t store explicit data types. A cell might look like a number, but unless all values in the column conform, the ingestion engine plays it safe. Type inference is schema-on-read and based on a sample of rows (often the first N rows), so a few string-like entries (e.g., a blank cell or a header row that wasn't skipped) can cause the entire column to be inferred as a string.

Option 1: Schema Definition via Mapping in Copy Activity

Use the "Mapping" tab inside the Copy Activity to manually define the column types. Here’s how:

After selecting source and sink, go to the Mapping tab.
Click Import schemas (you’ll likely see all columns as string).
Change the destination column types to what you want (integer, double, boolean, etc.).
This doesn't change the source, but it will force type coercion during the copy.

Option 2: Preprocess the Data via Dataflow or Notebook

If the Copy Activity mapping is too rigid, consider adding a Dataflow Gen2 or a notebook (Spark or PySpark) step before ingestion:

Read the Excel file using spark.read.format("excel") or the Dataflow equivalent.
Use cast() to explicitly set column types:
Write the cleaned and typed data to your Lakehouse as Delta/Parquet.

Option 3: Convert Excel to CSV Before Upload

CSV often allows better control during parsing (e.g., headers, delimiters, missing values).
Then, use Copy Activity with schema mapping, or read via a Notebook/Dataflow with explicit schema.

Please mark this post as solution if it helps you. Appreciate Kudos.

Anonymous · ‎04-16-2025

Hi @Fabroulous ,
Thank you for reaching out to us on Microsoft Fabric Community Forum!

Excel files may be tricky when used with Copy Activity in Microsoft Fabric. Even when a column clearly has numbers, Fabric might treat it as text due to how Excel handles formatting. There are a few ways to fix this and ensure your data types are set correctly:

Manually define data types in the Copy Activity.In the Copy Activity’s Source settings, go to the 'Schema' tab and manually set each column’s data type.
Use a Dataflow Gen2 to cast types with functions like toInteger() before loading.
Check your Excel formatting , make sure columns are set to "Number" before upload, or save as CSV for cleaner inference.

Refer the document here for more understanding:
https://learn.microsoft.com/en-us/fabric/data-factory/format-excel
https://learn.microsoft.com/en-us/fabric/data-factory/data-type-mapping

If this resolved your query,consider accepting it as solution.

Regards,
Pallavi.