Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified as a Fabric Data Engineer: Check your eligibility for a 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700. Get started

Reply
Anonymous
Not applicable

Issue with copy data from csv

I'm copying data from a csv file that is uploaded to a lakehouse to a table in the same lakehouse.  I've set it up three different ways.

1. Pipeline copy task with the csv as the source and new table as destination

2. Using the "Load to Tables" festure on the file from the Lakehouse explorer

3. A simple data flow with the csv file as a source and a new table as a destination

 

Option 2 and 3 work as expected.  Option 1, the pipeline, produces the following.

1. The table seems to be missing the data for the last column when viewing in the Lakehouse explorer.

2. An error occurs when accessing the SQL Endpoint with a table in this state.

3. Any further use of the table in other ways, like a dataflow, has the last column's data missing.

4. However, if querying it using Spark SQL in a Notebook shows the data.

 

Here are a series of screen shots depicted the issue.  NOTE: Conducted a similar experiment with a JSON file and all was good so it would seem to ba an issue with handling CSV's in the Pipeline Copy activitiy.

 

Load via Pipeline (missing last column)

kwebb_2-1687109628305.png

 

Load via "Load to Tables" (last column OK)

kwebb_1-1687109604019.png

 

Load via Dataflow (last column OK)

kwebb_0-1687109572579.png

 

Error in SQL Endpoint when table loaded via Pipeline exists.

kwebb_3-1687109733959.png

 

Spark SQL Query of the table with the last column's data missing everywhere else.

kwebb_4-1687109782214.png

 

 

4 REPLIES 4
ajarora
Microsoft Employee
Microsoft Employee

Yes that was my assumption. In that case reset any existing mappings and your issue should get resolved.

sudhav
Helper V
Helper V

Hi, in pipeline case, you have to take care of import scheme option, there you need to map the column names and data types, then you will get succeeded output.

 

sudhav_0-1687193826769.png

If I answer your question, please mark it as solution.

ajarora
Microsoft Employee
Microsoft Employee

Make sure you have no explicit mapping set in the copy activity.

Please log a support ticket with your activity run id as well as pipeline json. We will resolve it ASAP.

Anonymous
Not applicable

Thanks Ajarora.  Upon inspecting the JSON of the Copy Task, it contained a schema from a different CSV file than the opne selected for the source.  What I'm thinking happened is that when creating a new pipeline, I used the same name as one that I previously deleted and somehow got content from the old one mixed in with the new one I'm not sure.  I will try to reproduce.

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!