Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have a batch of Excel files that I'm trying to use as a source for copy data. The first row in these files is blank, and even if I specify the range start with A2, the run will fail with the error below. Removing the first row resolves the issue, but is not ideal as these files are automated and this would require manual edits.
Is this a known issue, and is there any sort of workaround?
Activity failed because an inner activity failed; Inner activity name: Copy data1, Error: ErrorCode=ExcelUnsupportedFormat,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Only '.xls' and '.xlsx' format is supported in reading excel file while error is ' at NPOI.HSSF.Record.RecordInputStream.get_HasNextRecord()
at NPOI.HSSF.Record.RecordFactoryInputStream.NextRecord()
at NPOI.HSSF.Record.RecordFactory.CreateRecords(Stream in1)
at NPOI.HSSF.UserModel.HSSFWorkbook..ctor(DirectoryNode directory, Boolean preserveNodes)
at Microsoft.DataTransfer.ClientLibrary.ExcelUtility.GetExcelWorkbook(String fileExtension, TransferStream stream)'.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=NPOI.HSSF.Record.LeftoverDataException,Message=Initialisation of record 0x200 left 2 bytes remaining still to be read.,Source=NPOI,'
Hi @dude1500
Depending on your requirements, you do not want to manually delete the first row.
We can deal with this in an automated way. Here are the steps on how to automatically skip the first line in an Excel file using the Data Pipeline tool in Microsoft Fabric:
First, make sure your Excel file has been uploaded to an accessible storage location.
In the Microsoft Fabric-enabled workspace, select new Data Flow Gen2.
In the Data Flow Editor, select "Import from File" and connect to your Excel file.
In the data flow editor, select Add Step and select Delete Row. Configure the Delete row step to delete the first row.
Once the configuration is complete, click "Save and Publish" to save the data stream.
In the workspace, select New Data Pipeline.
Add a data Flow activity Add a data flow activity in the Data Pipeline editor. In Settings, configure the data flow details.
The replication data activity is then configured to copy the processed data from the data stream to the destination store.
Once the configuration is complete, click "Save" and select "Run" to execute the data pipeline.
Regards,
Nono Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.