The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have a folder with aprox 2000 (and more is added weekly) similar files, that I want to load into a single table.
All the files should have the same amount of columns - however because humen is involved in creating the files some of them doesn't match the correct syntax. This result in my query to fail.
What I am after is some kind of error handling in the query, so the query doesn't stop, but skip the file and then continue. A bonus feature would be an extra table with the filename of those files that failed to be loaded.
Have anyone of you done anything similar to this or perhaps have an idea on how to solve it?
Solved! Go to Solution.
I found a solution for this and wanted to share it:
I created a function to handle the load of the individual Excel sheet. I then use this funtion in to individual queries. One to load the data from the files and one that will return the file name of the files that are unable to be processed by the funtion.
The function look like this:
(sourceFile as text) as table => let Source = Excel.Workbook(File.Contents(sourceFile), null, true), ExpandData = Table.ExpandTableColumn( Source, "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15"}, {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15"} ), AddRowNumber = Table.AddIndexColumn(ExpandData, "Row", 0, 1), //Remove Rows with only Nulls AddColumnRowWithData = Table.AddColumn(AddRowNumber, "RowWithData", each if [Column1] = null and [Column2] = null and [Column3] = null and [Column4] = null and [Column5] = null and [Column6] = null and [Column7] = null and [Column8] = null and [Column9] = null and [Column10] = null and [Column11] = null and [Column12] = null and [Column13] = null and [Column14] = null and [Column15] = null then "Empty Row" else "Row with data" ), RemoveRowsWithNoData = Table.SelectRows(AddColumnRowWithData, each ([RowWithData] = "Row with data")), RemoveColomnRowWithData = Table.RemoveColumns(RemoveRowsWithNoData, {"RowWithData"}), //Remove Columns with No Name TransposeTable = Table.Transpose(RemoveColomnRowWithData), RemoveColumnsWithNoName = Table.SelectRows(TransposeTable , each ([Column1] <> null)), TransposeTable2 = Table.Transpose(RemoveColumnsWithNoName), PromoteHeaders = Table.PromoteHeaders(TransposeTable2, [PromoteAllScalars=true]), RenameColumn = Table.RenameColumns(PromoteHeaders,{{"0", "Row"}}) in RenameColumn
The code for returning a list of files with errors:
let FilePath = fnGetParameter("File Path"), Source = Folder.Files(FilePath), RemoveOtherColumns = Table.SelectColumns( Source, {"Name", "Folder Path"}), CallFunction = Table.AddColumn(RemoveOtherColumns, "LoadAllFiles", each fnLoadFiles([Folder Path]&[Name])), KeepFilesWithErrors = Table.SelectRowsWithErrors(CallFunction, {"LoadAllFiles"}), RemoveColumns = Table.RemoveColumns(KeepFilesWithErrors ,{"Folder Path", "LoadAllFiles"}) in RemoveColumns
HI @sdjensen,
>>All the files should have the same amount of columns - however because humen is involved in creating the files some of them doesn't match the correct syntax.
In my opinion, you can try to add some steps to filter the error records , then merge these filtered records.
BTW, you can also add the 'Data Validation' to columns of your source file if you use the excel files as source.
Excel data validation - regular expressions?
Regards,
Xiaoxin Sheng
It's not errors on a single row - it's errors like where some of the files have more columns so data is not in the same columns as the correct syntax - this breaks the load, so can't really remove error rows, because it's not single rows that returns the error, but the file itself and as I said that stops the import.
HI @sdjensen,
If your tables contains the different structure or column index, I think you need to format them before merge these table.
I don't think it will autoformat your source data.
For example: Use Table.SelectColumns function to find out speicfic columns and index.
Table.Combine | Returns a table that is the result of merging a list of tables. The tables must all have the same row type structure. |
Regards,
Xiaoxin Sheng
As I wrote in my initial post there is a lot of files and manually going through all of them would be an almost impossible task - hence my request for a function that could return the filename of files that fail to load in a different table/list.
Then I could return this list to the involved staff and ask them to handle the files with errors.
Table.SelectColumn will not solve my problem - I can't be sure that the columns is named the same.
Hi @sdjensen,
For your scenario, I'd like to suggest you to use programming languages to deal with these data validation and error handling, then output the formatted data. (I think it will be more simply than power query)
Finally, you can use power bi to import these analysis/formatted data.
Regards,
Xiaxoin Sheng
I found a solution for this and wanted to share it:
I created a function to handle the load of the individual Excel sheet. I then use this funtion in to individual queries. One to load the data from the files and one that will return the file name of the files that are unable to be processed by the funtion.
The function look like this:
(sourceFile as text) as table => let Source = Excel.Workbook(File.Contents(sourceFile), null, true), ExpandData = Table.ExpandTableColumn( Source, "Data", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15"}, {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6", "Column7", "Column8", "Column9", "Column10", "Column11", "Column12", "Column13", "Column14", "Column15"} ), AddRowNumber = Table.AddIndexColumn(ExpandData, "Row", 0, 1), //Remove Rows with only Nulls AddColumnRowWithData = Table.AddColumn(AddRowNumber, "RowWithData", each if [Column1] = null and [Column2] = null and [Column3] = null and [Column4] = null and [Column5] = null and [Column6] = null and [Column7] = null and [Column8] = null and [Column9] = null and [Column10] = null and [Column11] = null and [Column12] = null and [Column13] = null and [Column14] = null and [Column15] = null then "Empty Row" else "Row with data" ), RemoveRowsWithNoData = Table.SelectRows(AddColumnRowWithData, each ([RowWithData] = "Row with data")), RemoveColomnRowWithData = Table.RemoveColumns(RemoveRowsWithNoData, {"RowWithData"}), //Remove Columns with No Name TransposeTable = Table.Transpose(RemoveColomnRowWithData), RemoveColumnsWithNoName = Table.SelectRows(TransposeTable , each ([Column1] <> null)), TransposeTable2 = Table.Transpose(RemoveColumnsWithNoName), PromoteHeaders = Table.PromoteHeaders(TransposeTable2, [PromoteAllScalars=true]), RenameColumn = Table.RenameColumns(PromoteHeaders,{{"0", "Row"}}) in RenameColumn
The code for returning a list of files with errors:
let FilePath = fnGetParameter("File Path"), Source = Folder.Files(FilePath), RemoveOtherColumns = Table.SelectColumns( Source, {"Name", "Folder Path"}), CallFunction = Table.AddColumn(RemoveOtherColumns, "LoadAllFiles", each fnLoadFiles([Folder Path]&[Name])), KeepFilesWithErrors = Table.SelectRowsWithErrors(CallFunction, {"LoadAllFiles"}), RemoveColumns = Table.RemoveColumns(KeepFilesWithErrors ,{"Folder Path", "LoadAllFiles"}) in RemoveColumns
I know this is old but thanks for sharing the solution you found. I'm in the same scenario. You dont happen to have a file you can share where this function is operating? I use a lot more DAX than M so its a little difficult to follow.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
111 | |
80 | |
71 | |
52 | |
50 |
User | Count |
---|---|
129 | |
123 | |
78 | |
64 | |
60 |