Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
osabhaois
Frequent Visitor

Dealing with multiple excel file submission and data inconsistency

Hi

I am updating a data set on a weekly basis with multiple excel files. Some files are added to a folder and some are connected via onedrive web link.

 

The files are provided to me by team members and naturally sometimes columns can be missing.

When this happens to the OneDrive web linked files; this throws exception errors related to the missing columns.

 

I find that when I get data from a folder, if an excel file is missing the correct column it just doesnt load with no error so it can be difficult to troubleshoot when the data doesnt appear in the model. 

 

Is there best practise when setting up a workflow were you are recieving excel data from multiple team members to ensure the data will not cause errors or omissions?

 

 

 

1 ACCEPTED SOLUTION
edhans
Super User
Super User

If the missing columns are needed for the report, nothing you can do about that as the data is missing and the report won't work.

 

However, if these are optional columns that you aren't using, then absolutely! Suggestions:

  1. Remove the automatic "Changed Type" step that Power Query puts in. I actually disable that in global options now. What happens is PQ will apply data types to every column, some of which you don't care about and remove. But by explicitly converting the column data type, it will cause an error if that column is missing on the next refresh.edhans_0-1602694628067.png
  2. Select only the columns you need, and select Remove Other Columns from the Home Ribbon. Now missing columns will not be a factor as you never actually use them. By using Remove Other, you are actually telling Power Query to "Keep these columns" so you are not naming the unneeded columns in a "Remove this column" step.
  3. Now you can set your data types manually for the columns you need to keep.

There are other ways to help make your report more robust. DataChant did a series of Pitfalls in Power Query few years ago and it is still 100% relevant. You can read those, or post back here with some more specific issues you are having.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

2 REPLIES 2
edhans
Super User
Super User

If the missing columns are needed for the report, nothing you can do about that as the data is missing and the report won't work.

 

However, if these are optional columns that you aren't using, then absolutely! Suggestions:

  1. Remove the automatic "Changed Type" step that Power Query puts in. I actually disable that in global options now. What happens is PQ will apply data types to every column, some of which you don't care about and remove. But by explicitly converting the column data type, it will cause an error if that column is missing on the next refresh.edhans_0-1602694628067.png
  2. Select only the columns you need, and select Remove Other Columns from the Home Ribbon. Now missing columns will not be a factor as you never actually use them. By using Remove Other, you are actually telling Power Query to "Keep these columns" so you are not naming the unneeded columns in a "Remove this column" step.
  3. Now you can set your data types manually for the columns you need to keep.

There are other ways to help make your report more robust. DataChant did a series of Pitfalls in Power Query few years ago and it is still 100% relevant. You can read those, or post back here with some more specific issues you are having.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
AnkitKukreja
Super User
Super User

Hi @osabhaois 

 

I can feel that. What I practise is, I used to sent back those files initially and ask them to make the changes and keep the format as it was earlier, though it takes time but that's the best practise. 
So ideally they should keep the column name and they might leave it blank if there is no data but the column sequence and names should be similar as they were in last file.

 

Yeah it woun't show you an explanation for the error, but it would just say load failed or something, but if you'll check the same in Power Query you can find the root cause for the same.

 

Hope it helps.

 

Thanks

Ankit

If my response has successfully addressed your question or concern, I kindly request that you mark this post as resolved. Additionally, if you found my assistance helpful, a thumbs-up would be greatly appreciated.
Thanks,
Ankit Kukreja
www.linkedin.com/in/ankit-kukreja1904

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors