Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Syndicate_Admin
Administrator
Administrator

Data Validation and Error Handling in PowerQuery

Hello people,

 

i do have a source dataset thats generated from user inputs and therefore need to be validated before i can pass the dataset for analyzing.

 

There are mostly two possible type of errors:

- logical erros ( [Field1] = 0, [Field2] <> 0)

- lookup errors ([Field3] value not in lookup/ list

 

I wrote a powerquery M script that validates the data in three steps.
1.  "if then" statements  to find logical erros, and "Table.Contains" functions to find data thats not in the lookup tables.

 

i.e.
E20 = if [Field1] = 0 and [Field2] <> 0 then 20 else null,

E21 = if [Field3]  = x and [Field4] <> y then 21 else null,

E22 = if Table.Contains(tab_categories, [category = [out_category]]) = false then 22 else null,

 

2.  create a result list of all the conditions/ if clauses and remove null values

 

results = List.RemoveNulls({E20,E21,E22)

 

3. Translate those results into error messages by looking up each element of the result list in a tab_errorcodes and combine the texts:

error_message = Text.Combine(
List.Transform([results], each tab_errorcodes[ErrorMessage]{List.PositionOf(tab_errorcodes[ErrorCode], _,0)})
)

 

 

The script has to handle approx. 2k Datarows but seems to be very slow and produces timeout errors at some point.
I am looking for tips/ hints to learn new ways/ best practices to get the task done and would love to hear some feedback to my approach.

Greetings,
S

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi S,

 

Your approach to validating the data using Power Query M script seems reasonable. However, there might be some room for optimization to improve performance. Here are a few suggestions:

Parallelize the validation process: If possible, try to parallelize the validation process by splitting the dataset into smaller chunks and validating them concurrently. This can help to speed up the overall process.

Optimize the M script: Review your M script for any redundant or unnecessary calculations that might be slowing down the process. You can also consider using native M functions instead of custom functions, as they are usually faster.

Use buffering: If your dataset is large, consider using the function to cache the data in memory. This can help to speed up the validation process by reducing the number of times the data is read from the source.

Performance tip for List.Generate (1): Buffer your tables in Power BI and Power Query

Optimize lookup tables: Make sure your lookup tables are optimized for quick lookups. You can use the function to sort the lookup tables by the key column, which can help to speed up the function.Table.SortTable.Contains

 

Remember that performance optimization is an iterative process, and you might need to try different approaches to find the best solution for your specific use case.I hope these suggestions help you improve the performance of your data validation process. If you have any further questions, please feel free to ask.

 

Best Regards,
Gao

Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly. If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly --  How to provide sample data in the Power BI Forum

View solution in original post

2 REPLIES 2
Syndicate_Admin
Administrator
Administrator

Hi Gao,
thank you for your Feedback and the input regarding the table buffering.

I will look to implement table buffering in my solution and trying to imporve my overall flow setup.
Greetings,

S

Anonymous
Not applicable

Hi S,

 

Your approach to validating the data using Power Query M script seems reasonable. However, there might be some room for optimization to improve performance. Here are a few suggestions:

Parallelize the validation process: If possible, try to parallelize the validation process by splitting the dataset into smaller chunks and validating them concurrently. This can help to speed up the overall process.

Optimize the M script: Review your M script for any redundant or unnecessary calculations that might be slowing down the process. You can also consider using native M functions instead of custom functions, as they are usually faster.

Use buffering: If your dataset is large, consider using the function to cache the data in memory. This can help to speed up the validation process by reducing the number of times the data is read from the source.

Performance tip for List.Generate (1): Buffer your tables in Power BI and Power Query

Optimize lookup tables: Make sure your lookup tables are optimized for quick lookups. You can use the function to sort the lookup tables by the key column, which can help to speed up the function.Table.SortTable.Contains

 

Remember that performance optimization is an iterative process, and you might need to try different approaches to find the best solution for your specific use case.I hope these suggestions help you improve the performance of your data validation process. If you have any further questions, please feel free to ask.

 

Best Regards,
Gao

Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly. If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

How to get your questions answered quickly --  How to provide sample data in the Power BI Forum

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors