The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
I have a large dataset (about 300,000 records), and I have to build a custom column based on one of the columns in the dataset. The custom column is basically an 'IF-ELSE' condition, however I have too many conditions (about 1000ish). Clearly I cannot write an if-else with that many conditions. What are my other options?
An example if clause for my use case with a single condition would look like this - if Text.Contains([Line Description],"ABC") then "AAA" else ""
Solved! Go to Solution.
Even better would be to add some exception handling: Return all replacement values found, (separated by commas) and an empty string if no conditions are found:
= Text.Combine(Table.SelectRows(ReplaceConditions, (current_row) => Text.Contains([Line Description], current_row[SearchText]))[ReplaceText],",")
And even better, better, buffering the ReplaceConditions for performance and I tested it now:
Created 2 tables in Excell. The purple one is your big dataset 😃
Added column with this query:
// BigDataset
let
Source = Excel.CurrentWorkbook(){[Name="BigDataset"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Line", Int64.Type}, {"Line Description", type text}}),
// Load to memory
BufferdConditions = Table.Buffer(ReplaceConditions),
// Add the new column. I named it MatchResults
#"Added Custom" = Table.AddColumn(#"Changed Type", "MatchResult", each Text.Combine(Table.SelectRows(BufferdConditions, (current_row) => Text.Contains([Line Description], current_row[SearchText]))[ReplaceText],","))
in
#"Added Custom"
With this result, loaded back into Excel in 45 seconds on my reasonably fast laptop:
And pleas keep in mind: PowerQuery is case sensitive!
If you don't want the match to be case sensitive use
Text.Contains([Line Description], current_row[SearchText],Comparer.OrdinalIgnoreCase)
Which loads in 60 seconds (so 30% slower)
Even better would be to add some exception handling: Return all replacement values found, (separated by commas) and an empty string if no conditions are found:
= Text.Combine(Table.SelectRows(ReplaceConditions, (current_row) => Text.Contains([Line Description], current_row[SearchText]))[ReplaceText],",")
Oh wow!! This works very well. Just tried. Mannnn I've got tears of happiness rolling down my eyes rn. I do need to straighten up my ReplaceConditions table though. Refine it so that each pattern has only one o/p. I will reach out to you separately if I have any trouble refining the code (if I have to). Thank you so much!
Ha ha, glad to make you so happy and I appreciate you sharing it!
You're welcome!
I would create a separate table named ReplaceConditions with 2 columns SearchText and ReplaceText
Your add column formula would be something like
= Table.SelectRows(ReplaceConditions, (current_row) => Text.Contains([Line Description], current_row[SearchText])[ReplaceText]{0})
Haven't tested it.... If you have trouble implementing it, I can provide more help.
Where does the data come from? Can you add the column at the source?
Is the condition something you can do with a regular expression?