The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi guys,
i am splitting a text column by a text delimiter. Delimiter in my case is the text "CAUTION: ". After the split i retrieve 1000+ columns with Null values in each cell. I would actually only expect two columns since the text "Caution: " only appears once in each cell of the column. Here the example:
Why is the splitting by this text delimiter giving me so many columns with null values?
I would really like to understand why this split is not working as expected.
Thanks!
Solved! Go to Solution.
Are you sure that none of the cells has multiple instances of "CAUTION:" in it? If just one has multiple instances, then you'll get more columns than you're expecting.
I'd recommend splitting it into rows instead of columns to track down where the multiple "CAUTION:" values are coming from. So, instead of
Splitting into this:
You split it like this:
After this step, you can group on the DateTimeReceived to see which ones are responsible for the duplicate values.
It'll be the one(s) with a count higher than most of the rest.
Are you sure that none of the cells has multiple instances of "CAUTION:" in it? If just one has multiple instances, then you'll get more columns than you're expecting.
I'd recommend splitting it into rows instead of columns to track down where the multiple "CAUTION:" values are coming from. So, instead of
Splitting into this:
You split it like this:
After this step, you can group on the DateTimeReceived to see which ones are responsible for the duplicate values.
It'll be the one(s) with a count higher than most of the rest.
This code is beeing created automatically after going via the menu to split columns and eventually selecting this:
= Table.SplitColumn(#"Trimmed Text1", "Body.TextBody", Splitter.SplitTextByDelimiter("CAUTION: ", QuoteStyle.Csv), {"Body.TextBody.1", "Body.TextBody.2", "Body.TextBody.3", "Body.TextBody.4", "Body.TextBody.5", "Body.TextBody.6", "Body.TextBody.7", "Body.TextBody.8", "Body.TextBody.9", "Body.TextBody.10", "Body.TextBody.11", "Body.TextBody.12", "Body.TextBody.13", "Body.TextBody.14", "Body.TextBody.15", "Body.TextBody.16", "Body.TextBody.17", "Body.TextBody.18", "Body.TextBody.19", "Body.TextBody.20", "Body.TextBody.21", "Body.TextBody.22", "Body.TextBody.23", "Body.TextBody.24", "Body.TextBody.25", "Body.TextBody.26", "Body.TextBody.27", "Body.TextBody.28", "Body.TextBody.29", "Body.TextBody.30", "Body.TextBody.31", "Body.TextBody.32", "Body.TextBody.33", "Body.TextBody.34", "Body.TextBody.35", "Body.TextBody.36", "Body.TextBody.37", "Body.TextBody.38", "Body.TextBody.39", "Body.TextBody.40", "Body.TextBody.41", "Body.TextBody.42", "Body.TextBody.43", "Body.TextBody.44", "Body.TextBody.45", "Body.TextBody.46", "Body.TextBody.47", "Body.TextBody.48", "Body.TextBody.49", "Body.TextBody.50", "Body.TextBody.51", "Body.TextBody.52", "Body.TextBody.53", "Body.TextBody.54", "Body.TextBody.55", "Body.TextBody.56", "Body.TextBody.57", "Body.TextBody.58", "Body.TextBody.59", "Body.TextBody.60", "Body.TextBody.61", "Body.TextBody.62", "Body.TextBody.63", "Body.TextBody.64", "Body.TextBody.65", "Body.TextBody.66", "Body.TextBody.67", "Body.TextBody.68", "Body.TextBody.69", "Body.TextBody.70", "Body.TextBody.71", "Body.TextBody.72", "Body.TextBody.73", "Body.TextBody.74", "Body.TextBody.75", "Body.TextBody.76", "Body.TextBody.77", "Body.TextBody.78", "Body.TextBody.79", "Body.TextBody.80", "Body.TextBody.81", "Body.TextBody.82", "Body.TextBody.83", "Body.TextBody.84", "Body.TextBody.85", "Body.TextBody.86", "Body.TextBody.87", "Body.TextBody.88", "Body.TextBody.89", "Body.TextBody.90", "Body.TextBody.91", "Body.TextBody.92", "Body.TextBody.93", "Body.TextBody.94", "Body.TextBody.95", "Body.TextBody.96", "Body.TextBody.97", "Body.TextBody.98", "Body.TextBody.99", "Body.TextBody.100"})
What i don't understand is why in the code automatically all these additional colums are inserted "Body.Textbody.XXX". This is probably why so many columns are created allthough they are all empty.
If this is a matter of only one CAUTION:, then you can select Left-most delimiter. Though this won't work if there are multiple CAUTION: are there.
To see this problem in your case - I'm not able to replicate this. I will need either your pbix file or complete string which you are trying to post here.
Please post the code of this step.