The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Dear all,
While working on an exhaustive database, I had to recognize that the data I had been provided with, are of worse quality than expected.
Actually I do have a column “DN” (formatted as text) that contains data looking as follows:
DN 1200 / 48" | DN 1200 |
DN 1200 / 48" ETFE | DN 1200 |
DN 125 / 5" | DN 125 |
DN 1400 / 56" | DN 1400 |
DN 150 / 6" | DN 150 |
DN 150 / 6" | DN 150 |
DN 1600 / 64" | DN 1600 |
DN 1800 / 72" | DN 1800 |
DN 2.5 / 1/10" | DN 2,5 |
DN 2.5 / 1/10" | DN 2,5 |
The complete list currently consists of 102 different wordings (left column) that could be reduced to 31 (right column). I just checked this with Excel.
As you can see, sometimes there are also double/ triple spaces. I learned that these could be removed with “Trim”!?
However, my request is more complicated.
What I can recognize is that following actions are required (exceeding my skills by far):
Is there anybody out there to help with a custom column or something similar that I can implement? I am lost. 😥
= Table.AddColumn(Source,"extract", each Text.Combine(List.FirstN(Splitter.SplitTextByWhitespace()([txt]),2)," "))
See if my code works
@Orstenpowers here's what i would do:
If this answers your question, kindly mark as solution.
Try the following:
1) Select column DN
2) In 'Add Column' tab, press 'Column From Examples'
3) Start typing DN 1200 in the first row, it should suggest Text Before Delimiter transformation, if not move to second row and type it again
4) Press OK, it should generate following formula (#"Changed Type" is previous step name, most likely will be different in your case):
= Table.AddColumn(#"Changed Type", "Text Before Delimiter", each Text.BeforeDelimiter([DN], " ", 1), type text)
5) Select the new column, go to Transform tab and use Replace Values to convert '.' to ','
Dear Stachu,
Thanks for your reply!
It looks quite good, but does your proposal consider the double/ triple spaces that the original column could consist of?
Looking forward to hearing from you I remain
In the Transform tab, under Format there is Trim you can apply to columns - does it work for you as expected?
I also have the impression that "Trim" does not work, although column is formatted as text...
I got this by following the steps I mentioned earlier.
Hi Marius,
Unfortunately this does not work out, because separating by delimiter creats more than twenty additional columns in my file...😔
I see...
Would it be possible for you to provide a larger sample for testing?
Dear Stachu,
I just recognize that
= Table.AddColumn(#"Renamed Columns8", "Text Between Delimiters", each Text.BeforeDelimiter([DN], " ", 1), type text)
did not work correctly. 😐
Sometimes the information is cut as intended, but often it is not, e.g. you can see with line "DN 200..."
Do you have any idea?
try this instead:
Table.AddColumn(#"Renamed Columns8", "Text Before Delimiter", each Text.BeforeDelimiter([DN], "/", 0), type text)
and then Trim the column. Please note that Trim in PowerQuery only removes spaces at the beginning and the end of the text, not in the middle (like an Excel formula does). So ' DN 2000 ' will become 'DN 2000' with double space in the middle
https://docs.microsoft.com/en-us/powerquery-m/text-trim
Hi Stachu,
Thank you!
I will try it.
For the spaces in the middle, is it possible to create a looped "If"-formula, stating that double spaces should be replaced by single ones as long as only single ones left? Combined this with "Trim" should remove all unnecessary spaces, shouldn't it?
you can create custom Trim function with recursion, but performance may be an issue here:
(Text as text) as text =>
if Text.PositionOf(Text," ")=-1
then Text
else CustomTrim(Text.Replace(Text," "," "))
this one needs to be named CustomTrim to work
EDIT this code might have better performance, both remove all double spaces in the string - in the beginning, middle, and at the end of the string
(Text as text) as text =>
let
ListGenerate = List.Generate(
() => Text,
each Text.PositionOf(_," ")>-1,
each Text.Replace(_," "," "),
each Text.Replace(_," "," ")
),
Result = if Text.PositionOf(Text," ")=-1 then Text else List.Last(ListGenerate)
in
Result
Dear Stachu,
Thank you for your help!
Where do I need to enter your proposed custom trim function??? Custom column in Power Query Editor?
@Orstenpowers you need to create a new, blank query and paste the code I posted in the Advanced Editor, replacing existing code.
After that you can reference the function anywhere you want, could be as custom column, could be invoked separately etc.
Hi Stachu,
Where can I enter this custom Trim function? @ "Conditional Column"?
Sorry for my many questions...
At the moment the result is as follows:
This is good, but you can see that - I guess due to text formatting - the info is not listed ascending from 2.5...2000. Any idea how to change this?
the values are text so it sorts alphabetically, you need to create a new column that would have the sort you need
https://docs.microsoft.com/en-us/power-bi/desktop-sort-by-column
as for the trim you can try:
Table.AddColumn(#"Renamed Columns8", "Text Before Delimiter", each CustomTrim(Text.BeforeDelimiter([DN], "/", 0), type text))
Hi Stachu,
I just split the column indicating e.g. "DN 200" into two separate columns "DN" and "200". The latter one I formatted as decimal number. Is it possible to change this like in Excel where you can define customized format such as "DN"0, which would result in illustrating DN 200, but still being a value?
Where can I enter this trim function???
You could change data type in Query Editor or in Data model.
https://community.powerbi.com/t5/Desktop/how-to-change-column-into-mixed-data-type/td-p/575728
Best Regards
Maggie
Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Maybe not the most sexy solution (or at all), but you could try the following:
1) Split the column at each occurence of SPACE
2) Replace .(dot) with ,(comma) in the column that end up containing for example 2.5
3) Create a Custom Column with the following M expression --> [DN]&" "&[2,5] --> DN and 2,5 must be replaced with the name of the column containing these values
This is based on the assumption that your uotput should be like the column to the right, and the format before / is the same for all rows.