Solved: Re: Replace or remove values that are not in speci...

SamiS · ‎02-28-2022

I am working with a bad quality dataset that has many annoying errors. Please help me with one that I haven't solved yet.

There is two columns: SERIAL-NUMBER and DESCRIPTION

- Serialnumber should be in format ####-#### for example 1230-0123.

- Description is just a text

The problem:

There is several rows where description is written into serial-number column. Serial-number may be mentioned in the middle of that text. I want to find and keep only the serial number and remove all extra text. Serial-number to keep can be recognized by its' format.

Question:

How to only keep value that is in format ####-#### and remove all other text from the serial-number column?

SamiS · ‎03-01-2022

Thanks to all who answered!

In fact, I took a part from everybody's answer to build my own solution.

I created a new column where...

1...all acharacters but digits and used delimeters are removed.

2...text is splitted by any defined delimeter.

3...text is combined back with delimeter "-", but only if format is correct (4 digits + 4 digits)

View solution in original post

SamiS · ‎03-01-2022

Thanks to all who answered!

In fact, I took a part from everybody's answer to build my own solution.

I created a new column where...

1...all acharacters but digits and used delimeters are removed.

2...text is splitted by any defined delimeter.

3...text is combined back with delimeter "-", but only if format is correct (4 digits + 4 digits)

v-yingjl · ‎03-02-2022

Hi @SamiS ,

Glad to hear that it can be solved by yourself with others' ideas and thanks for your sharing your workaround.

You can accept your reply as solution, that way, other community members could easily find the answer when they get same issues.

Best Regards,
Community Support Team _ Yingjie Li

serpiva64 · ‎02-28-2022

Hi,

if you have only that hyphen in your text, you can easily achieve this:

You have to:

- extract text before delimiter -

- extract last 4 characters

- extract text after delimiter -

- extract first 4 characters

- merge the two column

- remove columns not necessary

That's done

If this post is useful to help you to solve your issue consider giving the post a thumbs up

and accepting it as a solution !

wdx223_Daniel · ‎02-28-2022

NewStep=let lst={"0".."9","#","-"},fx=(x,y,z)=>Splitter.SplitTextByCharacterTransition(x,y,z) in Table.TransformColulmns(PreviousStepName,{"Serial Number",each List.RemoveNulls(List.TransformMany(fx(lst,each not List.Contains(lst,_),_),each fx((x)=>not List.Contains(lst,x),lst,_),(x,y)=>let a=Text.SplitAny(y,"-#") in if List.Count(a)=2 and Text.Length(a{0})=4 and Text.Length(a{1})=4 then Text.Combine(a,"-") else null)){0}?})

SamiS · ‎03-01-2022

I didn't get this to work. After fixing typos teh first error was something like "there is 3 arguments for a function that excepts 2". I fixed it and then there was an other one "cannot convert a value of type Function to type List".

However, I got some ideas how to approach this from your answer. Thank you!

BeaBF · ‎02-28-2022

@SamiS can you paste some data and an example of the expected result?

BF

SamiS · ‎02-28-2022

Hi,

Here is an example of existing data:

And here is how it should look like:

Those are shop floor quality notes from manufacturing operators. Unfortunately system hasn't forced operators to fill fields correctly.

I already have filled empty description fields by adding custom column:

= Table.AddColumn(#"Renamed Columns2", "NewDescription", each if [Description] = null then [Serial-Number] else [Description])

Next I want to remove extra strings from Serial-Number and keep actual serial number only.

mahoneypat · ‎02-28-2022

If there are not numbers or hyphens in the other text, you can use this expression in your custom column.

= Text.Select([Description], {"-"} & {"0".."9"})

Pat

Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.

@mahoneypa HoosierBI on YouTube

Replace or remove values that are not in specified format. (Find and keep values)

Helpful resources

Microsoft Fabric Learn Together

Power BI Monthly Update - April 2024

Fabric Community Update - April 2024