Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Irwin
Helper IV
Helper IV

Easiest way to treat/clean data for micro biology containing "<" sign?

Hi,


I have a large dataset with many columns containing micro biological data.

Many of these columns will have a value of <1000 or <100. For example. Enterobaceriacea <10.

 

Powerbi cannot handle a less than value when the other values are whole numbers. So I have to split my columns into:

Column A, Column B

 

Column A would then be "attribute": <

Column B would be the whole number.


This way I can later create a measures that uses the value and checks column A to be blank/not blank.

 

However, I find it quite a big amount of work to go through 20+ columns in power query and split them.

The method I use now are as follows:

 

= Table.AddColumn(#"Removed Columns", "Custom", each if Text.Contains([Enterobacteriacea], "<" ) then "<" else [Enterobacteriacea])

 

And then 

 

= Table.AddColumn(#"Changed Type", "Custom", each Text.Select([TPC], {"0".."9"}))

 

 

After this I have to remove the original column, rename the new column and change the type to whole number.

Doing this x 20 seems stupid to me. Is there a better solution to split all columns that has "<" into Attribute columns and whole number coloumns?

Thank you for any assistance. 🙂

 

1 ACCEPTED SOLUTION

Hi @Irwin,

Here is a custom M query function to add placeholder into the records if it does not include "<" character, you can use it to add whitespace to number values and them you can do ‘split column’ by number of characters.

#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Function.Invoke((x)=> Record.FromList( List.Transform(Record.ToList(x), each if Text.Contains(_,"<") then _ else " "&_),Record.FieldNames(_)),{_}))

Full query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wiik1MDBONjQwMFDSgXNwMGN1qKeediYPlPoBCpNBGhoo6gePS4zo7BLjwRk7IJXUCL6BiULa5sJR9WSojwUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TPC = _t, #"Entero-bacteriaceae" = _t, #"Bacillus cereus" = _t, #"Yeast/Mould" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"TPC", type text}, {"Entero-bacteriaceae", type text}, {"Bacillus cereus", type text}, {"Yeast/Mould", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Function.Invoke((x)=> Record.FromList( List.Transform(Record.ToList(x), each if Text.Contains(_,"<") then _ else " "&_),Record.FieldNames(_)),{_})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"}),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Removed Columns", "Custom", {"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"}, {"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"})
in
    #"Expanded Custom"

1.png
Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

4 REPLIES 4
v-shex-msft
Community Support
Community Support

Hi @Irwin,

Can you please share a pbix or some dummy data that keep the raw data structure with expected results? It should help us clarify your scenario and test to coding formula.

How to Get Your Question Answered Quickly  

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Sure here is some example data. Imagine this with 20 ish columns.

 

As is:

TPCEntero-bacteriaceaeBacillus cereusYeast/
Mould
<1000<10<10<10
<1000<10<10<10
<1000<1010<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
1000<10<10<10
<1000<1010<10
<1000<10<1010
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<100010<10<10
<1000<10<10<10
<1000<10<10<10
<100020<10<10
<1000<10<10<10
<1000<10<1030
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
10000<10<10<10
<100020<1010
<1000<10<10<10
<1000<10<10<10
<1000200<10<10
1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10

 

To be

 

TPC AttributeTPCEB AttributeEntero-bacteriaceaeBC AttributeBacillus cereusY/M AttributeYeast/Mould
<1000<10<10<10
<1000<10<10<10
<1000<10 10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
 1000<10<10<10
<1000<10 10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000 10<10<10
<1000<10<10<10
<1000<10<10<10
<1000 20<10<10
<1000<10<10<10
<1000<10<10 30
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
 10000<10<10<10
<1000 20<10 10
<1000<10<10<10
<1000<10<10<10
<1000 200<10<10
 1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10
<1000<10<10<10

 

Hi @Irwin,

Here is a custom M query function to add placeholder into the records if it does not include "<" character, you can use it to add whitespace to number values and them you can do ‘split column’ by number of characters.

#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Function.Invoke((x)=> Record.FromList( List.Transform(Record.ToList(x), each if Text.Contains(_,"<") then _ else " "&_),Record.FieldNames(_)),{_}))

Full query:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wiik1MDBONjQwMFDSgXNwMGN1qKeediYPlPoBCpNBGhoo6gePS4zo7BLjwRk7IJXUCL6BiULa5sJR9WSojwUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TPC = _t, #"Entero-bacteriaceae" = _t, #"Bacillus cereus" = _t, #"Yeast/Mould" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"TPC", type text}, {"Entero-bacteriaceae", type text}, {"Bacillus cereus", type text}, {"Yeast/Mould", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each Function.Invoke((x)=> Record.FromList( List.Transform(Record.ToList(x), each if Text.Contains(_,"<") then _ else " "&_),Record.FieldNames(_)),{_})),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"}),
    #"Expanded Custom" = Table.ExpandRecordColumn(#"Removed Columns", "Custom", {"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"}, {"TPC", "Entero-bacteriaceae", "Bacillus cereus", "Yeast/Mould"})
in
    #"Expanded Custom"

1.png
Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Excellent. I will  try it out when I get time on work. Thank you. 🙂

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.