Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
kirvis
Helper I
Helper I

Create bins for subcategories based on MIN and MAX values

Hi all,

 

I am working on a project for which I would like to create bins based on the MAX and MIN of a value within a subset.

 

The dataset has 3 columns:

 

  1. Cell ID
  2. Gene ID
  3. Value

I want to create 10 bins for each Gene ID by doing (MAX(Value) - MIN(Value))/10 and then assign every row in my table the result.

 

Since my datasets are (very) large with millions of rows, I want to do this in Power Query and not in DAX. Once the bin is set, it will not change anymore.

 

Any ideas on how I can do this?

 

An example file with data can be found here.

 

Thanks!

 

1 ACCEPTED SOLUTION

Hi @kirvis 

you have to turn the query into a function and apply it on the grouped data.

Please see file attached.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

5 REPLIES 5
ImkeF
Super User
Super User

Hi @kirvis 

please check this query:

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Expression_FACT"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"GeneID", Int64.Type}, {"Cell ID", Int64.Type}, {"Value", type number}}),
    ListOfValues = List.Buffer(#"Changed Type"[Value]),
    Min = List.Min(ListOfValues),
    Max = List.Max(ListOfValues),
    Increment = (Max - Min) / 10,
    Buckets = List.Transform({1..10}, each [Index = _, Value =  Min +  (_ * Increment) ] ),
    #"Converted to Table" = Table.FromList(Buckets, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"Index", "Value"}, {"Index", "Value"}),
    Custom1 = #"Expanded Column1" & #"Changed Type",
    #"Sorted Rows" = Table.Buffer(Table.Sort(Custom1,{{"Value", Order.Descending}, {"Index", Order.Descending}})),
    #"Filled Down" = Table.FillDown(#"Sorted Rows",{"Index"}),
    #"Filtered Rows" = Table.SelectRows(#"Filled Down", each ([GeneID] <> null))
in
    #"Filtered Rows"

 

 

This is not the fastest operation one can run...

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi Imke,

 

Thanks for your response. The query seems to work like a charm (and is quite fast as well). 

 

However, I realize that I might not have made myself completely clear in my original post: I would like to do exactly this, but then for each GeneID separately. In your query, you determine the MIN and MAX of all values combined, so of all GeneIDs together.

 

Is there an easy way to do exactly what you have done, but then grouped per GeneID?

 

Thanks,

 

Kirvis 

Hi @kirvis 

Sure that's possible - will send solution later.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi @kirvis 

you have to turn the query into a function and apply it on the grouped data.

Please see file attached.

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hi Imke,

 

Excellent, thanks! This is indeed exactly what I was looking for!

 

Best regards,

 

Kirvis

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors