Re: Most Efficient Way to Create Buckets?

Dave1mo1 · ‎12-06-2024

Hi,

I'd like to bucket order size by units (1-5, 6-10, 11-20, 21-30, 30+) and use it as a slicer or field in report visuals. My fact table contains more than 17 million rows, so I want to do this as efficiently as possible. Is it best to create a conditional column in Power Query during the ETL stage, or add a column using DAX?

Or is there another alternative that I'm missing? Thanks!

Ashish_Mathur

Hi,

You may try using measures to solve this problem. Please see the attached files.

Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Thejeswar · ‎12-11-2024

@Dave1mo1 ,

You can use 2 table approach

1. 1st is your original table where you want to have the bucket information

2. Create a new table that has the Units grouped by Order level. This table as of me will have only 2 columns (Order, Units).

3. To the table created in step 2, add a calculated column to define the buckets as per your requirement using conditional column option

4. Using Merge option in Power Query editor, Merge these two tables and get the Bucket info alone to the 1st table.

5. This should give you a table with details at Order line number while also having the bucket information for the Order.

With your volume of records, can't be sure about the time it could take. But it is worth a try.

Although I would prefer doing this same approach in a much higher level like writing this in a SQL statement and pulling in only the output rather than doing this inside Power BI

Regards,

Dave1mo1 · ‎12-08-2024

Bump - Any other ideas? Thanks!

v-jianpeng-msft · ‎12-08-2024

Thank you Tahreem24

Hi, @Dave1mo1

When your conditional columns only depend on other columns in the current table, it is usually better to perform in the ETL phase.
If your calculated columns depend on fields from other tables, you can create DAX calculated columns.
SQL BI has an article dedicated to this, and I believe it can help you make the right decision:

Comparing DAX calculated columns with Power Query computed columns - SQLBI

Best Regards

Jianpeng Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Dave1mo1 · ‎12-10-2024

Hi Jian,

That is helpful - thanks. I am trying to create buckets based on an aggregation. My data comes in at an "order line" granuality, but I'd like to aggregate at the order level to show total units for each unit. Then I'd like to group the total units for each order into buckets (1-5, 6-10, etc.).

I can write the DAX measure to give me total units for an order easily, but then that result doesn't "live" anywhere for me to bucket. When I tried to aggregate in the ETL stage by duplicating the query and using "Group By" in the second query, it took a very long time (I have ~17m records in this table, so it's somewhat understandable).

I'm not sure how I can both aggregate at a higher granularity than my fact table and bucket that aggregation in an efficient way. Do you have any other ideas?

v-jianpeng-msft · ‎12-15-2024

Hi, @Dave1mo1

I wish you all the best. Previously we have provided a solution to help you solve the problem. Since we haven't heard back from you yet, I'd like to confirm if you've successfully resolved this issue or if you need further help?
You are welcome to share your workaround and mark it as a solution so that other users can benefit as well. If you find a reply particularly helpful to you, you can also mark it as a solution.
If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.

Best Regards

Jianpeng Li

Dave1mo1 · ‎12-16-2024

Hi Jianpeng,

I'm planning on spending some time today trying the "two table" method to see if that works for me. Thanks!

-David

v-jianpeng-msft

Hi, @Dave1mo1

I wish you all the best. Previously Thejeswar have provided a solution to help you solve the problem. Since we haven't heard back from you yet, I'd like to confirm if you've successfully resolved this issue or if you need further help?
If yes, you are welcome to share your workaround and mark it as a solution so that other users can benefit as well. If you find a reply particularly helpful to you, you can also mark it as a solution.
If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.

Best Regards

Jianpeng Li

Dave1mo1

I've not been able to get the preview to load the second table in a reasonable time to test. I think this will add too much time to my refreshes to be worth the outcome, to be honest 😞 I think the solution will work but can't verify it, in other words.

v-jianpeng-msft

Hi, @Dave1mo1

Merry Christmas to you. Previously we have provided a solution to help you solve the problem. Since we haven't heard back from you yet, I'd like to confirm if you've successfully resolved this issue or if you need further help?
If yes, you are welcome to share your workaround and mark it as a solution so that other users can benefit as well. If you find a reply particularly helpful to you, you can also mark it as a solution.
If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.

Best Regards

Jianpeng Li

v-jianpeng-msft

Hi, @Dave1mo1

I understand your concerns. This is a challenge when dealing with large datasets.
I think you could try to do it in Power Query and use the Table.buffer function to cache some data in memory.

Table.Buffer - PowerQuery M | Microsoft Learn
Or if you're using some relational databases, you can do that by writing SQL on load. Of course, the best way to do this is to do it upstream (in the database), which will guarantee the loading speed in Power BI.
Of course, I have a small idea that if you have permission to use Fabric, you can feed your data through Fabric's pipeline into the lakehouse, and then the lakehouse completes the transformation. In Power BI, you only need to read the data of lakehouse to create a report.

If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.

Best Regards

Jianpeng Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

v-jianpeng-msft · ‎12-16-2024

Hi, @Dave1mo1

Yeah. That's right, and we look forward to your follow-up feedback. It would be great if it was possible to solve your current problem smoothly!

Best Regards

Jianpeng Li

v-jianpeng-msft · ‎12-10-2024

Hi @Dave1mo1

Are you able to create a calculated column? You can use two tools to determine which is faster in your data, either the calculated column or the conditional column for PQ:

PQ:

Calculated Column:

If you still have any questions or need more support, please feel free to let us know. We are more than happy to continue to help you.
Thank you for your patience and look forward to hearing from you.

Best Regards

Jianpeng Li

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Dave1mo1 · ‎12-06-2024

Hi Tahreem,

Thanks so much for the link. I did try using the "bins" feature, but I can't figure out how to change the name of the bin. If I say I want a bin size of "5" for units, it gives me bin names of "0," 5," "10," etc. I want the bin name to reflect the range of 1-5, 6-10, etc.

Tahreem24 · ‎12-06-2024

@Dave1mo1 Best way to define the bucket is by using "Bins and Grouping" feature of PBI. Attached is the official blog for your reference:

https://learn.microsoft.com/en-us/power-bi/create-reports/desktop-grouping-and-binning

Don't forget to give thumbs up and accept this as a solution if it helped you!!!

Please take a quick glance at newly created dashboards : Restaurant Management Dashboard , HR Analytics Report , Hotel Management Report, Sales Analysis Report , Fortune 500 Companies Analysis , Revenue Tracking Dashboard

Most Efficient Way to Create Buckets?

Helpful resources

Join us at the Microsoft Fabric Community Conference

We want your feedback!

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

How to Get Your Question Answered Quickly

Join us at the 2025 Microsoft Fabric Community Conference

Most Efficient Way to Create Buckets?

Helpful resources

Join us at the Microsoft Fabric Community Conference

We want your feedback!

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

How to Get Your Question Answered Quickly