…
Part1: Randomly select a few rows of records from the entire table
In Power BI, there is a DAX function named SAMPLE. It will return a sample of N rows from the specified table. And if OrderBy arguments are pro...
Here are examples:
EX1: Return a calculated table in which 3 samples are randomly selected from Table1.
Table1_Sample_1 =
SAMPLE (
3, /* The number of rows to return. */
Table1, /* Any DAX expression that returns a table of data from where to extract the 'n' sample rows. */
1 /* Since the minimum argument count for SAMPLE function is 3, put any scalar DAX expression. You can also specify a column. */
)
EX2: Return a calculated table in which 3 samples are randomly selected from Table1 and sorted by one specific column.
Table1_Sample_2 =
SAMPLE (
3,
Table1,
[Metrics] /* Sort by [Value] column. */
)
In addition, from my limited testing, I find that SAMPLE function will not update the result returned after refresh if data has not been updated. Here is the test result:
Then how do we get different results after each refresh?
Let’s consider using RAND function. It works the same as in Excel, returning a random number greater than or equal to 0 a...
There are two methods.
a. Combined with SUMMARIZE, TOPN and ADDCOLUMNS functions to create a calculated table.
Table1_RAND =
SUMMARIZE (
TOPN ( 3, ADDCOLUMNS ( Table1, "Rand", RAND () ), [Rand] ),
[Product],
[Product ID],
[Production Date],
[Metrics]
)
b. Create a calculated column in the original table and just return the filtered values in Table visual.
Note: To avoid affecting the test results above, we copy the Table1 below, generate Table2, and then create a new computed column in Table2.
Rand = RAND()
We can see that after each refresh, the returned rows are different.
Now, we know how to get sample rows from one table. Then, how do we get the same number of rows for each category? Let’s go to the next part.
Part2: Randomly select records with the same number of rows by category from the entire table
a. Based on my test, class-by-class sampling cannot be achieved by SAMPLE function in this scenario. For example:
Table1_Sample_3 =
VAR ProductNum_ =
DISTINCTCOUNT ( Table1[Product] )
RETURN
SAMPLE ( ProductNum_ * 2, Table1, [Product] )
b. Then we use RANK function to create measures based on the Rand column in Part1 and then use “Filters on this visual” feature.Rank_Table2 =
RANKX (
ALLEXCEPT ( Table2, Table2[Category] ),
CALCULATE ( SUM ( Table2[Rand] ) ),
,
DESC,
DENSE
)
So far, we have implemented sampling by category. Then, if some categories have fewer records and some categories have more records. Then how do we sample it? Let’s go to Part3.
Part3: Randomly select records with different rows by category from the entire table
It is, in fact, very simple that we only need to combine the SWITCH function and add a judgment condition and then use “Filters on this visual” feature. For exam...
RankFilter_Table2 = VAR Count_ = CALCULATE ( COUNTROWS ( Table2 ), ALLEXCEPT ( Table2, Table2[Product] ) ) VAR Threshold_ = SWITCH ( TRUE (), Count_ >= 8, 4, Count_ >= 6, 3, Count_ >= 4, 2, 1 ) RETURN IF ( [Rank_Table2] <= Threshold_, 1 )
If you want to get more details, please check the attached .pbix file.
Hope this blog can help you.
Author: Icey Zhang
Reviewer: Kerry Wang & Ula Huang
|