Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.
Hi team,
I have a datset with 1 million rows, which are loaded into PowerBI by ODBC(MongoDB). Below is some sample data.
I want to extract each element in multi-vlaue column "C_DS" into a new table , and then stats the value count and create a pie cart by creating a new measurement.
Here is the two Dax scripts I use:
1. Extract each element
C_DS_label =
VAR SplitByCharacter = ","
VAR Table0 =
SELECTCOLUMNS(
ADDCOLUMNS (
GENERATE (
ROW ( "Text", CONCATENATEX(SampleData,SampleData[C_DS],",") ),
VAR TokenCount =
PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
RETURN
GENERATESERIES ( 1, TokenCount )
),
"Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
),
"Word",[Word])
RETURN
SUMMARIZE(Table0,[Word])
2. stats the value count
Count_C_DS =
var curr = SELECTEDVALUE(C_DS_label[Word])
return
calculate(COUNTROWS('SampleData'), FILTER(ALL('SampleData'[C_DS]), PATHCONTAINS(SUBSTITUTE ( SampleData[C_DS], ",", "|" ) , curr) ))
But I found that the first DAX script(extract each element) is running very very ....very slow. It has run for 3 hours for 1 million records and hasn't been ended yet.
can any expert help take a look and advice how to improve the DAX script?
Thanks,
Cherie
Solved! Go to Solution.
Great @Anonymous
Please use
C_DS_label =
DISTINCT (
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords =
PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries =
GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
)
Hi @Anonymous
Please try
C_DS_label =
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords = PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries = GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
Hi @tamerj1 ,
Thank you for the advice. I tried, and it's very fast to extract the values. But the extracted values looks like duplicated, and not unique . What I expect is to extract the disctinct unique values from the C_DS column.
Here is the actual result in powerBI Desktop.
Expected Result I want:
Word |
Profile_Criteria |
Keyword |
Faceted_Criteria |
Backgroud_Criteria |
Rating_Criteria |
Great @Anonymous
Please use
C_DS_label =
DISTINCT (
SELECTCOLUMNS (
GENERATE (
FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
VAR NumofWords =
PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
VAR NumSeries =
GENERATESERIES ( 1, NumofWords )
RETURN
ADDCOLUMNS (
NumSeries,
"@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
)
),
"Word", [@Word]
)
)
User | Count |
---|---|
98 | |
76 | |
74 | |
49 | |
26 |