Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
Anonymous
Not applicable

DAX script is running slow for large dataset scenario.(1 million records)

Hi team, 

 

I have a datset with 1 million rows, which are loaded into PowerBI  by ODBC(MongoDB).  Below is some sample data.

 

I want to extract each element in multi-vlaue column "C_DS" into a new table , and then stats the value count and create a pie cart by creating a new measurement.

Here is the two Dax scripts I use:

1. Extract each element

 

 

 

C_DS_label = 

VAR SplitByCharacter = ","
VAR Table0 =
    SELECTCOLUMNS(
    ADDCOLUMNS (
        GENERATE (
            ROW ( "Text", CONCATENATEX(SampleData,SampleData[C_DS],",") ),
            VAR TokenCount =
                PATHLENGTH ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ) )
            RETURN
                GENERATESERIES ( 1, TokenCount )
        ),
        "Word", PATHITEM ( SUBSTITUTE ( [Text], SplitByCharacter, "|" ), [Value] )
    ),
    "Word",[Word])
RETURN
    SUMMARIZE(Table0,[Word])

 

 

 

 

2. stats the value count

 

 

 

Count_C_DS =
var curr = SELECTEDVALUE(C_DS_label[Word])
return
calculate(COUNTROWS('SampleData'), FILTER(ALL('SampleData'[C_DS]), PATHCONTAINS(SUBSTITUTE ( SampleData[C_DS], ",", "|" ) , curr) ))

 

 

 

 

But I found that  the first DAX script(extract each element) is running very very ....very slow. It has run for 3 hours for 1 million records and hasn't been ended yet.

 

can any expert help take a look and advice how to improve the DAX script?

 

Thanks,

Cherie

1 ACCEPTED SOLUTION

Great @Anonymous 
Please use

C_DS_label =
DISTINCT (
    SELECTCOLUMNS (
        GENERATE (
            FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
            VAR NumofWords =
                PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
            VAR NumSeries =
                GENERATESERIES ( 1, NumofWords )
            RETURN
                ADDCOLUMNS (
                    NumSeries,
                    "@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
                )
        ),
        "Word", [@Word]
    )
)

View solution in original post

4 REPLIES 4
tamerj1
Super User
Super User

 Hi @Anonymous 
Please try 

C_DS_label =  
SELECTCOLUMNS (
    GENERATE (
        FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
        VAR NumofWords = PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
        VAR NumSeries = GENERATESERIES ( 1, NumofWords )
        RETURN
            ADDCOLUMNS ( 
                NumSeries,
                "@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
            )
    ),
    "Word", [@Word]
)

1.png

Anonymous
Not applicable

Hi @tamerj1 ,

 

Thank you for the advice. I tried, and it's very fast to extract the values. But the extracted values looks like duplicated, and not unique . What I expect is to extract the disctinct unique values from the C_DS column.

Here is the actual result in powerBI Desktop.

cheriemilk_0-1653381259530.png

 

Expected Result I want:

Word
Profile_Criteria
Keyword
Faceted_Criteria
Backgroud_Criteria
Rating_Criteria

 

Great @Anonymous 
Please use

C_DS_label =
DISTINCT (
    SELECTCOLUMNS (
        GENERATE (
            FILTER ( VALUES ( SampleData[C_DS] ), SampleData[C_DS] <> BLANK () ),
            VAR NumofWords =
                PATHLENGTH ( SUBSTITUTE ( [C_DS], ",", "|" ) )
            VAR NumSeries =
                GENERATESERIES ( 1, NumofWords )
            RETURN
                ADDCOLUMNS (
                    NumSeries,
                    "@Word", PATHITEM ( SUBSTITUTE ( [C_DS], ",", "|" ), [Value] )
                )
        ),
        "Word", [@Word]
    )
)
Anonymous
Not applicable

@tamerj1  thanks, works perfect.

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors