Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
reinaldogani
New Member

Text Analytics Sentiment API in R - Twitter

I am using R to stream tweets from Twitter
After doing some cleaning on the tweets, ie. eliminating link, duplicated links, etc, I convert it to seriallsed JSON format

I am trying to call the text analytics (sentiment) API but always return 400 (bad service)
But when I limit the number of tweets to very small amount (10 tweets), somehow it works

Any suggestion on how to solve this problem?

 

GalaxyS8 <- searchTwitter("Galaxy S8", n=10000, lang='en')

 

GalaxyS8_tweets_df = do.call("rbind", lapply(GalaxyS8, as.data.frame))
GalaxyS8_tweets_df = subset(GalaxyS8_tweets, select = c(text))

 

textScrubber <- function(dataframe)
{dataframe$text <- gsub("—", " ", dataframe$text)
dataframe$text <- gsub("&", " ", dataframe$text)
dataframe$text = gsub("[[:punct:]]", "", dataframe$text)
dataframe$text = gsub("(RT|via)((?:\\b\\W*@\\w+)+)", " ", dataframe$text)
dataframe$text = gsub("@\\w+", "", dataframe$text)
dataframe$text = gsub("http\\w+", "", dataframe$text)
dataframe$text = gsub("[ \t]{2,}", "", dataframe$text)
dataframe$text = gsub("^\\s+|\\s+$", "", dataframe$text)
dataframe["DuplicateFlag"] = duplicated(dataframe$text)
dataframe = subset(dataframe, dataframe$DuplicateFlag=="FALSE")
dataframe = subset(dataframe, select = -c(DuplicateFlag))

return(dataframe)
}

GalaxyS8_tweets_df <- textScrubber(GalaxyS8_tweets_df)

 

GalaxyS8_tweets_df["language"] = "en"
GalaxyS8_tweets_df["id"] = seq.int(nrow(GalaxyS8_tweets_df))
request_body_GalaxyS8 = GalaxyS8_tweets_df[c(2,3,1)]

 

request_body_json_GalaxyS8 = toJSON(list(documents = request_body_GalaxyS8))

 

result_GalaxyS8 <- POST("https://westus.api.cognitive.microsoft.com/text/analytics/v2.0/sentiment",
body = request_body_json_GalaxyS8,
add_headers(.headers = c('Content-Type'='application/json','Ocp-Apim-Subscription-Key'='your-api-key')))

 

2 REPLIES 2
BeardyGeorge
Advocate I
Advocate I

not specifically related to your problem, but you can use | to seperate multiple objects in gsub, like so:

dataframe$text <- gsub("—|&", " ", dataframe$text)

which might make your code a little more concise - there's some great documentation on the grep function and how R handles regular expressions available online :).

v-sihou-msft
Microsoft Employee
Microsoft Employee

@reinaldogani

 

As mentioned in document, it has size limitation for input JSON when calling Text Analytics Sentiment API:

 

The maximum size of a single document that can be submitted is 10 KB, and the total maximum size of submitted input is 1 MB. No more than 1,000 documents may be submitted in one call. Rate limiting exists at a rate of 100 calls per minute - we therefore recommend that you submit large quantities of documents in a single call. 

 

Regards,

 

 

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.