Sentiment Analysis in R with package syuzhet

Sentiment analysis is the process of determining whether a piece of writing  or set of text is positive, negative or neutral. Here, we’ll work with the package “syuzhet”.
Supposed there is long email which we would like to check, we’ll read the Emails from the database.

Read emails into syuzhet
1
2
Emails <- data.frame(dbGetQuery(database,"SELECT * FROM Emails"))
library('syuzhet')

“syuzhet” uses NRC Emotion lexicon.

The NRC emotion lexicon is a list of words and their associations with eight emotions (trust, surprise, anger, fear, anticipation, sadness, joy, and disgust) and two sentiments (negative and positive).

The get_nrc_sentiment function returns a data frame in which each row represents a sentence from the original file. The columns include one for each emotion type was well as the positive or negative sentiment valence. It allows us to take a body of text and return which emotions it represents — and also whether the emotion is positive or negative. 

Do sentiment analysis of the email
1
2
3
4
5
6
7
8
9
10
11
d<-get_nrc_sentiment(Emails$RawText)
td<-data.frame(t(d))

td_new <- data.frame(rowSums(td[2:7945]))
#The function rowSums computes column sums across rows for each level of a grouping variable.

#Transformation and  cleaning
names(td_new)[1] <- "count"
td_new <- cbind("sentiment" = rownames(td_new), td_new)
rownames(td_new) <- NULL
td_new2<-td_new[1:8,]
Now, we’ll use “ggplot2” to create a bar graph. Each bar represents how prominent the each of the emotion is in text.

Graph the sentiment analysis in ggplot2
1
2
3
#Visualisation
library("ggplot2")
qplot(sentiment, data=td_new2, weight=count, geom="bar",fill=sentiment)+ggtitle("Email sentiments")

No comments:

Post a Comment

7 Stages of Machine Learning - Framework Introduction

Framework Introduction 7 Stages Introduction Stage 1: Problem Definition Stage 2: Data Collection Stage 3: Data Preparation Stage 4: Data Vi...