Subscribe to the PwC Newsletter

Join the community, add a new evaluation result row, sentiment analysis.

1373 papers with code • 40 benchmarks • 97 datasets

Sentiment Analysis is the task of classifying the polarity of a given text. For instance, a text-based tweet can be categorized into either "positive", "negative", or "neutral". Given the text and accompanying labels, a model can be trained to predict the correct sentiment.

Sentiment Analysis techniques can be categorized into machine learning approaches, lexicon-based approaches, and even hybrid methods. Some subcategories of research in sentiment analysis include: multimodal sentiment analysis, aspect-based sentiment analysis, fine-grained opinion analysis, language specific sentiment analysis.

More recently, deep learning techniques, such as RoBERTa and T5, are used to train high-performing sentiment classifiers that are evaluated using metrics like F1, recall, and precision. To evaluate sentiment analysis systems, benchmark datasets like SST, GLUE, and IMDB movie reviews are used.

Further readings:

  • Sentiment Analysis Based on Deep Learning: A Comparative Study

sentiment analysis research papers 2019

Benchmarks Add a Result

--> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> --> -->
Trend Dataset Best ModelPaper Code Compare
T5-11B
RoBERTa-large with LlamBERT
Heinsen Routing + RoBERTa Large
XLNet
VLAWE
XLNet
Bangla-BERT (large)
MA-BERT
AnglE-LLaMA-7B
BERT large
BERT large
InstructABSA
W2V2-L-LL60K (pipeline approach, uses LM)
BERTweet
UDALM: Unsupervised Domain Adaptation through Language Modeling
RoBERTa-large 355M + Entailment as Few-shot Learner
k-RoBERTa (parallel)
CalBERT
LSTMs+CNNs ensemble with multiple conv. ops
RobBERT v2
AEN-BERT
RuBERT-RuSentiment
xlmindic-base-uniscript
LSTMs+CNNs ensemble with multiple conv. ops
FiLM
Space-XLNet
fastText, h=10, bigram
CNN-LSTM
CNN-LSTM
Random
RoBERTa-wwm-ext-large
RoBERTa-wwm-ext-large
AraBERTv1
AraBERTv1
AraBERTv1
Naive Bayes
SVM
RCNN
lstm+bert
CalBERT

sentiment analysis research papers 2019

Most implemented papers

Bert: pre-training of deep bidirectional transformers for language understanding.

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

Convolutional Neural Networks for Sentence Classification

sentiment analysis research papers 2019

We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks.

Universal Language Model Fine-tuning for Text Classification

sentiment analysis research papers 2019

Inductive transfer learning has greatly impacted computer vision, but existing approaches in NLP still require task-specific modifications and training from scratch.

Bag of Tricks for Efficient Text Classification

facebookresearch/fastText • EACL 2017

This paper explores a simple and efficient baseline for text classification.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

A Structured Self-attentive Sentence Embedding

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention.

Deep contextualized word representations

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).

Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training.

Domain-Adversarial Training of Neural Networks

Our approach is directly inspired by the theory on domain adaptation suggesting that, for effective domain transfer to be achieved, predictions must be made based on features that cannot discriminate between the training (source) and test (target) domains.

Revisiting Sentiment Analysis for Software Engineering in the Era of Large Language Models

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options, index terms.

Software and its engineering

Software creation and management

Software post-development issues

Maintaining software

Recommendations

Sentiment analysis for software engineering: how far can we go.

Sentiment analysis has been applied to various software engineering (SE) tasks, such as evaluating app reviews or analyzing developers' emotions in commit messages. Studies indicate that sentiment analysis tools provide unreliable results when used out-...

Sentiment in software engineering: detection and application

In software engineering the role of human aspects is an important one, especially as developers indicate that they experience a wide range of emotions while developing software. Within software engineering researchers have sought to understand the ...

Joint sentiment/topic model for sentiment analysis

Sentiment analysis or opinion mining aims to use automated tools to detect subjective information such as opinions, attitudes, and feelings expressed in text. This paper proposes a novel probabilistic modeling framework based on Latent Dirichlet ...

Information

Published in.

cover image ACM Transactions on Software Engineering and Methodology

Association for Computing Machinery

New York, NY, United States

Publication History

Check for updates, author tags.

  • Large Language Models
  • Sentiment Analysis
  • Software Engineering
  • Research-article

Contributors

Other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 0 Total Downloads
  • Downloads (Last 12 months) 0
  • Downloads (Last 6 weeks) 0

View options

View or Download as a PDF file.

View online with eReader .

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with Transformer language models

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Te Pūkenga–New Zealand Institute of Skills and Technology, Dunedin, Otago, New Zealand

ORCID logo

Roles Data curation, Project administration

Affiliation Department of Psychology, University of Otago, Dunedin, Otago, New Zealand

Roles Conceptualization, Methodology, Supervision, Writing – original draft, Writing – review & editing

  • David Rozado, 
  • Ruth Hughes, 
  • Jamin Halberstadt

PLOS

  • Published: October 18, 2022
  • https://doi.org/10.1371/journal.pone.0276367
  • Reader Comments

Fig 1

This work describes a chronological (2000–2019) analysis of sentiment and emotion in 23 million headlines from 47 news media outlets popular in the United States. We use Transformer language models fine-tuned for detection of sentiment (positive, negative) and Ekman’s six basic emotions (anger, disgust, fear, joy, sadness, surprise) plus neutral to automatically label the headlines. Results show an increase of sentiment negativity in headlines across written news media since the year 2000. Headlines from right-leaning news media have been, on average, consistently more negative than headlines from left-leaning outlets over the entire studied time period. The chronological analysis of headlines emotionality shows a growing proportion of headlines denoting anger , fear , disgust and sadness and a decrease in the prevalence of emotionally neutral headlines across the studied outlets over the 2000–2019 interval. The prevalence of headlines denoting anger appears to be higher, on average, in right-leaning news outlets than in left-leaning news media.

Citation: Rozado D, Hughes R, Halberstadt J (2022) Longitudinal analysis of sentiment and emotion in news media headlines using automated labelling with Transformer language models. PLoS ONE 17(10): e0276367. https://doi.org/10.1371/journal.pone.0276367

Editor: Sergio Consoli, European Commission, ITALY

Received: January 31, 2022; Accepted: October 5, 2022; Published: October 18, 2022

Copyright: © 2022 Rozado et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The URLs sources of articles’ headlines, the Transformer models used for sentiment/emotion predictions, the sentiment and emotion labels annotations generated by the Transformer language models for each headline, the human sentiment/emotion annotations for a small subset of headlines used as ground truth to evaluate models’ performance and the analysis scripts are available in the following repository: https://doi.org/10.5281/zenodo.5144113 .

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Headlines from written news media constitute an important source of information about current affairs. News and opinion articles headlines often establish the first point of contact between an article and potential readers, with the reader often deciding whether to engage more in-depth with an article’s content after evaluating its headline [ 1 ]. In doing so, headlines also set the tone about the main text body of the article and affect readers’ processing of articles’ content to the point of constraining further information processing and biasing readers towards specific interpretations of the article [ 2 , 3 ].

The sentiment and emotionality of text has been shown to influence its virality [ 4 ]. Textual content that evokes high arousal, such as text conveying an emotion of anger , diffuses more profusely through online platforms [ 5 , 6 ]. Emotionally charged fake news also spread further and fastest through social media [ 7 ]. A study measuring the reach of tweets found that each moral or emotional word used in a tweet increased its virality by 20 percent, on average [ 8 ]. Thus, user engagement can be maximized by news articles posts that trigger negative sentiment/emotions [ 9 ]. This creates a financial incentive for news outlets to maximize incoming web traffic by modulating the emotional saliency of headlines.

News content has also been shown to be predictive of public mood [ 10 ], public opinion [ 11 ] and outlets’ biases [ 12 , 13 ]. Thus, studying the sentiment (positive/negative) and emotional payload (anger, disgust, fear, joy, sadness, surprise or neutral) of news articles headlines is of sociological interest. As far as we can tell however, a comprehensive longitudinal analysis of news media headlines sentiment and emotion remains lacking in the existing literature. Here, we attempt to remedy this knowledge gap by documenting chronologically the sentiment and emotion of headlines in a representative sample of news media outlets.

Examining written sources using human coders has been useful in the sociological analysis of text content [ 14 – 16 ]. Unfortunately, this approach is limited by its inability to scale to large corpora and by low intercoder reliability when examining subtle themes. Computational content analysis techniques circumvent some of the limitations of content analysis using human raters by permitting the quantification of textual attributes in vast text corpora [ 17 , 18 ].

Modern machine learning language models constitute an important tool for the automated analysis of text [ 13 , 19 – 21 ]. In particular, Transformer models [ 22 , 23 ] have achieved state-of-the-art performance in numerous Natural Language Processing (NLP) tasks. A Transformer model is a deep neural network that learns words’ context and thus meaning by using a mechanism known as self-attention–a form of differentially weighting the significance of each part of the input sentence when constructing word embeddings. Transformer architectures have reached prediction accuracies that match human annotations for text classification tasks such as the labelling of sentiment polarity [ 23 ]. Thus, computational content analysis of large chronological corpora using state-of-the-art machine learning models can provide insight about the temporal dynamics of semantic content in vast textual corpora [ 19 ].

This work uses modern Transformer language models, fine-tuned for text classification, to automatically label the sentiment polarity and emotional charge of a large data set of news articles headlines (N = 23 million). The set of news outlets analyzed was derived from the AllSides Media Bias Chart 2019 v1.1 [ 24 ] which lists 47 of the most popular news media outlets in the United States. Leveraging the diachronic nature of the corpus (2000–2019), we carry out a longitudinal analysis of sentiment polarity and emotional payload over time. Using external labels of news media outlets political leanings from the AllSides organization [ 24 ], we also examine the sentiment and emotional dynamics of headlines controlling for the ideological orientation of news outlets.

Ethics approval

Institutional ethics approval for gathering from human raters the sentiment and emotion annotations of a subset of news media headlines was obtained from the University of Otago Ethics Committee (reference number for proposal: D21/234). The human raters recruited for the annotation of the headlines provided written informed consent to participate in the study.

Analysis scripts and data availability

The URLs sources of articles’ headlines, the Transformer models used for sentiment/emotion predictions, the sentiment and emotion labels annotations generated by the Transformer language models for each headline, the human sentiment/emotion annotations for a small subset of headlines used as ground truth to evaluate models’ performance and the analysis scripts are available in the following repository: https://doi.org/10.5281/zenodo.5144113 .

Headlines data

The set of news media outlets analysed was derived from the AllSides organization 2019 Media Bias Chart v1.1 [ 24 ]. The human ratings of outlets’ ideological leanings were also taken from this chart. The AllSides Media Bias Chart has been used previously in the literature as a representative sample of popular U.S. news media outlets and as a ground truth of news outlets ideological leanings [ 6 , 12 , 25 ].

In total, we analyzed 23+ Million headlines from 47 news media outlets over the period 2000–2019. Average headline length in number of characters was 58.3. Average headline length in number of tokens (i.e. unigrams) was 9.4. See S1 File for detailed histograms about these metrics.

News articles headlines from the set of outlets listed in Fig 1 are available in the outlets’ online domains and/or public cache repositories such as The Internet Wayback Machine, Google cache and Common Crawl. Articles headlines were located in articles’ HTML raw data using outlet-specific XPath expressions.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

The shaded area indicates the 95% confidence interval around the mean. A statistical test for the null hypothesis of zero slope is shown on the bottom left of the plot. The percentage change on average yearly sentiment across outlets between 2000 and 2019 is shown on the top left of the plot.

https://doi.org/10.1371/journal.pone.0276367.g001

To avoid unrepresentative samples, we established an inclusion criteria threshold of at least 100 outlet headlines in any given year in order for the year to be included in the outlet time series. The temporal coverage of headlines across news outlets is not uniform. For some media organizations, news articles availability in online domains or Internet cache repositories becomes sparse for earlier years. Furthermore, some news outlets popular in 2019, such as The Huffington Post or Breitbart, did not exist in the early 2000’s. Hence, our data set is sparser in headlines sample size and representativeness for earlier years in the 2000–2019 range. Nevertheless, 18 outlets in our data set have chronologically continuous availability of headlines fulfilling our inclusion criteria since the year 2000. This smaller subset with a total of 12.5 Million headlines was used to replicate our experiments and confirm the validity of the results when using a fixed set of outlets over time, see S1 File for a detailed report about the number of headlines per outlet/year in our analysis.

Using a Transformer language model to predict the sentiment of headlines

Automated sentiment polarity annotation refers to the usage of computational tools to predict the sentiment polarity (positive or negative) of a text instance. Although the sentiment polarity of individual instances of text can sometimes be ambiguous, and humans can occasionally disagree about the sentiment of a particular piece of text, aggregating sentiment polarity over a large set of text instances provides a robust measurement of overall sentiment in a corpus since automated individual annotations accuracy is well above chance guessing.

In recent years, Transformer models have reached state-of-the-art results for automated sentiment polarity detection in natural language text [ 23 ]. In this work we use SiEBERT, a public checkpoint of a RoBERTa-large Transformer architecture [ 26 ] previously fine-tuned and evaluated for sentiment analysis on 15 data sets from diverse text sources to enhance generalization of sentiment annotations across different types of text [ 27 ]. Due to the heterogeneity of sources used for fine-tuning, SiEBERT outperforms the accuracy of a DistilBERT-based model fine-tuned solely on the popular Stanford Sentiment Treebank 2 (SST-2) data set by more than 15 percentage points (93.2 vs. 78.1 percent) [ 28 ]. The fine-tuning hyperparameters of SiEBERT were: learning rate = 2×10 −5 , number of training epochs = 3.0, number of warmup steps = 500, weight decay = 0.01 [ 27 , 28 ].

To validate the usage of the Transformer model for estimating headline sentiment, we measured the performance of the fine-tuned SiEBERT model in a random sample of 1,120 headlines from our data set that we had manually annotated for positive/negative sentiment using raters recruited through Mechanical Turk. We used these labels as ground truth to measure the performance of the SiEBERT model when predicting the sentiment of news media headlines. Only individuals over 18 years old and residents of the United States of America were allowed to take part. In total, 71 individuals (measured as independent IP addresses) took part in the headlines sentiment annotation task. The SiEBERT model fine-tuned for sentiment annotation reached an accuracy of 75% on this task. Note that human sentiment annotations intercoder agreement on the same task was 80% (Cohen’s Kappa: 0.59). These results hint at the validity of the Transformer model to, on aggregate, measure the sentiment of news media headlines on par with human annotations.

We used the SiEBERT model fine-tuned for sentiment classification to automatically annotate the sentiment of every headline in our data set. We then averaged the sentiment scores of all headlines of each news outlet in any given year to obtain time series of yearly headlines sentiment polarity for each outlet. Headlines with more than 32 tokens were truncated prior to automated annotation for GPU memory computational efficiency. To further validate our results, we replicated our experiments using the popular DistilBERT-based model fine-tuned on the SST-2 data set [ 29 ].

Using a Transformer language model to predict the emotion of headlines

Machine learning language models can also be used to detect the emotionality of text by generating emotional categories annotations for instances of natural language text. We used a public Transformer DistilRoBERTa-base checkpoint previously fine-tuned on 6 different emotion data sets for recognizing Ekman’s 6 basic emotions ( anger , disgust , fear , joy , sadness , and surprise ) plus neutral [ 28 , 30 , 31 ]. The fine-tuning hyperparameters of this model were: learning rate = 5×10 −5 , number of training epochs = 3.0, number of warmup steps = 500, weight decay = 0.01 [ 31 ].

The datasets used for fine tuning represent a diverse collection of text types, such as Twitter, Reddit, student self-reports or TV dialogues. The heterogeneity of data sets used for fine tuning was intended by the original authors to enhance the generalization of emotion predictions across different types of text.

To validate the ability of the model to generate accurate emotional annotations of headlines in our data set, we used the DistilRoBERTa-base fine-tuned for emotion recognition on a random sample of 5,353 headlines from our data set that we had annotated through Mechanical Turk for Ekman’s 6 basic emotion types plus neutral and that we used as ground truth to estimate model’s performance. Only individuals over 18 years old and residents of the United States of America were allowed to take part. In total, 143 individuals (measured as independent IP addresses) took part in the headlines’ emotion annotation task.

The DistilRoBERTa model achieved 39% classification accuracy on the task of classifying the headlines for which we had human-generated classification labels and which we used as ground truth (random guessing would be expected to reach 14%). Note that human interrater agreement on this task was also very low, 36%. See S1 File for detailed analysis. Also, since the emotion classes are not balanced in the data set of human annotated headlines’ emotionality, the accuracy metric is not particularly informative. Thus, we report the weighted precision, recall and F-1 scores of the model as 0.37, 0.39 and 0.36 respectively, see S1 File for detailed reporting for each emotional category and corresponding confusion matrices. Cohen’s kappa between model predictions and ground truth was 0.16. Matthew’s correlation coefficient between model predictions and ground truth was 0.16. Both metrics are relatively low but above the 0 level indicative of weighted random guessing. The performance of the model was above chance guessing for all emotional categories except surprise . Thus, in the Results section we drop this category for all subsequent analyses.

Interrater agreement between human raters for the emotion annotation task was 36% (Cohen’s Kappa = 0.16). Thus, interrater agreement was better than chance but relatively low. This is suggestive of the emotional annotation task being inherently ambiguous and/or subjective. For all emotional categories except surprise , interrater agreement between pairs of humans and between humans and the model was very similar. Thus, the performance of the model is mostly on par with human annotations. When using such a model to annotate a large number of headlines aggregated by year, yearly central tendency estimations should be more robust than noisy individual headline predictions.

To confirm that the automated model can detect overall trends in the emotional valence of headlines over time, we carried out a simulation using the true positive and false positive rates of the model for the different emotion categories to generate simulated annotations of illustrative hardcoded trends (see S1 File for details), and averaging those simulated predictions per year. When averaging a small set of simulated headlines emotion predictions per year (N = 100), the resulting average is unable to capture the underlying dynamics of headline emotionality. However, when aggregating a larger set of simulated headlines emotion predictions per year (N = 2,000), the resulting average is able to loosely capture the emotional dynamics of most emotion categories. When aggregating an even larger set of simulated headlines emotion predictions per year (N = 10,000 or N = 100,000), the resulting average is able to capture the emotional dynamics of all emotion categories except surprise with moderate to very high correlation. The underperformance in the simulation of the surprise category was expected since the prediction accuracy of the model on this particular category was on par with chance guessing. Note also that our data set contains a very large number of headlines per year: a minimum of more than 300,000 for the year 2000, and more than 1 million headlines per year since 2009 (see S1 File for detailed breakdown by outlet and year). Thus, allowing yearly central tendencies to reliably determine the emotional dynamics of headlines. A word cloud of the most prevailing words in each emotional category of headlines is included as S1 File to provide further support for the accuracy of the automated annotation method.

Chronological analysis of sentiment in news articles headlines

Fig 1 shows the average yearly sentiment of news articles headlines across the 47 popular news outlets analyzed. A pattern of increasing negative sentiment in headlines over time is apparent. A linear regression t-test to determine whether the slope of the regression line differs significantly from zero was conducted: t(18) = -9.63, p<10 −7 . The percentage change in the average sentiment of headlines from the year 2000 to the year 2019 is -314%. The slope of growing negativity appears to increase post-2010. A Chow Test [ 32 ] for structural break detection in 2010 is significant (F = 28.83, p<10–5).

A potential confound in Fig 1 is that more recent years aggregate a larger number of outlets. Thus, the pattern in Fig 1 could be due to a qualitatively different mix of outlets over time. However, redoing the analysis in Fig 1 using 12.5 million headlines from the 18 news media outlets in the data set with continuous availability of news articles headlines since the year 2000 also shows a pattern of declining sentiment in headlines; see S1 File for details.

We replicate the analysis in Fig 1 using a different Transformer model (DistilBert) fine-tuned on the SST-2 sentiment data set. This variation of the analysis produces very similar results to those reported in Fig 1 ; see S1 File for details.

Sentiment of news articles headlines by ideological leanings of news outlets

Aggregating the sentiment of headlines according to the ideological leanings of news outlets, using human ratings of outlet political leanings from the 2019 AllSides Media Bias Chart v1.1 [ 24 ], shows that the pattern of increasing negativity in news headlines is consistent across left-leaning and right-leaning outlets, see Fig 2 . Both right-leaning and left-leaning news outlets display increasing negative sentiment in their headlines since the year 2000. There is a high degree of correlation in the sentiment of headlines between right-leaning and left-leaning outlets (r = 0.82). On average, right-leaning news outlets have historically tended to use more negative headlines than left-leaning news outlets and continue to do so in 2019. Centrist news outlets appear to use less negative headlines than both right and left-leaning news outlets but the small set of outlets (N = 7) classified as centrists by the 2019 AllSides Media Bias Chart v1.1 warrants caution when interpreting the external validity of the centrist outlets trendline. Replicating this analysis using only the 18 media outlets with news articles headlines available since the year 2000 shows similar trends to those in Fig 2 , with the caveat that the declining sentiment trend for right-leaning outlets is milder (see S1 File ).

thumbnail

The figure displays the standard error bars of the average yearly sentiment for outlets within each color-coded political orientation category. For each ideological grouping, statistical tests for the null hypothesis of zero slope are shown on the bottom left of the plot.

https://doi.org/10.1371/journal.pone.0276367.g002

Chronological analysis of emotion in news articles headlines

Next, we analyze the emotional charge of headlines using the emotion predictions of the DistilRoBERTa-base Transformer model fine-tuned for emotion labelling. The aggregation of average yearly prevalence of emotional labels across the 47 popular news outlets analyzed is shown in Fig 3 . Linear regression t-tests to determine whether the slope of the regression line differs significantly from zero were conducted for each emotion (See Fig 3 for each test’s results). Reported p-values have been Bonferroni-corrected for multiple comparisons.

thumbnail

The shaded gray area indicates the 95% confidence interval around the mean. Note the different scale of the Y axes for the different emotion types. For each emotional category, statistical tests for the null hypothesis of zero slope are shown on the bottom left of each subplot. Reported p-values have been Bonferroni-corrected for multiple comparisons. The percentage changes between 2000 and 2019 are shown on the top left of each subplot.

https://doi.org/10.1371/journal.pone.0276367.g003

An increase of 104% in the prevalence of headlines denoting anger since the year 2000 is apparent in Fig 3 . There are also substantial increases in the prevalence of headlines denoting fear (+150%), disgust (29%) and sadness (+54%) in the 2000–2019 studied time range. In contrast, the prevalence of headlines with neutral emotion has experienced a continuous decrease (-30%) since the year 2000. The joy emotional category shows a curvilinear pattern with increasing proportion of headlines denoting joy from 2000 to 2010 and a decreasing trend from 2010 to 2019. Chow Tests [ 32 ] (Bonferroni corrected for multiple comparisons) for structural break detection in 2010 are significant for anger (F = 29.07, p<10 −4 ), disgust (F = 27.97, p<10 −4 ), joy (F = 23.69, p<10 −4 ), sadness (F = 6.48, p<0.05) and neutral (F = 7.64, p<0.05). Notice the different scale of the Y-axes for the different emotion types that might exaggerate the apparent temporal dynamics of emotion categories with low prevalence such as disgust . To confirm that the patterns shown in Fig 3 are not the result of a different qualitative composition of outlets between the year 2000 and the year 2019, we replicate the experiment using only the 18 outlets in the data set with continuous online availability of headlines since the year 2000 (N = 12.5 million). Results show very similar trends to those displayed in Fig 3 , see S1 File . Replicating the previous analysis with the 12 news outlets with more than 2,000 headlines per year since 2000 (N = 12 million), shows very similar trends. Another replication with the six news outlets with more than 10,000 headlines per year since 2000 (N = 8 million), shows very similar results to those reported in Fig 3 (see S1 File for details).

Emotionality of news articles headlines by ideological leanings of news outlets

Aggregating the emotionality of headlines according to the ideological leanings of the outlets, using political bias ratings from the 2019 AllSides Media Bias Chart v1.1 [ 24 ], shows that the increasing prevalence of headlines denoting anger is apparent in both right-leaning and left-leaning news outlets, see Fig 4 . Centrist news outlets follow a similar trend over the studied time frame. Anger denoting headlines appear more prevalent in right-leaning outlets than in left-leaning outlets over the entire studied time period. Fear and sadness denoting headlines are also increasing across the entire ideological spectrum. The decreasing prevalence of headlines with neutral emotional valence appears to be consistent in left, centrist and right-leaning outlets. The degree of correlation between the emotionality of headlines in left-leaning and right-leaning news outlets is substantial for most emotion types. Replicating this analysis using only the 18 news outlets with headlines available since the year 2000 shows similar trends; see S1 File for details.

thumbnail

Note the different scale of the Y axes for the different emotion types. Only statistical tests within each ideological grouping for which the null hypothesis of zero slope was rejected (after Bonferroni correction for multiple comparisons) are shown on the bottom left of each plot.

https://doi.org/10.1371/journal.pone.0276367.g004

The results of this work show an increase of sentiment negativity in headlines across news media outlets popular in the United States since at least the year 2000. The sentiment of headlines in right-leaning news outlets has been, on average, more negative than the sentiment of headlines in left-leaning news outlets for the entirety of the 2000–2019 studied time interval. Also, since at least the year 2008, there has been a substantial increase in the prevalence of headlines denoting anger across popular news media outlets. Here as well, right-leaning news media appear, on average, to have used a higher proportion of anger denoting headlines than left-leaning news outlets. The prevalence of headlines denoting fear and sadness has also increased overall during the 2000–2019 interval. Within the same temporal period, the proportion of headlines with neutral emotional valence has markedly decreased across the entire news media ideological spectrum.

The higher prevalence of negativity and anger in right-leaning news media is noteworthy. Perhaps this is due to right-leaning news media simply using more negative language than left-leaning news media to describe the same phenomena. Alternatively, the higher negativity and anger undertones in headlines from right-leaning news media could be driven by differences in topic coverage between both types of outlets. Clarifying the underlying reasons for the different sentiment and emotional undertones of headlines between left-leaning and right-leaning news media could be an avenue for relevant future research.

The structural break in the sentiment polarity and the emotional payload of headlines around 2010 is intriguing, although the short nature of the time series under investigation (just 20 years of observations) makes the reliability uncertain. Due to the methodological limitations of our observational study, we can only speculate about its potential causes.

In the year 2009, social media giants Facebook and Twitter added the like and retweet buttons respectively to their platforms [ 33 ]. These features allowed those social media companies to collect information about how to capture users’ attention and maximize engagement through algorithmically determined personalized feeds. Information about which news articles diffused more profusely through social media percolated to news outlets by user-tracking systems such as browser cookies and social media virality metrics. In the early 2010s, media companies also began testing news media headlines across dozens of variations to determine the version that generated the highest click-through ratio [ 34 ]. Thus, a perverse incentive might have emerged in which news outlets, judging by the larger reach/popularity of their articles with negative/emotional headlines, started to drift towards increasing usage of negative sentiment/emotions in their headlines.

A limitation of this work is the frequent semantic overloading of the sentiment/emotion annotation task. The negative sentiment category for instance often conflates into the same umbrella notion of negativity text that describes suffering and/or being at the receiving end of mistreatment, as in “the Prime Minister has been a victim of defamation”, with text that denotes negative behavior or character traits, as in “the Prime Minister is selfish”. Thus, it is uncertain whether the increasing prevalence of headlines with negative connotations emphasize victimization, negative behavior/judgment or a mixture of the two.

An additional limitation of this work is the frequent ambiguity of the sentiment/emotion annotation task. The sentiment polarity and particularly the emotional payload of a text instance can be highly subjective and intercoder agreement is generally low, especially for the latter, albeit above chance guessing. For this reason, automated annotations for single instances of text can be noisy and thus unreliable. Yet, as shown in the simulation experiments (see S1 File for details), when aggregating the emotional payload over a large number of headlines, the average signal raises above the noise to provide a robust proxy of overall emotion in large text corpora. Reliable annotations at the individual headline level however would require more overdetermined emotional categories.

The imbalanced nature of the emotion labels also represents a challenge for the classification analysis. For that reason, we used performance metrics that are recommended when handling imbalanced data such as confusion matrices, precision, recall and F-1 scores. Usage of different algorithms such as decision trees are often recommended when working with imbalanced data. However, since Transformer models represent the state-of-the-art for NLP text classification, we circumscribed our analysis to their usage. Other techniques for dealing with imbalanced data such as oversampling the minority class or under sampling the majority class could have also been used. However, our relatively small number of human annotated headlines (1124 for sentiment and 5353 for emotion), constrained our ability to trim the human-annotated data set.

Another limitation of this work is the potential biases of the human raters that annotated the sentiment and emotion of news media headlines. It is conceivable that our sample of human raters, recruited through Mechanical Turk, is not representative of the general US population. For instance, the distribution of socioeconomic status among raters active in Mechanical Turk might not match the distribution of the entire US population. The impact of such potential sample bias on headlines sentiment/emotion estimation is uncertain.

A final limitation of our work is the small number of outlets falling into the centrist political orientation category according to the AllSides Media Bias Chart v1.1. Such small sample size limits the sample representativeness and constraints the external validity of the centrist outlets results reported here.

An important question raised by this work is whether the sentiment and emotionality embedded in news media headlines reflect a wider societal mood or if instead they just reflect the sentiment and emotionality prevalent or pushed by those creating news content. Financial incentives to maximize click-through ratios could be at play in increasing the sentiment polarity and emotional charge of headlines over time. Conceivably, the temptation of shaping the sentiment and emotional undertones of news headlines to advance political agendas could also be playing a role. Deciphering these unknowns is beyond the scope of this article and could be a worthy goal for future research.

To conclude, we hope this work paves the way for further exploration about the potential impact on public consciousness of growing emotionality and sentiment negativity of news media content and whether such trends are conductive to sustain public well-being. Thus, we hope that future research throws light on the potential psychological and social impact of public consumption of news media diets with increasingly negative sentiment and anger/fear/sadness undertones embedded within them.

Supporting information

https://doi.org/10.1371/journal.pone.0276367.s001

  • View Article
  • Google Scholar
  • PubMed/NCBI
  • 14. Krippendorff K., Content Analysis : An Introduction to Its Methodology , Third edition. Los Angeles; London: SAGE Publications, Inc, 2012.
  • 15. Neuendorf K. A., The Content Analysis Guidebook , 1st edition. Thousand Oaks, Calif: SAGE Publications, Inc, 2001.
  • 20. S. Raza and C. Ding, “News Recommender System Considering Temporal Dynamics and News Taxonomy,” in 2019 IEEE International Conference on Big Data (Big Data) , Dec. 2019, pp. 920–929. https://doi.org/10.1109/BigData47090.2019.9005459
  • 21. S. Raza and C. Ding, “Deep Neural Network to Tradeoff between Accuracy and Diversity in a News Recommender System,” in 2021 IEEE International Conference on Big Data (Big Data) , Dec. 2021, pp. 5246–5256. https://doi.org/10.1109/BigData52589.2021.9671467
  • 24. AllSides, “AllSides Media Bias Ratings,” AllSides , 2019. https://www.allsides.com/blog/updated-allsides-media-bias-chart-version-11 (accessed May 10, 2020).
  • 28. Heitmann M., Siebert C., Hartmann J., and Schamp C., “More than a Feeling: Benchmarks for Sentiment Analysis Accuracy,” Social Science Research Network, Rochester, NY, SSRN Scholarly Paper ID 3489963, Jul. 2020. https://doi.org/10.2139/ssrn.3489963

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

A review on sentiment analysis and emotion detection from text

Pansy nandwani.

Computer Science and Engineering Department, Punjab Engineering College, Chandigarh, India

Rupali Verma

Social networking platforms have become an essential means for communicating feelings to the entire world due to rapid expansion in the Internet era. Several people use textual content, pictures, audio, and video to express their feelings or viewpoints. Text communication via Web-based networking media, on the other hand, is somewhat overwhelming. Every second, a massive amount of unstructured data is generated on the Internet due to social media platforms. The data must be processed as rapidly as generated to comprehend human psychology, and it can be accomplished using sentiment analysis, which recognizes polarity in texts. It assesses whether the author has a negative, positive, or neutral attitude toward an item, administration, individual, or location. In some applications, sentiment analysis is insufficient and hence requires emotion detection, which determines an individual’s emotional/mental state precisely. This review paper provides understanding into levels of sentiment analysis, various emotion models, and the process of sentiment analysis and emotion detection from text. Finally, this paper discusses the challenges faced during sentiment and emotion analysis.

Introduction

Human language understanding and human language generation are the two aspects of natural language processing (NLP). The former, however, is more difficult due to ambiguities in natural language. However, the former is more challenging due to ambiguities present in natural language. Speech recognition, document summarization, question answering, speech synthesis, machine translation, and other applications all employ NLP (Itani et al. 2017 ). The two critical areas of natural language processing are sentiment analysis and emotion recognition. Even though these two names are sometimes used interchangeably, they differ in a few respects. Sentiment analysis is a means of assessing if data is positive, negative, or neutral.

In contrast, Emotion detection is a means of identifying distinct human emotion types such as furious, cheerful, or depressed. “Emotion detection,” “affective computing,” “emotion analysis,” and “emotion identification” are all phrases that are sometimes used interchangeably (Munezero et al. 2014 ). People are using social media to communicate their feelings since Internet services have improved. On social media, people freely express their feelings, arguments, opinions on wide range of topics. In addition, many users give feedbacks and reviews various products and services on various e-commerce sites. User's ratings and reviews on multiple platforms encourage vendors and service providers to enhance their current systems, goods, or services. Today almost every industry or company is undergoing some digital transition, resulting in vast amounts of structured and unstructured increase data. The enormous task for companies is to transform unstructured data into meaningful insights that can help them in decision-making (Ahmad et al. 2020 )

For instance, in the business world, vendors use social media platforms such as Instagram, YouTube, Twitter, and Facebook to broadcast information about their product and efficiently collect client feedback (Agbehadji and Ijabadeniyi 2021 ). People’s active feedback is valuable not only for business marketers to measure customer satisfaction and keep track of the competition but also for consumers who want to learn more about a product or service before buying it. Sentiment analysis assists marketers in understanding their customer's perspectives better so that they may make necessary changes to their products or services (Jang et al. 2013 ; Al Ajrawi et al. 2021 ). In both advanced and emerging nations, the impact of business and client sentiment on stock market performance may be witnessed. In addition, the rise of social media has made it easier and faster for investors to interact in the stock market. As a result, investor's sentiments impact their investment decisions which can swiftly spread and magnify over the network, and the stock market can be altered to some extent (Ahmed 2020 ). As a result, sentiment and emotion analysis has changed the way we conduct business (Bhardwaj et al. 2015 ).

In the healthcare sector, online social media like Twitter have become essential sources of health-related information provided by healthcare professionals and citizens. For example, people have been sharing their thoughts, opinions, and feelings on the Covid-19 pandemic (Garcia and Berton 2021 ). Patients were directed to stay isolated from their loved ones, which harmed their mental health. To save patients from mental health issues like depression, health practitioners must use automated sentiment and emotion analysis (Singh et al. 2021 ). People commonly share their feelings or beliefs on sites through their posts, and if someone seemed to be depressed, people could reach out to them to help, thus averting deteriorated mental health conditions.

Sentiment and emotion analysis plays a critical role in the education sector, both for teachers and students. The efficacy of a teacher is decided not only by his academic credentials but also by his enthusiasm, talent, and dedication. Taking timely feedback from students is the most effective technique for a teacher to improve teaching approaches (Sangeetha and Prabha 2020 ). Open-ended textual feedback is difficult to observe, and it is also challenging to derive conclusions manually. The findings of a sentiment analysis and emotion analysis assist teachers and organizations in taking corrective action. Since social site's inception, educational institutes are increasingly relying on social media like Facebook and Twitter for marketing and advertising purposes. Students and guardians conduct considerable online research and learn more about the potential institution, courses and professors. They use blogs and other discussion forums to interact with students who share similar interests and to assess the quality of possible colleges and universities. Thus, applying sentiment and emotion analysis can help the student to select the best institute or teacher in his registration process (Archana Rao and Baglodi 2017 ).

Sentiment and emotion analysis has a wide range of applications and can be done using various methodologies. There are three types of sentiment and emotion analysis techniques: lexicon based, machine learning based, and deep learning based. Each has its own set of benefits and drawbacks. Despite different sentiment and emotion recognition techniques, researchers face significant challenges, including dealing with context, ridicule, statements conveying several emotions, spreading Web slang, and lexical and syntactical ambiguity. Furthermore, because there are no standard rules for communicating feelings across multiple platforms, some express them with incredible effect, some stifle their feelings, and some structure their message logically. Therefore, it is a great challenge for researchers to develop a technique that can efficiently work in all domains.

In this review paper, Sect.  2 , introduces sentiment analysis and its various levels, emotion detection, and psychological models. Section  3 discusses multiple steps involved in sentiment and emotion analysis, including datasets, pre-processing of text, feature extraction techniques, and various sentiment and emotion analysis approaches. Section  4 addresses multiple challenges faced by researchers during sentiment and emotion analysis. Finally, Sect.  5 concludes the work.

Sentiment analysis

Many people worldwide are now using blogs, forums, and social media sites such as Twitter and Facebook to share their opinions with the rest of the globe. Social media has become one of the most effective communication media available. As a result, an ample amount of data is generated, called big data, and sentiment analysis was introduced to analyze this big data effectively and efficiently (Nagamanjula and Pethalakshmi 2020 ). It has become exceptionally crucial for industry or organization to comprehend the sentiments of the user. Sentiment analysis, often known as opinion mining, is a method for detecting whether an author’s or user’s viewpoint on a subject is positive or negative. Sentiment analysis is defined as the process of obtaining meaningful information and semantics from text using natural processing techniques and determining the writer’s attitude, which might be positive, negative, or neutral (Onyenwe et al. 2020 ). Since the purpose of sentiment analysis is to determine polarity and categorize opinionated texts as positive or negative, dataset’s class range involved in sentiment analysis is not restricted to just positive or negative; it can be agreed or disagreed, good or bad. It can also be quantified on a 5-point scale: strongly disagree, disagree, neutral, agree, or strongly agree (Prabowo and Thelwall 2009 ). For instance, Ye et al. ( 2009 ) applied sentiment analysis on reviews on European and US destinations labeled on the scale of 1 to 5. They associated 1-star or 2-star reviews with the negative polarity and more than 2-star reviews with positive polarity. Gräbner et al. ( 2012 ) built a domain-specific lexicon that consists of tokens with their sentiment value. These tokens were gathered from customer reviews in the tourism domain to classify sentiment into 5-star ratings from terrible to excellent in the tourism domain. Moreover, Sentiment analysis from the text can be performed at three levels discussed in the following section. Salinca ( 2015 ) applied machine learning algorithms on the Yelp dataset, which contains reviews on service providers scaled from 1 to 5. Sentiment analysis can be categorized at three levels, mentioned in the following section.

Levels of sentiment analysis

Sentiment analysis is possible at three levels: sentence level, document level, and aspect level. At the sentence-level or phrase-level sentiment analysis, documents or paragraphs are broken down into sentences, and each sentence’s polarity is identified (Meena and Prabhakar 2007 ; Arulmurugan et al. 2019 ; Shirsat et al. 2019 ). At the document level, the sentiment is detected from the entire document or record (Pu et al. 2019 ). The necessity of document-level sentiment analysis is to extract global sentiment from long texts that contain redundant local patterns and lots of noise. The most challenging aspect of document-level sentiment classification is taking into account the link between words and phrases and the full context of semantic information to reflect document composition (Rao et al. 2018 ; Liu et al. 2020a ). It necessitates a deeper understanding of the intricate internal structure of sentiments and dependent words (Liu et al. 2020b ). At the aspect level, sentiment analysis, opinion about a specific aspect or feature is determined. For instance, the speed of the processor is high, but this product is overpriced. Here, speed and cost are two aspects or viewpoints. Speed is mentioned in the sentence hence called explicit aspect, whereas cost is an implicit aspect. Aspect-level sentiment analysis is a bit harder than the other two as implicit features are hard to identify. Devi Sri Nandhini and Pradeep ( 2020 ) proposed an algorithm to extract implicit aspects from documents based on the frequency of co-occurrence of aspect with feature indicator and by exploiting the relation between opinionated words and explicit aspects. Ma et al. ( 2019 ) took care of the two issues concerning aspect-level analysis: various aspects in a single sentence having different polarities and explicit position of context in an opinionated sentence. The authors built up a two-stage model based on LSTM with an attention mechanism to solve these issues. They proposed this model based on the assumption that context words near to aspect are more relevant and need greater attention than farther context words. At stage one, the model exploits multiple aspects in a sentence one by one with a position attention mechanism. Then, at the second state, it identifies (aspect, sentence) pairs according to the position of aspect and context around it and calculates the polarity of each team simultaneously.

As stated earlier, sentiment analysis and emotion analysis are often used interchangeably by researchers. However, they differ in a few ways. In sentiment analysis, polarity is the primary concern, whereas, in emotion detection, the emotional or psychological state or mood is detected. Sentiment analysis is exceptionally subjective, whereas emotion detection is more objective and precise. Section 2.2 describes all about emotion detection in detail.

Emotion detection

Emotions are an inseparable component of human life. These emotions influence human decision-making and help us communicate to the world in a better way. Emotion detection, also known as emotion recognition, is the process of identifying a person’s various feelings or emotions (for example, joy, sadness, or fury). Researchers have been working hard to automate emotion recognition for the past few years. However, some physical activities such as heart rate, shivering of hands, sweating, and voice pitch also convey a person’s emotional state (Kratzwald et al. 2018 ), but emotion detection from text is quite hard. In addition, various ambiguities and new slang or terminologies being introduced with each passing day make emotion detection from text more challenging. Furthermore, emotion detection is not just restricted to identifying the primary psychological conditions (happy, sad, anger); instead, it tends to reach up to 6-scale or 8-scale depending on the emotion model.

Emotion models/emotion theories

In English, the word 'emotion' came into existence in the seventeenth century, derived from the French word 'emotion, meaning a physical disturbance. Before the nineteenth century, passion, appetite, and affections were categorized as mental states. In the nineteenth century, the word 'emotion' was considered a psychological term (Dixon 2012 ). In psychology, complex states of feeling lead to a change in thoughts, actions, behavior, and personality referred to as emotions. Broadly, psychological or emotion models are classified into two categories: dimensional and categorical.

Dimensional Emotion model This model represents emotions based on three parameters: valence, arousal, and power (Bakker et al. 2014 . Valence means polarity, and arousal means how exciting a feeling is. For example, delighted is more exciting than happy. Power or dominance signifies restriction over emotion. These parameters decide the position of psychological states in 2-dimensional space, as illustrated in Fig. ​ Fig.1 1 .

An external file that holds a picture, illustration, etc.
Object name is 13278_2021_776_Fig1_HTML.jpg

Dimensional model of emotions

Categorical Emotion model

In the categorical model, emotions are defined discretely, such as anger, happiness, sadness, and fear. Depending upon the particular categorical model, emotions are categorized into four, six, or eight categories.

Table  1 demonstrates numerous emotion models that are dimensional and categorical. In the realm of emotion detection, most researchers adopted Ekman and Plutchik’s emotion model. The emotional states defined by the models make up the set of labels used to annotate the sentences or documents. Batbaatar et al. ( 2019 ), Becker et al. ( 2017 ), Jain et al. ( 2017 ) adopted Ekman’s six basic emotions. Sailunaz and Alhajj ( 2019 ) used Ekman models for annotating tweets. Some researchers used customized emotion models by extending the model with one or two additional states. Roberts et al. ( 2012 ) used the Ekman model to annotate the tweets with the 'love' state. Ahmad et al. ( 2020 ) adopted the wheel of emotion modeled by Plutchik for labeling Hindi sentences with nine different Plutchik model states, decreasing semantic confusion, among other words. Plutchik and Ekman’s model's states are also utilized in various handcrafted lexicons like WordNet-Affect (Strapparava et al. 2004 ) and NRC (Mohammad and Turney 2013 ) word–emotion lexicons. Laubert and Parlamis ( 2019 ) referred to the Shaver model because of its three-level hierarchy structure of emotions. Valence or polarity is presented at the first level, followed by the second level consisting of five emotions, and the third level shows discrete 24 emotion states. Some researchers did not refer to any model and classified the dataset into three basic feelings: happy, sad, or angry.

Emotion models defined by various psychologists

Emotion modelType of modelNo. of statesPsychological statesRepresentationsDiscussion
Ekman model (Ekman )Categorical6Anger, disgust, fear, joy, sadness, surpriseEkman’s model consisted of six emotions, which act as a base for other emotion models like Plutchik model
Plutchik Wheel of Emotions (Plutchik )DimensionalJoy, pensiveness, ecstasy, acceptance, sadness, fear, interest, rage, admiration, amazement, anger, vigilance boredom, annoyance, submission, serenity, apprehension, contempt, surprise, disapproval, distraction, grief, loathing, love, optimism, aggressiveness, remorse, anticipation, awe, terror, trust, disgustWheelPlutchik considered two types of emotions: basic (Ekman model + Trust +Anticipation) and mixed emotions (made from the combination of basic emotions). Plutchik represented emotions on a colored wheel
Izard model (Izard )10Anger, contempt, disgust, anxiety, fear, guilt, interest, joy, shame, surprise
Shaver model (Shaver et al. )Categorical6Sadness, joy, anger, fear, love, surpriseTreeShaver represented the primary, secondary and tertiary emotions in a hierarchal manner. The top-level of the tree presents these six emotions
Russell’s circumplex model (Russell )DimensionalSad, satisfied, Afraid, alarmed, frustrated, angry, happy, gloomy, annoyed, tired, relaxed, glad, aroused, astonished, at ease, tense, miserable, content, bored, calm, delighted, excited, depressed, distressed, serene, droopy, pleased, sleepyEmotions are presented over the circumplex model
Tomkins model (Tomkins and McCarter )Categorical9Disgust, surprise-Startle, anger-rage, anxiety, fear-terror, contempt, joy, shame, interest-ExcitementTomkins identified nine different emotions out of which six emotions are negative. Most of the emotions are defined as a pair
Lövheim Model (Lövheim )DimensionalAnger, contempt, distress, enjoyment, terror, excitement, humiliation, startleCubeLövheim arranged the emotions according to the amount of three substances (Noradrenaline, dopamine and Serotonin) on a 3-D cube

Figure  2 depicts the numerous emotional states that can be found in various models. These states are plotted on a four-axis by taking the Plutchik model as a base model. The most commonly used emotion states in different models include anger, fear, joy, surprise, and disgust, as depicted in the figure above. It can be seen from the figure that emotions on two sides of the axis will not always be opposite of each other. For example, sadness and joy are opposites, but anger is not the opposite of fear.

An external file that holds a picture, illustration, etc.
Object name is 13278_2021_776_Fig2_HTML.jpg

Illustration of various emotional models with some psychological states

Process of sentiment analysis and emotion detection

Process of sentiment analysis and emotion detection comes across various stages like collecting dataset, pre-processing, feature extraction, model development, and evaluation, as shown in Fig.  3 .

An external file that holds a picture, illustration, etc.
Object name is 13278_2021_776_Fig3_HTML.jpg

Basic steps to perform sentiment analysis and emotion detection

Datasets for sentiment analysis and emotion detection

Table  2 lists numerous sentiment and emotion analysis datasets that researchers have used to assess the effectiveness of their models. The most common datasets are SemEval, Stanford sentiment treebank (SST), international survey of emotional antecedents and reactions (ISEAR) in the field of sentiment and emotion analysis. SemEval and SST datasets have various variants which differ in terms of domain, size, etc. ISEAR was collected from multiple respondents who felt one of the seven emotions (mentioned in the table) in some situations. The table shows that datasets include mainly the tweets, reviews, feedbacks, stories, etc. A dimensional model named valence, arousal dominance model (VAD) is used in the EmoBank dataset collected from news, blogs, letters, etc. Many studies have acquired data from social media sites such as Twitter, YouTube, and Facebook and had it labeled by language and psychology experts in the literature. Data crawled from various social media platform's posts, blogs, e-commerce sites are usually unstructured and thus need to be processed to make it structured to reduce some additional computations outlined in the following section.

DatasetData sizeSentiment/emotion analysisSentiments/emotionsRangeDomain
Stanford Sentiment Treebank (Chen et al. )118,55 reviews in SST-1Sentiment analysisVery positive, positive, negative, very negative and neutral.5Movie reviews
9613 reviews in SST-2Sentiment analysisPositive and negative2Movie reviews
SemEval Tasks (Ma et al. ; Ahmad et al. )SemEval- 2014 (Task 4): 5936 reviews for training and 1758 reviews for testingSentiment analysisPositive, negative and neutral3Laptop and Restaurant reviews
SemEval-2018 (Affects in dataset Task): 7102 tweets in Emotion and Intensity for ordinal classification (EI-oc)Emotion analysisAnger, Joy, sad and fear4Tweets
Thai fairy tales (Pasupa and Ayutthaya )1964 sentencesSentiment analysisPositive, negative and neutral3Children tales
SS-Tweet (Symeonidis et al. )4242Sentiment AnalysisPositive strength and Negative strength1 to 5 for positive and to for negativeTweets
EmoBank (Buechel and Hahn )10,548Emotion analysisValence, Arousal Dominance model (VAD)News, blogs, fictions, letters etc.
International Survey of Emotional Antecedents and Reactions (ISEAR) (Seal et al. )Around 7500 sentencesEmotion analysisGuilt, Joy, Shame, Fear, sadness, disgust7Incident reports.
Alm gold standard data set (Agrawal and An )1207 sentencesEmotion analysishappy, fearful, sad, surprised and angry-disgust(combined)5Fairy tales
EmoTex (Hasan et al. )134,100 sentencesEmotion analysisCircumplex modelTwitter
Text Affect (Chaffar and Inkpen )1250 sentencesEmotion analysisEkman6Google news
Neviarouskaya Dataset (Alswaidan and Menai )Dataset 1: 1000 sentences and Dataset 2: 700 sentencesEmotion analysisIzard10Stories and blogs
Aman’s dataset (Hosseini )1890 sentencesEmotion analysisEkman with neutral class7Blogs

Pre-processing of text

On social media, people usually communicate their feelings and emotions in effortless ways. As a result, the data obtained from these social media platform's posts, audits, comments, remarks, and criticisms are highly unstructured, making sentiment and emotion analysis difficult for machines. As a result, pre-processing is a critical stage in data cleaning since the data quality significantly impacts many approaches that follow pre-processing. The organization of a dataset necessitates pre-processing, including tokenization, stop word removal, POS tagging, etc. (Abdi et al. 2019 ; Bhaskar et al. 2015 ). Some of these pre-processing techniques can result in the loss of crucial information for sentiment and emotion analysis, which must be addressed.

Tokenization is the process of breaking down either the whole document or paragraph or just one sentence into chunks of words called tokens (Nagarajan and Gandhi 2019 ). For instance, consider the sentence “this place is so beautiful” and post-tokenization, it will become 'this,' "place," is, "so," beautiful.’ It is essential to normalize the text for achieving uniformity in data by converting the text into standard form, correcting the spelling of words, etc. (Ahuja et al. 2019 ).

Unnecessary words like articles and some prepositions that do not contribute toward emotion recognition and sentiment analysis must be removed. For instance, stop words like "is," "at," "an," "the" have nothing to do with sentiments, so these need to be removed to avoid unnecessary computations (Bhaskar et al. 2015 ; Abdi et al. 2019 ). POS tagging is the way to identify different parts of speech in a sentence. This step is beneficial in finding various aspects from a sentence that are generally described by nouns or noun phrases while sentiments and emotions are conveyed by adjectives (Sun et al. 2017 ).

Stemming and lemmatization are two crucial steps of pre-processing. In stemming, words are converted to their root form by truncating suffixes. For example, the terms "argued" and "argue" become "argue." This process reduces the unwanted computation of sentences (Kratzwald et al. 2018 ; Akilandeswari and Jothi 2018 ). Lemmatization involves morphological analysis to remove inflectional endings from a token to turn it into the base word lemma (Ghanbari-Adivi and Mosleh 2019 ). For instance, the term "caught" is converted into "catch" (Ahuja et al. 2019 ). Symeonidis et al. ( 2018 ) examined the performance of four machine learning models with a combination and ablation study of various pre-processing techniques on two datasets, namely SS-Tweet and SemEval. The authors concluded that removing numbers and lemmatization enhanced accuracy, whereas removing punctuation did not affect accuracy.

Feature extraction

The machine understands text in terms of numbers. The process of converting or mapping the text or words to real-valued vectors is called word vectorization or word embedding. It is a feature extraction technique wherein a document is broken down into sentences that are further broken into words; after that, the feature map or matrix is built. In the resulting matrix, each row represents a sentence or document while each feature column represents a word in the dictionary, and the values present in the cells of the feature map generally signify the count of the word in the sentence or document. To carry out feature extraction, one of the most straightforward methods used is 'Bag of Words' (BOW), in which a fixed-length vector of the count is defined where each entry corresponds to a word in a pre-defined dictionary of words. The word in a sentence is assigned a count of 0 if it is not present in the pre-defined dictionary, otherwise a count of greater than or equal to 1 depending on how many times it appears in the sentence. That is why the length of the vector is always equal to the words present in the dictionary. The advantage of this technique is its easy implementation but has significant drawbacks as it leads to a sparse matrix, loses the order of words in the sentence, and does not capture the meaning of a sentence (Bandhakavi et al. 2017 ; Abdi et al. 2019 ). For example, to represent the text “are you enjoying reading” from the pre-defined dictionary I, Hope, you, are, enjoying, reading would be (0,0,1,1,1,1). However, these representations can be improved by pre-processing of text and by utilizing n-gram, TF-IDF.

The N-gram method is an excellent option to resolve the order of words in sentence vector representation. In an n-gram vector representation, the text is represented as a collaboration of unique n-gram means groups of n adjacent terms or words. The value of n can be any natural number. For example, consider the sentence “to teach is to touch a life forever” and n = 3 called trigram will generate 'to teach is,' 'teach is to,' 'is to touch,' 'to touch a,' 'touch a life,' 'a life forever.' In this way, the order of the sentence can be maintained (Ahuja et al. 2019 ). N-grams features perform better than the BOW approach as they cover syntactic patterns, including critical information (Chaffar and Inkpen 2011 ). However, though n-gram maintains the order of words, it has high dimensionality and data sparsity (Le and Mikolov 2014 ).

Term frequency-inverse document frequency, usually abbreviated as TFIDF, is another method commonly used for feature extraction. This method represents text in matrix form, where each number quantifies how much information these terms carry in a given document. It is built on the premise that rare terms have much information in the text document (Liu et al. 2019 ). Term frequency is the number of times a word w appears in a document divided by the total number of words W in the document, and IDF is log (total number of documents (N) divided by the total number of documents in which word w appears (n)) (Songbo and Jin 2008 ). Ahuja et al. ( 2019 ) implemented six pre-processing techniques and compared two feature extraction techniques to identify the best approach. They applied six machine learning algorithms and used n-grams with n = 2 and TF-IDF for feature extraction over the SS-tweet dataset and concluded TF-IDF gives better performance over n-gram.

The availability of vast volumes of data allows a deep learning network to discover good vector representations. Feature extraction with word embedding based on neural networks is more informative. In neural network-based word embedding, the words with the same semantics or those related to each other are represented by similar vectors. This is more popular in word prediction as it retains the semantics of words. Google’s research team, headed by Tomas Mikolov, developed a model named Word2Vec for word embedding. With Word2Vec, it is possible to understand for a machine that “queen” + “female” + “male” vector representation would be the same as a vector representation of “king” (Souma et al. 2019 ).

Other examples of deep learning-based word embedding models include GloVe, developed by researchers at Stanford University, and FastText, introduced by Facebook. GloVe vectors are faster to train than Word2vec. FastText vectors have better accuracy as compared to Word2Vec vectors by several varying measures. Yang et al. ( 2018 ) proved that the choice of appropriate word embedding based on neural networks could lead to significant improvements even in the case of out of vocabulary (OOV) words. Authors compared various word embeddings, trained using Twitter and Wikipedia as corpora with TF-IDF word embedding.

Techniques for sentiment analysis and emotion detection

Figure  4 presents various techniques for sentiment analysis and emotion detection which are broadly classified into a lexicon-based approach, machine learning-based approach, deep learning-based approach. The hybrid approach is a combination of statistical and machine learning approaches to overcome the drawbacks of both approaches. Transfer learning is also a subset of machine learning which allows the use of the pre-trained model in other similar domain.

An external file that holds a picture, illustration, etc.
Object name is 13278_2021_776_Fig4_HTML.jpg

Sentiment analysis techniques

Lexicon-based approach This method maintains a word dictionary in which each positive and negative word is assigned a sentiment value. Then, the sum or mean of sentiment values is used to calculate the sentiment of the entire sentence or document. However, Jurek et al. ( 2015 ) tried a different approach called the normalization function to calculate the sentiment value more accurately than this basic summation and mean function. Dictionary-based approach and corpus-based approach are two types of lexicon-based approaches based on sentiment lexicon. In general, a dictionary maintains words of some language systemically, whereas a corpus is a random sample of text in some language. The exact meaning applies here in the dictionary-based approach and corpus-based approach. In the dictionary-based approach, a dictionary of seed words is maintained (Schouten and Frasincar 2015 ). To create this dictionary, the first small set of sentiment words, possibly with very short contexts like negations, is collected along with its polarity labels (Bernabé-Moreno et al. 2020 ). The dictionary is then updated by looking for their synonymous (words with the same polarity) and antonymous (words with opposite polarity). The accuracy of sentiment analysis via this approach will depend on the algorithm. However, this technique does not contain domain specificity. The Corpus-based approach solves the limitations of the dictionary-based approach by including domain-specific sentiment words where the polarity label is assigned to the sentiment word according to its context or domain. It is a data-driven approach where sentiment words along with context can be accessed. This approach can certainly be a rule-based approach with some NLP parsing techniques. Thus corpus-based approach tends to have poor generalization but can attain excellent performance within a particular domain. Since the dictionary-based approach does not consider the context around the sentiment word, it leads to less efficiency. Thus, Cho et al. ( 2014 ) explicitly handled the contextual polarity to make dictionaries adaptable in multiple domains with a data-driven approach. They took a three-step strategy: merge various dictionaries, remove the words that do not contribute toward classification, and switch the polarity according to a particular domain.

SentiWordNet (Esuli and Sebastiani 2006 ) and Valence Aware Dictionary and Sentiment Reasoner (VADER) (Hutto and Gilbert 2014 ) are popular lexicons in sentiment. Jha et al. ( 2018 ) tried to extend the lexicon application in multiple domains by creating a sentiment dictionary named Hindi Multi-Domain Sentiment Aware Dictionary (HMDSAD) for document-level sentiment analysis. This dictionary can be used to annotate the reviews into positive and negative. The proposed method labeled 24% more words than the traditional general lexicon Hindi Sentiwordnet (HSWN), a domain-specific lexicon. The semantic relationships between words in traditional lexicons have not been examined, improving sentiment classification performance. Based on this premise, Viegas et al. ( 2020 ) updated the lexicon by including additional terms after utilizing word embeddings to discover sentiment values for these words automatically. These sentiment values were derived from “nearby” word embeddings of already existing words in the lexicon.

Machine Learning-based approach There is another approach for sentiment analysis called the machine learning approach. The entire dataset is divided into two parts for training and testing purposes: a training dataset and a testing dataset. The training dataset is the information used to train the model by supplying the characteristics of different instances of an item. The testing dataset is then used to see how successfully the model from the training dataset has been trained. Generally, the machine learning algorithms used for sentiment analysis fall under supervised classification. Different kinds of algorithms required for sentiment classification may include Naïve Bayes, support vector machine (SVM), decision trees, etc. each having its pros and cons. Gamon ( 2004 ) applied a support vector machine over 40,884 customer feedbacks collected from surveys. The authors implemented various feature set combinations and achieved accuracy up to 85.47%. Ye et al. ( 2009 ) worked with SVM, N-gram model, and Naïve Bayes on sentiment and review on seven popular destinations of Europe and the USA, which was collected from yahoo.com. The authors achieved an accuracy of up to 87.17% with the n-gram model. indent Bučar et al. ( 2018 ) created the lexicon called JOB 1.0 and labeled news corpora called SentiNews 1.0 for sentiment analysis in Slovene texts. JOB 1.0 consists of 25,524 headwords extended with sentiment scaling from – 5 to 5 based on the AFINN model. For the construction of corpora, data were scraped from various news Web media. Then, after cleaning and pre-processing of data, the annotators were asked to annotate 10,427 documents on the 1–5 scale where one means negative and 5 means very positive. Then these documents were labeled with positive, negative, and neutral labels as per the specific average scale rating. The authors observed that Naïve Bayes performed better as compared to the support vector machine (SVM). Naive Bayes achieved an F1 score above 90% in binary classification and an F1 score above 60% for the three-class classification of sentiments. Tiwari et al. ( 2020 ) implemented three machine learning algorithms called SVM, Naive Bayes, and maximum entropy with the n-gram feature extraction method on the rotten tomato dataset. The training and testing dataset constituted 1600 reviews in each. The authors observed a decrease in accuracy with higher values of n in n-grams such as n = four, five, and six. Soumya and Pramod ( 2020 ) classified 3184 Malayalam tweets into positive and negative opinions using different feature vectors like BOW, Unigram with Sentiwordnet, etc. The authors implemented machine learning algorithms like random forest and Naïve Bayes and observed that the random forest with an accuracy of 95.6% performs better with Unigram Sentiwordnet considering negation words.

Deep Learning-based Approach In recent years, deep learning algorithms are dominating other traditional approaches for sentiment analysis. These algorithms detect the sentiments or opinions from text without doing feature engineering. There are multiple deep learning algorithms, namely recurrent neural network and convolutional neural networks, that can be applied to sentiment analysis and gives results that are more accurate than those provided by machine learning models. This approach makes humans free from constructing the features from text manually as deep learning models extract those features or patterns themselves. Jian et al. ( 2010 ) used a model based upon neural networks technology for categorizing sentiments which consisted of sentimental features, feature weight vectors, and prior knowledge base. The authors applied the model to review the data of Cornell movie. The experimental results of this paper revealed that the accuracy level of the I-model is extraordinary compared to HMM and SVM. Pasupa and Ayutthaya ( 2019 ) executed five-fold cross-validation on the children’s tale (Thai) dataset and compared three deep learning models called CNN, LSTM, and Bi-LSTM. These models are applied with or without features: POS-tagging (pre-processing technique to identify different parts of speech); Thai2Vec (word embedding trained from Thai Wikipedia); sentic (to understand the sentiment of the word). The authors observed the best performance in the CNN model with all the three features mentioned earlier. As stated earlier, social media platforms act as a significant source of data in the field of sentiment analysis. Data collected from this social sites consist lot of noise due to its free writing syle of users. Therefore, Arora and Kansal ( 2019 ) proposed a model named Conv-char-Emb that can handle the problem of noisy data and use small memory space for embedding. For embedding, convolution neural network (CNN) has been used that uses less parameters in feature representation. Dashtipour et al. ( 2020 ) proposed a deep learning framework to carry out sentiment analysis in the Persian language. The researchers concluded that deep neural networks such as LSTM and CNN outperformed the existing machine learning algorithms on the hotel and product review dataset.

Transfer Learning Approach and Hybrid Approach Transfer learning is also a part of machine learning. A model trained on large datasets to resolve one problem can be applied to other related issues. Re-using a pre-trained model on related domains as a starting point can save time and produce more efficient results. Zhang et al. ( 2012 ) proposed a novel instance learning method by directly modeling the distribution between different domains. Authors classified the dataset: Amazon product reviews and Twitter dataset into positive and negative sentiments. Tao and Fang ( 2020 ) proposed extending recent classification methods in aspect-based sentiment analysis to multi-label classification. The authors also developed transfer learning models called XLNet and Bert and evaluated the proposed approach on different datasets Yelp, wine reviews rotten tomato dataset from other domains. Deep learning and machine learning approaches yield good results, but the hybrid approach can give better results since it overcomes the limitations of each traditional model. Mladenović et al. ( 2016 ) proposed a feature reduction technique, a hybrid framework made of sentiment lexicon and Serbian wordnet. The authors expanded both lexicons by addition some morphological sentiment words to avoid loss of critical information while stemming. Al Amrani et al. ( 2018 ) compared their hybrid model made of SVM and random forest model, i.e., RFSVM, on amazon’s product reviews. The authors concluded RFSVM, with an accuracy level of 83.4%, performs better than SVM with 82.4% accuracy and random forest with 81% accuracy individually over the dataset of 1000 reviews. Alqaryouti et al. ( 2020 ) proposed the hybrid of the rule-based approach and domain lexicons for aspect-level sentiment detection to understand people’s opinions regarding government smart applications. The authors concluded that the proposed technique outperforms other lexicon-based baseline models by 5%. Ray and Chakrabarti ( 2020 ) combined the rule-based approach to extract aspects with a 7-layer deep learning CNN model to tag each aspect. The hybrid model achieved 87% accuracy, whereas the individual models had 75% accuracy with rule-based and 80% accuracy with the CNN model.

Table  3 describes various machine learning and deep learning algorithms used for analyzing sentiments in multiple domains. Many researchers implemented the proposed models on their dataset collected from Twitter and other social networking sites. The authors then compared their proposed models with other existing baseline models and different datasets. It is observed from the table above that accuracy by various models ranges from 80 to 90%.

Work on sentiment analysis

ReferenceLevelTechniqueFeature extractionLearning algorithmDomainDatasetResults
Songbo and Jin ( )SentenceMachine learningCentroid classifier, K-nearest Classifier, Winnow Classifier, Naïve Bayes, SVMHouse, movie and educationChn- sentiCorpMicro F1 = 90.60% with SVM and IG and macro F1 = 90.43%.
Moraes ( )Aspect levelMachine learning and deep learningBag of wordsArtificial neural network (ANN), Naïve Bayes, SVMMovies, Books, GPS, CamerasAccuracy = 86.5% on movie dataset, 87.3% on GPS dataset, 81.8% on book dataset 90.6% on camera dataset with ANN.
Tang et al. ( )Document-levelDeep learningWord embeddings to dense document vectorUPNN (user product neutral network) based on CNNMoviesDataset collected from yelp dataset and IMDBAccuracy = 58.5% with UPNN (no UP) and 60.8% with UPNN on Yelp 2014.
Dahou et al. ( )Deep learningWord embedding built from Arabic corpusConvolutional neural network (CNN)Book, movie, restaurant etc.LABR book reviews, Arabic sentiment tweet dataset, etcAccuracy= 91.7% on and Accuracy = 89.6% unbalanced HTL and LABR dataset, respectively.
Ahuja et al. ( )SentenceMachine learningTF-IDF, n-gramKNN, SVM, logistic regression, NB, random forestSS-tweetsAccuracy = 57% with TF-IDF and logistic regression and accuracy = 51% with n-gram and random forest.
Untawale and Choudhari ( )Machine learningNaïve Bayes and random forestmovie reviewsRotten tomatoes, reviews from Times of India, etcNaïve Bayes required more time and memory than random forest.
Shamantha et al. ( )-Machine learningNaïve bayes, SVM and random forestTwitterTwitterAccuracy = above 80% with Naïve Bayes (3features) on 200 tweets.
Goularas and Kamis ( )Machine learningRandom forest and SVMAccuracy = 95% with random forest.
Nandal et al. ( )Aspect levelMachine learningSVM with different kernels: linear, radial basis function (RBF), and polynomialAmazon reviewsMean square error = 0.04 with radial basis function and 0.11 with linear kernel.
Sharma and Sharma ( )Machine learning and deep learningDeep artificial neural network and SVMTwitterTwitterPositive emotion rate = 87.5 with the proposed algorithm.
Mukherjee et al. ( )Sentence levelMachine learning and Deep Learning with negation prediction processTF IDFNaïve Bayes, support vector machines, artificial neural network (ANN), and recurrent neural network (RNN)Cellphone reviewsAmazon reviewsAccuracy = 95.30% with RNN + Negation and 95.67% with ANN+negation.

Emotion detection techniques

Lexicon-based Approach Lexicon-based approach is a keyword-based search approach that searches for emotion keywords assigned to some psychological states (Rabeya et al. 2017 ). The popular lexicons for emotion detection are WordNet-Affect (Strapparava et al. 2004 and NRC word–emotion lexicon (Mohammad and Turney 2013 ). WordNet-Affect is an extended form of WordNet which consists of affective words annotated with emotion labels. NRC lexicon consists of 14,182 words, each assigned to one particular emotion and two sentiments. These lexicons are categorical lexicons that tag each word with an emotional state for emotion classification. However, by ignoring the intensity of emotions, these traditional lexicons become less informative and less adaptable. Thus, Li et al. ( 2021 ) suggested an effective strategy to obtain word-level emotion distribution to assign emotions with intensities to the sentiment words by merging a dimensional dictionary named NRC-Valence arousal dominance. EmoSenticNet (Poria et al. 2014 ) also consists of a large number assigned to both qualitative and quantitative labels. Generally, researchers generate their lexicons and directly apply them for emotion analysis, but lexicons can also be used for feature extraction purposes. Abdaoui et al. ( 2017 ) took the benefit of using online translation tools to create a French lexicon called FEEL (French expanded emotion lexicon) consisting of more than 14,000 words with both polarity and emotion labels. This lexicon was created by increasing the number of words in the NRC emotion lexicon and semi-automatic translation using six online translators. Those entries obtained from at least three translators were considered pre-validated and then validated by the manual translator. Bandhakavi et al. ( 2017 ) applied a domain-specific lexicon for the process of feature extraction in emotion analysis. The authors concluded that features derived from their proposed lexicon outperformed the other baseline features. Braun et al. ( 2021 ) constructed a multilingual corpus called MEmoFC, which stands for Multilingual Emotional Football Corpus, consisting of football reports from English, Dutch and German Web sites and match statistics crawled from Goal.com. The corpus was created by creating two metadata tables: one explaining details of a match like a date, place, participation teams, etc., and the second table consisted of abbreviations of football clubs. Authors demonstrated the corpus with various approaches to know the influence of the reports on game outcomes.

Machine Learning-based Techniques Emotion detection or classification may require different types of machine learning models such as Naïve Bayes, support vector machine, decision trees, etc. Jain et al. ( 2017 ) extracted the emotions from multilingual texts collected from three different domains. The authors used a novel approach called rich site summary for data collection and applied SVM and Naïve Bayes machine learning algorithms for emotion classification of twitter text. Results revealed that an accuracy level of 71.4% was achieved with the Naïve Bayes algorithm. Hasan et al. ( 2019 ) evaluated the machine learning algorithms like Naïve Bayes, SVM, and decision trees to identify emotions in text messages. The task is divided into two subtasks: Task 1 includes a collection of the dataset from Twitter and automatic labeling of the dataset using hashtags and model training. Task 2 is developing a two-stage EmotexStream that separates emotionless tweets at the first stage and identifies emotions in the text by utilizing the models trained in the task1. The authors observed accuracy of 90% in classifying emotions. Asghar et al. ( 2019 ) aimed to apply multiple machine learning models on the ISEAR dataset to find the best classifier. They found that the logistic regression model performed better than other classifiers with a recall value of 83%.

Deep Learning and Hybrid Technique Deep learning area is part of machine learning that processes information or signals in the same way as the human brain does. Deep learning models contain multiple layers of neurons. Thousands of neurons are interconnected to each other, which speeds up the processing in a parallel fashion. Chatterjee et al. ( 2019 ) developed a model called sentiment and semantic emotion detection (SSBED) by feeding sentiment and semantic representations to two LSTM layers, respectively. These representations are then concatenated and then passed to a mesh network for classification. The novel approach is based on the probability of multiple emotions present in the sentence and utilized both semantic and sentiment representation for better emotion classification. Results are evaluated over their own constructed dataset with tweet conversation pairs, and their model is compared with other baseline models. Xu et al. ( 2020 ) extracted features emotions using two-hybrid models named 3D convolutional-long short-term memory (3DCLS) and CNN-RNN from video and text, respectively. At the same time, the authors implemented SVM for audio-based emotion classification. Authors concluded results by fusing audio and video features at feature level with MKL fusion technique and further combining its results with text-based emotion classification results. It provides better accuracy than every other multimodal fusion technique, intending to analyze the sentiments of drug reviews written by patients on social media platforms. Basiri et al. ( 2020 ) proposed two models using a three-way decision theory. The first model is a 3-way fusion of one deep learning model with the traditional learning method (3W1DT), while the other model is a 3-way fusion of three deep learning models with the conventional learning method (3W3DT). The results derived using the Drugs.com dataset revealed that both frameworks performed better than traditional deep learning techniques. Furthermore, the performance of the first fusion model was noted to be much better as compared to the second model in regards to accuracy and F1-metric. In recent days, social media platforms are flooded with posts related to covid-19. Singh et al. ( 2021 ) applied emotion detection analysis on covid-19 tweets collected from the whole world and India only with Bidirectional Encoder Representations from Transformers (BERT) model on the Twitter data sets and achieved accuracy 94% approximately.

Transfer Learning Approach In traditional approaches, the common presumption is that the dataset is from the same domain; however, there is a need for a new model when the domain changes. The transfer learning approach allows you to reuse the existing pre-trained models in the target domain. For example, Ahmad et al. ( 2020 ) used a transfer learning technique due to the lack of resources for emotion detection in the Hindi language. The researchers pre-trained a model on two different English datasets: SemEval-2018, sentiment analysis, and one Hindi dataset with positive, neutral, conflict, and negative labels. They achieved a score of 0.53 f1 using the transfer learning and 0.47 using only base models CNN and Bi-LSTM with cross-lingual word embedding. Hazarika et al. ( 2020 ) created a TL-ERC model where the model was pre-trained over source multi-turn conversations and then transferred over emotion classification task on exchanged messages. The authors emphasized the issues like lack of labeled data in multi-conversations with the framework based on inductive transfer learning.

Table  4 shows that most researchers implemented models by combining machine learning and deep learning techniques with various feature extraction techniques. Most of the datasets are available in the English language. However, some researchers constructed the dataset of their regional language. For example, Sasidhar et al. ( 2020 ) created the dataset of Hindi-English code mixed with three basic emotions: happy, sad, and angry, and observed CNN-BILSTM gave better performance compared to others.

Work on emotion detection

ReferenceApproachFeature extractionModelsDatasetsEmotion modelNo of emotionsResults
Chaffar and Inkpen ( )Machine learningBag of words, N-grams, WordNetAffectNaïve Bayes, decision tree, and SVMMultiple datasetEkman with neutral class, Izard10Acuracy = 81.16% on Aman’s dataset and 71.69% on Global dataset
Kratzwald et al. ( )Deep learning with transfer learning approachCustomised embedding GloVeSent2AffectLiterary tales, election tweet Isear Headlines General tweetsF1-score = 68.8% on literary dataset with pre-trained Bi-LSTM
Sailunaz and Alhajj ( )Machine learningNAVA (Noun Adverb, verb and Adjective)SVM, Random forest, Naïve BayesISEARGuilt, Joy, Shame, Fear, sadness, disgust6Accuracy = 43.24% on NAVA text with Naïve Bayes.
Shrivastava et al. ( )Deep learningWord2VecConvolutional neural networkTV shows transcript7Training accuracy = 80.41% and 77.54% with CNN (7 emotions)
Batbaatar et al. ( )Deep learningWord2Vec, GloVe, FastText, EWESENNISEAR, Emo Int, electoral tweets, etcAcuracy = 98.8% with GloVe+EWE and SENN on emotion cause dataset
Ghanbari-Adivi and Mosleh ( )Deep learningDoctoVecEnsemble classifier, tree-structured parzen estimator (TPE) for tuning parameterswonder, anger, hate, happy , sadness, and fear6OANC, CrowdFlower, ISEAR,99.49 on regular sentences
Xu et al. ( )Deep learning-based Hybrid Approach3DCLS model for visual , CNN-RNN for text and SVM for textMoud and IEMOCAPHappy, sad, angry, neutral4Accuracy = 96.75% by fusing audio and visual features at feature level on MOUD dataset
Adoma et al. ( )Pretrained transfer models (machine learning and deep learning)BERT, RoBERTa, DistilBERT, and XLNetISEARshame, anger,fear, disgust,joy, sadness, and guilt7Accuracy = 74%, 79% , 69% for RoBERTa, BERT, respectively.
Chowanda et al. ( )Machine learning and Deep learningsentistrength, N-gram and TF IDFGeneralised linear model, Naïve Bayes, fast-large margins, etc.Affective TweetsAnger, fear, sadness, joy4Accuracy = 92% and recall = 90% with the generalized linear model
Dheeraj and Ramakrishnudu ( )Deep learningGloveMulti-head attention with bidirectional long short-term memory and convolutional neural network (MHA-BCNN)Patient doctor interactions from Webmd and Healthtap platformsAnxiety, addiction, obsessive cleaning disorder (OCD), depression, etc6Accuracy = 97.8% using MHA-BCNN with Adam optimizer

Model assessment

Finally, the model is compared with baseline models based on various parameters. There is a requirement of model evaluation metrics to quantify model performance. A confusion matrix is acquired, which provides the count of correct and incorrect judgments or predictions based on known actual values. This matrix displays true positive (TP), false negative (FN), false positive (FP), true negative (TN) values for data fitting based on positive and negative classes. Based on these values, researchers evaluated their model with metrics like accuracy, precision, and recall, F1 score, etc., mentioned in Table  5 .

Evaluation metrics

Evaluation metricDescriptionEquation
AccuracyIt’s a statistic that sums up how well the model performs in all classes. It’s helpful when all types of classes are equally important. It is calculated as the ratio between the number of correct judgments to the total number of judgments.(TP+TN)/(TP+TN+FP+FN)
PrecisionIt measures the accuracy of the model in terms of categorizing a sample as positive. It is determined as the ratio of the number of correctly categorized Positive samples to the total number of positive samples (either correctly or incorrectly).TP/(TP+FP )
RecallThis score assesses the model’s ability to identify positive samples. It is determined by dividing the number of Positive samples that were correctly categorized as Positive by the total number of Positive samples.TP/(TP+FN)
F-measureIt is determined by calculating the harmonic mean of precision and recall.(2*Precision*Recall)/(Precision+Recall) = (2*TP)/((2*TP)+FP+FN)
SensitivityIt refers to the percentage of appropriately detected actual positives and it quantifies how effectively the positive class was anticipated.TP/((TP+FN))
SpecificityIt is the complement of sensitivity, the true negative rate which sums up how effectively the negative class was anticipated. The sensitivity of an imbalanced categorization may be more interesting than specificity.TN/(FP+TN)
Geometric-mean (G-mean)It is a measure that combines sensitivity and specificity into a single value that balances both objectives.

Challenges in sentiment analysis and emotion analysis

In the Internet era, people are generating a lot of data in the form of informal text. Social networking sites present various challenges, as shown in Fig.  5 , which includes spelling mistakes, new slang, and incorrect use of grammar. These challenges make it difficult for machines to perform sentiment and emotion analysis. Sometimes individuals do not express their emotions clearly. For instance, in the sentence “Y have u been soooo late?”, 'why' is misspelled as 'y,' 'you' is misspelled as 'u,' and 'soooo' is used to show more impact. Moreover, this sentence does not express whether the person is angry or worried. Therefore, sentiment and emotion detection from real-world data is full of challenges due to several reasons (Batbaatar et al. 2019 ).

An external file that holds a picture, illustration, etc.
Object name is 13278_2021_776_Fig5_HTML.jpg

Challenges in sentiment analysis and emotion detection

One of the challenges faced during emotion recognition and sentiment analysis is the lack of resources. For example, some statistical algorithms require a large annotated dataset. However, gathering data is not difficult, but manual labeling of the large dataset is quite time-consuming and less reliable (Balahur and Turchi 2014 ). The other problem regarding resources is that most of the resources are available in the English language. Therefore, sentiment analysis and emotion detection from a language other than English, primarily regional languages, are a great challenge and an opportunity for researchers. Furthermore, some of the corpora and lexicons are domain specific, which limits their re-use in other domains.

Another common problem is usually seen on Twitter, Facebook, and Instagram posts and conversations is Web slang. For example, the Young generation uses words like 'LOL,' which means laughing out loud to express laughter, 'FOMO,' which means fear of missing out, which says anxiety. The growing dictionary of Web slang is a massive obstacle for existing lexicons and trained models.

People usually express their anger or disappointment in sarcastic and irony sentences, which is hard to detect (Ghanbari-Adivi and Mosleh 2019 ). For instance, in the sentence, “This story is excellent to put you in sleep,” the excellent word signifies positive sentiment, but in actual the reviewer felt it quite dull. Therefore, sarcasm detection has become a tedious task in the field of sentiment and emotion detection.

The other challenge is the expression of multiple emotions in a single sentence. It is difficult to determine various aspects and their corresponding sentiments or emotions from the multi-opinionated sentence. For instance, the sentence “view at this site is so serene and calm, but this place stinks” shows two emotions, 'disgust' and 'soothing' in various aspects. Another challenge is that it is hard to detect polarity from comparative sentences. For example, consider two sentences 'Phone A is worse than phone B' and 'Phone B is worse than Phone A.' The word ’worse’ in both sentences will signify negative polarity, but these two sentences oppose each other (Shelke 2014 ).

In this paper, a review of the existing techniques for both emotion and sentiment detection is presented. As per the paper’s review, it has been analyzed that the lexicon-based technique performs well in both sentiment and emotion analysis. However, the dictionary-based approach is quite adaptable and straightforward to apply, whereas the corpus-based method is built on rules that function effectively in a certain domain. As a result, corpus-based approaches are more accurate but lack generalization. The performance of machine learning algorithms and deep learning algorithms depends on the pre-processing and size of the dataset. Nonetheless, in some cases, machine learning models fail to extract some implicit features or aspects of the text. In situations where the dataset is vast, the deep learning approach performs better than machine learning. Recurrent neural networks, especially the LSTM model, are prevalent in sentiment and emotion analysis, as they can cover long-term dependencies and extract features very well. But RNN with attention networks performs very well. At the same time, it is important to keep in mind that the lexicon-based approach and machine learning approach (traditional approaches) are also evolving and have obtained better outcomes. Also, pre-processing and feature extraction techniques have a significant impact on the performance of various approaches of sentiment and emotion analysis.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Pansy Nandwani, Email: [email protected] , Email: moc.liamg@2991inawdnanysnap .

Rupali Verma, Email: ni.ude.cep@ilapur .

  • Abdaoui A, Azé J, Bringay S, Poncelet P. Feel: a French expanded emotion lexicon. Lang Resour Eval. 2017; 51 (3):833–855. doi: 10.1007/s10579-016-9364-5. [ CrossRef ] [ Google Scholar ]
  • Abdi A, Shamsuddin SM, Hasan S, Piran J. Deep learning-based sentiment classification of evaluative text based on multi-feature fusion. Inf Process Manag. 2019; 56 (4):1245–1259. doi: 10.1016/j.ipm.2019.02.018. [ CrossRef ] [ Google Scholar ]
  • Adoma AF, Henry N-M, Chen W (2020) Comparative analyses of bert, roberta, distilbert, and xlnet for text-based emotion recognition. In: 2020 17th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP), IEEE, pp 117–121. 10.1109/ICCWAMTIP51612.2020.9317379
  • Agbehadji IE, Ijabadeniyi A (2021) Approach to sentiment analysis and business communication on social media. In: Fong S, Millham R (eds) Bio-inspired algorithms for data streaming and visualization, big data management, and fog computing, Springer Tracts in Nature-Inspired Computing. Springer, Singapore. 10.1007/978-981-15-6695-0_9
  • Agrawal A, An A (2012) Unsupervised emotion detection from text using semantic and syntactic relations. In: 2012 IEEE/WIC/ACM international conferences on web intelligence and intelligent agent technology, pp 346–353. 10.1109/WI-IAT.2012.170.
  • Ahmad Z, Jindal R, Ekbal A, Bhattachharyya P. Borrow from rich cousin: transfer learning for emotion detection using cross lingual embedding. Expert Syst Appl. 2020; 139 :112851. doi: 10.1016/j.eswa.2019.112851. [ CrossRef ] [ Google Scholar ]
  • Ahmed WM. Stock market reactions to domestic sentiment: panel CS-ARDL evidence. Res Int Bus Finance. 2020; 54 :101240. doi: 10.1016/j.ribaf.2020.101240. [ CrossRef ] [ Google Scholar ]
  • Ahuja R, Chug A, Kohli S, Gupta S, Ahuja P. The impact of features extraction on the sentiment analysis. Procedia Comput Sci. 2019; 152 :341–348. doi: 10.1016/j.procs.2019.05.008. [ CrossRef ] [ Google Scholar ]
  • Akilandeswari J, Jothi G. Sentiment classification of tweets with non-language features. Procedia Comput Sci. 2018; 143 :426–433. doi: 10.1016/j.procs.2018.10.414. [ CrossRef ] [ Google Scholar ]
  • Al Ajrawi S, Agrawal A, Mangal H, Putluri K, Reid B, Hanna G, Sarkar M (2021) Evaluating business yelp’s star ratings using sentiment analysis. Materials Today: Proceedings. 10.1016/j.matpr.2020.12.137
  • Al Amrani Y, Lazaar M, El Kadiri KE. Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput Sci. 2018; 127 :511–520. doi: 10.1016/j.procs.2018.01.150. [ CrossRef ] [ Google Scholar ]
  • Alqaryouti O, Siyam N, Monem AA, Shaalan K (2020) Aspect-based sentiment analysis using smart government review data. Appl Comput Inf. 10.1016/j.aci.2019.11.003
  • Alswaidan N, Menai MEB. A survey of state-of-the-art approaches for emotion recognition in text. Knowl Inf Syst. 2020; 62 (8):1–51. doi: 10.1007/s10115-020-01449-0. [ CrossRef ] [ Google Scholar ]
  • Archana Rao PN, Baglodi K (2017) Role of sentiment analysis in education sector in the era of big data: a survey. Int J Latest Trends Eng Technol 22–24
  • Arora M, Kansal V. Character level embedding with deep convolutional neural network for text normalization of unstructured data for twitter sentiment analysis. Soc Netw Anal Min. 2019; 9 (1):1–14. doi: 10.1007/s13278-019-0557-y. [ CrossRef ] [ Google Scholar ]
  • Arulmurugan R, Sabarmathi K, Anandakumar H. Classification of sentence level sentiment analysis using cloud machine learning techniques. Cluster Comput. 2019; 22 (1):1199–1209. doi: 10.1007/s10586-017-1200-1. [ CrossRef ] [ Google Scholar ]
  • Asghar MZ, Subhan F, Imran M, Kundi FM, Shamshirband S, Mosavi A, Csiba P, Várkonyi-Kóczy AR (2019) Performance evaluation of supervised machine learning techniques for efficient detection of emotions from online content. arXiv preprint arXiv:190801587
  • Bakker I, Van Der Voordt T, Vink P, De Boon J. Pleasure, arousal, dominance: Mehrabian and Russell revisited. Curr Psychol. 2014; 33 (3):405–421. doi: 10.1007/s12144-014-9219-4. [ CrossRef ] [ Google Scholar ]
  • Balahur A, Turchi M. Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis. Comput Speech Lang. 2014; 28 (1):56–75. doi: 10.1016/j.csl.2013.03.004. [ CrossRef ] [ Google Scholar ]
  • Bandhakavi A, Wiratunga N, Padmanabhan D, Massie S. Lexicon based feature extraction for emotion text classification. Pattern Recogn Lett. 2017; 93 :133–142. doi: 10.1016/j.patrec.2016.12.009. [ CrossRef ] [ Google Scholar ]
  • Basiri ME, Abdar M, Cifci MA, Nemati S, Acharya UR. A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowl Based Syst. 2020; 198 :105949. doi: 10.1016/j.knosys.2020.105949. [ CrossRef ] [ Google Scholar ]
  • Batbaatar E, Li M, Ryu KH. Semantic-emotion neural network for emotion recognition from text. IEEE Access. 2019; 7 :111866–111878. doi: 10.1109/ACCESS.2019.2934529. [ CrossRef ] [ Google Scholar ]
  • Becker K, Moreira VP, dos Santos AG. Multilingual emotion classification using supervised learning: comparative experiments. Inf Process Manag. 2017; 53 (3):684–704. doi: 10.1016/j.ipm.2016.12.008. [ CrossRef ] [ Google Scholar ]
  • Bernabé-Moreno J, Tejeda-Lorente A, Herce-Zelaya J, Porcel C, Herrera-Viedma E. A context-aware embeddings supported method to extract a fuzzy sentiment polarity dictionary. Knowl-Based Syst. 2020; 190 :105236. doi: 10.1016/j.knosys.2019.105236. [ CrossRef ] [ Google Scholar ]
  • Bhardwaj A, Narayan Y, Dutta M, et al. Sentiment analysis for Indian stock market prediction using sensex and nifty. Procedia Comput Sci. 2015; 70 :85–91. doi: 10.1016/j.procs.2015.10.043. [ CrossRef ] [ Google Scholar ]
  • Bhaskar J, Sruthi K, Nedungadi P. Hybrid approach for emotion classification of audio conversation based on text and speech mining. Procedia Comput Sci. 2015; 46 :635–643. doi: 10.1016/j.procs.2015.02.112. [ CrossRef ] [ Google Scholar ]
  • Braun N, van der Lee C, Gatti L, Goudbeek M, Krahmer E. Memofc: introducing the multilingual emotional football corpus. Lang Resour Eval. 2021; 55 (2):389–430. doi: 10.1007/s10579-020-09508-2. [ CrossRef ] [ Google Scholar ]
  • Bučar J, Žnidaršič M, Povh J. Annotated news corpora and a lexicon for sentiment analysis in Slovene. Lang Resour Eval. 2018; 52 (3):895–919. doi: 10.1007/s10579-018-9413-3. [ CrossRef ] [ Google Scholar ]
  • Buechel S, Hahn U (2017) Emobank: Studying the impact of annotation perspective and representation format on dimensional emotion analysis. In: Proceedings of the 15th conference of the european chapter of the association for computational linguistics: volume 2, Short Papers, pp 578–585
  • Chaffar S, Inkpen D (2011) Using a heterogeneous dataset for emotion analysis in text. In: Butz C, Lingras P (eds) Advances in artificial intelligence. Canadian AI 2011. Lecture notes in computer science, vol 6657. Springer, Berlin, Heidelberg. 10.1007/978-3-642-21043-3_8
  • Chatterjee A, Gupta U, Chinnakotla MK, Srikanth R, Galley M, Agrawal P. Understanding emotions in text using deep learning and big data. Comput Hum Behav. 2019; 93 :309–317. doi: 10.1016/j.chb.2018.12.029. [ CrossRef ] [ Google Scholar ]
  • Chen T, Xu R, He Y, Wang X. Improving sentiment analysis via sentence type classification using BILSTM-CRF and CNN. Expert Syst Appl. 2017; 72 :221–230. doi: 10.1016/j.eswa.2016.10.065. [ CrossRef ] [ Google Scholar ]
  • Cho H, Kim S, Lee J, Lee JS. Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowl-Based Syst. 2014; 71 :61–71. doi: 10.1016/j.knosys.2014.06.001. [ CrossRef ] [ Google Scholar ]
  • Chowanda A, Sutoyo R, Tanachutiwat S, et al. Exploring text-based emotions recognition machine learning techniques on social media conversation. Procedia Comput Sci. 2021; 179 :821–828. doi: 10.1016/j.procs.2021.01.099. [ CrossRef ] [ Google Scholar ]
  • Dahou A, Xiong S, Zhou J, Haddoud MH, Duan P (2016) Word embeddings and convolutional neural network for arabic sentiment classification. In: Proceedings of coling 2016, the 26th international conference on computational linguistics: Technical papers, pp 2418–2427
  • Dashtipour K, Gogate M, Li J, Jiang F, Kong B, Hussain A. A hybrid Persian sentiment analysis framework: integrating dependency grammar based rules and deep neural networks. Neurocomputing. 2020; 380 :1–10. doi: 10.1016/j.neucom.2019.10.009. [ CrossRef ] [ Google Scholar ]
  • Devi Sri Nandhini M, Pradeep G. A hybrid co-occurrence and ranking-based approach for detection of implicit aspects in aspect-based sentiment analysis. SN Comput Sci. 2020; 1 :1–9. doi: 10.1007/s42979-020-00138-7. [ CrossRef ] [ Google Scholar ]
  • Dheeraj K, Ramakrishnudu T (2021) Negative emotions detection on online mental-health related patients texts using the deep learning with MHA-BCNN model. Expert Syst Appl 182:115265
  • Dixon T. “Emotion”: the history of a keyword in crisis. Emot Rev. 2012; 4 (4):338–344. doi: 10.1177/1754073912445814. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ekman P. An argument for basic emotions. Cognit Emot. 1992; 6 (3–4):169–200. doi: 10.1080/02699939208411068. [ CrossRef ] [ Google Scholar ]
  • Esuli A, Sebastiani F. Sentiwordnet: a publicly available lexical resource for opinion mining. LREC, Citeseer. 2006; 6 :417–422. [ Google Scholar ]
  • Gamon M (2004) Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis. In: COLING 2004: Proceedings of the 20th international conference on computational linguistics, pp 841–847
  • Garcia K, Berton L. Topic detection and sentiment analysis in twitter content related to covid-19 from brazil and the USA. Appl Soft Comput. 2021; 101 :107057. doi: 10.1016/j.asoc.2020.107057. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ghanbari-Adivi F, Mosleh M. Text emotion detection in social networks using a novel ensemble classifier based on Parzen tree estimator (tpe) Neural Comput Appl. 2019; 31 (12):8971–8983. doi: 10.1007/s00521-019-04230-9. [ CrossRef ] [ Google Scholar ]
  • Goularas D, Kamis S (2019) Evaluation of deep learning techniques in sentiment analysis from twitter data. In: 2019 International conference on deep learning and machine learning in emerging applications (Deep-ML), IEEE, pp 12–17
  • Gräbner D, Zanker M, Fliedl G, Fuchs M, et al. (2012) Classification of customer reviews based on sentiment analysis. In: ENTER, Citeseer, pp 460–470
  • Hasan M, Rundensteiner E, Agu E (2014) Emotex: detecting emotions in twitter messages. In: 2014 ASE BIGDATA/SOCIALCOM/CYBERSECURITY Conference. Stanford University, Academy of Science and Engineering (ASE), USA, ASE, pp 1–10
  • Hasan M, Rundensteiner E, Agu E. Automatic emotion detection in text streams by analyzing twitter data. Int J Data Sci Anal. 2019; 7 (1):35–51. doi: 10.1007/s41060-018-0096-z. [ CrossRef ] [ Google Scholar ]
  • Hazarika D, Poria S, Zimmermann R, Mihalcea R. Conversational transfer learning for emotion recognition. Inf Fusion. 2020; 65 :1–12. doi: 10.1016/j.inffus.2020.06.005. [ CrossRef ] [ Google Scholar ]
  • Hosseini AS. Sentence-level emotion mining based on combination of adaptive meta-level features and sentence syntactic features. Eng Appl Artif Intell. 2017; 65 :361–374. doi: 10.1016/j.engappai.2017.08.006. [ CrossRef ] [ Google Scholar ]
  • Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the international AAAI conference on web and social media, vol 8
  • Itani M, Roast C, Al-Khayatt S. Developing resources for sentiment analysis of informal Arabic text in social media. Procedia Comput Sci. 2017; 117 :129–136. doi: 10.1016/j.procs.2017.10.101. [ CrossRef ] [ Google Scholar ]
  • Izard CE (1992) Basic emotions, relations among emotions, and emotion-cognition relations. Psychol Rev 99(3):561–565 [ PubMed ]
  • Jain VK, Kumar S, Fernandes SL. Extraction of emotions from multilingual text using intelligent text processing and computational linguistics. J Comput Sci. 2017; 21 :316–326. doi: 10.1016/j.jocs.2017.01.010. [ CrossRef ] [ Google Scholar ]
  • Jang HJ, Sim J, Lee Y, Kwon O. Deep sentiment analysis: mining the causality between personality-value-attitude for analyzing business ads in social media. Expert Syst Appl. 2013; 40 (18):7492–7503. doi: 10.1016/j.eswa.2013.06.069. [ CrossRef ] [ Google Scholar ]
  • Jha V, Savitha R, Shenoy PD, Venugopal K, Sangaiah AK. A novel sentiment aware dictionary for multi-domain sentiment classification. Comput Electr Eng. 2018; 69 :585–597. doi: 10.1016/j.compeleceng.2017.10.015. [ CrossRef ] [ Google Scholar ]
  • Jian Z, Chen X, Wang Hs. Sentiment classification using the theory of ANNs. J China Univ Posts Telecommun. 2010; 17 :58–62. doi: 10.1016/S1005-8885(09)60606-3. [ CrossRef ] [ Google Scholar ]
  • Jurek A, Mulvenna MD, Bi Y. Improved lexicon-based sentiment analysis for social media analytics. Secur Inform. 2015; 4 (1):1–13. doi: 10.1186/s13388-015-0024-x. [ CrossRef ] [ Google Scholar ]
  • Kratzwald B, Ilić S, Kraus M, Feuerriegel S, Prendinger H. Deep learning for affective computing: text-based emotion recognition in decision support. Decis Support Syst. 2018; 115 :24–35. doi: 10.1016/j.dss.2018.09.002. [ CrossRef ] [ Google Scholar ]
  • Laubert C, Parlamis J. Are you angry (happy, sad) or aren’t you? Emotion detection difficulty in email negotiation. Group Decis Negot. 2019; 28 (2):377–413. doi: 10.1007/s10726-018-09611-4. [ CrossRef ] [ Google Scholar ]
  • Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196
  • Li Z, Xie H, Cheng G, Li Q (2021) Word-level emotion distribution with two schemas for short text emotion classification. Knowledge-Based Syst 227:107163
  • Liu Y, Wan Y, Su X. Identifying individual expectations in service recovery through natural language processing and machine learning. Expert Syst Appl. 2019; 131 :288–298. doi: 10.1016/j.eswa.2019.04.063. [ CrossRef ] [ Google Scholar ]
  • Liu F, Zheng J, Zheng L, Chen C. Combining attention-based bidirectional gated recurrent neural network and two-dimensional convolutional neural network for document-level sentiment classification. Neurocomputing. 2020; 371 :39–50. doi: 10.1016/j.neucom.2019.09.012. [ CrossRef ] [ Google Scholar ]
  • Liu S, Lee K, Lee I. Document-level multi-topic sentiment classification of email data with bilstm and data augmentation. Knowl-Based Syst. 2020; 197 :105918. doi: 10.1016/j.knosys.2020.105918. [ CrossRef ] [ Google Scholar ]
  • Lövheim H. A new three-dimensional model for emotions and monoamine neurotransmitters. Med Hypoth. 2012; 78 (2):341–348. doi: 10.1016/j.mehy.2011.11.016. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ma X, Zeng J, Peng L, Fortino G, Zhang Y. Modeling multi-aspects within one opinionated sentence simultaneously for aspect-level sentiment analysis. Future Gener Comput Syst. 2019; 93 :304–311. doi: 10.1016/j.future.2018.10.041. [ CrossRef ] [ Google Scholar ]
  • Meena A, Prabhakar TV (2007) Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. In: Amati G, Carpineto C, Romano G (eds) Advances in information retrieval. ECIR 2007. Lecture notes in computer science, vol 4425. Springer, Berlin, Heidelberg. 10.1007/978-3-540-71496-5_53
  • Mladenović M, Mitrović J, Krstev C, Vitas D. Hybrid sentiment analysis framework for a morphologically rich language. J Intell Inf Syst. 2016; 46 (3):599–620. doi: 10.1007/s10844-015-0372-5. [ CrossRef ] [ Google Scholar ]
  • Mohammad SM, Turney PD. Crowdsourcing a word-emotion association lexicon. Comput Intell. 2013; 29 (3):436–465. doi: 10.1111/j.1467-8640.2012.00460.x. [ CrossRef ] [ Google Scholar ]
  • Moraes R, Valiati JF, Gavião Neto WP (2013) Document-level sentiment classification: an empirical comparison between SVM and ANN. Expert Syst Appl 40(2):621–633
  • Mukherjee P, Badr Y, Doppalapudi S, Srinivasan SM, Sangwan RS, Sharma R. Effect of negation in sentences on sentiment analysis and polarity detection. Procedia Comput Sci. 2021; 185 :370–379. doi: 10.1016/j.procs.2021.05.038. [ CrossRef ] [ Google Scholar ]
  • Munezero M, Montero CS, Sutinen E, Pajunen J. Are they different? affect, feeling, emotion, sentiment, and opinion detection in text. IEEE Trans Affect Comput. 2014; 5 (2):101–111. doi: 10.1109/TAFFC.2014.2317187. [ CrossRef ] [ Google Scholar ]
  • Nagamanjula R, Pethalakshmi A. A novel framework based on bi-objective optimization and lan 2 fis for twitter sentiment analysis. Soc Netw Anal Min. 2020; 10 :1–16. doi: 10.1007/s13278-020-00648-5. [ CrossRef ] [ Google Scholar ]
  • Nagarajan SM, Gandhi UD. Classifying streaming of twitter data based on sentiment analysis using hybridization. Neural Comput Appl. 2019; 31 (5):1425–1433. doi: 10.1007/s00521-018-3476-3. [ CrossRef ] [ Google Scholar ]
  • Nandal N, Tanwar R, Pruthi J. Machine learning based aspect level sentiment analysis for amazon products. Spat Inf Res. 2020; 28 (5):601–607. doi: 10.1007/s41324-020-00320-2. [ CrossRef ] [ Google Scholar ]
  • Onyenwe I, Nwagbo S, Mbeledogu N, Onyedinma E. The impact of political party/candidate on the election results from a sentiment analysis perspective using# anambradecides2017 tweets. Soc Netw Anal Min. 2020; 10 (1):1–17. doi: 10.1007/s13278-020-00667-2. [ CrossRef ] [ Google Scholar ]
  • Pasupa K, Ayutthaya TSN. Thai sentiment analysis with deep learning techniques: a comparative study based on word embedding, pos-tag, and sentic features. Sustain Cities Soc. 2019; 50 :101615. doi: 10.1016/j.scs.2019.101615. [ CrossRef ] [ Google Scholar ]
  • Plutchik R (1982) A psychoevolutionary theory of emotions. 10.1177/053901882021004003
  • Poria S, Gelbukh A, Cambria E, Hussain A, Huang GB. Emosenticspace: a novel framework for affective common-sense reasoning. Knowl-Based Syst. 2014; 69 :108–123. doi: 10.1016/j.knosys.2014.06.011. [ CrossRef ] [ Google Scholar ]
  • Prabowo R, Thelwall M. Sentiment analysis: a combined approach. J Inform. 2009; 3 (2):143–157. doi: 10.1016/j.joi.2009.01.003. [ CrossRef ] [ Google Scholar ]
  • Pu X, Wu G, Yuan C. Exploring overall opinions for document level sentiment classification with structural SVM. Multim Syst. 2019; 25 (1):21–33. doi: 10.1007/s00530-017-0550-0. [ CrossRef ] [ Google Scholar ]
  • Rabeya T, Ferdous S, Ali HS, Chakraborty NR (2017) A survey on emotion detection: a lexicon based backtracking approach for detecting emotion from Bengali text. In: 2017 20th international conference of computer and information technology (ICCIT), IEEE, pp 1–7
  • Rao G, Huang W, Feng Z, Cong Q. LSTM with sentence representations for document-level sentiment classification. Neurocomputing. 2018; 308 :49–57. doi: 10.1016/j.neucom.2018.04.045. [ CrossRef ] [ Google Scholar ]
  • Ray P, Chakrabarti A (2020) A mixed approach of deep learning method and rule-based method to improve aspect level sentiment analysis. Appl Comput Inform
  • Roberts K, Roach MA, Johnson J, Guthrie J, Harabagiu SM. Empatweet: annotating and detecting emotions on twitter. Lrec, Citeseer. 2012; 12 :3806–3813. [ Google Scholar ]
  • Russell JA. A circumplex model of affect. J Pers Soc Psychol. 1980; 39 (6):1161. doi: 10.1037/h0077714. [ CrossRef ] [ Google Scholar ]
  • Sailunaz K, Alhajj R. Emotion and sentiment analysis from twitter text. J Comput Sci. 2019; 36 :101003. doi: 10.1016/j.jocs.2019.05.009. [ CrossRef ] [ Google Scholar ]
  • Salinca A (2015) Business reviews classification using sentiment analysis. In: Proceedings of the 2015 17th international symposium on symbolic and numeric algorithms for scientific computing (SYNASC), IEEE, pp 247–250
  • Sangeetha K, Prabha D (2020) Sentiment analysis of student feedback using multi-head attention fusion model of word and context embedding for LSTM. J Ambient Intell Hum Comput 12:4117–4126
  • Sasidhar TT, Premjith B, Soman K. Emotion detection in hinglish (hindi+ english) code-mixed social media text. Procedia Comput Sci. 2020; 171 :1346–1352. doi: 10.1016/j.procs.2020.04.144. [ CrossRef ] [ Google Scholar ]
  • Schouten K, Frasincar F. Survey on aspect-level sentiment analysis. IEEE Trans Knowl Data Eng. 2015; 28 (3):813–830. doi: 10.1109/TKDE.2015.2485209. [ CrossRef ] [ Google Scholar ]
  • Seal D, Roy UK, Basak R (2020) Sentence-level emotion detection from text based on semantic rules. In: Information and communication technology for sustainable development. Springer, pp 423–430
  • Shamantha RB, Shetty SM, Rai P (2019) Sentiment analysis using machine learning classifiers: evaluation of performance. In: Proceedings of the 2019 IEEE 4th international conference on computer and communication systems (ICCCS), IEEE, pp 21–25
  • Sharma P, Sharma A (2020) Experimental investigation of automated system for twitter sentiment analysis to predict the public emotions using machine learning algorithms. Mater Today Proc
  • Shaver P, Schwartz J, Kirson D, O’connor C (1987) Emotion knowledge: further exploration of a prototype approach. J Pers Soc Psychol 52(6):1061 [ PubMed ]
  • Shelke NM. Approaches of emotion detection from text. Int J Comput Sci Inf Technol. 2014; 2 (2):123–128. [ Google Scholar ]
  • Shirsat VS, Jagdale RS, Deshmukh SN (2019) Sentence level sentiment identification and calculation from news articles using machine learning techniques. In: Computing, communication and signal processing. Springer, pp 371–376
  • Shrivastava K, Kumar S, Jain DK. An effective approach for emotion detection in multimedia text data using sequence based convolutional neural network. Multim Tools Appl. 2019; 78 (20):29607–29639. doi: 10.1007/s11042-019-07813-9. [ CrossRef ] [ Google Scholar ]
  • Singh M, Jakhar AK, Pandey S. Sentiment analysis on the impact of coronavirus in social life using the Bert model. Soc Netw Anal Min. 2021; 11 (1):1–11. doi: 10.1007/s13278-021-00737-z. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Songbo T, Jin Z. An empirical study of sentiment analysis for Chinese documents. Expert Syst Appl. 2008; 34 (4):2622–2629. doi: 10.1016/j.eswa.2007.05.028. [ CrossRef ] [ Google Scholar ]
  • Souma W, Vodenska I, Aoyama H. Enhanced news sentiment analysis using deep learning methods. J Comput Soc Sci. 2019; 2 (1):33–46. doi: 10.1007/s42001-019-00035-x. [ CrossRef ] [ Google Scholar ]
  • Soumya S, Pramod KV. Sentiment analysis of malayalam tweets using machine learning techniques. ICT Express. 2020; 6 (4):300–305. doi: 10.1016/j.icte.2020.04.003. [ CrossRef ] [ Google Scholar ]
  • Strapparava C, Valitutti A, et al. (2004) Wordnet affect: an affective extension of wordnet. In: Lrec, Citeseer, vol 4, pp 1083–1086
  • Sun S, Luo C, Chen J. A review of natural language processing techniques for opinion mining systems. Inf Fusion. 2017; 36 :10–25. doi: 10.1016/j.inffus.2016.10.004. [ CrossRef ] [ Google Scholar ]
  • Symeonidis S, Effrosynidis D, Arampatzis A. A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis. Expert Syst Appl. 2018; 110 :298–310. doi: 10.1016/j.eswa.2018.06.022. [ CrossRef ] [ Google Scholar ]
  • Tang D, Qin B, Liu T (2015) Learning semantic representations of users and products for document level sentiment classification. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Volume 1: Long Papers), pp 1014–1023
  • Tao J, Fang X. Toward multi-label sentiment analysis: a transfer learning based approach. J Big Data. 2020; 7 (1):1–26. doi: 10.1186/s40537-019-0278-0. [ CrossRef ] [ Google Scholar ]
  • Tiwari P, Mishra BK, Kumar S, Kumar V (2020) Implementation of n-gram methodology for rotten tomatoes review dataset sentiment analysis. In: Cognitive analytics: concepts, methodologies, tools, and applications, IGI Global, pp 689–701
  • Tomkins SS, McCarter R. What and where are the primary affects? Some evidence for a theory. Percept Mot Skills. 1964; 18 (1):119–158. doi: 10.2466/pms.1964.18.1.119. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Untawale TM, Choudhari G (2019) Implementation of sentiment classification of movie reviews by supervised machine learning approaches. In: Proceedings of the 2019 3rd international conference on computing methodologies and communication (ICCMC), IEEE, pp 1197–1200
  • Viegas F, Alvim MS, Canuto S, Rosa T, Gonçalves MA, Rocha L. Exploiting semantic relationships for unsupervised expansion of sentiment lexicons. Inf Syst. 2020; 94 :101606. doi: 10.1016/j.is.2020.101606. [ CrossRef ] [ Google Scholar ]
  • Xu G, Li W, Liu J. A social emotion classification approach using multi-model fusion. Future Gen Comput Syst. 2020; 102 :347–356. doi: 10.1016/j.future.2019.07.007. [ CrossRef ] [ Google Scholar ]
  • Yang X, Macdonald C, Ounis I. Using word embeddings in twitter election classification. Inf Retriev J. 2018; 21 (2–3):183–207. doi: 10.1007/s10791-017-9319-5. [ CrossRef ] [ Google Scholar ]
  • Ye Q, Zhang Z, Law R. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches. Expert Syst Appl. 2009; 36 (3):6527–6535. doi: 10.1016/j.eswa.2008.07.035. [ CrossRef ] [ Google Scholar ]
  • Zhang D, Si L, Rego VJ. Sentiment detection with auxiliary data. Inf Retriev. 2012; 15 (3–4):373–390. doi: 10.1007/s10791-012-9196-x. [ CrossRef ] [ Google Scholar ]

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

electronics-logo

Article Menu

sentiment analysis research papers 2019

  • Subscribe SciFeed
  • Recommended Articles
  • Author Biographies
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

A hybrid approach to dimensional aspect-based sentiment analysis using bert and large language models.

sentiment analysis research papers 2019

1. Introduction

  • We introduce innovative solutions based on BERT and LLM for dimABSA tasks, along with a variety of strategies to optimize their effectiveness.
  • We evaluate the advantages of the BERT-based and LLM-based methods across different tasks and devise a hybrid approach that leverages the advantages of both methods.
  • We conduct comprehensive experiments on the dimABSA benchmark. Our results demonstrate that our hybrid approach achieves state-of-the-art performance. Further, ablation studies confirm the effectiveness of each component in our approach. We also provide detailed discussions to offer deeper insights into our findings.

2. Background

2.1. aspect-based sentiment analysis, 2.2. dimensional sentiment analysis, 3. task definition.

  • Subtask 1: Intensity prediction. This task aims to predict sentiment intensities of given aspect terms in valence–arousal dimensions. The input includes a sentence S = [ w 1 , w 2 , ⋯ , w T ] consisting of T words, along with a predefined aspect term a , which is a substring of the sentence. The output is the sentiment intensity, denoted as v a l - a r o . As illustrated in Figure 1 , given the sentence “吐柴主艺文总店除了餐点好吃之外, 这里的用餐环境也很特别” (in English: “ Besides the tasty meals at the main art store of Tuchai, the dining environment here is also quite special ”) and two aspect terms “餐点” ( meals ) and “用餐环境” ( dining environment ), this subtask requires systems to predict valence–arousal scores of 6.5#5.75 and 6.5#6.0, respectively.
  • Subtask 2: Triplet Extraction. This task focuses on identifying aspect-level sentiments and opinions from given review sentences, outputting them as sets of triplets. The input is a sentence, and the corresponding output is a set containing all identified triplets. Each triplet consists of an aspect term a , an opinion term o , and sentiment intensity v a l - a r o . For example, given the sentence “吐柴主艺文总店除了餐点好吃之外, 这里的用餐环境也很特别” in Figure 1 (in English: “ Besides the tasty meals at the main art store of Tuchai, the dining environment here is also quite special ”), this subtask requires systems to produce the triplets {(餐点, 好吃, 6.5#5.75), (用餐环境, 很特别, 6.5#6.0)} (in English: {( meals , tasty , 6.5#5.75), ( dining environment , quite special , 6.5#6.0)}).
  • Subtask 3: Quadruple Extraction. This task builds on Subtask 2 by additionally requiring the identification of the aspect category, thus forming a quadruple. The aspect category falls within a predefined classification space, including 餐厅#概括 (restaurant#general), 餐厅#价格 (restaurant#prices), 餐厅#杂项 (restaurant#miscellaneous), 食物#价格 (food#prices), 食物#品质 (food#quality), 食物#份量与款式 (food#style& options), 饮料#价格 (drinks#prices), 饮料#品质 (drinks#quality), 饮料#份量与款式 (drinks#style&options), 氛围#概括 (ambience#general), 服务#概括 (services#general), and 地点#概括 (location#general). The specific meanings of each category can be found in the guideline [ 87 ]. For example, given the sentence in Figure 1 , this subtask requires systems to produce the quadruples {(餐点, 食物#品质, 好吃, 6.5#5.75), (用餐环境, 氛围#概括, 很特别, 6.5#6.0)} (in English: {( meals , food#quality, tasty , 6.5#5.75), ( dining environment , ambience#general, quite special , 6.5#6.0)}).

4.1. BERT-Based Method

4.1.1. domain-adaptive pre-training, 4.1.2. aspect–opinion extraction, 4.1.3. aspect–opinion pairing and category classification, 4.1.4. intensity prediction, 4.2. llm-based method, 4.3. ensemble strategy, 5. experiments, 5.1. experimental setup, 5.2. experimental results.

  • Firstly, the hybrid approach outperforms the individual approaches on the majority of metrics, indicating that it effectively leverages the strengths of both BERT-based and LLM-based methods to achieve enhanced performance. Note that the A-Q-F1 metrics for the hybrid approach are slightly lower than those for BERT CLS , indicating that the advantage of large model methods in arousal scores is relatively weak, as also reflected in the A-T-F1.
  • Secondly, despite having significantly fewer parameters (296M) compared to the LLM-based method (7B), the BERT-based method exhibits superior performance across all metrics. We attribute this advantage to two main limitations of LLMs: (1) LLMs lack specific structures or designs to model the interactions among sentiment elements or between sentiment elements and context. This deficiency hinders the model’s ability to learn task-specific representations. (2) The mapping from representations to dimABSA labels in LLMs is unnatural. Specifically, representing continuous valence–arousal scores as text reduces the semantic information inherent in the numerical values.
  • Thirdly, within the BERT-based approaches, the regression model performs better in Subtask 1, while the classification model excels in Subtasks 2 and 3. This suggests that the regression model is more advantageous for fine-grained intensity assessments, whereas the classification model is more effective for coarse-grained intensity assessments.
  • Finally, in the LLM-based methods, representing scores as decimals (LLM DEC ) yields better results in Subtask 1, while integer representations (LLM INT ) are more effective in Subtasks 2 and 3. This mirrors the conclusions drawn from the BERT-based methods.

6. Discussion

6.1. analysis of ensemble strategy, 6.2. ablation study, 6.3. effect of pre-trained language models, 6.4. error analysis, 7. conclusions and future works, author contributions, data availability statement, conflicts of interest.

  • Medhat, W.; Hassan, A.; Korashy, H. Sentiment analysis algorithms and applications: A survey. Ain Shams Eng. J. 2014 , 5 , 1093–1113. [ Google Scholar ] [ CrossRef ]
  • Liu, B. Sentiment analysis and subjectivity. In Handbook of Natural Language Processing ; Routledge: Abingdon-on-Thames, UK, 2010; Volume 2, pp. 627–666. [ Google Scholar ]
  • Pontiki, M.; Galanis, D.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S.; AL-Smadi, M.; Al-Ayyoub, M.; Zhao, Y.; Qin, B.; De Clercq, O.; et al. SemEval-2016 Task 5: Aspect Based Sentiment Analysis. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), San Diego, CA, USA, 16–17 June 2016; pp. 19–30. [ Google Scholar ] [ CrossRef ]
  • Cai, H.; Xia, R.; Yu, J. Aspect-Category-Opinion-Sentiment Quadruple Extraction with Implicit Aspects and Opinions. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 340–350. [ Google Scholar ] [ CrossRef ]
  • Zhang, W.; Deng, Y.; Li, X.; Yuan, Y.; Bing, L.; Lam, W. Aspect Sentiment Quad Prediction as Paraphrase Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online, 7–11 November 2021; pp. 9209–9219. [ Google Scholar ] [ CrossRef ]
  • Lee, L.H.; Yu, L.C.; Wang, S.; Liao, J. Overview of the SIGHAN 2024 shared task for Chinese dimensional aspect-based sentiment analysis. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 165–174. [ Google Scholar ]
  • Russell, J.A. A circumplex model of affect. J. Personal. Soc. Psychol. 1980 , 39 , 1161. [ Google Scholar ] [ CrossRef ]
  • Xu, H.; Zhang, D.; Zhang, Y.; Xu, R. HITSZ-HLT at SIGHAN-2024 dimABSA Task: Integrating BERT and LLM for Chinese Dimensional Aspect-Based Sentiment Analysis. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 175–185. [ Google Scholar ]
  • Li, P.; Sun, T.; Tang, Q.; Yan, H.; Wu, Y.; Huang, X.; Qiu, X. CodeIE: Large Code Generation Models are Better Few-Shot Information Extractors. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 15339–15353. [ Google Scholar ] [ CrossRef ]
  • Li, Z.; Zeng, Y.; Zuo, Y.; Ren, W.; Liu, W.; Su, M.; Guo, Y.; Liu, Y.; Li, X.; Hu, Z.; et al. KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 11–16 August 2024; pp. 8758–8779. [ Google Scholar ]
  • Dettmers, T.; Pagnoni, A.; Holtzman, A.; Zettlemoyer, L. QLoRA: Efficient Finetuning of Quantized LLMs. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2023; Volume 36, pp. 10088–10115. [ Google Scholar ]
  • Wang, Y.; Huang, M.; Zhu, X.; Zhao, L. Attention-based LSTM for Aspect-level Sentiment Classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 606–615. [ Google Scholar ] [ CrossRef ]
  • Ma, D.; Li, S.; Zhang, X.; Wang, H. Interactive attention networks for aspect-level sentiment classification. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; pp. 4068–4074. [ Google Scholar ]
  • Liu, J.; Zhang, Y. Attention Modeling for Targeted Sentiment. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, 3–7 April 2017; pp. 572–577. [ Google Scholar ]
  • Ma, Y.; Peng, H.; Cambria, E. Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM. Proc. AAAI Conf. Artif. Intell. 2018 , 32 , 5876–5883. [ Google Scholar ] [ CrossRef ]
  • Tang, D.; Qin, B.; Feng, X.; Liu, T. Effective LSTMs for Target-Dependent Sentiment Classification. In Proceedings of the COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 3298–3307. [ Google Scholar ]
  • Vo, D.T.; Zhang, Y. Target-dependent twitter sentiment classification with rich automatic features. In Proceedings of the 24th International Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015; pp. 1347–1353. [ Google Scholar ]
  • Zhang, M.; Zhang, Y.; Vo, D.T. Gated Neural Networks for Targeted Sentiment Analysis. Proc. AAAI Conf. Artif. Intell. 2016 , 30 , 3087–3093. [ Google Scholar ] [ CrossRef ]
  • Tang, D.; Qin, B.; Liu, T. Aspect Level Sentiment Classification with Deep Memory Network. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 214–224. [ Google Scholar ] [ CrossRef ]
  • Fan, C.; Gao, Q.; Du, J.; Gui, L.; Xu, R.; Wong, K.F. Convolution-based Memory Network for Aspect-based Sentiment Analysis. In Proceedings of the SIGIR’18: 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, New York, NY, USA, 8–12 July 2018; pp. 1161–1164. [ Google Scholar ] [ CrossRef ]
  • Xue, W.; Li, T. Aspect Based Sentiment Analysis with Gated Convolutional Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 2514–2523. [ Google Scholar ] [ CrossRef ]
  • Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [ Google Scholar ] [ CrossRef ]
  • Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arxiv 2019 , arXiv:1907.11692. [ Google Scholar ]
  • Sun, C.; Huang, L.; Qiu, X. Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 380–385. [ Google Scholar ] [ CrossRef ]
  • Zhang, K.; Zhang, K.; Zhang, M.; Zhao, H.; Liu, Q.; Wu, W.; Chen, E. Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis. In Proceedings of the Findings of Annual Meeting of the Association for Computational Linguistics—ACL, Dublin, Ireland, 22–27 May 2022; pp. 3599–3610. [ Google Scholar ]
  • Xu, H.; Liu, B.; Shu, L.; Philip, S.Y. BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 2324–2335. [ Google Scholar ]
  • Li, Z.; Zou, Y.; Zhang, C.; Zhang, Q.; Wei, Z. Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online, 7–11 November 2021; pp. 246–256. [ Google Scholar ]
  • Zhang, Y.; Yang, Y.; Liang, B.; Chen, S.; Qin, B.; Xu, R. An Empirical Study of Sentiment-Enhanced Pre-Training for Aspect-Based Sentiment Analysis. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 9633–9651. [ Google Scholar ] [ CrossRef ]
  • Liang, B.; Luo, W.; Li, X.; Gui, L.; Yang, M.; Yu, X.; Xu, R. Enhancing aspect-based sentiment analysis with supervised contrastive learning. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, New York, NY, USA, 1–5 November 2021; pp. 3242–3247. [ Google Scholar ]
  • Cao, J.; Liu, R.; Peng, H.; Jiang, L.; Bai, X. Aspect is not you need: No-aspect differential sentiment framework for aspect-based sentiment analysis. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, WA, USA, 10–15 July 2022; pp. 1599–1609. [ Google Scholar ]
  • Wang, K.; Shen, W.; Yang, Y.; Quan, X.; Wang, R. Relational Graph Attention Network for Aspect-based Sentiment Analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics—ACL, Online, 5–10 July 2020; pp. 3229–3238. [ Google Scholar ]
  • Chen, C.; Teng, Z.; Wang, Z.; Zhang, Y. Discrete Opinion Tree Induction for Aspect-based Sentiment Analysis. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 2051–2064. [ Google Scholar ] [ CrossRef ]
  • Wang, Z.; Xie, Q.; Feng, Y.; Ding, Z.; Yang, Z.; Xia, R. Is ChatGPT a good sentiment analyzer? A preliminary study. arXiv 2023 , arXiv:2304.04339. [ Google Scholar ]
  • Xu, H.; Wang, Q.; Zhang, Y.; Yang, M.; Zeng, X.; Qin, B.; Xu, R. Improving In-Context Learning with Prediction Feedback for Sentiment Analysis. arXiv 2024 , arXiv:2406.02911. [ Google Scholar ]
  • Fei, H.; Li, B.; Liu, Q.; Bing, L.; Li, F.; Chua, T.S. Reasoning Implicit Sentiment with Chain-of-Thought Prompting. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 1171–1182. [ Google Scholar ] [ CrossRef ]
  • Simmering, P.F.; Huoviala, P. Large language models for aspect-based sentiment analysis. arXiv 2023 , arXiv:2310.18025. [ Google Scholar ]
  • Šmíd, J.; Priban, P.; Kral, P. LLaMA-Based Models for Aspect-Based Sentiment Analysis. In Proceedings of the 14th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis, Bangkok, Thailand, 15 August 2024; pp. 63–70. [ Google Scholar ]
  • Wang, Q.; Ding, K.; Liang, B.; Yang, M.; Xu, R. Reducing Spurious Correlations in Aspect-based Sentiment Analysis with Explanation from Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; pp. 2930–2941. [ Google Scholar ] [ CrossRef ]
  • Yin, Y.; Wei, F.; Dong, L.; Xu, K.; Zhang, M.; Zhou, M. Unsupervised word and dependency path embeddings for aspect term extraction. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, NY, USA, 9–15 July 2016; pp. 2979–2985. [ Google Scholar ]
  • Xu, H.; Liu, B.; Shu, L.; Yu, P.S. Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Melbourne, Australia, 15–20 July 2018; pp. 592–598. [ Google Scholar ] [ CrossRef ]
  • Hu, M.; Peng, Y.; Huang, Z.; Li, D.; Lv, Y. Open-Domain Targeted Sentiment Analysis via Span-Based Extraction and Classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 537–546. [ Google Scholar ] [ CrossRef ]
  • Wei, Z.; Hong, Y.; Zou, B.; Cheng, M.; Yao, J. Don’t Eclipse Your Arts Due to Small Discrepancies: Boundary Repositioning with a Pointer Network for Aspect Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 3678–3684. [ Google Scholar ] [ CrossRef ]
  • Wang, Q.; Wen, Z.; Zhao, Q.; Yang, M.; Xu, R. Progressive Self-Training with Discriminator for Aspect Term Extraction. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online and Punta Cana, Dominican Republic, 7–11 November 2021; pp. 257–268. [ Google Scholar ] [ CrossRef ]
  • Wang, W.; Pan, S.J.; Dahlmeier, D.; Xiao, X. Recursive Neural Conditional Random Fields for Aspect-based Sentiment Analysis. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 616–626. [ Google Scholar ] [ CrossRef ]
  • Wang, W.; Pan, S.J.; Dahlmeier, D.; Xiao, X. Coupled Multi-Layer Attentions for Co-Extraction of Aspect and Opinion Terms. Proc. AAAI Conf. Artif. Intell. 2017 , 31 , 3316–3322. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Lam, W. Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, 9–11 September 2017; pp. 2886–2892. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Bing, L.; Li, P.; Lam, W.; Yang, Z. Aspect term extraction with history attention and selective transformation. In Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 4194–4200. [ Google Scholar ]
  • Fan, Z.; Wu, Z.; Dai, X.Y.; Huang, S.; Chen, J. Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 2509–2518. [ Google Scholar ] [ CrossRef ]
  • Peng, H.; Xu, L.; Bing, L.; Huang, F.; Lu, W.; Si, L. Knowing What, How and Why: A Near Complete Solution for Aspect-Based Sentiment Analysis. Proc. AAAI Conf. Artif. Intell. 2020 , 34 , 8600–8607. [ Google Scholar ] [ CrossRef ]
  • Chen, S.; Wang, Y.; Liu, J.; Wang, Y. Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet Extraction. Proc. AAAI Conf. Artif. Intell. 2021 , 35 , 12666–12674. [ Google Scholar ] [ CrossRef ]
  • Mao, Y.; Shen, Y.; Yu, C.; Cai, L. A Joint Training Dual-MRC Framework for Aspect Based Sentiment Analysis. Proc. AAAI Conf. Artif. Intell. 2021 , 35 , 13543–13551. [ Google Scholar ] [ CrossRef ]
  • Zhai, Z.; Chen, H.; Feng, F.; Li, R.; Wang, X. COM-MRC: A COntext-Masked Machine Reading Comprehension Framework for Aspect Sentiment Triplet Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 3230–3241. [ Google Scholar ] [ CrossRef ]
  • Li, Y.; Lin, Y.; Lin, Y.; Chang, L.; Zhang, H. A span-sharing joint extraction framework for harvesting aspect sentiment triplets. Knowl.-Based Syst. 2022 , 242 , 108366. [ Google Scholar ] [ CrossRef ]
  • Chen, Y.; Keming, C.; Sun, X.; Zhang, Z. A Span-level Bidirectional Network for Aspect Sentiment Triplet Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 4300–4309. [ Google Scholar ] [ CrossRef ]
  • Xu, L.; Chia, Y.K.; Bing, L. Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 4755–4766. [ Google Scholar ] [ CrossRef ]
  • Wu, Z.; Ying, C.; Zhao, F.; Fan, Z.; Dai, X.; Xia, R. Grid Tagging Scheme for Aspect-oriented Fine-grained Opinion Extraction. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; pp. 2576–2585. [ Google Scholar ] [ CrossRef ]
  • Chen, H.; Zhai, Z.; Feng, F.; Li, R.; Wang, X. Enhanced Multi-Channel Graph Convolutional Network for Aspect Sentiment Triplet Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 2974–2985. [ Google Scholar ] [ CrossRef ]
  • Zhang, Y.; Yang, Y.; Li, Y.; Liang, B.; Chen, S.; Dang, Y.; Yang, M.; Xu, R. Boundary-Driven Table-Filling for Aspect Sentiment Triplet Extraction. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 6485–6498. [ Google Scholar ] [ CrossRef ]
  • Yan, H.; Dai, J.; Ji, T.; Qiu, X.; Zhang, Z. A Unified Generative Framework for Aspect-based Sentiment Analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021; pp. 2416–2429. [ Google Scholar ] [ CrossRef ]
  • Lu, Y.; Liu, Q.; Dai, D.; Xiao, X.; Lin, H.; Han, X.; Sun, L.; Wu, H. Unified Structure Generation for Universal Information Extraction. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Dublin, Ireland, 22–27 May 2022; pp. 5755–5772. [ Google Scholar ] [ CrossRef ]
  • Zhang, W.; Li, X.; Deng, Y.; Bing, L.; Lam, W. Towards Generative Aspect-Based Sentiment Analysis. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), Online, 1–6 August 2021; pp. 504–510. [ Google Scholar ] [ CrossRef ]
  • Zhou, J.; Yang, H.; He, Y.; Mou, H.; Yang, J. A Unified One-Step Solution for Aspect Sentiment Quad Prediction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 12249–12265. [ Google Scholar ] [ CrossRef ]
  • Qin, Y.; Lv, S. Generative Aspect Sentiment Quad Prediction with Self-Inference Template. Appl. Sci. 2024 , 14 , 6017. [ Google Scholar ] [ CrossRef ]
  • Bao, X.; Wang, Z.; Jiang, X.; Xiao, R.; Li, S. Aspect-based Sentiment Analysis with Opinion Tree Generation. In Proceedings of the 31st International Joint Conference on Artificial Intelligence—IJCAI, Vienna, Austria, 23–29 July 2022; Volume 2022, pp. 4044–4050. [ Google Scholar ]
  • Mao, Y.; Shen, Y.; Yang, J.; Zhu, X.; Cai, L. Seq2Path: Generating Sentiment Tuples as Paths of a Tree. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, 22–27 May 2022; pp. 2215–2225. [ Google Scholar ] [ CrossRef ]
  • Hu, M.; Wu, Y.; Gao, H.; Bai, Y.; Zhao, S. Improving Aspect Sentiment Quad Prediction via Template-Order Data Augmentation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 7889–7900. [ Google Scholar ] [ CrossRef ]
  • Gou, Z.; Guo, Q.; Yang, Y. MvP: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Toronto, ON, Canada, 9–14 July 2023; pp. 4380–4397. [ Google Scholar ] [ CrossRef ]
  • Zhang, W.; Zhang, X.; Cui, S.; Huang, K.; Wang, X.; Liu, T. Adaptive Data Augmentation for Aspect Sentiment Quad Prediction. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 11176–11180. [ Google Scholar ] [ CrossRef ]
  • Yu, Y.; Zhao, M.; Zhou, S. Boosting Aspect Sentiment Quad Prediction by Data Augmentation and Self-Training. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–8. [ Google Scholar ] [ CrossRef ]
  • Wang, A.; Jiang, J.; Ma, Y.; Liu, A.; Okazaki, N. Generative Data Augmentation for Aspect Sentiment Quad Prediction. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), Toronto, ON, Canada, 13–14 July 2023; pp. 128–140. [ Google Scholar ] [ CrossRef ]
  • Zhang, Y.; Zeng, J.; Hu, W.; Wang, Z.; Chen, S.; Xu, R. Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Bangkok, Thailand, 11–16 August 2024; pp. 11862–11875. [ Google Scholar ]
  • Hu, M.; Bai, Y.; Wu, Y.; Zhang, Z.; Zhang, L.; Gao, H.; Zhao, S.; Huang, M. Uncertainty-Aware Unlikelihood Learning Improves Generative Aspect Sentiment Quad Prediction. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, Toronto, ON, Canada, 9–14 July 2023; pp. 13481–13494. [ Google Scholar ] [ CrossRef ]
  • Xu, X.; Zhang, J.D.; Xiao, R.; Xiong, L. The Limits of ChatGPT in Extracting Aspect-Category-Opinion-Sentiment Quadruples: A Comparative Analysis. arxiv 2023 , arXiv:2310.06502. [ Google Scholar ]
  • Kim, J.; Heo, R.; Seo, Y.; Kang, S.; Yeo, J.; Lee, D. Self-Consistent Reasoning-based Aspect-Sentiment Quad Prediction with Extract-Then-Assign Strategy. arXiv 2024 , arXiv:2403.00354. [ Google Scholar ]
  • Warriner, A.B.; Kuperman, V.; Brysbaert, M. Norms of valence, arousal, and dominance for 13,915 English lemmas. Behav. Res. Methods 2013 , 45 , 1191–1207. [ Google Scholar ] [ CrossRef ]
  • Preoţiuc-Pietro, D.; Schwartz, H.A.; Park, G.; Eichstaedt, J.; Kern, M.; Ungar, L.; Shulman, E. Modelling valence and arousal in facebook posts. In Proceedings of the 7th workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, San Diego, CA, USA, 16 June 2016; pp. 9–15. [ Google Scholar ]
  • Buechel, S.; Hahn, U. EmoBank: Studying the Impact of Annotation Perspective and Representation Format on Dimensional Emotion Analysis. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Valencia, Spain, 3–7 April 2017; pp. 578–585. [ Google Scholar ]
  • Yu, L.C.; Lee, L.H.; Hao, S.; Wang, J.; He, Y.; Hu, J.; Lai, K.R.; Zhang, X. Building Chinese affective resources in valence-arousal dimensions. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 540–545. [ Google Scholar ]
  • Lee, L.H.; Li, J.H.; Yu, L.C. Chinese EmoBank: Building valence-arousal resources for dimensional sentiment analysis. ACM Trans. Asian-Low-Resour. Lang. Inf. Process. 2022 , 21 , 1–18. [ Google Scholar ] [ CrossRef ]
  • Wu, C.; Wu, F.; Huang, Y.; Wu, S.; Yuan, Z. Thu_ngn at ijcnlp-2017 task 2: Dimensional sentiment analysis for chinese phrases with deep lstm. In Proceedings of the IJCNLP 2017, Shared Tasks, Taipei, Taiwan, 27 November–1 December 2017; pp. 47–52. [ Google Scholar ]
  • Xie, H.; Lin, W.; Lin, S.; Wang, J.; Yu, L.C. A multi-dimensional relation model for dimensional sentiment analysis. Inf. Sci. 2021 , 579 , 832–844. [ Google Scholar ] [ CrossRef ]
  • Wang, J.; Yu, L.C.; Lai, K.R.; Zhang, X. Dimensional sentiment analysis using a regional CNN-LSTM model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Berlin, Germany, 7–12 August 2016; pp. 225–230. [ Google Scholar ]
  • Wang, J.; Yu, L.C.; Lai, K.R.; Zhang, X. Tree-structured regional CNN-LSTM model for dimensional sentiment analysis. IEEE/ACM Trans. Audio Speech Lang. Process. 2019 , 28 , 581–591. [ Google Scholar ] [ CrossRef ]
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017 , 30 , 6000–6010. [ Google Scholar ]
  • Deng, Y.C.; Wang, Y.R.; Chen, S.H.; Lee, L.H. Towards Transformer Fusions for Chínese Sentiment Intensity Prediction in Valence-Arousal Dimensions. IEEE Access 2023 , 11 , 109974–109982. [ Google Scholar ] [ CrossRef ]
  • Wang, J.; Yu, L.C.; Zhang, X. SoftMCL: Soft Momentum Contrastive Learning for Fine-grained Sentiment-aware Pre-training. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; pp. 15012–15023. [ Google Scholar ]
  • Pontiki, M.; Galanis, D.; Papageorgiou, H.; Androutsopoulos, I.; Manandhar, S.; Al-Smadi, M.; Al-Ayyoub, M.; Zhao, Y.; Qin, B.; De Clercq, O.; et al. SemEval 2016 Task 5 Aspect Based Sentiment Analysis (ABSA-16) Annotation Guidelines. 2016. Available online: https://alt.qcri.org/semeval2016/task5/data/uploads/absa2016_annotationguidelines.pdf (accessed on 14 September 2024).
  • Li, T. Restaurant Review Data on Dianping.com. 2018. Available online: https://opendata.pku.edu.cn/dataset.xhtml?persistentId=doi:10.18170/DVN/GCIUN4 (accessed on 15 September 2024).
  • Cui, Y.; Che, W.; Liu, T.; Qin, B.; Yang, Z. Pre-Training With Whole Word Masking for Chinese BERT. IEEE/ACM Trans. Audio Speech, Lang. Process. 2021 , 29 , 3504–3514. [ Google Scholar ] [ CrossRef ]
  • Che, W.; Li, Z.; Liu, T. LTP: A Chinese language technology platform. In Coling 2010: Demonstrations ; Coling 2010 Organizing Committee: Beijing, China, 2010; pp. 13–16. [ Google Scholar ]
  • Ramshaw, L.A.; Marcus, M.P. Text Chunking Using Transformation-Based Learning. In Natural Language Processing Using Very Large Corpora ; Armstrong, S., Church, K., Isabelle, P., Manzi, S., Tzoukermann, E., Yarowsky, D., Eds.; Springer: Dordrecht, The Netherlands, 1999; pp. 157–176. [ Google Scholar ] [ CrossRef ]
  • Deotte, C. The Magic of No Dropout. 2021. Available online: https://www.kaggle.com/competitions/commonlitreadabilityprize/discussion/260729 (accessed on 1 August 2024).
  • Wang, X.; Zhou, W.; Zu, C.; Xia, H.; Chen, T.; Zhang, Y.; Zheng, R.; Ye, J.; Zhang, Q.; Gui, T.; et al. InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction. arxiv 2023 , arXiv:2304.08085. [ Google Scholar ]
  • Hu, E.J.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Wang, L.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. In Proceedings of the International Conference on Learning Representations, Online, 25–29 April 2022. [ Google Scholar ]
  • Sun, Y.; Wang, S.; Feng, S.; Ding, S.; Pang, C.; Shang, J.; Liu, J.; Chen, X.; Zhao, Y.; Lu, Y.; et al. Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation. arXiv 2021 , arXiv:2107.02137. [ Google Scholar ]
  • Guo, D.; Zhu, Q.; Yang, D.; Xie, Z.; Dong, K.; Zhang, W.; Chen, G.; Bi, X.; Wu, Y.; Li, Y.; et al. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv 2024 , arXiv:2401.14196. [ Google Scholar ]
  • Meng, L.a.; Zhao, T.; Song, D. DS-Group at SIGHAN-2024 dimABSA Task: Constructing In-context Learning Structure for Dimensional Aspect-Based Sentiment Analysis. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 127–132. [ Google Scholar ]
  • Wang, Z.; Zhang, Y.; Wang, J.; Xu, D.; Zhang, X. YNU-HPCC at SIGHAN-2024 dimABSA Task: Using PLMs with a Joint Learning Strategy for Dimensional Intensity Prediction. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 96–101. [ Google Scholar ]
  • Kang, X.; Zhang, Z.; Zhou, J.; Wu, Y.; Shi, X.; Matsumoto, K. TMAK-Plus at SIGHAN-2024 dimABSA Task: Multi-Agent Collaboration for Transparent and Rational Sentiment Analysis. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 88–95. [ Google Scholar ]
  • Jiang, Y.; Lu, H.Y. JN-NLP at SIGHAN-2024 dimABSA Task: Extraction of Sentiment Intensity Quadruples Based on Paraphrase Generation. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 121–126. [ Google Scholar ]
  • Zhu, S.; Zhao, H.; Wxr, W.; Jia, Y.; Zan, H. ZZU-NLP at SIGHAN-2024 dimABSA Task: Aspect-Based Sentiment Analysis with Coarse-to-Fine In-context Learning. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 112–120. [ Google Scholar ]
  • Tong, Z.; Wei, W. CCIIPLab at SIGHAN-2024 dimABSA Task: Contrastive Learning-Enhanced Span-based Framework for Chinese Dimensional Aspect-Based Sentiment Analysis. In Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10), Bangkok, Thailand, 11–16 August 2024; pp. 102–111. [ Google Scholar ]
  • Zhang, J.; Gan, R.; Wang, J.; Zhang, Y.; Zhang, L.; Yang, P.; Gao, X.; Wu, Z.; Dong, X.; He, J.; et al. Fengshenbang 1.0: Being the Foundation of Chinese Cognitive Intelligence. arXiv 2022 , arXiv:2209.02970. [ Google Scholar ]

Click here to enlarge figure

SubtaskDataset#Sent#Char#TupleAspectOpinion
#NULL#Unique#Repeat#Unique#Repeat
ST1train605085,769852316964301924--
dev1001.10911501150--
test200034,0022658026580--
ST2 and ST3train605085,7698523169643019247986537
dev1001280150078721437
test200039,014356652169318213263303
MethodsSubtask 1Subtask 2Subtask 3
V-MAE↓V-PCC↑A-MAE↓A-PCC↑V-T-F1↑A-T-F1↑VA-T-F1↑V-Q-F1↑A-Q-F1↑VA-Q-F1↑
yangnan1.0320.8771.0950.097------
DS-Group0.4600.8580.5010.490------
YNU-HPCC0.2940.9170.3180.771------
TMAK-Plus----0.2690.3070.157---
USTC-IAT-------0.4380.4370.312
SUDA-NLP----0.4750.4480.3260.4870.4440.336
BIT-NLP----0.4900.4500.3420.4700.4340.329
JN-NLP-------0.4820.4390.331
ZZU-NLP----0.5420.5070.3890.5220.4890.376
CCIIPLab0.2940.9160.3090.7660.5730.5220.4030.5550.5070.389
Ours
MethodsSubtask 1Subtask 2Subtask 3
V-MAE↓V-PCC↑A-MAE↓A-PCC↑V-T-F1↑A-T-F1↑VA-T-F1↑V-Q-F1↑A-Q-F1↑VA-Q-F1↑
BERT 0.2870.9300.3110.7730.5740.5260.4050.5550.5110.393
BERT 0.9300.3160.7660.5830.5430.4250.564 0.411
LLM 0.3670.8840.3940.6830.5300.4980.3920.5120.4820.379
LLM 0.2940.9190.3310.7380.4570.4370.3120.4430.4260.302
Hybrid approach 0.526
MethodsTypeV-Q-F1A-Q-F1VA-Q-F1
Voting1BERT0.5570.5090.393
Voting2BERT&LLM0.5630.5260.413
ReplaceBERT&LLM0.565 0.416
PipelineBERT&LLM
MethodsSubtask 1Subtask 2Subtask 3
V-MAE↓V-PCC↑A-MAE↓A-PCC↑V-T-F1↑A-T-F1↑VA-T-F1↑V-Q-F1↑A-Q-F1↑VA-Q-F1↑
BERT 0.773
w/o pre-training0.2940.9240.3130.7710.5650.5200.4010.5440.5020.386
w/o disabling-dropout0.3370.9330.348 0.5370.5030.3650.5210.4870.354
w/o negative-pair----0.5670.5180.3990.5490.5020.387
MethodsSubtask 1Subtask 2Subtask 3
V-MAE↓V-PCC↑A-MAE↓A-PCC↑V-T-F1↑A-T-F1↑VA-T-F1↑V-Q-F1↑A-Q-F1↑VA-Q-F1↑
LLM 0.394 0.530 0.512
w/o multi-task0.3810.8760.4060.632 0.4810.381 0.4640.367
w/o code prompt0.3670.8820.3940.6720.5150.4720.3730.4950.4540.358
w/o beam search0.3770.880 0.6700.5310.4890.3880.5110.4720.374
ModelParamsValenceArousal
MAE↓PCC↑MAE↓PCC↑
chinese-roberta-wwm-ext [ ]102M0.3000.9180.3100.766
ernie-3.0-base-zh [ ]118M0.3000.9150.3130.762
ernie-3.0-xbase-zh [ ]296M0.2860.926
erlangshen-deberta-v2-320m-chinese [ ]320M 0.3100.774
chinese-roberta-ext-large [ ]326M0.2890.9230.3140.769
AspectOpinionPairingCategoryValenceArousal
Error proportion18.68%21.78%2.34%4.48%25.90%26.82%
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Zhang, Y.; Xu, H.; Zhang, D.; Xu, R. A Hybrid Approach to Dimensional Aspect-Based Sentiment Analysis Using BERT and Large Language Models. Electronics 2024 , 13 , 3724. https://doi.org/10.3390/electronics13183724

Zhang Y, Xu H, Zhang D, Xu R. A Hybrid Approach to Dimensional Aspect-Based Sentiment Analysis Using BERT and Large Language Models. Electronics . 2024; 13(18):3724. https://doi.org/10.3390/electronics13183724

Zhang, Yice, Hongling Xu, Delong Zhang, and Ruifeng Xu. 2024. "A Hybrid Approach to Dimensional Aspect-Based Sentiment Analysis Using BERT and Large Language Models" Electronics 13, no. 18: 3724. https://doi.org/10.3390/electronics13183724

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Help | Advanced Search

Computer Science > Computation and Language

Title: a survey on aspect-based sentiment analysis: tasks, methods, and challenges.

Abstract: As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's opinions at the aspect level, has been attracting considerable interest in the last decade. To handle ABSA in different scenarios, various tasks are introduced for analyzing different sentiment elements and their relations, including the aspect term, aspect category, opinion term, and sentiment polarity. Unlike early ABSA works focusing on a single sentiment element, many compound ABSA tasks involving multiple elements have been studied in recent years for capturing more complete aspect-level sentiment information. However, a systematic review of various ABSA tasks and their corresponding solutions is still lacking, which we aim to fill in this survey. More specifically, we provide a new taxonomy for ABSA which organizes existing studies from the axes of concerned sentiment elements, with an emphasis on recent advances of compound ABSA tasks. From the perspective of solutions, we summarize the utilization of pre-trained language models for ABSA, which improved the performance of ABSA to a new stage. Besides, techniques for building more practical ABSA systems in cross-domain/lingual scenarios are discussed. Finally, we review some emerging topics and discuss some open challenges to outlook potential future directions of ABSA.
Subjects: Computation and Language (cs.CL)
Cite as: [cs.CL]
  (or [cs.CL] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Sentiment analysis using Twitter data: a comparative application of lexicon- and machine-learning-based approach

  • Original Article
  • Open access
  • Published: 09 February 2023
  • Volume 13 , article number  31 , ( 2023 )

Cite this article

You have full access to this open access article

sentiment analysis research papers 2019

  • Yuxing Qi 1 &
  • Zahratu Shabrina 2 , 3  

18k Accesses

43 Citations

1 Altmetric

Explore all metrics

Avoid common mistakes on your manuscript.

1 Introduction

Social media platform such as Twitter provides a space where users share their thoughts and opinion as well as connect, communicate, and contribute to certain topics using short, 140 characters posts, known as tweets . This can be done through texts, pictures, and videos, etc., and users can interact using likes, comments, and reposts buttons. According to Twitter ( https://investor.twitterinc.com ), the platform has more than 206 million daily active users in 2022, which is defined as the number of logged accounts that can be identified by the platform and where ads can be shown. As more people contribute to social media, the analysis of information available online can be used to reflect on the changes in people's perceptions, behavior, and psychology (Alamoodi et al. 2021 ). Hence, using Twitter data for sentiment analysis has become a popular trend. The growing interest in social media analysis has brought more attention to Natural Languages Processing (NLP) and Artificial Intelligence (AI) technologies related to text analysis.

Using text analysis, it is possible to determine the sentiments and attitudes of certain target groups. Much of the available literature focuses on texts in English but there is a growing interest in multilanguage analysis (Arun and Srinagesh 2020a ; Dashtipour et al. 2016 ; Lo et al. 2017 ). Text analysis can be done by extracting subjective comments toward a certain topic using different sentiments such as Positive, Negative, and Neutral (Arun and Srinagesh 2020b ). One of the topical interests would be related to the Coronavirus (Covid-19), which is a novel disease that was first discovered in late 2019. The rapid spread of Covid-19 worldwide has affected many countries, leading to changes in people’s lifestyles, such as wearing masks on public transportation and maintaining social distancing. Sentiment analysis can be implemented to social media data to explore changes in people’s behavior, emotions, and opinions such as by dividing the spread trend of Covid-19 into three stages and exploring people’s negative sentiments toward Covid-19 based on topic modeling and feature extraction (Boon-Itt and Skunkan 2020 ). Previous studies have retrieved tweets based on certain hashtags (#) used to categorize content based on certain topics such as “#stayathome” and “#socialdistancing” to measure their frequency (Saleh et al. 2021 ). Another study has used the Word2Vec technique and machine learning models, such as Naive Bayes, SVC, and Decision Tree, to explore the sentimental changes of students during the online learning process as various learning activities were moved online due to the pandemic (Mostafa 2021 ).

In this paper, we implement social media data analysis to explore sentiments toward Covid-19 in England. This paper aims to examine the sentiments of tweets using various methods including lexicon and machine learning approaches during the third lockdown period in England as a case study. Those who just started dealing with NLP should be able to use this paper to help select the appropriate method for their NLP analysis. Empirically, the case study also contributes to our understanding of the sentiments related to the UK national lockdown. In many countries, the implementation of policies and plans related to Covid-19 often sparked widespread discussion on Twitter. Tweet data can reflect the public sentiments on the Covid-19 pandemic, therefore providing an alternative source for guiding the government’s policies. The UK has experienced three national lockdowns since the outbreak of Covid-19, and people have expressed their opinions on Covid-19-related topics, such as social restrictions, vaccination plans, and school reopening, etc., all of which are worthy of exploring and analyzing. In addition, few existing studies focus on the UK or England, especially the change in people’s attitudes toward Covid-19 during the third lockdown.

2 Sentiment analysis approaches

In applying sentiment analysis, the key process is classifying extracted data into sentiment polarities such as positive, neutral, and negative classes. A wide range of emotions can also be considered which is the focus of the emerging fields of affective computing and sentiment analysis (Cambria 2016 ). There are various ways to separate sentiments according to different research topics, for example in political debates, sentiments can be divided further into satisfied and angry (D’Andrea et al. 2015 ). Sentiment analysis with ambivalence handling can be incorporated to account for a finer-grained results and characterize emotions into such detailed categories such as anxiety, sadness, anger, excitement, and happiness (Wang et al. 2015 , 2020 ).

Sentiment analysis is generally done to text data, although it can also be used to analyze data from devices that utilize audio- or audio-visual formats such as webcams to examine expression, body movement, or sounds known as multimodal sentiment analysis (Soleymani et al. 2017 ; Yang et al. 2022 ; Zhang et al. 2020 ). Multimodal sentiment analysis expands text-based analysis into something more complex that opens possibilities in the use of NLP for various purposes. Advancement of NLP is also rapidly growing driven by various research, for example in neural network (Kim 2014 ; Ray and Chakrabarti 2022 ). An example would be the implementation of Neurosymbolic AI that combines deep learning and symbolic reasoning, which is thought to be a promising method in NLP for understanding reasonings (Sarker et al. 2021 ). This indicates the wide possibilities of the direction of NLP research.

There are three main methods to detect and classify emotions expressed in text, which are lexicon-based, machine-learning-based approaches, and hybrid techniques. The lexicon-based approach uses the polarity of words, while the machine learning method sees texts as a classification problem and can be further divided into unsupervised, semi-supervised, and supervised learning (Aqlan et al. 2019 ). Figure  1 shows the classification of methods that can be used for sentiment analysis, and in practical applications, machine learning methods and lexicon-based methods could be used in combination.

figure 1

Sentiment analysis approaches

When dealing with large text data such as those from Twitter, it is important to do the data pre-processing before starting the analysis. This includes replacing upper-case letters, removing useless words or links, expanding contractions, removing non-alphabetical characters or symbols, removing stop words, and removing duplicate datasets. Beyond the basic data cleaning, there is a further cleaning process that should be implemented as well including tokenization, stemming, lemmatization, and Part of Speech (POS) tagging. Tokenization splits texts into smaller units and turns them into a list of tokens. This helps to make it convenient to calculate the frequency of each word in the text and analyze their sentiment polarity. Stemming and lemmatization replace words with their root word. For example, the word “feeling” and “felt” can be mapped to their stem word: “feel” using stemming. Lemmatization, on the other hand, uses the context of the words. This can reduce the dimensionality and complexity of a bag of words, which also improves the efficiency of searching the word in the lexicon when applying the lexicon-based method. POS Tagging can automatically tag the POS of words in the text, such as nouns, verbs, and adjectives, etc., which is useful for feature selection and extraction (Usop et al. 2017 ).

2.1 Lexicon-based approach

The core idea of the lexicon-based method is to (1) split the sentences into a bag of words, then (2) compare them with the words in the sentiment polarity lexicon and their related semantic relations, and (3) calculate the polarity score of the whole text. These methods can effectively determine whether the sentiment of the text is positive, negative, or neutral (Zahoor and Rohilla 2020 ). The lexicon-based approach performs the task of tagging words with semantic orientation either using dictionary-based or corpus-based approaches. The former is simpler, and we can determine the polarity score of words or phrases in the text using a sentiment dictionary with opinion words.

2.1.1 Lexicon-based approaches with built-in library

Examples of the most popular lexicon-based sentiment analysis models in Python are TextBlob and VADER. TextBlob is a Python library based on the Natural Language Toolkit (NLTK) that calculates the sentiment score for texts. An averaging technique is applied to each word to obtain the sentiment polarity scores for the entire text (Oyebode and Orji 2019 ). The words recorded in the TextBlob lexicon have their corresponding polarity score, subjectivity score, and intensity score. Additionally, there may be different records for the same word, so the sentiment score of the word is the average value of the polarity of all records containing them. The sentiment polarity scores produced are between [− 1, 1], in which − 1 refers to negative sentiment and + 1 refers to positive sentiment.

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based tool for sentiment analysis with a well-established sentiment lexicon (Hutto and Gilbert 2014 ). Compared to the TextBlob library, there are more corpora related to the language of social media, which may work better on a social media-type text that often contains non-formal language. From the results, the positive, negative, neutral, and compound values of tweets are presented, and the sentiment orientation is determined based on the compound score. There are several main steps of compound score calculation. Firstly, each word in the sentiment lexicon is given its corresponding scores of positive, negative, and neutral sentiments, ranging from − 4 to 4 from the most “negative” to the most “positive.” Heuristic rules are then applied when handling punctuation, capitalization, degree modifiers, contrastive conjunctions, and negations, which boosts the compound score of a sentence. The scores of all words in the text are standardized to (− 1, 1) using the formula below:

where x represents the sum of Valence scores of sentiment words, and α is a normalization constant. The compound score is obtained by calculating the scores of all standardized lexicons in the range of − 1 (most negative) to 1 (most positive). The specific classification criteria for both TextBlob and VADER are shown in Table 1 .

2.1.2 Lexicon-based approach with SentiWordNet

SentiWordNet is a lexical opinion resource that operates on the WordNet Database, which contains a set of lemmas with a synonymous interface called “synset” (Baccianella et al. 2010 ). Each synset corresponds to the positive and negative polarity scores. The value range of Pos(s) and Neg(s) is between 0 and 1. The process of SentiWordNet analysis is shown in Fig.  2 .

figure 2

Process of SentiWordNet-based approaches

There are several steps in applying the SentiWordNet-based approach. The first steps are data pre-processing including applying basic data cleaning, tokenization, stemming, and POS tagging. These steps can improve the time spent searching the words in the SentiWordNet database. For a given lemma that contains n meanings in the tweet, only the polarity score with the most common meaning is considered (the first one). The formula is as follows:

We can count the positive and negative terms in each tweet and calculate their sentiment polarity scores (Guerini et al. 2013 ). The sentiment score of each word or specific term in the SentiWordNet lexicon can be calculated by applying Eq. ( 4 ):

The SynsetScore then computes the absolute value of the maximum positive score and the maximum negative score of the word. For a term containing several synsets, the calculation is as follows:

where n is a count number, the total score would be recorded as 0 if this term is not in SentiWordNet. The symbol k indicates how many synsets are contained in this term, and if there are negations in front of this term, then, this sentiment value is reserved. Finally, we can add the sentiment scores of all terms to get the sentiment score of the tweets using the formula below:

where p is a clean tweet with m positive terms and n negative terms. PosScore( p ) is the final score of all the positive terms, while NegScore( p ) represents the negative terms, and SentiScore( s ) is the final sentiment score of tweets (Bonta et al. 2019 ).

2.2 Machine learning approach

The machine learning approaches can construct classifiers to complete sentiment classification by extracting feature vectors, which mainly includes steps including data collecting and cleaning, extracting features, training data with the classifier, and analyzing results (Adwan et al. 2020 ). The dataset needs to be divided into a training and a test dataset using machine learning methods. The training sets aim to enable the classifier to learn the text features, and the test dataset evaluates the performance of the classifier.

The role of classifiers (e.g., Naïve Bayes classifier, Support Vector Machine, Logistic classifier, and Random Forest classifier.) is to classify text into different defined classes. As one of the most common methods for text classification, machine learning is widely used by researchers. In addition, the performance of the same classifier for different types of text may differ greatly, so the feature vectors of each type of text should be trained separately. To increase the robustness of the model, a two-stage support vector machine classifier can be used, which can effectively process the influence of noise data on classification (Barbosa and Feng 2010 ). In the subsequent process, it is necessary to vectorize the tweets data and divide the labeled tweets data into a training set (80%) and a test set (20%), and then, the sentiment labels can be predicted by training different classification models. The overall process is shown in Fig.  3 below:

figure 3

Main process of machine-learning-based approaches

2.2.1 Feature representation

The common methods of text feature representation can be divided into two categories: frequency-based embeddings (e.g., Count vector, Hashing Vectorizer, and TF–IDF) and pre-trained word embedding (e.g., Word2Vec, Glove, and Bert) (Naseem et al. 2021 ). In this paper, the following three feature representation models are mainly used:

Bag of words ( BoW ) converts textual data to numerical data with a fixed-length vector by counting the frequency of each word in tweets. In Python, CountVectorizer() works on calculating terms frequency, in which a sparse matrix of clean tokens is built.

Term frequency–inverse document frequency ( TF–IDF ) measures the relevance between a word and the entire text and evaluates the importance of the word in the tweet dataset. In Python, TfidfVectorizer() can obtain a TF–IDF matrix by calculating the product of the word frequency metric and inverse document frequency metric of each word from clean tweets.

Word2Vec generates a vector space according to all tweet corpus, and each word is represented in the form of a vector in this space. In the vector space, words with similar meanings will be closer together, so this method is more effective for dealing with semantic relations. In Python, the text embedding method can be implemented with the Word2Vec model in the Gensim library, and many different hyperparameters can be adjusted to optimize the word embedding model, such as setting various corpus (sentences), trying different training algorithms (skip-grams/sg), and adjusting the maximum distance between the current word and the predicted word in a sentence (window).

2.2.2 Classification models

Sentiment classification is the process of predicting users’ tweets as positive, negative, and neutral based on the feature representation of tweets. The classifiers in the supervised machine learning methods, such as a random forest, can classify and predict unlabeled text by training a large number of sentiment-labeled tweets. The classification models used in this paper are as follows:

2.2.2.1 Random forest

The results of the random forest algorithm are based on the prediction results of multiple decision trees, and the classification of new data points is determined by a voting mechanism (Breiman 2001 ). Increasing the number of trees can increase the accuracy of the results. There are several steps in applying random forest for text processing (Kamble and Itkikar 2018 ). First, we select n random tweet records from the dataset as the sample dataset and build a decision tree for each sample. We then get the predicted classification results of each decision tree. Then, we take the majority vote for each prediction of the decision trees. The sentiment orientation will be assigned to the category with the most votes. To evaluate the results, we can split the dataset into a training part to build the forest and a test part to calculate the error rate (al Amrani et al. 2018 ).

2.2.2.2 Multinomial Naïve Bayes

This model is based on the Naïve Bayes Theorem, which calculates the probability of multiple categories from many observations, and the category with the maximum probability is assigned to the text. Hence, the model can effectively solve the problem of text classification with multiple classes. The formula using Bayes Theorem to predict the category label based on text features (Kamble and Itkikar 2018 ) is as follows:

where p (label) represents the prior probability of label p , and (feature/label) is the prior probability of the features with a given classifying label. To implement this technique, firstly, we calculate the prior probability for known category labels. Then, we obtain the likelihood probability with each feature for different categories and calculate the posterior probability with the formulas of the Bayes theorem. Lastly, we select the category with the highest probability as the label of the input tweet.

2.2.2.3 Support vector classification (SVC)

The purpose of this model is to determine linear separators in the vector space and facilitate the separation of different categories of input vector data. After the hyperplane is obtained, the extracted text features can be put into the classifier to predict the results. Additionally, the core idea is to find a line closest to the support vectors. The steps in implementing SVC include calculating the distance between the nearest support vectors, which is also called the margin, maximizing the margin to obtain an optimal hyperplane between support vectors from given data, and using this hyperplane as a decision boundary to segregate the support vectors.

2.2.3 Hyperparameters optimization

Hyperparameters can be considered as the settings of machine learning models, and they need to be tuned for ensuring better performance of models. There are many approaches to hyperparameter tuning, including Grid Search, Random Search, and automated hyperparameter optimization. In this study, Grid Search and Random Search are considered. The result may not be the global optimal solution of a classification model, but it is the optimal hyperparameters within the range of these grid values.

In applying Grid Search, we build a hyperparameter values grid, train a model with each combination of hyperparameter values, and evaluate every position of the grid. For Random Search, we build a grid of hyperparameter values and then, train a model with combinations randomly selected, which means not all the values can be tried. For this paper, this latter approach is more feasible because although the results of the Grid Search optimization method might be more accurate, it is inefficient and costs more time when compared with the random search approach.

3 Data and methods

This paper focuses on tweets that were geotagged from the main UK cities during the third national Covid-19 lockdown. The cities are Greater London, Bristol, South Hampton, Birmingham, Manchester, Liverpool, Newcastle, Leeds, Sheffield, and Nottingham. Since the total number of tweets in each city is positively correlated with the urban population size and density, the number of tweets varies widely among these different cities. To collect more tweets to represent the perception of most people in England toward the Covid-19 pandemic, the selection criteria for major cities are based on the total population and density to improve the validity of the data (Jiang et al. 2016 ).

We divide the data collection time frame into three different stages of the third national lockdown in 2021. The timeline of the third national lockdown in England is from 6 January 2021 to 18 July 2021 as can be seen in Fig.  4 . During this period, we selected several critical time points for research and analysis in stages according to the plan of lifting the lockdown in England, and the duration of each stage is about two months. The different stages are Stage 1 on January 6 until March 7, 2021, when England enters the third national lockdown, Stage 2 on March 8 until May 16, 2021, when the government implemented steps 1 and step 2 of lifting the lockdowns and Stage 3 on May 17 until July 18, 2021, when the government implemented step 3 of lifting the lockdown and easing most Covid-19 restrictions in the UK.

figure 4

Detailed timeline of the third national lockdown in 2021

The tweets are extracted using Twint and Twitter Academic API, as these scraping tools can help facilitate the collection of tweets with geo-location, which helps in applying geographical analysis. However, users who are willing to disclose their geographic location when sending tweets only account for 1% of the total users (Sloan and Morgan 2015 ), and the location-sharing option is off by default. Therefore, the data collected by Twint and Twitter academic API are merged to obtain more tweets.

To filter the tweets related to Covid-19, we used keywords including “corona” or “covid” in the searching configuration of Twint or the query field of Twitter academic API, thus extracting the tweets and hashtags containing the search terms. In Twint, 1000 tweets can be fetched in each city per day, which avoids large bias in sentiment analysis due to uneven data distribution, but, in most cases, the number of tweets from a city for one day cannot reach this upper limit. Moreover, cities in the major cities list are used as a condition for filtering tweets from different geographic regions.

A total of 77,332 unique tweets were collected in three stages crawled from January 6 to July 18, 2021 (stage 1: 29,923; stage 2: 24,689; and stage 3: 22,720 tweets). The distribution of the number of tweets in each city is shown in Fig.  5 a. Most of the tweets originate from London, Manchester, Birmingham, and Liverpool, and there are far more tweets in London (37,678) than in other cities. The number of tweets obtained in some cities, such as Newcastle, is much lower than the number of tweets in London, with only 852 tweets collected in six months. Figure  5 shows the distribution of data at each stage with the first stage having the most data while the third stage has the least amount of data. Additionally, at each stage, London has the largest proportion of data, with Newcastle having the least, linear to the total population and density of the area.

figure 5

Distribution of collected tweets based on the selected cities and different stages

Since most raw tweets are unstructured and informal, which may affect the word polarity or text feature extraction, the data were pre-processed before sentiment analysis (Naseem et al. 2021 ). We implemented a basic data-cleaning process as follows:

Replacing upper-case letters to avoid recognizing the same word as different words because of capitalization.

Removing hashtags (#topic), mentioned usernames (@username), and all the links that start with “www,” “http,” and “https.” Removing stop words and short words (less than two characters). The stop words are mostly very common in the text but hardly contain any sentiment polarity. However, in sentiment analysis, “not” and “no” should not be listed as stop words, because removing these negations would change the real meaning of entire sentences.

Reducing repeated characters from some words. Some users will type repeated characters to express their strong emotions, so these words that are not in the lexicons should be converted into their corresponding correct words. For example: “sooooo goooood” becomes “so good.”

Expanding contractions in tweets such as “isn't” or “don't” as these will become meaningless letters or words after punctuations have been removed. Therefore, all contractions in the tweets are expanded into their formal forms, such as “isn’t” become “is not.”

Clearing all non-alphabetical characters or symbols including punctuation, numbers, and other special symbols that may affect the feature extraction of the text.

Removing duplicated or empty tweets and creating a clean dataset.

Converting emojis to their real meaning as many Twitter users use emojis in their tweets to express their sentiments and emotions. Hence, using the demojize() function in the emoji module of Python and transforming emojis into their true meaning may improve the accuracy of the sentiment analysis (Tao and Fang 2020 ).

In addition, for some sentiment analysis approaches, such as SentiWordNet-based analysis, further cleaning is essential, including stemming and POS Tagging.

In this study, strategies for text cleaning, polarity calculation, and sentiment classification model are designed and optimized using two different approaches to sentiment analysis: lexicon and machine-learning-based techniques. We then compared the results of the different methods and compare their output and prediction accuracy. The machine-learning-based approaches require labels with the tweets data, but the constraint is that it often takes too much time to manually annotate a large amount of data. Hence, 3000 tweets are randomly sampled in this paper, with the average number of tweets in each sentiment category of about 1000. To save the time spent on labeling, the classification results of the TextBlob or VADER method are used as the labels of the sample data (Naseem et al. 2021 ). We then manually check whether the classification of the VADER or TextBlob method is correct and modify it when necessary.

4 Results and discussion

4.1 lexicon-based approach.

From Fig.  6 , the results obtained by TextBlob and VADER tools are similar, showing that positive sentiments appear more than negative sentiments. However, the number of neutral sentiments from the VADER method is lower. This might be because the VADER lexicon can efficiently handle the type of language used by social media users such as by considering the use of slang, Internet buzzwords, and abbreviations. On the other hand, TextBlob works better with formal language usage. Moreover, the results from the analysis using the SentiWordNet show a high proportion of negative sentiments. This might be due to some of the social media expressions of positive emotions that are not comprehensively recorded in the dictionary. Additionally, due to its limited coverage of domain-specific words, some words may be assigned wrong scores, which would cause a large deviation in sentiment scores. Only the most common meaning of each word is considered in SentiWordNet-based calculation; therefore, some large bias might occur. Consequently, the results of the VADER method are more convincing in this experiment. According to the comparison of public sentiment toward “Covid-19” and the “Covid-19 vaccine,” the classification results of all three approaches show that more people have positive sentiments than negative, indicating that most people expect the vaccine to have a good impact on Covid-19.

figure 6

a Sentiment classification statistics, b vaccine sentiment statistics

After using the lexicon-based approaches with TextBlob, VADER, and SentiWordNet-based methods, the sentiment scores and their classification results were obtained for each tweet. In this study, the three sentiment categories of positive, negative, and neutral sentiment correspond to 1, − 1, and 0, respectively, and we filter out the tweets in each city with their corresponding sentiment values (positive: 1, negative: − 1; and neutral: 0). The proportion of positive and negative sentiments in each city at each stage was calculated to compare how the sentiments change and to examine the differences in people’s perception of Covid-19 between these different cities.

Figure  7 a indicates the results of using TextBlob in the three stages. In most cities in Fig.  7 a, the proportion of positive sentiments at each stage is between 38 and 50%. Southampton and Manchester show a steady decline, while Sheffield is the only city where the proportion of positive sentiments increased in all three stages. Considering the entire period, Newcastle has the largest proportion of positive emotions, peaking at the second stage (about 50%), and Southampton was the lowest. For negative sentiments, the trend of Sheffield was different from other cities, which rise first and then fall. In addition, for most cities, the proportion of negative sentiments in the second stage is the lowest, and the proportion of negative sentiments in most cities is between 20 and 30%.

figure 7

Results of the various lexicon-based approaches

The results of VADER shown in Fig.  7 b are similar to those of TextBlob. The proportion of positive sentiment in most cities is 40–50%, showing a trend of increasing first and then falling, except for Sheffield. Additionally, most of the negative sentiments account for between 30 and 40%. Moreover, the changes in the proportion of positive emotions in Manchester and Leeds are relatively flat, and the proportion of negative sentiments in Manchester also changes smoothly. However, Nottingham has a large change in positive sentiments at each stage, with a difference of about 6% between the highest and lowest values, and Newcastle has a wide range of negative sentiments proportion.

Based on the results of the SentiWordNet-based approach shown in Fig.  7 c, the proportion of negative sentiments in each city is higher when compared with the previous two methods. Most of the negative sentiments are in the range of 40–50%, while the proportion of positive emotions is mostly between 36 and 46%. In terms of the trend of change, the percentage of Birmingham’s positive sentiment is declining, while the percentage of Liverpool’s positive sentiments trend is the opposite of other cities, which decreased first and then, increased.

Overall, according to the results of the three approaches, for most cities, the proportion of positive sentiments first rises and then, decreases. This is in contrast with the proportion of negative sentiments that decline from the first stage to the second stage and then, start to increase. The number of Covid-19 deaths and confirmed cases could be an indicator that can quantify the severity of the pandemic. Meanwhile, the increase in the number of people vaccinated with the Covid-19 vaccine can reduce the speed of the virus spreading among the population, thereby reducing the impact of the pandemic on people’s lives.

Figure  8 shows the changes in the number of deaths and confirmed cases, and the number of new vaccines given. It shows that after peaking at the beginning of the third national lockdown, the number of deaths began to decline and became stable after April 2021. In addition, the number of newly confirmed cases in 2021 shows a downward trend from January to May but has increased significantly since June. Moreover, from the perspective of vaccination, the peak period of vaccination in 2021 is mainly in April and May, while after June, the vaccination volume drops greatly. Furthermore, combined with the previous results of sentiment analysis, from the first stage to the second stage, the positive sentiment proportion increases in most cities. This might be related to the improved situation of the Covid-19 pandemic as well as the increased number of vaccinations. However, there is a drop in positive sentiments from stage two to stage three, and the negative proportion increases. This might be due to the overall sentiment toward the vaccine’s protection rate and a large amount of new confirmed cases at the time. Overall, it might be that the public feels that the third lockdown policy and vaccination have not achieved the expected effect on the control of Covid-19 in England; hence, the number of negative sentiments has an upward trend after the second stage. More analysis is needed to explain the change in the sentiment trends more accurately.

figure 8

Trend of deaths, confirmed cases, and vaccines

4.2 Machine-learning-based approach

In this paper, supervised learning approaches also need to be considered because unsupervised lexicon-based approaches cannot quantitatively analyze the results of sentiment classification. This part shows the classification performance of the three models (the proportion of the train dataset compared with the test dataset is 8:2) under different feature representation models (BoW, TF–IDF, and Word2Vec) and the optimization training on the models.

4.2.1 The hyperparameters of classification models

Each classification model needs to extract the text features of tweets and vectorize them before training, and the feature vectors of different forms may show different performances in the same classification model. Therefore, before the training of feature vectors, RandomizedSearchCV() is used to optimize the hyperparameters in the classifier. In the optimization process, the hyperparameters that are expected to be optimized can be selected with various options, and the result would be the optimal solution for the hyperparameters grid. Table 2 (a) presents the optimal parameters of the random forest classifier, and Table 2 (b) shows the optimal hyperparameters of the Multinomial Naive Bayes (MNB) classifier and the Support Vector Machine (SVC) classifier.

4.2.2 The evaluation results of classifiers

These models classify all tweets into three categories, which are negative, positive, and neutral. The following Table 3 shows their performance with different feature representations.

In this paper, Accuracy, Precision, and Recall are selected as evaluation indicators, measuring the performance of each classification model. Before calculating them, the values of the confusion matrix need to be known, and they are TP (True Positive), TN (True Negative), FP (False Positive), and FN (False Negative). Accuracy shows the proportion of the number of correct observations to the total observations using the formula below:

Precision is the proportion of positive observations that correctly estimates the total number of positive predictions using the formula:

Recall refers to the proportion of actual positive observations that are identified correctly calculated using:

The F1 Score is a comprehensive evaluation and balance of precision and recall values, which can be calculated as follows:

According to the classification results of the three models, the performance of these classifiers for tweets with negative labels is poor, especially for the Random Forest Classifier, which has a low ability to recognize negative tweets, though the prediction precision is high. The reason for this may be that the labels are annotated manually, and unsupervised learning methods are different from the real sentiment expression of tweets. For the overall prediction, the SVC model has the best prediction ability with an accuracy of 0.71. Additionally, the F1 values of each label show that the SVC model has a good ability to classify the three categories of sentiments.

The accuracy of the three models is relatively high with the TF–IDF method, all above 60%. However, similar to the experimental results using the BoW feature representation, in Random Forest Classifier, the recall value of the negative category is very low, indicating that there are many negative tweets in the test dataset that have not been identified. This may be caused by the imbalanced distribution of data in each category, or the category contains some wrong data that would affect the training results. Moreover, these three models have the best predictive effect on the positive category, with an F1 score above 0.7. In summary, the performance of the SVC model is the best and the accuracy is higher than 70% in our study.

The prediction results of the three classifiers with Word2Vec are not as good as the previous two feature representation models, especially for the identification of negative sentiments. The reasons for the poor performance are that the Word2Vec embedding method needs to group semantically similar words, which requires a large amount of data, and it is difficult to extract sufficient text feature vectors from a small dataset. Moreover, compared with the Multinomial Naïve Bayes classifier, the SVC model and Random Forest classifier have better prediction performance, and their values of accuracy are 0.56 and 0.53, respectively.

5 Conclusion

In conclusion, this paper extracts data regarding Covid-19 from people in the main cities of England on Twitter and separates it into three different stages. First, we perform data cleaning and use unsupervised lexicon-based approaches to classify the sentiment orientations of the tweets at each stage. Then, we apply the supervised machine learning approaches using a sample of annotated data to train the Random Forest classifier, Multinomial Naïve Bayes classifier, and SVC, respectively. From lexicon-based approaches, the three stages of public sentiment changes about the Covid-19 pandemic can be found. For most cities, the proportion of positive sentiments increases first and then drops, while the proportion of negative sentiments changed in a different direction. In addition, by analyzing the number of deaths and confirmed cases as well as vaccination situations, it could be concluded that the increase in confirmed cases and the decrease in vaccination volume might be the reason for the increase in negative sentiments, even though further research is needed to confirm this inference.

For supervised machine learning classifiers, the Random Search method is applied to optimize the hyperparameters of each model. The SVC results using BoW and TF–IDF feature models have the best performance, and their classification accuracy is as high as 71%. Due to the insufficiency of training data, the prediction accuracy of classifiers with the Word2Vec embedding method is low. Consequently, applying machine learning approaches to sentiment analysis can accurately extract text features without being restricted by lexicons.

It is important to note that this paper only collects the opinions of people in England on Twitter about Covid-19; thus, the result should be interpreted by considering this limitation. To obtain a more convincing conclusion, we can increase the data size by incorporating longer timeline, wider geographies, or by collecting data via other social media platforms while also considering the data protection policy. In addition, large-scale manually annotated datasets can be created for training machine learning models to improve their classification ability. Moreover, deep learning approaches can be used for model training, and this can be compared with different machine learning models. Furthermore, the Random Search method can only find the optimal parameters within a certain range, so exploring how to select model hyperparameters efficiently can further improve the stability of machine learning models. However, despite all the limitations, this study has provided contributions in advancing our understanding of the use of various NLP methods.

For lexicon-based approaches, the existing lexicon is modified to better fit the language habits of modern social media, improving the accuracy of this approach. Additionally, an annotated dataset can be created to compare the difference between predicted results and real results. Research on Covid-19 can be based on time series so that the changes in people’s attitudes and perceptions can be analyzed over some time. Moreover, further studies can combine the sentiment classification results with other factors such as deaths and vaccination rates and establish a regression model to analyze which factors contribute to the sentiment changes. Overall, the paper has showcased different methods of conducting sentiment analysis with SVC using BoW or TF–IDF outperformed the model accuracy overall.

6 The codes of the project

The main codes of this project have uploaded to GitHub, and here is the link: https://github.com/Yuxing-Qi/Sentiment-analysis-using-Twitter-data .

Adwan OY, Al-Tawil M, Huneiti AM, Shahin RA, Abu Zayed AA, Al-Dibsi RH (2020) Twitter sentiment analysis approaches: a survey. Int J Emerg Technol Learn. https://doi.org/10.3991/ijet.v15i15.14467

Article   Google Scholar  

al Amrani Y, Lazaar M, el Kadirp KE (2018) Random forest and support vector machine based hybrid approach to sentiment analysis. Proc Comput Sci. https://doi.org/10.1016/j.procs.2018.01.150

Alamoodi AH, Zaidan BB, Zaidan AA, Albahri OS, Mohammed KI, Malik RQ, Almahdi EM, Chyad MA, Tareq Z, Albahri AS, Hameed H, Alaa M (2021) Sentiment analysis and its applications in fighting COVID-19 and infectious diseases: a systematic review. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2020.114155

Aqlan AAQ, Manjula B, Lakshman Naik R (2019) A study of sentiment analysis: Concepts, techniques, and challenges. In Lecture notes on data engineering and communications technologies, vol 28. https://doi.org/10.1007/978-981-13-6459-4_16

Arun K, Srinagesh A (2020a) Multi-lingual Twitter sentiment analysis using machine learning. Int J Electr Comput Eng. https://doi.org/10.11591/ijece.v10i6.pp5992-6000

Arun K, Srinagesh A (2020b) Multi-lingual Twitter sentiment analysis using machine learning. Int J Electr Comput Eng. https://doi.org/10.11591/ijece.v10i6.pp5992-6000

Baccianella S, Esuli A, Sebastiani F (2010) SENTIWORDNET 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In: Proceedings of the 7th international conference on language resources and evaluation, LREC 2010

Barbosa L, Feng J (2010) Robust sentiment detection on twitter from biased and noisy data. In: Coling 2010—23rd international conference on computational linguistics, proceedings of the conference, 2

Bonta V, Kumaresh N, Janardhan N (2019) A comprehensive study on Lexicon based approaches for sentiment analysis. Asian J Comput Sci Technol 8(S2):1–6. https://doi.org/10.51983/ajcst-2019.8.s2.2037

Boon-Itt S, Skunkan Y (2020) Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study. JMIR Public Health and Surveillance, 6(4), e21978. https://doi.org/10.2196/21978

Breiman L (2001) Random forests. Mach Learn. https://doi.org/10.1023/A:1010933404324

Cambria E (2016) Affective computing and sentiment analysis. IEEE Intell Syst. https://doi.org/10.1109/MIS.2016.31

D’Andrea A, Ferri F, Grifoni P, Guzzo T (2015) Approaches, tools and applications for sentiment analysis implementation. Int J Comput Appl. https://doi.org/10.5120/ijca2015905866

Dashtipour K, Poria S, Hussain A, Cambria E, Hawalah AYA, Gelbukh A, Zhou Q (2016) Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cogn Comput. https://doi.org/10.1007/s12559-016-9415-7

Guerini M, Gatti L, Turchi M (2013) Sentiment analysis: how to derive prior polarities from SentiWordNet. In: EMNLP 2013—2013 conference on empirical methods in natural language processing, proceedings of the conference

Hutto CJ, Gilbert E (2014) VADER: a parsimonious rule-based model for sentiment analysis of social media text. In: Proceedings of the 8th international conference on weblogs and social media, ICWSM 2014. https://doi.org/10.1609/icwsm.v8i1.14550

Jiang B, Ma D, Yin J, Sandberg M (2016) Spatial distribution of city Tweets and their densities. Geogr Anal. https://doi.org/10.1111/gean.12096

Kamble SS, Itkikar PAR (2018) Study of supervised machine learning approaches for sentiment analysis. Int Res J Eng Technol (IRJET) 05(04)

Kim Y (2014) Convolutional neural networks for sentence classification. In: EMNLP 2014—2014 conference on empirical methods in natural language processing, proceedings of the conference. https://doi.org/10.3115/v1/d14-1181

Lo SL, Cambria E, Chiong R, Cornforth D (2017) Multilingual sentiment analysis: from formal to informal and scarce resource languages. Artif Intell Rev. https://doi.org/10.1007/s10462-016-9508-4

Mostafa L (2021) Egyptian student sentiment analysis using Word2vec during the coronavirus (Covid-19) pandemic. In Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020 (pp. 195-203). Springer International Publishing. https://doi.org/10.1007/978-3-030-58669-0_18

Naseem U, Razzak I, Khushi M, Eklund PW, Kim J (2021) COVIDSenti: a large-scale benchmark Twitter data Set for COVID-19 sentiment analysis. IEEE Trans Comput Soc Syst. https://doi.org/10.1109/TCSS.2021.3051189

Oyebode O, Orji R (2019) Social media and sentiment analysis: the Nigeria presidential election 2019. In: 2019 IEEE 10th annual information technology, electronics and mobile communication conference, IEMCON 2019. https://doi.org/10.1109/IEMCON.2019.8936139

Ray P, Chakrabarti A (2022) A mixed approach of deep learning method and rule-based method to improve aspect level sentiment analysis. Appl Comput Inform. https://doi.org/10.1016/j.aci.2019.02.002

Saleh SN, Lehmann CU, McDonald SA, Basit MA, Medford RJ (2021) Understanding public perception of coronavirus disease 2019 (COVID-19) social distancing on Twitter. Infect Control Hosp Epidemiol 42(2):131–138. https://doi.org/10.1017/ice.2020.406

Sarker MK, Zhou L, Eberhart A, Hitzler P (2021) Neuro-symbolic artificial intelligence. AI Commun. https://doi.org/10.3233/AIC-210084

Article   MathSciNet   Google Scholar  

Sloan L, Morgan J (2015) Who tweets with their location? Understanding the relationship between demographic characteristics and the use of geoservices and geotagging on twitter. PLoS ONE. https://doi.org/10.1371/journal.pone.0142209

Soleymani M, Garcia D, Jou B, Schuller B, Chang SF, Pantic M (2017) A survey of multimodal sentiment analysis. Image vis Comput. https://doi.org/10.1016/j.imavis.2017.08.003

Tao J, Fang X (2020) Toward multi-label sentiment analysis: a transfer learning based approach. J Big Data. https://doi.org/10.1186/s40537-019-0278-0

Usop ES, Isnanto RR, Kusumaningrum R (2017) Part of speech features for sentiment classification based on Latent Dirichlet allocation. In: Proceedings—2017 4th international conference on information technology, computer, and electrical engineering, ICITACEE 2017, 2018-January. https://doi.org/10.1109/ICITACEE.2017.8257670

Wang Z, Ho SB, Cambria E (2020) Multi-level fine-scaled sentiment sensing with ambivalence handling. Int J Uncertain Fuzziness Knowl-Based Syst. https://doi.org/10.1142/S0218488520500294

Wang Z, Joo V, Tong C, Chan D (2015) Issues of social data analytics with a new method for sentiment analysis of social media data. In: Proceedings of the international conference on cloud computing technology and science, CloudCom, 2015-February(February). https://doi.org/10.1109/CloudCom.2014.40

Yang B, Shao B, Wu L, Lin X (2022) Multimodal sentiment analysis with unidirectional modality translation. Neurocomputing. https://doi.org/10.1016/j.neucom.2021.09.041

Zahoor S, Rohilla R (2020) Twitter sentiment analysis using lexical or rule based approach: a case study. In: ICRITO 2020—IEEE 8th international conference on reliability, Infocom technologies and optimization (trends and future directions). https://doi.org/10.1109/ICRITO48877.2020.9197910

Zhang Y, Rong L, Song D, Zhang P (2020) A survey on multimodal sentiment analysis. In Moshi Shibie yu Rengong Zhineng/pattern recognition and artificial intelligence, vol 33, issue 5. https://doi.org/10.16451/j.cnki.issn1003-6059.202005005

Download references

Author information

Authors and affiliations.

Centre for Urban Science and Progress, King’s College London, London, UK

Department of Geography, King’s College, London, UK

Zahratu Shabrina

Regional Innovation, Graduate School, Universitas Padjadjaran, Bandung, Indonesia

You can also search for this author in PubMed   Google Scholar

Contributions

Z.S. and Y.Q. conceived the presented idea. Y.Q. conducted the data gathering, analysis, and drafted the main manuscript. Z.S. wrote and edited the final version of the manuscript and supervised the project. All authors provided critical feedback and helped shape the research, analysis, and manuscript.

Corresponding author

Correspondence to Zahratu Shabrina .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Qi, Y., Shabrina, Z. Sentiment analysis using Twitter data: a comparative application of lexicon- and machine-learning-based approach. Soc. Netw. Anal. Min. 13 , 31 (2023). https://doi.org/10.1007/s13278-023-01030-x

Download citation

Received : 01 June 2022

Revised : 16 January 2023

Accepted : 20 January 2023

Published : 09 February 2023

DOI : https://doi.org/10.1007/s13278-023-01030-x

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. (PDF) Sentiment Analysis-An Objective View

    sentiment analysis research papers 2019

  2. (PDF) A Review of Sentiment Analysis Techniques

    sentiment analysis research papers 2019

  3. An improved aspect-category sentiment analysis model for text sentiment

    sentiment analysis research papers 2019

  4. (PDF) Online review helpfulness: Findings from sentiment analysis

    sentiment analysis research papers 2019

  5. (PDF) A Survey of Sentiment Analysis: Approaches, Datasets, and Future

    sentiment analysis research papers 2019

  6. (PDF) Research on Sentiment Analysis and Satisfaction Evaluation of

    sentiment analysis research papers 2019

VIDEO

  1. Sentiment Analysis and Basic Feature Extraction (Natural Language Processing at UT Austin)

  2. Real-Time Sentiment Analysis and Interactive Dashboard Deployment for Social Media Discourse 'video'

  3. Lecture 5

  4. Sentiment Analysis: Introduction

  5. Homework 2: Sentiment Analysis

  6. How to Conduct Sentiment Analysis

COMMENTS

  1. Recent advancements and challenges of NLP-based sentiment analysis: A

    RQ3: Which research papers on sentiment analysis were published from 2020 to 2023, and what application domains do they cover? RQ4: Which datasets are commonly used for sentiment analysis experiments? ... 2019), which is a left-to-right, autoregressive, generative transformer-based language model, and scaled it up to 530 billion parameters. The ...

  2. More than a Feeling: Accuracy and Application of Sentiment Analysis

    This research addresses these shortcomings by considering about 70 times more sentiment datasets than prior work (Hartmann et al., 2019).Specifically, we conduct a comprehensive meta-analytical assessment, covering 217 publications, more than 1, 100 experimental results, nearly 12 million sentiment-labeled text documents, and 272 unique datasets. . Based on this data, we can study and explain ...

  3. Sentiment Analysis in Social Media and Its Application: Systematic

    This paper focuses to provide a better understanding of the application of sentiment analysis in social media platform by examining related literature published between 2014 to 2019. Sentiment analysis is an approach that uses Natural Language Processing (NLP) to extract, convert and interpret opinion from a text and classify them into positive ...

  4. A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research

    Sentiment analysis is a critical subfield of natural language processing that focuses on categorizing text into three primary sentiments: positive, negative, and neutral. With the proliferation of online platforms where individuals can openly express their opinions and perspectives, it has become increasingly crucial for organizations to comprehend the underlying sentiments behind these ...

  5. A review on sentiment analysis and emotion detection from text

    In sentiment analysis, polarity is the primary concern, whereas, in emotion detection, the emotional or psychological state or mood is detected. Sentiment analysis is exceptionally subjective, whereas emotion detection is more objective and precise. Section 2.2 describes all about emotion detection in detail.

  6. Sentiment Analysis in Health and Well-Being: Systematic Review

    Sentiment analysis (SA) is a subfield of natural language processing whose aim is to automatically classify the sentiment expressed in a free text. It has found practical applications across a wide range of societal contexts including marketing, economy, and politics. This review focuses specifically on applications related to health, which is ...

  7. Sentiment Analysis

    **Sentiment Analysis** is the task of classifying the polarity of a given text. For instance, a text-based tweet can be categorized into either "positive", "negative", or "neutral". Given the text and accompanying labels, a model can be trained to predict the correct sentiment. **Sentiment Analysis** techniques can be categorized into machine learning approaches, lexicon-based approaches, and ...

  8. Revisiting Sentiment Analysis for Software Engineering in the Era of

    Zulfadzli Drus and Haliyana Khalid. 2019. Sentiment analysis in social media and its application: Systematic literature review. ... Daniela Girardi, and Filippo Lanubile. 2018. A benchmark study on sentiment analysis for software engineering research. In Proceedings of the 15th International Conference on Mining Software Repositories. 364-375 ...

  9. Longitudinal analysis of sentiment and emotion in news media ...

    This work describes a chronological (2000-2019) analysis of sentiment and emotion in 23 million headlines from 47 news media outlets popular in the United States. We use Transformer language models fine-tuned for detection of sentiment (positive, negative) and Ekman's six basic emotions (anger, disgust, fear, joy, sadness, surprise) plus neutral to automatically label the headlines.

  10. A survey on sentiment analysis methods, applications, and challenges

    The rapid growth of Internet-based applications, such as social media platforms and blogs, has resulted in comments and reviews concerning day-to-day activities. Sentiment analysis is the process of gathering and analyzing people's opinions, thoughts, and impressions regarding various topics, products, subjects, and services. People's opinions can be beneficial to corporations, governments ...

  11. Systematic reviews in sentiment analysis: a tertiary study

    With advanced digitalisation, we can observe a massive increase of user-generated content on the web that provides opinions of people on different subjects. Sentiment analysis is the computational study of analysing people's feelings and opinions for an entity. The field of sentiment analysis has been the topic of extensive research in the past decades. In this paper, we present the results of ...

  12. A Study of Sentiment Analysis: Concepts, Techniques, and Challenges

    Abstract Sentiment analysis (SA) is a process of extensive exploration of data. stored on the W eb to identify and categorize the views expressed in a part of the. text. The intended outcome of ...

  13. An Analysis of Sentiment: Methods, Applications, and Challenges

    The study of sentiment is a fast-growing and dynamic research field with numerous applications. The purpose is to enhance the analysis of sentiment performance and address challenges related to this topic. Based on different viewpoints, it is important to integrate the current techniques into sentiment analysis.

  14. A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research

    This paper offers an overview of the latest advancements in sentiment analysis, including preprocessing techniques, feature extraction methods, classification techniques, widely used datasets, and ...

  15. A systematic review of social media-based sentiment analysis: Emerging

    Among the 40 papers investigated by this review paper, 29 of them use datasets for sentiment analysis in English. Research on sentiment analysis in English has yielded significant achievements, advancing not only to adapt the state-of-the-art theories in the fields of lexicon approaches [1], [19], [29] and ML approaches [15], [35], [39], [41 ...

  16. [1801.07883] Deep Learning for Sentiment Analysis : A Survey

    Lei Zhang, Shuai Wang, Bing Liu. View a PDF of the paper titled Deep Learning for Sentiment Analysis : A Survey, by Lei Zhang and 2 other authors. Deep learning has emerged as a powerful machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results.

  17. A review on sentiment analysis and emotion detection from text

    Levels of sentiment analysis . Sentiment analysis is possible at three levels: sentence level, document level, and aspect level. At the sentence-level or phrase-level sentiment analysis, documents or paragraphs are broken down into sentences, and each sentence's polarity is identified (Meena and Prabhakar 2007; Arulmurugan et al. 2019; Shirsat et al. 2019).

  18. (PDF) Sentiment Analysis

    sentiment analysis of social media data related to research papers at scale can give a sense of what people think about research and ho w they are engaging with it [ 21 - 23 ].

  19. A Hybrid Approach to Dimensional Aspect-Based Sentiment Analysis ...

    Dimensional aspect-based sentiment analysis (dimABSA) aims to recognize aspect-level quadruples from reviews, offering a fine-grained sentiment description for user opinions. A quadruple consists of aspect, category, opinion, and sentiment intensity, which is represented using continuous real-valued scores in the valence-arousal dimensions. To address this task, we propose a hybrid approach ...

  20. Exploring Sentiment Analysis Techniques in Natural Language Processing

    Sentiment analysis is the process of recognizing and extracting subjective information from textual data. It includes analyzing opinions, attitudes, emotions, and feelings articulated in a text and categorizing them as positive, negative, or neutral sentences [1]. SA has gained a lot of popularity in recent years due to the abundance of user ...

  21. [2203.01054] A Survey on Aspect-Based Sentiment Analysis: Tasks

    Wenxuan Zhang, Xin Li, Yang Deng, Lidong Bing, Wai Lam. View a PDF of the paper titled A Survey on Aspect-Based Sentiment Analysis: Tasks, Methods, and Challenges, by Wenxuan Zhang and 4 other authors. As an important fine-grained sentiment analysis problem, aspect-based sentiment analysis (ABSA), aiming to analyze and understand people's ...

  22. Sentiment analysis methods, applications, and ...

    The types of sentiment we can find include positive, neutral and negative, and can be further divided into surprise, trust, anticipation, anger, fear, sadness, disgust, joy and so on (Bose et al., 2020).From a language perspective, sentiment analysis research can use various types of natural languages, such as Chinese (Peng et al., 2017), English (Rodríguez-Ibánez et al., 2023), Arabic ...

  23. Survey on sentiment analysis: evolution of research methods and topics

    Sentiment analysis, one of the research hotspots in the natural language processing field, has attracted the attention of researchers, and research papers on the field are increasingly published. ... Zunic et al. selected 86 papers from 299 papers retrieved in the period 2011-2019 to discuss the application of sentiment analysis techniques in ...

  24. Sentiment analysis: A survey on design framework, applications and

    Sentiment analysis is a solution that enables the extraction of a summarized opinion or minute sentimental details regarding any topic or context from a voluminous source of data. Even though several research papers address various sentiment analysis methods, implementations, and algorithms, a paper that includes a thorough analysis of the process for developing an efficient sentiment analysis ...

  25. A Review of Sentiment Analysis in Social Media Perspectives

    researchers. Since the year of 2004, sentiment analysis research . ... The paper is organized as . ... about Oman Tourism," 2019 4th MEC International .

  26. SmartRAN: Smart Routing Attention Network for multimodal sentiment analysis

    Sentiment analysis, also known as sentiment detection or emotion recognition, is an important research topic in natural language processing (NLP) [1, 2]; it is designed to identify the sentiment polarity contained in information and is widely used in fields such as social media analysis, marketing, and consumer behavior analysis.With the rapid development of online platforms such as YouTube ...

  27. Bangla Sentiment Analysis On Highly Imbalanced Data ...

    [Show full abstract] This paper aims to provide a comprehensive overview of the conducted research on sentiment analysis and sarcasm detection, focusing on the time from 2018 to 2023. It explores ...

  28. Sentiment analysis using Twitter data: a comparative application of

    In this paper, we implement social media data analysis to explore sentiments toward Covid-19 in England. ... we selected several critical time points for research and analysis in stages according to the plan of lifting the lockdown in England, and the duration of each stage is about two months. ... Lakshman Naik R (2019) A study of sentiment ...