-
Essay / Twitter Sentiment Analysis - 1559
section{Introduction}Many new forms of communication have emerged in recent decades, such as text messaging, and have become very popular and important. These new forms of communication carry a wide range of information and are also commonly used to share feelings and opinions about different events and topics. We worked on the following task. The task is: egin{itemize}item Given a message, classify whether the message is positive, negative, or neutral sentiment. For messages conveying both positive and negative sentiment, choose the stronger sentiment.end{itemize}pagebreaksection{Motivation}We often encounter many challenges when working with informal texts like tweets, or when we work with traditional texts like the press wire. data. Tweets are generally short and to the point: they should be concluded in a sentence or two. And this makes language use very informal, with lots of newly created spelling, slang, new abbreviations like tc tor "take care", gr8 for "great" and so on. And with all this, we have the hash tags which perform the equivalent task of tagging Twitter posts. Recently, the task of handling such challenges and automatically understanding the opinions conveyed by these tweets has become very popular and has been the subject of research. \An important aspect of tweets is that they contain highly structured data about different aspects of real communication like location, language, individuals, time, etc. Twitter keeps track of various relevant information in JSON format and we can model this information. to our greatest use. This associated information is useful for a variety of purposes, including but not ...... middle of paper ...... on training tweets. Our method achieves good accuracy with a relatively small data size.pagebreaksection{Future Work} egin{itemize}item We have covered most of the features of our classification. However, we did not include the effect of the following features on the classification accuracy. egin{itemize}item Take care of the emotions conveyed by abbreviationsitem Analyze whether the following sentences in a tweet are more important. (For example, giving more weight to a $2^{nd}$ line in a 2-line tweet.)end{itemize}item Although it is clear from the work done by others on the same problem that SVM has tends to perform better than other classifiers, it would be interesting to see how the hybrid of other classifiers (like the naive Bayes classifier) would perform with SVM. (In our work, we tried a bag-of-words hybrid with SVM, which improved accuracy)end{itemize}