This Is AuburnElectronic Theses and Dissertations

Show simple item record

Predictive Text Analytics and Text Classification Algorithms


Metadata FieldValueLanguage
dc.contributor.advisorCarpenter, Marken_US
dc.contributor.authorYucel, Ahmeten_US
dc.date.accessioned2016-05-23T18:52:53Z
dc.date.available2016-05-23T18:52:53Z
dc.date.issued2016-05-23
dc.identifier.urihttp://hdl.handle.net/10415/5216
dc.description.abstractIn this dissertation, there are three research studies that are mainly based on text analysis. In the first study, a sentiment analysis is performed for extracting and identifying the general rating of the customer reviews for certain products. Classifying the sentiments of online reviews of products is important in that it provides the ability to extract critical information that can be used to improve the quality. Machine learning (ML) algorithms can be used effectively to analyze and therefore to automatically classify the reviews. The objective of this study is to develop a numerical composite variable from unstructured data for the estimation of the star rates of the customer reviews from different domains by employing popular tree-based ML algorithms by incorporating five-fold cross validation into the models. In the second study, a special text classification is used for extracting and identifying the subjective content of the customer reviews. Classifying people’s feedback on a special subject is vital for analysts to understand the public behavior. Especially for the organizations dealing with big bodies of data consisting of people’s reviews, understanding the reviews’ contents and classify them by the subjective information is very important. Although Information Technology modernized process of data gathering, state of art methods are required to handle the available big data. On the other hand, traditional methods are not capable of delivering profound insights on the unstructured based feedbacks. Therefore, institutions are seeking novel methods for text analysis. Text mining (TM) is a machine-learning approach for dealing with people’s reviews that can provide valuable insights about people’s feedback. This study proposes a creation of composite variables for the learning process and utilizes Multilayer Perceptron-based Artificial Neural Network. In the third study, a Turkish TM algorithm is developed for grading written exam papers automatically via TM techniques. Turkish grammar and natural language processing based algorithms are produced on the answer key prepared by the grader and then applied on the answer papers of the students. The main idea in this study is to build a TM tool in Turkish which is going to grade exam papers in Turkish.en_US
dc.subjectMathematics and Statisticsen_US
dc.titlePredictive Text Analytics and Text Classification Algorithmsen_US
dc.typeDissertationen_US
dc.embargo.statusNOT_EMBARGOEDen_US

Files in this item

Show simple item record