This paper is published in Volume-2, Issue-6, 2016
Area
Data Science
Author
Shirdi Wazeed Baba, Reddi Sanjeev Kumar
Org/Univ
Indian Institute of Information Technology, Design and Manufacturing, Jabalpur, (M.P), India
Pub. Date
03 December, 2016
Paper ID
V2I6-1175
Publisher
Keywords
Apriori algorithm, Association rule, confidence,support, Naïve Bayes classifier, Text classification.

Citationsacebook

IEEE
Shirdi Wazeed Baba, Reddi Sanjeev Kumar. Data Mining : Text Classification System for Classifying Abstracts of Research Papers, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Shirdi Wazeed Baba, Reddi Sanjeev Kumar (2016). Data Mining : Text Classification System for Classifying Abstracts of Research Papers. International Journal of Advance Research, Ideas and Innovations in Technology, 2(6) www.IJARIIT.com.

MLA
Shirdi Wazeed Baba, Reddi Sanjeev Kumar. "Data Mining : Text Classification System for Classifying Abstracts of Research Papers." International Journal of Advance Research, Ideas and Innovations in Technology 2.6 (2016). www.IJARIIT.com.

Abstract

Text classification is the process of classifying documents into predefined categories based on their content.Text classification is the primary requirement of text retrieval systems,which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data.We have proposed a Text Classification system for classifying abstract of different research papers. In this System we have extracted keywords using Porter Stemmer and Tokenizer. The word set is formed from the derived keywords using Association Rule and Apriori algorithm. The Probability of the word set is calculated using naive bayes classifier and then the new abstract inserted by the user is classified as belonging to one of the various classes. The accuracy of the system is found satisfactory. It requires less training data as compared to other classification system.