This paper is published in Volume-7, Issue-1, 2021
Area
Computer Science Engineering
Author
Raktim Chatterjee, Sukanya Bhattacharya, Soumyajeet Kabi
Org/Univ
Guru Nanak Institute of Technology, Kolkata, West Bengal, India
Pub. Date
17 February, 2021
Paper ID
V7I1-1269
Publisher
Keywords
Linguistic, Classification, Features, Profane, Learning

Citationsacebook

IEEE
Raktim Chatterjee, Sukanya Bhattacharya, Soumyajeet Kabi. Profanity detection in social media text using a hybrid approach of NLP and machine learning, International Journal of Advance Research, Ideas and Innovations in Technology, www.IJARIIT.com.

APA
Raktim Chatterjee, Sukanya Bhattacharya, Soumyajeet Kabi (2021). Profanity detection in social media text using a hybrid approach of NLP and machine learning. International Journal of Advance Research, Ideas and Innovations in Technology, 7(1) www.IJARIIT.com.

MLA
Raktim Chatterjee, Sukanya Bhattacharya, Soumyajeet Kabi. "Profanity detection in social media text using a hybrid approach of NLP and machine learning." International Journal of Advance Research, Ideas and Innovations in Technology 7.1 (2021). www.IJARIIT.com.

Abstract

Profanity is socially offensive language, which may also be called cursing, cussing, swearing, or expletives. Nowadays where everything is digitally managed, there are lots of online platforms and forums which people use. If we take an example of any social media platform like Twitter, their privacy policy suggests that users cannot share or write any obscene/vulgar language on a public platform. Several corporate and research organizations discuss how such content is found and controlled, such as computer vision research has developed to detect illegal practices in public spaces, NLP has progressed to detect profanity in social media texts. However, existing profanity detection systems still remain flawed because of various factors. In this paper, we define and analyze the system which will use NLP and Machine learning approach to solve this. It is usually framed as a supervised learning problem. Generic features such as Bag-Of-Words or embeddings systematically deliver fair success in classification. Lexical resources in combination with models such as Linear Support Vector Machine (SVM); feature modeling specific linguistic constructs making it more effective in classification.