Thursday 28th of March 2024
 

Arabic Text-Based Chat Topic Classification Using Discrete Wavelet Transform


Arwa Diwali, Mahmod Kamel and Mohmmed Dahab

Research studies on topic classification of chat conversations use the Vector Space Model (VSM) extensively. The VSM represents documents and queries as vectors of features. These features are the terms that occur within the collection. The VSMs limitation is that it does not consider the proximity of the terms in the document. The proximity information of the terms is an important factor that determines their position in the document. Another model used in information retrieval systems is the Spectral-Based Information Retrieval Method (SBIRM), which employs the Discrete Wavelet Transform (DWT) to rank documents according to document scores. This methods advantage is that it not only counts the frequency of the terms used in the document, but also considers their proximity by comparing the query terms in their spectral domain rather than their spatial domain. Based on the foregoing considerations, the objective of this research is to build a framework for Arabic Chat Classification (ACC) that can help detect illegal topics in Arabic text-based chat conversations. This framework is a combination of Information Retrieval (IR) and Machine Learning (ML) methods. The ACC implements two methods: first, the SBIRM using the DWT and second, the Nave Bayes method. Two experiments were conducted to test the ACC framework, one for root-based and the other for stem-based chat conversations. The results showed that the former outperformed the latter in terms of accuracy, precision, and F-measure. The recall results were the same for both experiments.

Keywords: Chat Classification, Discrete Wavelet Transform, Naïve Bayes, Term Signal, Spectral Based Retrieval Method

Download Full-Text


ABOUT THE AUTHORS

Arwa Diwali
Arwa Diwal has received a MSc degree from King Abdul Aziz University in Jeddah. She is working as Lecturer in the Information Systems Department, Faculty of Computing and Information Technology.

Mahmod Kamel
Dr. Mahmod Kamel received a PhD degree from Al-Azhar University, Egypt. At Present, he is an Assistant Professor in the Information Systems Department, King Abdul Aziz University. He has more than 12 years of experience in the field of teaching and training in the educational sector. He has been guiding Master students for the past 8 years.

Mohmmed Dahab
Dr. Mohmmed Dahab received a PhD in Computer Science from Cairo University, Egypt, in 2007. Currently, he is an Assistant Professor at King Abdul Aziz University. His research interests include applications of Expert Systems, Natural Language Processing, Text Mining, Information Retrieval, and Machine Learning.


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »