Lazy fine-tuning algorithms for naïve Bayesian text classification

Khalil-El-Hindi, Reem Al Julidan Huessien AlSalman . 2020

(Fine-tuning Naïve Bayes Document categorization)

The naïve Bayes (NB) learning algorithm is widely applied in many fields, particularly in text classification. However, its performance decreases when it is used in domains where its naïve assumption is violated or when the training set is too small to find accurate estimations of the probabilities. In this study, we propose a lazy fine-tuning naïve Bayes (LFTNB) method to address both problems. We propose a local fine-tuning algorithm that uses the nearest neighbors of a query instance to fine-tune the probability terms used by NB. Applying the nearest neighbors only makes the independence assumption more likely to be valid, whereas the fine-tuning algorithm is used to find more accurate estimations of the probability terms. The performance of the LFTNB approach was evaluated using 47 UCI datasets. The results show that the LFTNB method achieves superior performance than classical NB, eager FTNB, and k-nearest neighbor algorithms. We also propose eager and lazy fine-tuning versions of powerful NB-based text classification algorithms, namely, multinomial NB, complement NB, and one-versus-all NB. The empirical results using 18 UCI text classification datasets show that the proposed methods outperform untuned versions of these algorithms.

رقم المجلد

مجلة/صحيفة

Applied Soft Computing

مزيد من المنشورات

Error-Based Noise Filtering During Neural Network Training

The problem of dealing with noisy data in neural network-based models has been receiving more attention by researchers with the aim of mitigating possible consequences on learning.

بواسطة Khalil El Hindi; Saad Al-Ahmadi, Fahad-Alharbi

2020

Improved Distance Functions for Instance-Based Text Classification

Text classification has many applications in text processing and information retrieval. Instance-based learning (IBL) is among the top-performing text classification methods. However, its…

بواسطة Bayan Abu Shawar, Reem Aljulaidan,1 and Hussien Alsalamn, Khalil-El-Hindi

Analyzing social data as a participatory sensing system (PSS) provides a deep understanding of city dynamics, such as people’s mobility patterns, social patterns, and events detection. In a PSS,…

بواسطة Khalil El Hindi Salaha Alzahrani Khulud-Alharthy

2020

Khalil M El Hindi

Lazy fine-tuning algorithms for naïve Bayesian text classification

Khalil-El-Hindi, Reem Al Julidan Huessien AlSalman . 2020