Chinese Text Feature Classification Based on Distributed Framework

ZHANG Hui-fang ZONG Cai-le ZHANG Xiao-lin

Computer & Telecommunication ›› 2019, Vol. 1 ›› Issue (5) : 1-7.

Computer & Telecommunication ›› 2019, Vol. 1 ›› Issue (5) : 1-7.

Chinese Text Feature Classification Based on Distributed Framework

Author information +
History +

Abstract

The study uses Fudan Chinese text and Sogou Chinese document as the research object. It improves the Chinese text classification accuracy and recall rate. And it analyzes and obtains the best contribution value of the feature words. Based on naive Bayes classification method, improved TFIDF keyword extraction and weight calculation, the TNBIF model classification method is proposed and implemented on the Spark platform. The experimental results show that the Chinese text classification is applied by the TNBIF model. The accuracy is as high as 95.49%, which is 5.41% higher than the traditional text classification method and the recall rate is increased by 6.64%. This study obtains an optimal contribution of 0.95.

Key words

TNBIF / model / massive data set / Spark / feature classification / parallel classification

Cite this article

Download Citations
ZHANG Hui-fang ZONG Cai-le ZHANG Xiao-lin. Chinese Text Feature Classification Based on Distributed Framework[J]. Computer & Telecommunication. 2019, 1(5): 1-7

Accesses

Citation

Detail

Sections
Recommended

/