Please wait a minute...
Computer & Telecommunication
Current Issue | Archive | Adv Search |
Research on Personal Microblog Clustering Based on CBOW Model
SONG Tian-shu, LI Jiang-yu, ZHANG Qin-zhe
Inner Mongolia University of Science and Technology
Download:   PDF(0KB)
Export: BibTeX | EndNote (RIS)      
Abstract  Personal microblog is a popular social tool. The number of users is troublesome because it is confusing to users. This article clusters microblogs with high semantic similarity to facilitate user browsing. The main research work of this dissertation is as follows: 1. Use jieba segmentation in python to preprocess word segmentation and remove stopwords of personal microblog; 2. Use segmentation dataset to train word vectors using CBOW model; 3. Express personal microblog sentence vectors using word vector; 4. Personal microblog sentence vectors are represented as distribution points in space, using the modified Manhattan sentence algorithm to calculate distances, ie similarities between individual microblogs. 5. Use a modified clarans algorithm for clustering. Experiments show that the method of this paper is obviously improved compared with the traditional clustering algorithms, such as the method of dividing, the method of layering and the method of density.
Key wordsindividual microblog      semantic      clustering      machine learning     
ZTFLH:  TP391  

Cite this article:

SONG Tian-shu, LI Jiang-yu, ZHANG Qin-zhe. Research on Personal Microblog Clustering Based on CBOW Model. Computer & Telecommunication, 2018, 1(4): 69-72.

URL:

http://www.computertelecom.com.cn/EN/     OR     http://www.computertelecom.com.cn/EN/Y2018/V1/I4/69

[1] CAO Cong-hui LAN Qiang HOU Qun QI Wei-min . Research on Extraction Technology Based on Remote Learning[J]. 电脑与电信, 2021, 1(8): 1-5.
[2] ZHANG Zhi-yuan. DDoS Attack Detection Method on Application Layer Based on Clustering[J]. 电脑与电信, 2021, 1(7): 25-28.
[3] LIU Hui-fang. Research and Application of Anti-fraud Model Based on Big Data of Communication Operator[J]. 电脑与电信, 2021, 1(7): 46-52.
[4] LI Hui-fang ZHONG Xin-cheng FU Xiao-li. Research on the Detection of Abnormal Behavior of College StudentsBased on Density Peak Clustering[J]. 电脑与电信, 2021, 1(3): 26-29.
[5] WANG Yu HE Zhen-xiang. Research on the Intrusion Detection Method Based on Machine LearningAlgorithm[J]. 电脑与电信, 2020, 1(7): 1-3.
[6] JIANG Yi ZHU Jun-wu. Research on the Intrusion Detection Method Based on Machine LearningAlgorithm[J]. 电脑与电信, 2020, 1(7): 21-24.
[7] TONG Lian. Research and Application of Machine Learning in Big Data[J]. 电脑与电信, 2018, 1(9): 29-31.
[8] WEI Shuang. An Enhanced Data Mining Method for Text Clustering[J]. 电脑与电信, 2018, 1(3): 46-48.
[9] Wang Wang. Coverage Simulation of Base Stations Based on Machine Learning[J]. 电脑与电信, 2018, 1(11): 45-47.
[10] Chen Ping. Analysis of Network Learning Behavior Data Based on Clustering Technology[J]. 电脑与电信, 2017, 1(4): 31-33.
[11] LIU Li-ping. An Anomaly Detection Algorithm Based on Artificial Immune Intelligence[J]. 电脑与电信, 2017, 1(12): 67-70.
[12] LI Yan-mei. An Improvement Algorithm Based on Global K- means Clustering[J]. 电脑与电信, 2017, 1(11): 25-27.
[13] CHEN Min-tao, KUANG Fang-jun. Research on the Application of Data Mining Technology in Medical Big Data[J]. 电脑与电信, 2017, 1(11): 34-36.
[14] CHEN Shuang-quan. Research on Video Content Recognition Based on Clustering Algorithm[J]. 电脑与电信, 2017, 1(11): 44-46.
[15] Chen Lin. An Improved GHSOMAlgorithm for Text Clustering[J]. 电脑与电信, 2016, 1(5): 57-61.
Copyright © Computer & Telecommunication, All Rights Reserved.
Powered by Beijing Magtech Co. Ltd