Please wait a minute...
Computer & Telecommunication
Current Issue | Archive | Adv Search |
Research on Personal Microblog Clustering Based on CBOW Model
SONG Tian-shu, LI Jiang-yu, ZHANG Qin-zhe
Inner Mongolia University of Science and Technology
Download:   PDF(0KB)
Export: BibTeX | EndNote (RIS)      
Abstract  Personal microblog is a popular social tool. The number of users is troublesome because it is confusing to users. This article clusters microblogs with high semantic similarity to facilitate user browsing. The main research work of this dissertation is as follows: 1. Use jieba segmentation in python to preprocess word segmentation and remove stopwords of personal microblog; 2. Use segmentation dataset to train word vectors using CBOW model; 3. Express personal microblog sentence vectors using word vector; 4. Personal microblog sentence vectors are represented as distribution points in space, using the modified Manhattan sentence algorithm to calculate distances, ie similarities between individual microblogs. 5. Use a modified clarans algorithm for clustering. Experiments show that the method of this paper is obviously improved compared with the traditional clustering algorithms, such as the method of dividing, the method of layering and the method of density.
Key wordsindividual microblog      semantic      clustering      machine learning     
:  TP391  

Cite this article:

SONG Tian-shu, LI Jiang-yu, ZHANG Qin-zhe. Research on Personal Microblog Clustering Based on CBOW Model. Computer & Telecommunication, 2018, 1(4): 69-72.

URL:

https://www.computertelecom.com.cn/EN/     OR     https://www.computertelecom.com.cn/EN/Y2018/V1/I4/69

[1] HUANG He-lie HUANG Ge-wen CHEN Zhi-hua YAO Zu-fa. Improved Sparrow Search Algorithm for Hybrid Flow-shop Scheduling Problem with Peak Clipping Demand Response [J]. 电脑与电信, 2024, 1(6): 16-.
[2] YANG Li-jia CHEN Xin-fang ZHAO Han-qing WANG Shi-wei. Water Quality Safety Modeling Prediction Based on Extra Tree Classifier[J]. 电脑与电信, 2024, 1(6): 57-.
[3] ZHANG An-qi ZHANG Na. Graph Embedding Based on Neighbor Similarity for Community Detection[J]. 电脑与电信, 2024, 1(5): 79-.
[4] NIE Cheng WANG Jie.
A Review of Research and Development in Data Analysis Methods
[J]. 电脑与电信, 2024, 1(4): 200-25.
[5] GUO Zhao-feng XIE Ling ZHUANG Yi-fan.
Design of an Intelligent Blind Guide System
[J]. 电脑与电信, 2022, 1(3): 1-4.
[6] ZHANG Li.
Research on Macro Virus Processing Model Based on SVM Algorithm
[J]. 电脑与电信, 2022, 1(1-2): 41-45.
[7] CAO Cong-hui LAN Qiang HOU Qun QI Wei-min . Research on Extraction Technology Based on Remote Learning[J]. 电脑与电信, 2021, 1(8): 1-5.
[8] ZHANG Zhi-yuan. DDoS Attack Detection Method on Application Layer Based on Clustering[J]. 电脑与电信, 2021, 1(7): 25-28.
[9] LIU Hui-fang. Research and Application of Anti-fraud Model Based on Big Data of Communication Operator[J]. 电脑与电信, 2021, 1(7): 46-52.
[10] LI Hui-fang ZHONG Xin-cheng FU Xiao-li. Research on the Detection of Abnormal Behavior of College StudentsBased on Density Peak Clustering[J]. 电脑与电信, 2021, 1(3): 26-29.
[11] WANG Yu HE Zhen-xiang. Research on the Intrusion Detection Method Based on Machine LearningAlgorithm[J]. 电脑与电信, 2020, 1(7): 1-3.
[12] JIANG Yi ZHU Jun-wu. Research on the Intrusion Detection Method Based on Machine LearningAlgorithm[J]. 电脑与电信, 2020, 1(7): 21-24.
[13] TONG Lian. Research and Application of Machine Learning in Big Data[J]. 电脑与电信, 2018, 1(9): 29-31.
[14] WEI Shuang. An Enhanced Data Mining Method for Text Clustering[J]. 电脑与电信, 2018, 1(3): 46-48.
[15] Wang Wang. Coverage Simulation of Base Stations Based on Machine Learning[J]. 电脑与电信, 2018, 1(11): 45-47.
Copyright © Computer & Telecommunication, All Rights Reserved.
Powered by Beijing Magtech Co. Ltd