摘要
随着互联网内容的快速增长,对于网络内容的快速识别压力越来越大。本文进行基于聚类算法的内容识别研
究,为维护网络安全、网络内容健康,具有非常重要的意义。目前的互联网内容识别方式主要以关键字检索方法进行识别,但
是面对日益丰富的网络内容和不同方式存储在服务器的内容,这种方式已经无法满足实际的需求。从实际问题出发针对互联
网内容中以图形、图像、音频等非结构化数据形式存储在服务器中的内容进行识别,依据互联网内容的发展规律对现有的聚类
算法进行改进,以求能够最大程度地对互联网内容进行筛选和甄别,维护互联网安全。
Abstract
With the rapid growth of Internet content, the pressure for rapid identification of the network content is becoming
higher and higher. This paper researches on the content recognition based on clustering algorithm, which is very important to maintain
the security of network and the health of the network. The Internet content recognition at present mainly uses the keywords, but
it is unable to meet the actual demand of the network contents and server contents stored in different ways. In view of the practical
problems, the recognition of unstructured data stored in the forms of graphics, images and audio is researched. The existing clustering
algorithm is improved based on the law of the development of Internet content, in order to filter and discriminate the Internet
content in the greatest degree, to maintenance the Internet security.
关键词
数据挖掘 /
内容识别 /
聚类分析 /
K-MEANS聚类算法改进
Key words
data mining /
content recognition /
clustering analysis /
K-MEANS clustering algorithm improvement
徐勇.
基于聚类算法的内容识别研究[J]. 电脑与电信. 2016, 1(11): 39-41
Xu Yong.
Research on Content Recognition Based on Clustering Algorithm[J]. Computer & Telecommunication. 2016, 1(11): 39-41
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}