基于Hadoop 平台的聚类K-means算法的研究

汪一百

电脑与电信 ›› 2018, Vol. 1 ›› Issue (4) : 18-20.

基金项目

汪一百

作者信息 +

Research on Clustering K-means Algorithm Based on Hadoop Platform

WANG Yi-bai

Author information +

文章历史 +

摘要

针对当前数据规模不断增大，单机的数据挖掘运行效率低下的问题，本文采用Hadoop 平台对聚类K-means 算法进行研究以解决此类问题。首先对Hadoop 平台的架构和搭建进行了详细描述；其次详细分析了K-means 算法；最后给出了算法实现，并对算法进行了实验分析。

Abstract

In view of the increasing scale of data and the inefficient operation of data mining in single machine, this paper uses Hadoop platform to cluster K-means algorithm to solve such problems. Firstly, the architecture and construction of the Hadoop platform are described in detail; secondly, the K-means algorithm is analyzed; finally, the algorithm implementation is given, and the algorithm is experimentally analyzed.