Abstract:The network has become the biggest knowledge base and contains a lot of valuable information. The presentation
form of Internet information is diversified. How to discover valuable page is top priority of information extraction and the foundation
of building knowledge base. Based on the Internet model, this article researches how to discover valuable pages using Pagerank
algorithm in Hadoop platform saving time and space, to provide solutions for knowledge base construction.