With the continuous development of big data technology, network data collection technology has become a popular research field. Data collection function libraries based on Python language such as Urllib, Requests, Selenium and other modules are inefficient and easy to block, and the current data collection and analysis platforms are all independent functional modules, which do not form a closed loop and have a poor user experience. In order to solve the above problems, this paper proposes a data collection
and analysis platform. First, the Scrapy framework is used to complete data collection, and then the Kettle tool is used to clean the
collected data. The processed results are saved into the MySQL database. Finally, the Flask frame is combined with Echarts technology to build a Web system to visualize the data analysis results. This paper uses Beijing Public Transport website data as a crawlertest platform. Through the collection and analysis of bus line types, bus routes and other information, and the results display, the analysis results have certain guiding significance for the planning of urban public transport. At the same time, the platform is stable, reliable and easy to operate.