Mediapipe框架中第三方模型的接入和GPU加速策略的研究

张钢, 袁霆, 肖宁杰, 杨鸿凯, 杨宗骏

电脑与电信 ›› 2025, Vol. 1 ›› Issue (6) : 37-41.

电脑与电信 ›› 2025, Vol. 1 ›› Issue (6) : 37-41.
应用技术与研究

Mediapipe框架中第三方模型的接入和GPU加速策略的研究

  • 张钢, 袁霆, 肖宁杰, 杨鸿凯, 杨宗骏
作者信息 +

Research on Third-party Model Integration and GPU Acceleration Strategies in Mediapipe Framework

  • ZHANG Gang, YUAN Ting, XIAO Ning-jie, YANG Hong-kai, YANG Zong-jun
Author information +
文章历史 +

摘要

聚焦Mediapipe框架中第三方模型的高效接入与GPU加速策略优化。Mediapipe作为Google开源的移动端AI框架,凭借其管道架构在移动端实现了低延迟、高精度的实时处理。然而,该框架在支持第三方模型接入方面存在明显不足。针对这一问题,提出了一种创新的模型接入层设计方案,并成功实现了YOLOv11、YOLOv11-Pose和RTMPose三个模型的接入。在GPU加速策略方面,本研究从模型推理参数优化和推理结果解析两个方面进行了探讨,提出了一套完整的性能优化方案。实验结果表明,在Android平台上,该接入方案在模型运行效率方面取得了显著提升,同时保持了良好的部署便捷性。

Abstract

This study focuses on optimizing third-party model integration and GPU acceleration strategies in the Mediapipe framework. As an open-source mobile AI framework developed by Google, Mediapipe achieves low-latency, high-precision real-time processing on mobile devices through its pipeline architecture. However, the framework exhibits significant limitations in supporting third-party model integration. To address this issue, we propose an innovative model integration layer design and successfully implement three models: YOLOv11, YOLOv11-Pose, and RTMPose. Regarding GPU acceleration strategies, this research explores two key aspects: model inference parameter optimization and inference result parsing, proposing a comprehensive performance optimization solution. Experimental results demonstrate that on the Android platform, this integration solution achieves significant improvements in model execution efficiency while maintaining excellent deployment convenience.

关键词

Mediapipe / YOLOv11 / RTMPose / 移动端AI / TfLite

Key words

Mediapipe / YOLOv11 / RTMPose / mobile AI / TfLite

引用本文

导出引用
张钢, 袁霆, 肖宁杰, 杨鸿凯, 杨宗骏. Mediapipe框架中第三方模型的接入和GPU加速策略的研究[J]. 电脑与电信. 2025, 1(6): 37-41
ZHANG Gang, YUAN Ting, XIAO Ning-jie, YANG Hong-kai, YANG Zong-jun. Research on Third-party Model Integration and GPU Acceleration Strategies in Mediapipe Framework[J]. Computer & Telecommunication. 2025, 1(6): 37-41
中图分类号: TP391.4   

参考文献

[1] Mediapipe框架[EB/OL].[2025-6-24].https://ai.google.dev/edge/mediapipe/framework?hl=zh-cn.
[2] Lugaresi C,Tang J,Nash H,et al.Mediapipe:A framework for building perception pipelines[J].arXiv preprint arXiv:1906.08172,2019.
[3] 刘星辰,杨瑞,刘林鑫,等.基于深度学习的中国通用手语识别系统[J].电脑与电信,2024(11):43-47.
[4] 邵晨悦,孟青云,查佳佳,等.基于视觉识别技术的手势自动跟随研究[J].智能计算机与应用,2024,14(11):117-123.
[5] Hidayatullah P,Syakrani N,Sholahuddin MR,et al.YOLOv8 to YOLO11:A Comprehensive Architecture In-depth Comparative Review[J].arXiv preprint arXiv:2501.13400,2025.
[6] Maji D,Nagori S,Mathew M,et al.Yolo-pose:Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2022.
[7] Sengupta A,Jin F,Zhang R,et al.mm-Pose:Real-time human skeletal posture estimation using mmWave radars and CNNs[J].IEEE Sensors Journal. 2020,20(17):10032-10044.
[8] Jiang T,Xie X,Li Y,et al.Rtmpose:Real-time multi-person pose estimation based on mmpose[J].arXiv preprint arXiv:2303.07399,2023.

基金

国家自然科学基金项目,项目编号:81373883; 2025年大学生创新训练项目,项目编号:xj2025118450147

Accesses

Citation

Detail

段落导航
相关文章

/