01 Introduction
Download the example code repository and set up the development environment for YOLOv8 with OpenVINO™ inference engine. Use the following command to clone the code repository:
git clone
https://gitee.com/ppov-nuc/yolov8_openvino.git
02 Exporting YOLOv8 Object Detection OpenVINO™ IR Model
YOLOv8 has five different object detection models trained on the COCO dataset.
Start by exporting the YOLOv8n.onnx model using the command:
yolo export model=yolov8n.pt format=onnx
This will generate the yolov8n.onnx model.
mo -m yolov8n.onnx –compress_to_fp16
Next, optimize and export the OpenVINO™ IR format model with FP16 precision using the command:
03 Testing the Inference Performance of YOLOv8 Object Detection Model with benchmark_app
benchmark_app is a performance testing tool provided by the OpenVINO™ toolkit for evaluating the inference performance of AI models. It allows testing the pure AI model inference performance without pre- or post-processing in synchronous or asynchronous mode.
Use the command:
benchmark_app -m yolov8n.xml -d GPU
This will provide the asynchronous inference performance of the yolov8n.xml model on the integrated GPU of the AIxBoard.
04 Writing YOLOv8 Object Detection Model Inference Program with OpenVINO™ Python API
Open yolov8n.onnx using Netron, as shown in the figure below. The input shape of the model is [1,3,640,640], and the output shape is [1,84,8400]. The “84” represents the cx, cy, h, w, and scores for 80 classes. “8400” indicates the number of output cells for the three detection heads in YOLOv8 when the image size is 640 (80×80+40×40+20×20=8400).
Here’s an example program for YOLOv8 object detection model using the OpenVINO™ Python API:
yolov8_od_ov_sync_infer_demo.py
The core source code is as follows:
The running result of `yolov8_od_ov_sync_infer_demo.py` is shown in the following image:
05 Conclusion
By leveraging the integrated GPU of the AIxBoard and utilizing OpenVINO™, impressive performance can be achieved with the YOLOv8 object detection model. Asynchronous processing and the use of AsyncInferQueue can further improve the utilization of the compute device and increase the throughput of AI inference programs.