当前位置：首页 > news >正文

网站备案账号是什么情况京东云服务器

news 2025/12/31 17:04:33

网站备案账号是什么情况,京东云服务器,界面设计职业技能等级证书,专做和田玉的网站概述在之前博客中有介绍YOLOv8从环境安装到训练的完整过程#xff0c;本节主要介绍ONNX Runtime的原理以及使用其进行推理加速#xff0c;使用Python、C两种编程语言来实现。 https://blog.csdn.net/MariLN/article/details/143924548?spm1001.2014.3001.5501 1. ONNX Ru…概述在之前博客中有介绍YOLOv8从环境安装到训练的完整过程本节主要介绍ONNX Runtime的原理以及使用其进行推理加速使用Python、C两种编程语言来实现。 https://blog.csdn.net/MariLN/article/details/143924548?spm1001.2014.3001.5501 1. ONNX Runtime ONNX Runtime是一个由微软推出的跨平台机器学习模型加速器它仅支持 ONNX 模型格式。它适用于桌面、服务器以及移动设备。多框架支持支持多种常见的深度学习框架如 PyTorch、TensorFlow、Keras、scikit-learn 等使开发者能轻松将不同框架训练的模型移植到 ONNX Runtime 中进行高效推理促进了模型在不同框架间的共享与流转。跨平台兼容性可在 Linux、Windows、macOS 等多种操作系统上运行还支持在云、边缘、网页和移动等不同环境中部署能很好地满足各种应用场景的需求。硬件优化针对 GPU、CPU 以及各种 AI 加速器如 Intel MKL、cuDNN、TensorRT 等进行了优化能够充分利用硬件资源提升性能。例如在 GPU 上可实现并行计算大大加快模型的推理速度。高效的内存管理采用零拷贝Zero-Copy技术和内存池管理减少了数据传输的开销提升了整体运行速度在处理大规模数据时优势明显。动态形状支持允许输入尺寸在运行时发生变化模型仍能正确处理增加了模型应用的灵活性可更好地适应不同的输入数据情况。 2. 模型转换 2.1 .pt与.onnx模型 2.1.1 pt 模型 .pt 模型是 PyTorch 模型的一种常见存储格式。PyTorch 是一个广泛使用的深度学习框架在训练神经网络模型时模型的参数包括权重和偏置等会被保存下来这些参数可以以.pt 文件的形式存储在磁盘中。例如当你使用 PyTorch 训练一个图像分类模型如 ResNet后通过torch.save()函数就可以将训练好的模型保存为.pt 文件。本质上它是一个二进制文件它包含了模型的结构定义和参数。模型的结构定义包括网络的层数、每层的类型如线性层、卷积层、池化层等、激活函数的类型等信息。参数则是在训练过程中学习到的具体数值这些数值决定了模型对输入数据的处理方式。 2.1.2 onnx 模型 ONNXOpen Neural Network Exchange是一种开放的神经网络交换格式.onnx 文件就是以这种格式存储的模型文件。它的出现是为了解决不同深度学习框架之间模型转换和互用的问题。许多深度学习框架如 PyTorch、TensorFlow 等都可以将自己的模型转换为 ONNX 格式。以 PyTorch 为例通过torch.onnx.export()函数可以将.pt 模型转换为.onnx 模型。 .onnx 文件同样是一种结构化的文件它以一种中间表示的形式存储了模型的计算图。这个计算图包含了模型中的各种操作如加法、乘法、卷积等以及操作之间的连接关系同时也包含了模型的输入和输出信息。这种中间表示形式使得不同框架训练的模型能够在一个统一的格式下进行转换和推理。 .onnx 模型主要用于模型的跨框架部署和推理。由于它可以被多种推理引擎如 ONNX Runtime、TensorRT 等所支持所以可以将在一个框架下训练好的模型转换为.onnx 格式然后在其他环境中进行高效的推理。例如在工业生产环境中模型可能是在 PyTorch 中训练的但在实际的产品线上需要将其部署到一个对性能和效率要求更高的推理引擎上此时将模型转换为.onnx 文件并使用 ONNX Runtime 等推理引擎进行部署就非常方便。同时它也方便了不同团队之间的协作即使不同团队使用不同的深度学习框架也可以通过.onnx 文件进行模型的共享和集成。 2.2 .pt转换.onnx 将训练好的 YOLOv8 的.pt模型转换为.onnx模型。可以使用ultralytics库来进行转换。 yolo taskdetect modeexport model./runs/detect/train/weights/best.pt formatonnx或 from ultralytics import YOLO# Load a model model YOLO(./runs/detect/train/weights/best.pt) # load a custom trained# Export the model success model.export(formatonnx) 3. 模型推理 3.1 Python实现 3.1.1 环境部署需要安装onnxruntime、numpy、cv2等库。如果使用 GPU 进行推理还需安装onnxruntime-gpu。 pip install onnxruntime pip install onnxruntime-gpu pip install opencv-python pip install numpy pip install gradio3.1.2 推理步骤 1图像预处理读取图像并将图像的颜色空间从 BGR 格式转换为 RGB 格式。OpenCV 默认使用 BGR 格式而许多深度学习框架和模型如 ONNX 模型则期望输入是 RGB 格式。调整图像大小通常将图像 resize 到模型要求的输入尺寸如 640x640。对图像进行归一化处理将像素值归一化到 [0, 1] 区间。调整图像通道顺序一般从 HWCHeight, Width, Channel转换为 CHW 格式并增加一个批次维度使其变为 NCHW 格式N 为批次大小通常设为 1。 import cv2 import numpy as npdef prepare_input(image, input_width, input_height):# 转换为 RGB 格式input_img cv2.cvtColor(image, cv2.COLOR_BGR2RGB)# cv2.imread 读取到的图像默认是 BGR 格式的# 调整图像尺寸input_img cv2.resize(input_img, (input_width, input_height))# input_width、input_height是模型期望的输入宽度和高度# 归一化到 0-1input_img input_img / 255.0# 变换通道顺序并增加 batch 维度HWC-NCHWinput_img input_img.transpose(2, 0, 1)input_tensor input_img[np.newaxis, :, :, :].astype(np.float32)# np.newaxis 用于增加一个新的维度return input_tensorimage_path test.jpg image cv2.imread(image_path) input_tensor prepare_input(image, 640, 640) 2模型推理创建onnxruntime.InferenceSession对象加载转换后的.onnx模型。将预处理后的图像数据作为输入传递给模型进行推理并获取输出结果。 def inference(model_path, input_tensor):start time.perf_counter() # 获取一个高精度的时间戳主要用于代码性能测试和计算时间间隔精确度通常远高于 time.time()# 加载 ONNX 模型session onnxruntime.InferenceSession(model_path, providersonnxruntime.get_available_providers())# 获取输入和输出的名字input_names [model_inputs.name for model_inputs in session.get_inputs()]output_names [model_outputs.name for model_outputs in session.get_outputs()]# 运行模型推理outputs session.run(output_names, {input_names[0]: input_tensor})print(fInference time: {(time.perf_counter() - start)*1000:.2f} ms)return outputs 3后处理对模型的输出结果去除批量维度。获取每个检测框的置信度最高的类别并根据置信度阈值进行筛选过滤掉低置信度的目标检测框。坐标转换将预测框还原到原始图像尺寸并将边界框的表示从中心点坐标 (x_center, y_center) 和宽高 (w, h) 格式转换为左上角和右下角坐标 (x1, y1, x2, y2) 格式。进行非极大值抑制NMS去除重叠度过高的检测框得到最终的目标检测结果。 def xywh2xyxy(x):# 将边界框从 (x_center, y_center, w, h) 格式转换为 (x1, y1, x2, y2)y np.copy(x)# 计算左上角坐标 x1 和 y1y[..., 0] x[..., 0] - x[..., 2] / 2 # x1 x_center - w / 2y[..., 1] x[..., 1] - x[..., 3] / 2 # y1 y_center - h / 2# 计算右下角坐标 x2 和 y2y[..., 2] x[..., 0] x[..., 2] / 2 # x2 x_center w / 2y[..., 3] x[..., 1] x[..., 3] / 2 # y2 y_center h / 2return ydef multiclass_nms(boxes, scores, class_ids, iou_threshold):# 获取所有唯一的类别索引unique_class_ids np.unique(class_ids)keep_boxes [] # 存储最终保留的边界框索引for class_id in unique_class_ids:# 筛选出属于当前类别的边界框索引class_indices np.where(class_ids class_id)[0] # np.where返回元组# 提取属于当前类别的边界框和分数class_boxes boxes[class_indices, :] # 当前类别的边界框class_scores scores[class_indices] # 当前类别的分数# 执行 NMS 并获取保留下来的索引class_keep_boxes nms(class_boxes, class_scores, iou_threshold)# 将保留的索引对应原始的索引添加到结果中keep_boxes.extend(class_indices[class_keep_boxes])return keep_boxesdef nms(boxes, scores, iou_threshold):# 根据 scores 对检测框从高到低进行排序得到排序后的索引sorted_indices np.argsort(scores)[::-1] # [::-1] 反转排序顺序keep_boxes []while sorted_indices.size 0:# 保留最高分数的边界框box_id sorted_indices[0]keep_boxes.append(box_id)# 计算当前最高分数的边界框与剩余边界框的 IoUious compute_iou(boxes[box_id, :], boxes[sorted_indices[1:], :])# 找出 IoU 小于阈值的边界框索引保留这些框过滤重叠框keep_indices np.where(ious iou_threshold)[0]# 注意由于 keep_indices 是相对于 sorted_indices[1:] 的索引# 需要将其整体偏移 1 来匹配到原始 sorted_indicessorted_indices sorted_indices[keep_indices 1]return keep_boxesdef compute_iou(box, boxes):# 计算交集区域的坐标xmin 和 ymin: 交集左上角的坐标xmax 和 ymax: 交集右下角的坐标xmin np.maximum(box[0], boxes[:, 0]) ymin np.maximum(box[1], boxes[:, 1]) xmax np.minimum(box[2], boxes[:, 2]) ymax np.minimum(box[3], boxes[:, 3]) # 计算交集区域面积如果两个框没有重叠交集宽度和高度会为负使用 np.maximum 保证面积非负intersection_area np.maximum(0, xmax - xmin) * np.maximum(0, ymax - ymin)# 计算每个边界框的面积box_area (box[2] - box[0]) * (box[3] - box[1]) boxes_area (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1]) # 计算并集区域面积union_area box_area boxes_area - intersection_area# 计算 IoU交并比iou intersection_area / union_area # 交集区域面积 / 并集区域面积return ioudef process_output(outputs, conf_threshold, iou_threshold, input_width, input_height, img_width, img_height):predictions np.squeeze(outputs[0]).T # 去除数组中形状为1的维度批量维度(1, N, M)-(M, N)# 获取每个检测框的置信度最高的类别scores np.max(predictions[:, 4:], axis1) # 在行方向上取最大值# 根据置信度阈值过滤掉低置信度的检测框predictions predictions[scores conf_threshold, :]scores scores[scores conf_threshold]if len(scores) 0:return [], [], []# 获取检测框的类别置信度最高的索引class_ids np.argmax(predictions[:, 4:], axis1) # 返回数组中最大值的索引# 提取边界框boxes predictions[:, :4]# 将边界框坐标从归一化坐标还原到原图尺寸input_shape np.array([input_width, input_height, input_width, input_height])boxes np.divide(boxes, input_shape, dtypenp.float32) # 边界框坐标是相对于输入图像尺寸的归一化到 [0, 1] 之间boxes * np.array([img_width, img_height, img_width, img_height]) # 将归一化的坐标还原到原图尺寸# 转换为 xyxy 格式boxes xywh2xyxy(boxes)# 执行非极大值抑制NMSindices multiclass_nms(boxes, scores, class_ids, iou_threshold)return boxes[indices], scores[indices], class_ids[indices]3.1.3 完整代码部署 utils.py import numpy as np import cv2class_names [person,head,helmet]# Create a list of colors for each class where each color is a tuple of 3 integer values rng np.random.default_rng(3) colors rng.uniform(0, 255, size(len(class_names), 3))def nms(boxes, scores, iou_threshold):# Sort by scoresorted_indices np.argsort(scores)[::-1]keep_boxes []while sorted_indices.size 0:# Pick the last boxbox_id sorted_indices[0]keep_boxes.append(box_id)# Compute IoU of the picked box with the restious compute_iou(boxes[box_id, :], boxes[sorted_indices[1:], :])# Remove boxes with IoU over the thresholdkeep_indices np.where(ious iou_threshold)[0]# print(keep_indices.shape, sorted_indices.shape)sorted_indices sorted_indices[keep_indices 1]return keep_boxesdef multiclass_nms(boxes, scores, class_ids, iou_threshold):unique_class_ids np.unique(class_ids)keep_boxes []for class_id in unique_class_ids:class_indices np.where(class_ids class_id)[0]class_boxes boxes[class_indices,:]class_scores scores[class_indices]class_keep_boxes nms(class_boxes, class_scores, iou_threshold)keep_boxes.extend(class_indices[class_keep_boxes])return keep_boxesdef compute_iou(box, boxes):# Compute xmin, ymin, xmax, ymax for both boxesxmin np.maximum(box[0], boxes[:, 0])ymin np.maximum(box[1], boxes[:, 1])xmax np.minimum(box[2], boxes[:, 2])ymax np.minimum(box[3], boxes[:, 3])# Compute intersection areaintersection_area np.maximum(0, xmax - xmin) * np.maximum(0, ymax - ymin)# Compute union areabox_area (box[2] - box[0]) * (box[3] - box[1])boxes_area (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])union_area box_area boxes_area - intersection_area# Compute IoUiou intersection_area / union_areareturn ioudef xywh2xyxy(x):# Convert bounding box (x, y, w, h) to bounding box (x1, y1, x2, y2)y np.copy(x)y[..., 0] x[..., 0] - x[..., 2] / 2y[..., 1] x[..., 1] - x[..., 3] / 2y[..., 2] x[..., 0] x[..., 2] / 2y[..., 3] x[..., 1] x[..., 3] / 2return ydef draw_detections(image, boxes, scores, class_ids, mask_alpha0.3):det_img image.copy()img_height, img_width image.shape[:2]font_size min([img_height, img_width]) * 0.0006text_thickness int(min([img_height, img_width]) * 0.001)det_img draw_masks(det_img, boxes, class_ids, mask_alpha)# Draw bounding boxes and labels of detectionsfor class_id, box, score in zip(class_ids, boxes, scores):color colors[class_id]draw_box(det_img, box, color)label class_names[class_id]caption f{label} {int(score * 100)}%draw_text(det_img, caption, box, color, font_size, text_thickness)return det_imgdef draw_box( image: np.ndarray, box: np.ndarray, color: tuple[int, int, int] (0, 0, 255),thickness: int 2) - np.ndarray:x1, y1, x2, y2 box.astype(int)return cv2.rectangle(image, (x1, y1), (x2, y2), color, thickness)def draw_text(image: np.ndarray, text: str, box: np.ndarray, color: tuple[int, int, int] (0, 0, 255),font_size: float 0.001, text_thickness: int 2) - np.ndarray:x1, y1, x2, y2 box.astype(int)(tw, th), _ cv2.getTextSize(texttext, fontFacecv2.FONT_HERSHEY_SIMPLEX,fontScalefont_size, thicknesstext_thickness)th int(th * 1.2)cv2.rectangle(image, (x1, y1),(x1 tw, y1 - th), color, -1)return cv2.putText(image, text, (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, font_size, (255, 255, 255), text_thickness, cv2.LINE_AA)def draw_masks(image: np.ndarray, boxes: np.ndarray, classes: np.ndarray, mask_alpha: float 0.3) - np.ndarray:mask_img image.copy()# Draw bounding boxes and labels of detectionsfor box, class_id in zip(boxes, classes):color colors[class_id]x1, y1, x2, y2 box.astype(int)# Draw fill rectangle in mask imagecv2.rectangle(mask_img, (x1, y1), (x2, y2), color, -1)return cv2.addWeighted(mask_img, mask_alpha, image, 1 - mask_alpha, 0) target_detection.py import time import cv2 import numpy as np import onnxruntimefrom detection.utils import xywh2xyxy, draw_detections, multiclass_nmsclass TargetDetection:def __init__(self, path, conf_thres0.7, iou_thres0.5):self.conf_threshold conf_thresself.iou_threshold iou_thres# Initialize modelself.initialize_model(path)def __call__(self, image):return self.detect_objects(image)def initialize_model(self, path):self.session onnxruntime.InferenceSession(path,providersonnxruntime.get_available_providers())# Get model infoself.get_input_details()self.get_output_details()def detect_objects(self, image):input_tensor self.prepare_input(image)# Perform inference on the imageoutputs self.inference(input_tensor)self.boxes, self.scores, self.class_ids self.process_output(outputs)return self.boxes, self.scores, self.class_idsdef prepare_input(self, image):self.img_height, self.img_width image.shape[:2]input_img cv2.cvtColor(image, cv2.COLOR_BGR2RGB)# Resize input imageinput_img cv2.resize(input_img, (self.input_width, self.input_height))# Scale input pixel values to 0 to 1input_img input_img / 255.0input_img input_img.transpose(2, 0, 1)input_tensor input_img[np.newaxis, :, :, :].astype(np.float32)return input_tensordef inference(self, input_tensor):start time.perf_counter()outputs self.session.run(self.output_names, {self.input_names[0]: input_tensor})# print(fInference time: {(time.perf_counter() - start)*1000:.2f} ms)return outputsdef process_output(self, output):predictions np.squeeze(output[0]).T# Filter out object confidence scores below thresholdscores np.max(predictions[:, 4:], axis1)predictions predictions[scores self.conf_threshold, :]scores scores[scores self.conf_threshold]if len(scores) 0:return [], [], []# Get the class with the highest confidenceclass_ids np.argmax(predictions[:, 4:], axis1)# Get bounding boxes for each objectboxes self.extract_boxes(predictions)# Apply non-maxima suppression to suppress weak, overlapping bounding boxes# indices nms(boxes, scores, self.iou_threshold)indices multiclass_nms(boxes, scores, class_ids, self.iou_threshold)return boxes[indices], scores[indices], class_ids[indices]def extract_boxes(self, predictions):# Extract boxes from predictionsboxes predictions[:, :4]# Scale boxes to original image dimensionsboxes self.rescale_boxes(boxes)# Convert boxes to xyxy formatboxes xywh2xyxy(boxes)return boxesdef rescale_boxes(self, boxes):# Rescale boxes to original image dimensionsinput_shape np.array([self.input_width, self.input_height, self.input_width, self.input_height])boxes np.divide(boxes, input_shape, dtypenp.float32)boxes * np.array([self.img_width, self.img_height, self.img_width, self.img_height])return boxesdef draw_detections(self, image, draw_scoresTrue, mask_alpha0.4):return draw_detections(image, self.boxes, self.scores,self.class_ids, mask_alpha)def get_input_details(self):model_inputs self.session.get_inputs()self.input_names [model_inputs[i].name for i in range(len(model_inputs))]self.input_shape model_inputs[0].shapeself.input_height self.input_shape[2]self.input_width self.input_shape[3]print(self.input_width,self.input_height)def get_output_details(self):model_outputs self.session.get_outputs()self.output_names [model_outputs[i].name for i in range(len(model_outputs))] ATDetector.py import cv2 from detection.target_detection import TargetDetection from detection.utils import draw_detections# yolov8 onnx 模型推理 class ATDetector():def __init__(self):super(ATDetector, self).__init__()self.model_path ../yolov8s_best.onnxself.detector TargetDetection(self.model_path, conf_thres0.5, iou_thres0.3)def detect_image(self, input_image, output_image):cv_img cv2.imread(input_image)boxes, scores, class_ids self.detector.detect_objects(cv_img)cv_img draw_detections(cv_img, boxes, scores, class_ids)cv2.namedWindow(output, cv2.WINDOW_NORMAL)cv2.imwrite(output_image, cv_img)cv2.imshow(output, cv_img)cv2.waitKey(0)def detect_video(self, input_video, output_video):cap cv2.VideoCapture(input_video)fps int(cap.get(5))videoWriter Nonewhile True:_, cv_img cap.read()if cv_img is None:breakboxes, scores, class_ids self.detector.detect_objects(cv_img)cv_img draw_detections(cv_img, boxes, scores, class_ids)# 如果视频写入器未初始化则使用输出视频路径和参数进行初始化if videoWriter is None:fourcc cv2.VideoWriter_fourcc(m, p, 4, v)# 在这里给值了它就不是None, 下次判断它就不进这里了videoWriter cv2.VideoWriter(output_video, fourcc, fps, (cv_img.shape[1], cv_img.shape[0]))videoWriter.write(cv_img)cv2.imshow(aod, cv_img)cv2.waitKey(5)# 等待按键并检查窗口是否关闭if cv2.getWindowProperty(aod, cv2.WND_PROP_AUTOSIZE) 1:# 点x退出breakcap.release()videoWriter.release()cv2.destroyAllWindows()if __name__ __main__:det ATDetector()# input_image ../data/A_905.jpg# output_image ../data/output.jpg# det.detect_image(input_image, output_image)input_videorE:\dataset\MOT\video\A13.mp4output_video../data/output.mp4det.detect_video(input_video,output_video) 3.2 C实现 3.2.1 为什么呢 Python 是解释型语言代码在运行时逐行解释执行。在进行模型推理时每次执行模型的计算操作如卷积、池化等都需要解释器介入这会带来一定的性能开销。而C 是编译型语言代码直接编译为机器码计算机可以直接执行。在处理 YOLOv8 推理等这种计算密集型任务时C 没有解释器的开销执行速度更快。Python 代码的跨平台性较好但在一些特殊的硬件平台或者嵌入式系统中可能会受到限制。例如在资源非常有限的嵌入式设备中安装 Python 解释器以及相关的依赖库如 NumPy、ONNX Runtime for Python 等可能会占用过多的存储空间并且 Python 解释器的运行也需要一定的资源支持。而且 Python 程序在不同的操作系统上可能会因为依赖库版本等问题出现兼容性问题。C 的跨平台性非常出色并且可以通过编译器选项和特定的平台相关代码更好地适应不同的硬件环境。对于 YOLOv8 等模型推理如果要部署到嵌入式设备、工业控制设备等特殊平台C 可以更方便地进行优化和定制。例如在一些对性能和体积要求苛刻的嵌入式视觉系统中C 可以直接编译成高效的机器码并且可以根据设备的硬件特性进行针对性的优化如利用硬件加速指令集等。总之使用C 编写可以提供更快的实时性能。 3.2.2 安装依赖库 1下载ONNX Runtime 笔者的环境是Windows11CUDA 11.7cuDNN 8.5IDE是 vs2019。下载的ONNX Runtime的CPU和GPU版本为1.14.1。下载链接为https://github.com/microsoft/onnxruntime/releases/tag/v1.14.1 2下载OpenCV 笔者下载的opencv 版本为 4.7.0 下载链接为 https://opencv.org/releases/ 3配置ONNX Runtime和OpenCV 下载完成后解压在项目属性配置ONNX Runtime和OpenCV。首先把ONNX Runtime和OpenCV加入到包含目录路径里面包含ONNX Runtime和OpenCV的头文件。接着把ONNX Runtime和OpenCV加入到库目录路径里面包含ONNX Runtime和OpenCV的lib文件。然后把ONNX Runtime和OpenCV的lib文件名添加到链接器。最后把ONNX Runtime和OpenCV的 dll 文件名添加到项目工程的 Release 下。 3.2.3 推理步骤同Python语言实现一样模型推理部署需要三大步骤预处理、模型推理、后处理。在这里笔者重点介绍使用C实现模型推理的流程。 1图像预处理颜色空间转换OpenCV 默认读取的图像是 BGR 格式YOLO 模型通常要求输入 RGB 格式图像。将图像调整为网络输入所需的固定尺寸保持原始图像的宽高比在图像周围添加填充。归一化将像素值缩放到 [0, 1] 区间。数据格式转换HWC - CHW。 2模型推理 a. 引入头文件 #include onnxruntime_cxx_api.hb. 初始化 ONNX Runtime 环境和会话 Step 1: 创建 ONNX Runtime 环境 env Ort::Env(OrtLoggingLevel::ORT_LOGGING_LEVEL_WARNING, YOLOV8);Ort::Env 是 ONNX Runtime 中的环境对象它是一个全局性的对象用于初始化和管理 ONNX Runtime 运行时环境。功能初始化 ONNX Runtime 库。控制日志记录级别和日志输出。提供名称标识符方便调试和跟踪。 ONNX Runtime 支持的日志级别 ORT_LOGGING_LEVEL_VERBOSE记录所有信息详细级别。ORT_LOGGING_LEVEL_INFO记录一般信息。ORT_LOGGING_LEVEL_WARNING记录警告信息。ORT_LOGGING_LEVEL_ERROR仅记录错误信息。ORT_LOGGING_LEVEL_FATAL仅记录致命错误信息。 Step 2: 创建 ONNX Runtime 会话选项设置 ONNX Runtime 会话的选项。这可能包括配置 GPU 使用、优化器级别、执行模式等。 sessionOptions Ort::SessionOptions();它控制 ONNX 模型在推理时的行为包括线程数并行计算能力优化级别对模型进行图优化CUDA 使用GPU 加速内存分配器会话日志设置等。 // 设置线程数 sessionOptions.SetIntraOpNumThreads(1); //设置使用 GPU 推理加速 OrtCUDAProviderOptions cudaOption;//OrtCUDAProviderOptions 是 ONNX Runtime 提供的一个结构体用于配置 CUDA GPU 推理选项当在 GPU 上使用 ONNX Runtime 时需要通过该结构体指定 CUDA 相关参数。 sessionOptions.AppendExecutionProvider_CUDA(cudaOption); // 设置图优化级别为全部优化最大优化 sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL); 在 ONNX Runtime 中SetGraphOptimizationLevel 用于设置图优化的级别影响模型执行时的效率和性能。图优化有助于提高推理速度和减少内存消耗。不同的优化级别会对模型执行过程中的节点、计算图进行不同程度的优化。常见的图优化级别 ORT_ENABLE_BASIC 这是基本优化级别。启用对计算图的基本优化例如节点合并、常量折叠、去除无用的操作等。相比于未启用优化这个级别能带来一定程度的性能提升。 ORT_ENABLE_EXTENDED 启用更高级的优化策略例如通过对操作进行更复杂的优化来加速推理。优化程度更高可能会进一步减少内存占用和计算量。 ORT_ENABLE_ALL 启用所有可能的优化策略包括最激进的优化。这是最大优化级别会尝试最大限度地提升推理性能。包括节点的合并、常量折叠、冗余节点移除、图的精简等多个优化过程。适合追求最高性能的场景但可能会增加模型加载时间尤其是在某些复杂的模型中。为什么使用 ORT_ENABLE_ALL 性能提升 ORT_ENABLE_ALL 可以对计算图执行更多的优化极大地提升推理速度。内存优化优化后的图通常会更小内存占用也会减少。适用场景对于生产环境中的高性能需求或者需要进行大量推理的场景启用所有优化可以显著减少执行时间和内存消耗。 Step 3: 加载 ONNX 模型文件加载预训练的 ONNX 模型文件。使用运行时环境、会话选项和模型创建一个 Ort::Session 对象。 const wchar_t* modelPath yolov8.onnx; Ort::Session session(env, modelPath, sessionOptions);其中第二个参数modelPath模型的路径需要以宽字符wchar_t*格式传递。因为Windows 系统中的文件路径通常使用宽字符编码wchar_t。可以使用c_str() 方法它返回 std::wstring 对象的指针确保符合 Ort::Session 构造函数所需的格式。方便与需要const char或const wchar_t类型的 C 风格函数或库如 OpenCV、ONNX Runtime 等兼容。 OpenCVcv::imread() 接收 const char*。ONNX RuntimeWindows 平台Ort::Session 需要 const wchar_t* 。对于 std::string返回 const char*。对于 std::wstring返回 const wchar_t*。如果你的模型路径原本是 std::string 类型可以通过一个转换函数将其转换为 std::wstring例如 std::wstring w_modelPath utils::charToWstring(modelPath.c_str());std::wstring utils::charToWstring(const char *str) {typedef std::codecvt_utf8wchar_t convert_type;//std::codecvt_utf8wchar_t 是一种转换类型用于将UTF-8字符串与wchar_t宽字符字符串之间进行相互转换。std::wstring_convertconvert_type, wchar_t converter;//std::wstring_convert 需要一个编码转换类型如std::codecvt_utf8和一个宽字符类型如 wchar_treturn converter.from_bytes(str); }c. 获取模型输入/输出信息从 Ort::Session 对象中获取模型输入和输出的详细信息包括数量、名称、类型和形状。在 ONNX Runtime 中Ort::Session 提供了两种方法来获取模型输入/输出名称 GetInputName 使用用户提供的内存分配器如 Ort::AllocatorWithDefaultOptions。返回的是 char*指向分配的内存区域。需要用户确保分配的内存不会泄漏ONNX Runtime 不自动释放它。如果分配器没有释放功能可能导致内存泄漏。需要搭配 allocator.Free(inputName); // 释放名称内存 GetInputNameAllocated 直接返回一个 Ort::AllocatedStringPtr对象封装了分配的字符串指针和释放逻辑而不是简单的 char*。内存管理更为安全因为返回的 Ort::AllocatedStringPtr 是 RAII 风格的对象自动释放内存。 Ort::AllocatorWithDefaultOptions allocator; //ONNX Runtime 提供的一个默认内存分配器类用于管理内存资源特别是在获取模型输入/输出的元数据如名称、形状时非常有用// 获取输入信息 std::vectorconst char * inputNames; std::vectorOrt::AllocatedStringPtr input_names_ptr; std::vectorstd::vectorint64_t inputShapes; bool isDynamicInputShape{};size_t numInputNodes session.GetInputCount(); //输入数量 for (size_t i 0; i numInputNodes; i) {// 输入名称auto input_name session.GetInputNameAllocated(i, allocator);inputNames.push_back(input_name.get());//get 返回指向的原始字符串指针也就是 const char* 类型input_names_ptr.push_back(std::move(input_name)); // 输入类型和形状Ort::TypeInfo inputTypeInfo session.GetInputTypeInfo(i);std::vectorint64_t inputTensorShape inputTypeInfo.GetTensorTypeAndShapeInfo().GetShape();inputShapes.push_back(inputTensorShape);isDynamicInputShape false;// checking if width and height are dynamicif (inputTensorShape[2] -1 inputTensorShape[3] -1){std::cout Dynamic input shape std::endl;this-isDynamicInputShape true;} }// 获取输出信息 std::vectorconst char * outputNames; std::vectorOrt::AllocatedStringPtr output_names_ptr; std::vectorstd::vectorint64_t outputShapes; int classNums 3;size_t numOutputNodes session.GetOutputCount();//大于1分割 if (num_output_nodes 1) {hasMask true;std::cout Instance Segmentation std::endl; } elsestd::cout Object Detection std::endl; for (size_t i 0; i numOutputNodes; i) {// 输出名称auto output_name session.GetOutputNameAllocated(i, allocator);outputNames.push_back(output_name.get());output_names_ptr.push_back(std::move(output_name));// 输出类型和形状Ort::TypeInfo outputTypeInfo session.GetOutputTypeInfo(i);std::vectorint64_t outputTensorShape outputTypeInfo.GetTensorTypeAndShapeInfo().GetShape();outputShapes.push_back(outputTensorShape);if (i 0){if (!this-hasMask)classNums outputTensorShape[1] - 4;elseclassNums outputTensorShape[1] - 4 - 32;} } 查看模型的输入和输出层可以使用netron这个网站可视化直接导入onnx模型即可。输入层输出层 d. 创建输入张量 std::vectorOrt::Value inputTensors;Ort::MemoryInfo memoryInfo Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);//表示输入张量数据存储在 CPU 内存中inputTensors.push_back(Ort::Value::CreateTensorfloat(memoryInfo, inputTensorValues.data(), inputTensorSize,inputTensorShape.data(), inputTensorShape.size()));//将数据创建为一个 ONNX Tensor CreateTensor参数解释 memoryInfo内存信息表示数据存储在 CPU 上。 inputTensorValues.data()指向 Tensor 数据的起始位置。 inputTensorSizeTensor 数据的元素个数。 inputTensorShape.data()Tensor 形状的指针。 inputTensorShape.size()Tensor 形状的维度数量。 e. 进行推理 std::vectorOrt::Value outputTensors session.Run(Ort::RunOptions{nullptr}, inputNames.data(), inputTensors.data(), 1, outputNames.data(), outputNames.size()); run 参数解释 Ort::RunOptions{nullptr}RunOptions 是 ONNX Runtime 执行配置对象这里传入 nullptr 使用默认配置。 inputNames.data()输入 Tensor 名称数组的指针指定模型输入的名称。 inputTensors.data()输入 Tensor 数据的指针指定输入数据。 1表示输入 Tensor 数量。 outputNames.data()输出 Tensor 名称数组的指针指定需要输出的节点名称。 outputNames.size()输出 Tensor 数量。 Run 返回一个包含输出 Tensor 的向量 std::vectorOrt::Value每个 Ort::Value 包含模型的一个输出。 3后处理从输出张量获取数据并通过 cv::Mat 转换为矩阵格式CHW → HWC。提取最高置信度的类别和对应的分数过滤低置信度目标。将中心坐标 (cx, cy) 和宽高 (w, h) 转换为左上角坐标 (x, y) 和尺寸格式。去除重叠度高的冗余检测框保留置信度最高的框。将检测框从网络输入尺寸映射回原图尺寸。 3.2.4 完整代码部署 utils.cpp #include utils.hsize_t utils::vectorProduct(const std::vectorint64_t vector) {if (vector.empty())return 0;size_t product 1;for (const auto element : vector)product * element;return product; }std::wstring utils::charToWstring(const char *str) {typedef std::codecvt_utf8wchar_t convert_type;//std::codecvt_utf8wchar_t 是一种转换类型用于将 UTF-8 字符串与 wchar_t 宽字符字符串之间进行相互转换。//在 Windows 系统中wchar_t 通常是 UTF-16 编码。//在 Linux / Unix 系统中wchar_t 通常是 UTF - 32 编码。std::wstring_convertconvert_type, wchar_t converter;//std::wstring_convert 需要一个编码转换类型如 std::codecvt_utf8和一个宽字符类型如 wchar_treturn converter.from_bytes(str); }std::vectorstd::string utils::loadNames(const std::string path) {// load class namesstd::vectorstd::string classNames;std::ifstream infile(path);if (infile.good()){std::string line;while (getline(infile, line)){if (line.back() \r)line.pop_back();classNames.emplace_back(line);}infile.close();}else{std::cerr ERROR: Failed to access class name path: path std::endl;}// set colorsrand(time(0));for (int i 0; i 2 * classNames.size(); i){int b rand() % 256;int g rand() % 256;int r rand() % 256;colors.push_back(cv::Scalar(b, g, r));}return classNames; }void utils::visualizeDetection(cv::Mat im, std::vectorYolov8Result results,const std::vectorstd::string classNames) {cv::Mat image im.clone();for (const Yolov8Result result : results){int x result.box.x;int y result.box.y;int conf (int)std::round(result.conf * 100);int classId result.classId;std::string label classNames[classId] 0. std::to_string(conf);int baseline 0;cv::Size size cv::getTextSize(label, cv::FONT_ITALIC, 0.4, 1, baseline);image(result.box).setTo(colors[classId classNames.size()], result.boxMask);cv::rectangle(image, result.box, colors[classId], 2);cv::rectangle(image,cv::Point(x, y), cv::Point(x size.width, y 12),colors[classId], -1);cv::putText(image, label,cv::Point(x, y - 3 12), cv::FONT_ITALIC,0.4, cv::Scalar(0, 0, 0), 1);}cv::addWeighted(im, 0.4, image, 0.6, 0, im); }void utils::letterbox(const cv::Mat image, cv::Mat outImage,const cv::Size newShape cv::Size(640, 640),const cv::Scalar color cv::Scalar(114, 114, 114),bool auto_ true,//是否根据步幅对填充尺寸进行自动调整bool scaleFill false,//是否强制将图像拉伸到目标尺寸忽略长宽比bool scaleUp true,//是否允许放大图像如果为 false图像只会缩小或保持原始尺寸int stride 32)//对齐步幅用于控制填充的边缘尺寸 {cv::Size shape image.size();//计算缩放比例float r std::min((float)newShape.height / (float)shape.height,(float)newShape.width / (float)shape.width);//如果 scaleUp 为 false缩放比例 r 被限制为 1.0确保图像不会被放大仅会缩小或保持原尺寸if (!scaleUp)r std::min(r, 1.0f);float ratio[2]{r, r};//调整图像尺寸int newUnpad[2]{(int)std::round((float)shape.width * r),(int)std::round((float)shape.height * r)};//计算填充大小auto dw (float)(newShape.width - newUnpad[0]);auto dh (float)(newShape.height - newUnpad[1]);if (auto_){dw (float)((int)dw % stride);dh (float)((int)dh % stride);}else if (scaleFill){dw 0.0f;dh 0.0f;newUnpad[0] newShape.width;newUnpad[1] newShape.height;ratio[0] (float)newShape.width / (float)shape.width;ratio[1] (float)newShape.height / (float)shape.height;}dw / 2.0f;dh / 2.0f;if (shape.width ! newUnpad[0] shape.height ! newUnpad[1]){cv::resize(image, outImage, cv::Size(newUnpad[0], newUnpad[1]));}int top int(std::round(dh - 0.1f));int bottom int(std::round(dh 0.1f));int left int(std::round(dw - 0.1f));int right int(std::round(dw 0.1f));//添加填充cv::copyMakeBorder(outImage, outImage, top, bottom, left, right, cv::BORDER_CONSTANT, color); }void utils::scaleCoords(cv::Rect coords,cv::Mat mask,const float maskThreshold,const cv::Size imageShape,const cv::Size imageOriginalShape) {float gain std::min((float)imageShape.height / (float)imageOriginalShape.height,(float)imageShape.width / (float)imageOriginalShape.width);//计算缩放比例int pad[2] {(int)(((float)imageShape.width - (float)imageOriginalShape.width * gain) / 2.0f),(int)(((float)imageShape.height - (float)imageOriginalShape.height * gain) / 2.0f)};//计算填充边距 coords.x (int)std::round(((float)(coords.x - pad[0]) / gain));//还原到原始图像坐标coords.x std::max(0, coords.x);coords.y (int)std::round(((float)(coords.y - pad[1]) / gain));coords.y std::max(0, coords.y);coords.width (int)std::round(((float)coords.width / gain));coords.width std::min(coords.width, imageOriginalShape.width - coords.x);coords.height (int)std::round(((float)coords.height / gain));coords.height std::min(coords.height, imageOriginalShape.height - coords.y);mask mask(cv::Rect(pad[0], pad[1], imageShape.width - 2 * pad[0], imageShape.height - 2 * pad[1]));//裁剪掩码并去掉边缘填充cv::resize(mask, mask, imageOriginalShape, cv::INTER_LINEAR);mask mask(coords) maskThreshold; } template typename T T utils::clip(const T n, const T lower, const T upper) {return std::max(lower, std::min(n, upper)); } predictor.cpp #include yolov8Predictor.hYOLOPredictor::YOLOPredictor(const std::string modelPath,const bool isGPU,float confThreshold,float iouThreshold,float maskThreshold) {this-confThreshold confThreshold;this-iouThreshold iouThreshold;this-maskThreshold maskThreshold;//初始化一个 ONNX 运行时环境 env并设置日志级别为警告。env Ort::Env(OrtLoggingLevel::ORT_LOGGING_LEVEL_WARNING, YOLOV8);//创建一个会话选项sessionOptions Ort::SessionOptions();//获取当前 ONNX 运行时支持的执行提供程序并检查是否支持 CUDA 执行提供程序。std::vectorstd::string availableProviders Ort::GetAvailableProviders();std::cout -------------------- std::endl;for (int i 0; i availableProviders.size(); i){std::cout availableProviders.at(i) std::endl;}auto cudaAvailable std::find(availableProviders.begin(), availableProviders.end(), CUDAExecutionProvider);//在指定的范围内搜索第一个等于给定值的元素并返回一个指向该元素的迭代器。//如果未找到匹配的元素std::find 返回指向范围末尾的迭代器即 end()。OrtCUDAProviderOptions cudaOption;//OrtCUDAProviderOptions 是 ONNX Runtime 提供的一个结构体用于配置 CUDA GPU 推理选项当在 GPU 上使用 ONNX Runtime 时需要通过该结构体指定 CUDA 相关参数。//根据是否使用 GPU 和 CUDA 提供程序是否可用选择相应的执行提供程序并输出相应的推断设备信息。if (isGPU (cudaAvailable availableProviders.end()))//end()指向容器末尾的下一个位置的迭代器{std::cout GPU is not supported by your ONNXRuntime build. Fallback to CPU. std::endl;std::cout Inference device: CPU std::endl;}else if (isGPU (cudaAvailable ! availableProviders.end())){std::cout Inference device: GPU std::endl;sessionOptions.AppendExecutionProvider_CUDA(cudaOption);}else{std::cout Inference device: CPU std::endl;}#ifdef _WIN32//Windows 系统中的文件路径通常使用宽字符Unicode 编码wchar_tstd::wstring w_modelPath utils::charToWstring(modelPath.c_str());//c_str()将 std::string 或 std::wstring 转换为以 \0 结尾的 C 风格字符串方便与需要 const char* 或 const wchar_t* 类型的 C 风格函数或库如 OpenCV、ONNX Runtime 等兼容。//OpenCVcv::imread() 接收 const char*。//ONNX RuntimeWindows 平台Ort::Session 需要 const wchar_t* 。session Ort::Session(env, w_modelPath.c_str(), sessionOptions);//创建一个 Ort::Session 会话通过会话来执行推理任务。 #elsesession Ort::Session(env, modelPath.c_str(), sessionOptions); #endif//获取输入节点和输出节点的数量并判断是否存在掩码输出。const size_t num_input_nodes session.GetInputCount(); //1const size_t num_output_nodes session.GetOutputCount(); //1,2if (num_output_nodes 1){this-hasMask true;std::cout Instance Segmentation std::endl;}elsestd::cout Object Detection std::endl;Ort::AllocatorWithDefaultOptions allocator;//Ort::AllocatorWithDefaultOptions 是 ONNX Runtime 提供的一个默认内存分配器类用于管理内存资源特别是在获取模型输入/输出的元数据如名称、形状时非常有用//遍历输入节点获取其名称、形状信息并检查输入形状是否为动态形状。for (int i 0; i num_input_nodes; i){auto input_name session.GetInputNameAllocated(i, allocator);//返回的是一个 Ort::AllocatedStringPtr 对象而不是简单的 char*//GetInputName返回的字符串指针是一个 C 风格字符串char*this-inputNames.push_back(input_name.get());//get 返回指向的原始字符串指针也就是 const char* 类型input_names_ptr.push_back(std::move(input_name));Ort::TypeInfo inputTypeInfo session.GetInputTypeInfo(i);std::vectorint64_t inputTensorShape inputTypeInfo.GetTensorTypeAndShapeInfo().GetShape();this-inputShapes.push_back(inputTensorShape);this-isDynamicInputShape false;// checking if width and height are dynamicif (inputTensorShape[2] -1 inputTensorShape[3] -1){std::cout Dynamic input shape std::endl;this-isDynamicInputShape true;}}//遍历输出节点获取其名称和形状信息并根据输出节点的数量和是否存在掩码输出来确定类别数量。for (int i 0; i num_output_nodes; i){auto output_name session.GetOutputNameAllocated(i, allocator);this-outputNames.push_back(output_name.get());output_names_ptr.push_back(std::move(output_name));Ort::TypeInfo outputTypeInfo session.GetOutputTypeInfo(i);std::vectorint64_t outputTensorShape outputTypeInfo.GetTensorTypeAndShapeInfo().GetShape();this-outputShapes.push_back(outputTensorShape);if (i 0){if (!this-hasMask)classNums outputTensorShape[1] - 4;elseclassNums outputTensorShape[1] - 4 - 32;}} }void YOLOPredictor::getBestClassInfo(std::vectorfloat::iterator it,float bestConf,int bestClassId,const int _classNums) {// first 4 element are boxbestClassId 4;bestConf 0;for (int i 4; i _classNums 4; i){if (it[i] bestConf){bestConf it[i];bestClassId i - 4;}} } cv::Mat YOLOPredictor::getMask(const cv::Mat maskProposals,const cv::Mat maskProtos) {cv::Mat protos maskProtos.reshape(0, {(int)this-outputShapes[1][1], (int)this-outputShapes[1][2] * (int)this-outputShapes[1][3]});cv::Mat matmul_res (maskProposals * protos).t();cv::Mat masks matmul_res.reshape(1, {(int)this-outputShapes[1][2], (int)this-outputShapes[1][3]});cv::Mat dest;// sigmoidcv::exp(-masks, dest);dest 1.0 / (1.0 dest);cv::resize(dest, dest, cv::Size((int)this-inputShapes[0][2], (int)this-inputShapes[0][3]), cv::INTER_LINEAR);return dest; }void YOLOPredictor::preprocessing(cv::Mat image, float *blob, std::vectorint64_t inputTensorShape) {cv::Mat resizedImage, floatImage;cv::cvtColor(image, resizedImage, cv::COLOR_BGR2RGB);//BGR-RGButils::letterbox(resizedImage, resizedImage, cv::Size((int)this-inputShapes[0][2], (int)this-inputShapes[0][3]),cv::Scalar(114, 114, 114), this-isDynamicInputShape,false, true, 32);//用于调整图像的尺寸使其适应网络输入要求的尺寸同时保持原始图像的长宽比。它会在图像周围添加填充填充的颜色由 cv::Scalar(114, 114, 114) 指定这通常是 YOLO 等模型的默认填充色。inputTensorShape[2] resizedImage.rows;inputTensorShape[3] resizedImage.cols;resizedImage.convertTo(floatImage, CV_32FC3, 1 / 255.0);//将每个像素的值归一化到 [0, 1] 之间blob new float[floatImage.cols * floatImage.rows * floatImage.channels()];//为图像数据分配内存大小为图像宽度 × 高度 × 通道数//每个像素的数据将存储为一个 float 类型的值cv::Size floatImageSize{floatImage.cols, floatImage.rows};// hwc - chwstd::vectorcv::Mat chw(floatImage.channels());for (int i 0; i floatImage.channels(); i){chw[i] cv::Mat(floatImageSize, CV_32FC1, blob i * floatImageSize.width * floatImageSize.height);//这里的 cv::Mat 对象并不直接复制数据而是创建了一个指向 blob 中特定位置的“视图”。这个“视图”指向的是 blob 中为每个通道分配的内存区域。//计算出每个通道数据在 blob 数组中的起始位置}cv::split(floatImage, chw);//将图像数据按通道拆分并将其存储在 blob 指向的内存中 }std::vectorYolov8Result YOLOPredictor::postprocessing(const cv::Size resizedImageShape,const cv::Size originalImageShape,std::vectorOrt::Value outputTensors) {// for boxstd::vectorcv::Rect boxes;std::vectorfloat confs;std::vectorint classIds;float *boxOutput outputTensors[0].GetTensorMutableDatafloat();//获取指向第一个输出张量数据的指针//[1,4n,8400][1,8400,4n] or [1,4n32,8400][1,8400,4n32]cv::Mat output0 cv::Mat(cv::Size((int)this-outputShapes[0][2], (int)this-outputShapes[0][1]), CV_32F, boxOutput).t();//chw-hwcfloat *output0ptr (float *)output0.data;int rows (int)this-outputShapes[0][2];int cols (int)this-outputShapes[0][1];// std::cout rows cols std::endl;// if hasMaskstd::vectorstd::vectorfloat picked_proposals;cv::Mat mask_protos;for (int i 0; i rows; i){std::vectorfloat it(output0ptr i * cols, output0ptr (i 1) * cols);//提取每行数据float confidence;int classId;this-getBestClassInfo(it.begin(), confidence, classId, classNums);//提取最高置信度的类别和对应的分数if (confidence this-confThreshold)//过滤低置信度目标{if (this-hasMask){std::vectorfloat temp(it.begin() 4 classNums, it.end());//跳过前面 4 个边界框坐标和 classNums 个类别置信度定位到掩码数据部分的起始位置。picked_proposals.push_back(temp);}//将检测框的坐标转换为左上角格式 (left, top, width, height)存储到 boxesint centerX (int)(it[0]);int centerY (int)(it[1]);int width (int)(it[2]);int height (int)(it[3]);int left centerX - width / 2;int top centerY - height / 2;boxes.emplace_back(left, top, width, height);confs.emplace_back(confidence);classIds.emplace_back(classId);}}//对检测框进行非极大值抑制去除重叠度较高的冗余框std::vectorint indices;//保存了保留的检测框索引cv::dnn::NMSBoxes(boxes, confs, this-confThreshold, this-iouThreshold, indices);if (this-hasMask){float *maskOutput outputTensors[1].GetTensorMutableDatafloat();std::vectorint mask_protos_shape {1, (int)this-outputShapes[1][1], (int)this-outputShapes[1][2], (int)this-outputShapes[1][3]};mask_protos cv::Mat(mask_protos_shape, CV_32F, maskOutput);}std::vectorYolov8Result results;for (int idx : indices){Yolov8Result res;res.box cv::Rect(boxes[idx]);if (this-hasMask)res.boxMask this-getMask(cv::Mat(picked_proposals[idx]).t(), mask_protos);//如果存在掩码调用 getMask 生成实例分割掩码elseres.boxMask cv::Mat::zeros((int)this-inputShapes[0][2], (int)this-inputShapes[0][3], CV_8U);utils::scaleCoords(res.box, res.boxMask, this-maskThreshold, resizedImageShape, originalImageShape);//将检测框和掩码从网络输入大小映射回原图坐标系res.conf confs[idx];res.classId classIds[idx];results.emplace_back(res);}return results; }std::vectorYolov8Result YOLOPredictor::predict(cv::Mat image) {float *blob nullptr;//用于存储图像预处理后的数据std::vectorint64_t inputTensorShape{1, 3, -1, -1};//-1, -1 表示动态输入的高度和宽度在运行时由实际图像尺寸决定this-preprocessing(image, blob, inputTensorShape);//预处理size_t inputTensorSize utils::vectorProduct(inputTensorShape);//计算输入 Tensor 中的元素个数std::vectorfloat inputTensorValues(blob, blob inputTensorSize);//将预处理后的数据拷贝到向量中,blob首地址std::vectorOrt::Value inputTensors;Ort::MemoryInfo memoryInfo Ort::MemoryInfo::CreateCpu(OrtAllocatorType::OrtArenaAllocator, OrtMemType::OrtMemTypeDefault);//表示 Tensor 数据存储在 CPU 内存中。inputTensors.push_back(Ort::Value::CreateTensorfloat(memoryInfo, inputTensorValues.data(), inputTensorSize,inputTensorShape.data(), inputTensorShape.size()));//将数据创建为一个 ONNX Tensor//memoryInfo内存信息表示数据存储在 CPU 上。//inputTensorValues.data()指向 Tensor 数据的起始位置。//inputTensorSizeTensor 数据的元素个数。//inputTensorShape.data()Tensor 形状的指针。//inputTensorShape.size()Tensor 形状的维度数量。std::vectorOrt::Value outputTensors this-session.Run(Ort::RunOptions{nullptr},this-inputNames.data(),inputTensors.data(),1,this-outputNames.data(),this-outputNames.size());//Ort::RunOptions{nullptr}RunOptions 是 ONNX Runtime 执行配置对象这里传入 nullptr 使用默认配置。//this-inputNames.data()输入 Tensor 名称数组的指针指定模型输入的名称。//inputTensors.data()输入 Tensor 数据的指针指定输入数据。//1表示输入 Tensor 数量。//this-outputNames.data()输出 Tensor 名称数组的指针指定需要输出的节点名称。//this-outputNames.size()输出 Tensor 数量。//Run 返回一个包含输出 Tensor 的向量 std::vectorOrt::Value每个 Ort::Value 包含模型的一个输出。cv::Size resizedShape cv::Size((int)inputTensorShape[3], (int)inputTensorShape[2]);//获取模型输入的尺寸信息std::vectorYolov8Result result this-postprocessing(resizedShape,image.size(),outputTensors);//后处理delete[] blob;return result; } 3.3 推理测试图片测试

查看全文

http://www.w-s-a.com/news/787541/