做网站厂家,个人网站建设 开题报告,装修公司怎么找客源最有效,列举五种网络营销模式yolos和DETR#xff0c;除了yolos没有卷积层以外#xff0c;几乎所有操作都一样。 HF官方文档
因为目标检测模型#xff0c;实际会输出几百几千个“框”#xff0c;所以损失函数计算比较复杂。损失函数为偶匹配损失 bipartite matching loss#xff0c;参考此blog
targe…yolos和DETR除了yolos没有卷积层以外几乎所有操作都一样。 HF官方文档
因为目标检测模型实际会输出几百几千个“框”所以损失函数计算比较复杂。损失函数为偶匹配损失 bipartite matching loss参考此blog
target为class_label和box组成的字典。假设对于一张图片我们有5个target框。 num_detection_tokens为模型对一张图最多可以产生的box的数量 简单阐述loss计算流程 vit 模型输入经过预处理的图片输出最后隐含层状态 大小为 [batchsizeseq_lenhidden_size] 取最后num_detection_tokens个token的隐藏状态变为 [batchsizenum_detection_tokenshidden_size] 由于输出了num_detection_tokens个box而target为5个box所以需要进行一对一的匹配 匹配过程 先计算3个cost矩阵shape均为【num_detection_tokensnum_target_box】矩阵元素代表loss矩阵代表对所有pred和target之间两两计算一次loss。3个cost矩阵分别代表标签loss交叉熵损失、坐标loss表示一个框的4个值的L1损失、GIoU loss框与框之间计算GIoU三个cost矩阵加权得到总体cost矩阵大小为【num_detection_tokensnum_target_box】对此矩阵进行linear_sum_assignment操作得到一个匹配此匹配下cost最小即cost矩阵中找到不同行且不同列的5个元素这5个元素之和最小。匹配表示为长度为min(num_detection_tokensnum_target_box)的索引对。本例长度为5。 根据此匹配pred和target之间计算一次loss本例中一共计算5次loss并求和最重loss就是上面说的3种loss的加权和 其实还有两种loss “cardinality” loss表示输出的num_detection_tokens个class_label中class_label不为“无目标”的个数与num_target_box的个数的L1 loss. 说白了就是除了5个框有实际的class以外其他框应尽可能分类为“无目标”避免检测出来目标过多。但之一loss不产生梯度仅仅用于评估。mask loss:功能暂时不清楚
官方匹配函数匈牙利算法
# Copied from transformers.models.detr.modeling_detr.DetrHungarianMatcher with Detr-Yolos
class YolosHungarianMatcher(nn.Module):This class computes an assignment between the targets and the predictions of the network.For efficiency reasons, the targets dont include the no_object. Because of this, in general, there are morepredictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others areun-matched (and thus treated as non-objects).Args:class_cost:The relative weight of the classification error in the matching cost.bbox_cost:The relative weight of the L1 error of the bounding box coordinates in the matching cost.giou_cost:The relative weight of the giou loss of the bounding box in the matching cost.def __init__(self, class_cost: float 1, bbox_cost: float 1, giou_cost: float 1):super().__init__()requires_backends(self, [scipy])self.class_cost class_costself.bbox_cost bbox_costself.giou_cost giou_costif class_cost 0 and bbox_cost 0 and giou_cost 0:raise ValueError(All costs of the Matcher cant be 0)torch.no_grad()def forward(self, outputs, targets):Args:outputs (dict):A dictionary that contains at least these entries:* logits: Tensor of dim [batch_size, num_queries, num_classes] with the classification logits* pred_boxes: Tensor of dim [batch_size, num_queries, 4] with the predicted box coordinates.targets (List[dict]):A list of targets (len(targets) batch_size), where each target is a dict containing:* class_labels: Tensor of dim [num_target_boxes] (where num_target_boxes is the number ofground-truthobjects in the target) containing the class labels* boxes: Tensor of dim [num_target_boxes, 4] containing the target box coordinates.Returns:List[Tuple]: A list of size batch_size, containing tuples of (index_i, index_j) where:- index_i is the indices of the selected predictions (in order)- index_j is the indices of the corresponding selected targets (in order)For each batch element, it holds: len(index_i) len(index_j) min(num_queries, num_target_boxes)batch_size, num_queries outputs[logits].shape[:2]# We flatten to compute the cost matrices in a batchout_prob outputs[logits].flatten(0, 1).softmax(-1) # [batch_size * num_queries, num_classes]out_bbox outputs[pred_boxes].flatten(0, 1) # [batch_size * num_queries, 4]# Also concat the target labels and boxestarget_ids torch.cat([v[class_labels] for v in targets])target_bbox torch.cat([v[boxes] for v in targets])# Compute the classification cost. Contrary to the loss, we dont use the NLL,# but approximate it in 1 - proba[target class].# The 1 is a constant that doesnt change the matching, it can be ommitted.class_cost -out_prob[:, target_ids]# Compute the L1 cost between boxesbbox_cost torch.cdist(out_bbox, target_bbox, p1)# Compute the giou cost between boxesgiou_cost -generalized_box_iou(center_to_corners_format(out_bbox), center_to_corners_format(target_bbox))# Final cost matrixcost_matrix self.bbox_cost * bbox_cost self.class_cost * class_cost self.giou_cost * giou_costcost_matrix cost_matrix.view(batch_size, num_queries, -1).cpu()sizes [len(v[boxes]) for v in targets]indices [linear_sum_assignment(c[i]) for i, c in enumerate(cost_matrix.split(sizes, -1))]return [(torch.as_tensor(i, dtypetorch.int64), torch.as_tensor(j, dtypetorch.int64)) for i, j in indices]目标检测还有很多细节问题以后更新