IoU
Intersection over Union (IoU) 是目标检测里一种重要的评价值。上面第一张途中框出了 gt box 和 predict box,IoU 通过计算这两个框 A、B 间的 Intersection Area $I$ 和 Union Area $U$
然而现有的算法都采用 distance losses(例如 SSD 里的 smooth_L1 loss) 来优化这一评价值。讲道理 The optimal objective for a metric is the metric itself. 所以我们可以直接将 IoU 直接作为回归 loss 来使用,令人遗憾的是 IoU 无法优化无重叠的 bboxes。
如果用 IoU 作为 loss($\mathcal{L}_{IoU} = 1 - IoU$) 衡量值的话有两个优点和一个缺点:
1. IoU 可以有效比较两个任意形状之间相似性
2. IoU 具有尺度不变性
3. 任意两个形状 A、B 之间如果没有 overlap,则 IoU 均为 0,此时,IoU 无法分辨两个形状 A、B 是靠的非常近还是非常远
GIoU
smallest convex shapes $C$,具体计算公式是:
下图中有两个不同的检测结果 bad & better,不难看出距离 gt box 越远 $C$ 越大。
如此,损失函数可以写成:$\mathcal{L}_{GIoU} = 1- GIoU$,不难发现 $\mathcal{L}_{GIoU}$ 的值域范围为 $[0, 2)$。
In summary, this generalization keeps the major properties of IoU while rectifying its weakness.
DIoU & CIoU
论文中提出,GIoU loss 仍然存在收敛速度慢、回归不准等问题。
In this paper, we propose a Distance-IoU (DIoU) loss by incorporating the normalized distance between the predicted box and the target box, which converges much faster in training than IoU and GIoU losses. Furthermore, this paper summarizes three geometric factors in bounding box regression, i.e., overlap area, central point distance and aspect ratio, based on which a Complete IoU (CIoU) loss is proposed, thereby leading to faster convergence and better performance. Moreover, DIoU can be easily adopted into non-maximum suppression (NMS) to act as the criterion, further boosting performance improvement.
作者在分析 GIoU loss 时,发现 GIoU 首先会试图通过增加检测框的大小使其与目标 bbox 有重叠,然后利用 IoU loss 项使其与目标 bbox 重叠面积最大,如下左图所示:
同时,但两个框有包含关系是,GIoU loss 就退化成了 IoU loss 了。这时候边界框的对齐变得较困难,收敛较慢。
In Distance-IoU (DIoU) loss, we simply add a penalty term on IoU loss to directly minimize the normalized distance between central points of two bounding boxes, leading to much faster convergence than GIoU loss.
作者认为,一个好的 bbox 回归损失应该考虑三个重要的集合度量:重叠面积、中心点距离和高宽比。结合这些,作者进一步提出了一个 Complete IoU (CIoU) loss。同时 DIoU 还可以引入到 NMS 中来替换里面的 IoU,使得目标在遮挡情况下检测更鲁棒。
DIoU
参考上图,DIoU loss 的公式为:
这里的 $\bf{d}$ 和 $\bf{c}$ 分别代表检测框和真实框的中心点,且 $d$ 代表的是计算两个中心点之间的欧氏距离,$c$ 则代表 GIoU 中提到的 smallest convex shapes 的对角线距离。
优点:
- 与GIoU loss 类似,DIoU loss 在与目标框不重叠时,仍然可以为边界框提供移动方向。
- DIoU loss 可以直接最小化两个目标框的距离,因此比 GIoU loss 收敛快得多。
- 对于包含两个框在水平方向和垂直方向上这种情况,DIoU loss 可以使回归非常快,而 GIoU loss 几乎退化为 IoU loss。
- DIoU 还可以替换普通的 IoU 评价策略,应用于 NMS 中,使得 NMS 得到的结果更加合理和有效。
同 $\mathcal{L}_{GIoU}$ 类似, $\mathcal{L}_{DIoU}$ 的值域范围也为 $[0, 2)$。
CIoU
$\mathcal{L}_{CIoU}$ 在 $\mathcal{L}_{DIoU}$ 的基础上考虑了 aspect ratios:
额,这个。。。看起来复杂的一逼
其中,$v$ 用来衡量高宽比的一致性,$\alpha$ 是一个 positive trade-off parameter, 是不参与求导的。
DIoU-NMS
这个还没试,等着。。。
示例
import numpy as np
import matplotlib.pyplot as plt
import math
epsilon = 1e-5
def IoU(box1, box2, wh=False):
if wh:
xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
else:
xmin1, ymin1, xmax1, ymax1 = box1
xmin2, ymin2, xmax2, ymax2 = box2
# 计算交集部分尺寸
W = min(xmax1, xmax2) - max(xmin1, xmin2)
H = min(ymax1, ymax2) - max(ymin1, ymin2)
# 计算两个矩形框面积
SA = (xmax1 - xmin1) * (ymax1 - ymin1)
SB = (xmax2 - xmin2) * (ymax2 - ymin2)
cross = max(0, W) * max(0, H) # 计算交集面积
iou = float(cross) / (SA + SB - cross)
return iou
def GIoU(box1, box2, wh=False):
if wh:
xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
else:
xmin1, ymin1, xmax1, ymax1 = box1
xmin2, ymin2, xmax2, ymax2 = box2
iou = IoU(box1, box2, wh)
SC = (max(xmax1, xmax2) - min(xmin1, xmin2)) * (max(ymax1, ymax2) - min(ymin1, ymin2))
# 计算交集部分尺寸
W = min(xmax1, xmax2) - max(xmin1, xmin2)
H = min(ymax1, ymax2) - max(ymin1, ymin2)
# 计算两个矩形框面积
SA = (xmax1 - xmin1) * (ymax1 - ymin1)
SB = (xmax2 - xmin2) * (ymax2 - ymin2)
cross = max(0, W) * max(0, H) # 计算交集面积
add_area = SA + SB - cross # 两矩形并集的面积
end_area = (SC - add_area) / SC # 闭包区域中不属于两个框的区域占闭包区域的比重
giou = iou - end_area
return giou
def DIoU(box1, box2, wh=False):
if wh:
inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
else:
xmin1, ymin1, xmax1, ymax1 = box1
xmin2, ymin2, xmax2, ymax2 = box2
center_x1 = (xmax1 + xmin1) / 2
center_y1 = (ymax1 + ymin1) / 2
center_x2 = (xmax2 + xmin2) / 2
center_y2 = (ymax2 + ymin2) / 2
inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2
iou = IoU(box1, box2, wh)
enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
diou = iou - 1.0 * inter_diag / outer_diag
return diou
def CIoU(box1, box2, wh=False, normaled=False):
if wh:
w1, h1 = box1[2], box1[3]
w2, h2 = box2[2], box2[3]
inter_diag = (box1[0] - box2[0])**2 + (box1[1] - box2[1])**2
xmin1, ymin1 = box1[0] - box1[2] / 2.0, box1[1] - box1[3] / 2.0
xmax1, ymax1 = box1[0] + box1[2] / 2.0, box1[1] + box1[3] / 2.0
xmin2, ymin2 = box2[0] - box2[2] / 2.0, box2[1] - box2[3] / 2.0
xmax2, ymax2 = box2[0] + box2[2] / 2.0, box2[1] + box2[3] / 2.0
else:
xmin1, ymin1, xmax1, ymax1 = box1
xmin2, ymin2, xmax2, ymax2 = box2
w1, h1 = xmax1-xmin1, ymax1-ymin1
w2, h2 = xmax2-xmin2, ymax2-ymin2
center_x1 = (xmax1 + xmin1) / 2
center_y1 = (ymax1 + ymin1) / 2
center_x2 = (xmax2 + xmin2) / 2
center_y2 = (ymax2 + ymin2) / 2
inter_diag = (center_x1 - center_x2)/2 ** 2 + (center_y1 - center_y2) ** 2
iou = IoU(box1, box2, wh)
enclose1 = max(max(xmax1, xmax2)-min(xmin1, xmin2), 0.0)
enclose2 = max(max(ymax1, ymax2)-min(ymin1, ymin2), 0.0)
outer_diag = (enclose1 ** 2) + (enclose2 ** 2)
u = (inter_diag) / outer_diag
arctan = math.atan(w2 / h2) - math.atan(w1 / h1)
v = (4 / (math.pi ** 2)) * (math.atan(w2 / h2) - math.atan(w1 / h1))**2
S = 1 - iou
alpha = v / (S + v)
w_temp = 2 * w1
distance = w1 ** 2 + h1 ** 2
ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
if not normaled:
cious = iou - (u + alpha * ar / distance)
else:
cious = iou - (u + alpha * ar)
cious = np.clip(cious, a_min=-1.0, a_max=1.0)
return cious
def bbox_giou_np(boxes1, boxes2):
# xywh -> xyxy
boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = np.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
# 计算最小闭合凸面 C 的面积
enclose_area = enclose[..., 0] * enclose[..., 1]
# 根据 GIoU 公式计算 GIoU 值
giou = iou - 1.0 * (enclose_area - union_area) / enclose_area
return giou
# https://github.com/YunYang1994/TensorFlow2.0-Examples/blob/4d4a403d00e6e887ecb7229719b1407d2e132811/4-Object_Detection/YOLOV3/core/yolov3.py#L121
def bbox_giou_tf(boxes1, boxes2):
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
# 计算最小闭合凸面 C 的面积
enclose_area = enclose[..., 0] * enclose[..., 1]
# 根据 GIoU 公式计算 GIoU 值
giou = iou - 1.0 * (enclose_area - union_area) / enclose_area
return giou
def bbox_giou_torch(boxes1, boxes2):
# boxes1, boxes2 = torch.tensor(boxes1, dtype=torch.float32), torch.tensor(boxes2, dtype=torch.float32)
boxes1, boxes2 = torch.from_numpy(boxes1).float(), torch.from_numpy(boxes2).float()
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = torch.cat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], dim=-1)
boxes2 = torch.cat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], dim=-1)
boxes1 = torch.cat([torch.min(boxes1[..., :2], boxes1[..., 2:]),
torch.max(boxes1[..., :2], boxes1[..., 2:])], dim=-1)
boxes2 = torch.cat([torch.min(boxes2[..., :2], boxes2[..., 2:]),
torch.max(boxes2[..., :2], boxes2[..., 2:])], dim=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = torch.max(boxes1[..., :2], boxes2[..., :2])
right_down = torch.min(boxes1[..., 2:], boxes2[..., 2:])
inter_section = torch.max(right_down - left_up, torch.tensor(0.0))
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = torch.min(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = torch.max(boxes1[..., 2:], boxes2[..., 2:])
enclose = torch.max(enclose_right_down - enclose_left_up, torch.tensor(0.0))
# 计算最小闭合凸面 C 的面积
enclose_area = enclose[..., 0] * enclose[..., 1]
# 根据 GIoU 公式计算 GIoU 值
giou = iou - 1.0 * (enclose_area - union_area) / enclose_area
return giou
# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L5
def bbox_diou_torch(bboxes1, bboxes2):
bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
dious = torch.zeros((rows, cols))
if rows * cols == 0:
return dious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
dious = torch.zeros((cols, rows))
exchange = True
w1 = bboxes1[:, 2] - bboxes1[:, 0]
h1 = bboxes1[:, 3] - bboxes1[:, 1]
w2 = bboxes2[:, 2] - bboxes2[:, 0]
h2 = bboxes2[:, 3] - bboxes2[:, 1]
area1 = w1 * h1
area2 = w2 * h2
center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1] # 交集
inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
outer = torch.clamp((out_max_xy - out_min_xy), min=0)
outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
union = area1 + area2 - inter_area # 并集
dious = inter_area / union - (inter_diag) / outer_diag
dious = torch.clamp(dious, min=-1.0, max=1.0)
if exchange:
dious = dious.T
return dious
def bbox_diou_np(boxes1, boxes2, normaled=False):
inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = np.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
# 根据 DIoU 公式计算 DIoU 值
diou = iou - 1.0 * inter_diag / outer_diag
diou = np.clip(diou, a_min=-1.0, a_max=1.0)
return diou
def bbox_diou_tf(boxes1, boxes2):
inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=1)
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
# 根据 GIoU 公式计算 GIoU 值
diou = iou - 1.0 * inter_diag / outer_diag
diou = tf.clip_by_value(diou, clip_value_min=-1.0, clip_value_max=1.0)
return diou
# https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/65b68b53f73173397937d4950ff916a41545c960/utils/box/box_utils.py#L47
def bbox_ciou_torch(bboxes1, bboxes2, normaled=False):
bboxes1, bboxes2 = torch.from_numpy(bboxes1).float(), torch.from_numpy(bboxes2).float()
rows = bboxes1.shape[0]
cols = bboxes2.shape[0]
cious = torch.zeros((rows, cols))
if rows * cols == 0:
return cious
exchange = False
if bboxes1.shape[0] > bboxes2.shape[0]:
bboxes1, bboxes2 = bboxes2, bboxes1
cious = torch.zeros((cols, rows))
exchange = True
w1 = bboxes1[:, 2] - bboxes1[:, 0]
h1 = bboxes1[:, 3] - bboxes1[:, 1]
w2 = bboxes2[:, 2] - bboxes2[:, 0]
h2 = bboxes2[:, 3] - bboxes2[:, 1]
area1 = w1 * h1
area2 = w2 * h2
center_x1 = (bboxes1[:, 2] + bboxes1[:, 0]) / 2
center_y1 = (bboxes1[:, 3] + bboxes1[:, 1]) / 2
center_x2 = (bboxes2[:, 2] + bboxes2[:, 0]) / 2
center_y2 = (bboxes2[:, 3] + bboxes2[:, 1]) / 2
inter_max_xy = torch.min(bboxes1[:, 2:], bboxes2[:, 2:])
inter_min_xy = torch.max(bboxes1[:, :2], bboxes2[:, :2])
out_max_xy = torch.max(bboxes1[:, 2:], bboxes2[:, 2:])
out_min_xy = torch.min(bboxes1[:, :2], bboxes2[:, :2])
inter = torch.clamp((inter_max_xy - inter_min_xy), min=0)
inter_area = inter[:, 0] * inter[:, 1]
inter_diag = (center_x2 - center_x1) ** 2 + (center_y2 - center_y1) ** 2
outer = torch.clamp((out_max_xy - out_min_xy), min=0)
outer_diag = (outer[:, 0] ** 2) + (outer[:, 1] ** 2)
union = area1 + area2 - inter_area
u = (inter_diag) / outer_diag
iou = inter_area / union
with torch.no_grad():
arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1)
v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
S = 1 - iou
alpha = v / (S + v)
w_temp = 2 * w1
distance = w1 ** 2 + h1 ** 2
ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
if not normaled:
cious = iou - (u + alpha * ar / distance)
else:
cious = iou - (u + alpha * ar)
cious = torch.clamp(cious, min=-1.0, max=1.0)
if exchange:
cious = cious.T
return cious
def bbox_ciou_np(boxes1, boxes2, normaled=False):
w1, h1 = boxes1[..., 2], boxes1[..., 3]
w2, h2 = boxes2[..., 2], boxes2[..., 3]
inter_diag = np.sum(np.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = np.concatenate([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = np.concatenate([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = np.concatenate([np.minimum(boxes1[..., :2], boxes1[..., 2:]),
np.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = np.concatenate([np.minimum(boxes2[..., :2], boxes2[..., 2:]),
np.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = np.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = np.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = np.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = np.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = np.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = np.maximum(enclose_right_down - enclose_left_up, 0.0)
outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
u = (inter_diag) / outer_diag
# 根据 CIoU 公式计算 CIoU 值
arctan = np.arctan(w2 / h2) - np.arctan(w1 / h1)
v = (4 / (math.pi ** 2)) * np.square(np.arctan(w2 / h2) - np.arctan(w1 / h1))
S = 1 - iou
alpha = v / (S + v)
w_temp = 2 * w1
distance = w1 ** 2 + h1 ** 2
ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
if not normaled:
cious = iou - (u + alpha * ar / distance)
else:
cious = iou - (u + alpha * ar)
cious = np.clip(cious, a_min=-1.0, a_max=1.0)
return cious
def bbox_ciou_tf(boxes1, boxes2, normaled=False):
w1, h1 = boxes1[..., 2], boxes1[..., 3]
w2, h2 = boxes2[..., 2], boxes2[..., 3]
inter_diag = tf.reduce_sum(tf.square(boxes1[..., :2] - boxes2[..., :2]), axis=-1)
# pred_xywh, label_xywh -> pred_xyxy, label_xyxy
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
# 计算两个边界框之间的 iou 值
iou = inter_area / union_area
# 计算最小闭合凸面 C 左上角和右下角的坐标
# 计算最小闭合凸面 C 左上角和右下角的坐标
enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
outer_diag = (enclose[:, 0] ** 2) + (enclose[:, 1] ** 2)
u = (inter_diag) / outer_diag
# 根据 CIoU 公式计算 CIoU 值
# arctan = tf.atan(w2 / h2) - tf.atan(w1 / h1)
# v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / h2) - tf.atan(w1 / h1))
arctan = tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon))
v = (4 / (math.pi ** 2)) * np.square(tf.atan(w2 / (h2 + epsilon)) - tf.atan(w1 / (h1 + epsilon)))
S = 1 - iou
alpha = tf.stop_gradient(v / (S + v))
w_temp = tf.stop_gradient(2 * w1)
distance = tf.stop_gradient(w1 ** 2 + h1 ** 2 + epsilon)
ar = (8 / (math.pi ** 2)) * arctan * ((w1 - w_temp) * h1)
if not normaled:
cious = iou - (u + alpha * ar / distance)
else:
cious = iou - (u + alpha * ar)
cious = tf.clip_by_value(cious, clip_value_min=-1.0, clip_value_max=1.0)
return cious
img_width = 480.0
img_height = 320.0
gt_bboxes_xyxy = np.array([[50, 40, 200, 200], [270, 70, 400, 180]]) # xyxy
pre_bboxes_xyxy = np.array([[100, 100, 250, 300], [400, 180, 460, 300]]) # xyxy
gt_bboxes_xyxy_nomal = np.zeros(shape=gt_bboxes_xyxy.shape, dtype=np.float)
pre_bboxes_xyxy_nomal = np.zeros(shape=pre_bboxes_xyxy.shape, dtype=np.float)
gt_bboxes_xyxy_nomal[..., 0::2] = gt_bboxes_xyxy[..., 0::2] / img_width
gt_bboxes_xyxy_nomal[..., 1::2] = gt_bboxes_xyxy[..., 1::2] / img_height
pre_bboxes_xyxy_nomal[..., 0::2] = pre_bboxes_xyxy[..., 0::2] / img_width
pre_bboxes_xyxy_nomal[..., 1::2] = pre_bboxes_xyxy[..., 1::2] / img_height
gt_bboxes_xywh = np.array([[125, 120, 150, 160], [335, 125, 130, 110]]) # xywh
pre_bboxes_xywh = np.array([[175, 200, 150, 200], [430, 240, 60, 120]]) # xywh
gt_bboxes_xywh_nomal = np.zeros(shape=gt_bboxes_xywh.shape, dtype=np.float)
pre_bboxes_xywh_nomal = np.zeros(shape=pre_bboxes_xywh.shape, dtype=np.float)
gt_bboxes_xywh_nomal[..., 0::2] = gt_bboxes_xywh[..., 0::2] / img_width
gt_bboxes_xywh_nomal[..., 1::2] = gt_bboxes_xywh[..., 1::2] / img_height
pre_bboxes_xywh_nomal[..., 0::2] = pre_bboxes_xywh[..., 0::2] / img_width
pre_bboxes_xywh_nomal[..., 1::2] = pre_bboxes_xywh[..., 1::2] / img_height
# ================================================================ #
fig = plt.figure()
ax = fig.add_subplot(111)
currentAxis = plt.gca()
for idx, (gt, pt) in enumerate(zip(gt_bboxes_xywh, pre_bboxes_xywh)):
iou = IoU(gt, pt, True)
giou = GIoU(gt, pt, True)
diou = DIoU(gt, pt, True)
ciou = CIoU(gt, pt, True)
currentAxis.text(gt[0] - gt[2] / 2, 20, 'iou={:.4f}, giou={:.4f}'.format(iou, giou),
bbox={'facecolor': 'yellow', 'alpha': 0.5})
currentAxis.text(gt[0] - gt[2] / 2, gt[1] + gt[3] / 2 + 20, 'diou={:.4f}, ciou={:.4f}'.format(diou, ciou),
bbox={'facecolor': 'yellow', 'alpha': 0.5})
currentAxis.add_patch(plt.Rectangle((gt[0]-gt[2]/2,gt[1]-gt[3]/2),gt[2],gt[3],
fill=False, edgecolor='green', linewidth=2))
currentAxis.text(gt[0]-gt[2]/2,gt[1]-gt[3]/2, 'g{}'.format(idx), bbox={'facecolor': 'green', 'alpha': 0.5})
currentAxis.add_patch(plt.Rectangle((pt[0]-pt[2]/2, pt[1]-pt[3]/2), pt[2], pt[3],
fill=False, edgecolor='red', linewidth=2))
currentAxis.text(pt[0]-pt[2]/2, pt[1]-pt[3]/2, 'p{}'.format(idx), bbox={'facecolor': 'red', 'alpha': 0.5})
plt.xticks(np.arange(0, img_width+1, 40))
plt.yticks(np.arange(0, img_height+1, 40))
currentAxis.invert_yaxis()
plt.show()
# ================================================================ #
import tensorflow as tf
import torch
label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
label_bbox_normal = tf.placeholder(dtype=tf.float32, name='label_bbox_normal')
predic_bbox_normal = tf.placeholder(dtype=tf.float32, name='predic_bbox_normal')
# ================================================================ #
# GIoU #
# ================================================================ #
gious = np.expand_dims(bbox_giou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish giou: ', gious)
# ================================================================ #
gious = tf.expand_dims(bbox_giou_tf(predic_bbox, label_bbox), axis=-1)
with tf.Session() as sess:
result = sess.run(gious, feed_dict={label_bbox: gt_bboxes_xywh,
predic_bbox: pre_bboxes_xywh}
)
print('tensorflow publish giou: ', result)
# ================================================================ #
gious = bbox_giou_torch(gt_bboxes_xywh, pre_bboxes_xywh).unsqueeze(-1)
print('pytorch publish goiu: ', gious.numpy())
# ================================================================ #
# DIoU #
# ================================================================ #
dious = np.expand_dims(bbox_diou_np(gt_bboxes_xywh, pre_bboxes_xywh), axis=-1)
print('numpy publish diou : ', dious)
# ================================================================
dious = bbox_diou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy).unsqueeze(-1)
print('pytorch publish diou: ', dious.numpy())
# ================================================================
label_bbox = tf.placeholder(dtype=tf.float32, name='label_bbox')
predic_bbox = tf.placeholder(dtype=tf.float32, name='predic_bbox')
dious = tf.expand_dims(bbox_diou_tf(label_bbox, predic_bbox), axis=-1)
with tf.Session() as sess:
result = sess.run(dious, feed_dict={label_bbox: gt_bboxes_xywh,
predic_bbox: pre_bboxes_xywh})
print('tensorflow publish diou: ', result)
# ================================================================ #
# CIoU #
# ================================================================ #
cious = bbox_ciou_torch(gt_bboxes_xyxy, pre_bboxes_xyxy, False).unsqueeze(-1)
print('pytorch publish ciou unnormaled: ', cious.numpy())
cious = bbox_ciou_torch(gt_bboxes_xyxy_nomal, pre_bboxes_xyxy_nomal, True).unsqueeze(-1)
print('pytorch publish ciou normaled: ', cious.numpy())
# ================================================================ #
cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh, pre_bboxes_xywh, False), axis=-1)
print('numpy publish ciou unnormaled: ', cious)
cious = np.expand_dims(bbox_ciou_np(gt_bboxes_xywh_nomal, pre_bboxes_xywh_nomal, True), axis=-1)
print('numpy publish ciou normaled: ', cious)
# ================================================================ #
cious = tf.expand_dims(bbox_ciou_tf(label_bbox, predic_bbox, False), axis=-1)
cious_normal = tf.expand_dims(bbox_ciou_tf(label_bbox_normal, predic_bbox_normal, True), axis=-1)
with tf.Session() as sess:
cious_tf, cious_tf_normal = sess.run([cious, cious_normal],
feed_dict={label_bbox_normal: gt_bboxes_xywh_nomal,
predic_bbox_normal: pre_bboxes_xywh_nomal,
label_bbox: gt_bboxes_xywh,
predic_bbox: pre_bboxes_xywh})
print('tensorflow publish ciou unnormaled:', cious_tf)
print('tensorflow publish ciou normaled: ', cious_tf_normal)
# ================================================================ #
View Code
numpy publish giou: [[ 0.07342657]
[-0.50800915]]
tensorflow publish giou: [[ 0.07342657]
[-0.50800914]]
pytorch publish goiu: [[ 0.07342657]
[-0.50800914]]
numpy publish diou : [[ 0.14455897]
[-0.25 ]]
pytorch publish diou: [[ 0.14455898]
[-0.25 ]]
tensorflow publish diou: [[ 0.14455898]
[-0.25 ]]
pytorch publish ciou unnormaled: [[ 0.14428109]
[-0.2600825 ]]
pytorch publish ciou normaled: [[ 0.1392411 ]
[-0.25120372]]
numpy publish ciou unnormaled: [[ 0.14428107]
[-0.26008251]]
numpy publish ciou normaled: [[ 0.13924112]
[-0.25120372]]
tensorflow publish ciou unnormaled: [[ 0.14428109]
[-0.2600825 ]]
tensorflow publish ciou normaled: [[ 0.13924108]
[-0.25120363]]
同事实验下来:
method | GIoU | DIoU | CIoU |
mAP | 81.37% | 81.46% | 82.36% |