本文将简单介绍如何对mmdetection
的绘制预测框进行修改,包括修改bbox
的线宽、字体颜色与大小、字体区域的背景颜色等,请注意本教程基于mmdetection-2.27.0
版本,2.x
版本是通用的,若是1.x
版本,本文不保证有效。
当需要对bbox的形式进行修改时,就意味着你的模型已经训练完成了,我们可以先测试一下模型,看看原始的框是什么样子的,这里我直接贴上mmdetection原始的bbox形式:
在tools/test.py文件中可以看到这样的一个函数:
就是这个函数进行预测画框的后续操作,点进去再查看,跳转到mmdetection-2.27.0/mmdet/apis/test.py
,看到这么一个函数:
点击show_result()
函数,跳转到mmdetection-2.27.0/mmdet/models/detectors/base.py
,看到show_result()
函数,重点查看imshow_det_bboxes()
如果只有一个类,你嫌弃这个类名字挡着观察了,那么像我一样把它注掉然后给个别名,这里的参数不需要改动啥,因为上一部分已经改了,这里直接传参进来了。
点击imshow_det_bboxes()
进去,跳转到mmdetection-2.27.0/mmdet/core/visualization/image.py
挨个查看,draw_bboxes
是画框的
上面四对参数分别是(左下角、左上角、右上角、右下角)坐标
draw_labels
这个是画label信息的
label_txt就是类别名称和分数的,我在这里把|去掉了,用空格替代了
facecolor --> 字体背景填充的颜色,我用红色替换了
color --> 字体颜色
pos[1]-16.5 --> 我把字体背景框往上挪16.5个像素
在下面加上这个函数count_nums
,参数啥的就看我这里的吧,然后再在imshow_det_bboxes()
引用这个函数,记得传参
贴上image.py
修改的完整信息:
def draw_bboxes(ax, bboxes, color='g', alpha=0.8, thickness=2):
"""Draw bounding boxes on the axes.
Args:
ax (matplotlib.Axes): The input axes.
bboxes (ndarray): The input bounding boxes with the shape
of (n, 4).
color (list[tuple] | matplotlib.color): the colors for each
bounding boxes.
alpha (float): Transparency of bounding boxes. Default: 0.8.
thickness (int): Thickness of lines. Default: 2.
Returns:
matplotlib.Axes: The result axes.
"""
polygons = []
for i, bbox in enumerate(bboxes):
bbox_int = bbox.astype(np.int32)
poly = [[bbox_int[0], bbox_int[1]], [bbox_int[0], bbox_int[3]],
[bbox_int[2], bbox_int[3]], [bbox_int[2], bbox_int[1]]]
np_poly = np.array(poly).reshape((4, 2))
polygons.append(Polygon(np_poly))
p = PatchCollection(
polygons,
facecolor='none',
edgecolors=color,
linewidths=thickness,
alpha=alpha)
ax.add_collection(p)
return ax
def draw_labels(ax,
labels,
positions,
scores=None,
class_names=None,
color='w',
font_size=8,
scales=None,
horizontal_alignment='left'):
"""Draw labels on the axes.
Args:
ax (matplotlib.Axes): The input axes.
labels (ndarray): The labels with the shape of (n, ).
positions (ndarray): The positions to draw each labels.
scores (ndarray): The scores for each labels.
class_names (list[str]): The class names.
color (list[tuple] | matplotlib.color): The colors for labels.
font_size (int): Font size of texts. Default: 8.
scales (list[float]): Scales of texts. Default: None.
horizontal_alignment (str): The horizontal alignment method of
texts. Default: 'left'.
Returns:
matplotlib.Axes: The result axes.
"""
for i, (pos, label) in enumerate(zip(positions, labels)):
label_text = class_names[
label] if class_names is not None else f'class {label}'
if scores is not None:
label_text += f' {scores[i]:.02f}'
text_color = color[i] if isinstance(color, list) else color
font_size_mask = font_size if scales is None else font_size * scales[i]
ax.text(
pos[0],
pos[1]-16.5,
f'{label_text}',
bbox={
'facecolor': 'red',
'alpha': 0.8,
# 'pad': 0.7,
'edgecolor': 'none'
},
color=text_color,
fontsize=font_size_mask,
verticalalignment='top',
horizontalalignment=horizontal_alignment)
return ax
# 在检测图上写上检测到的目标数量
def count_nums(ax, nums: int, txt_color=(0, 0, 0), font_size=13):
ax.text(5, 30, f'count:{nums}', fontsize=font_size, color='black')
return ax
def draw_masks(ax, img, masks, color=None, with_edge=True, alpha=0.8):
"""Draw masks on the image and their edges on the axes.
Args:
ax (matplotlib.Axes): The input axes.
img (ndarray): The image with the shape of (3, h, w).
masks (ndarray): The masks with the shape of (n, h, w).
color (ndarray): The colors for each masks with the shape
of (n, 3).
with_edge (bool): Whether to draw edges. Default: True.
alpha (float): Transparency of bounding boxes. Default: 0.8.
Returns:
matplotlib.Axes: The result axes.
ndarray: The result image.
"""
taken_colors = set([0, 0, 0])
if color is None:
random_colors = np.random.randint(0, 255, (masks.size(0), 3))
color = [tuple(c) for c in random_colors]
color = np.array(color, dtype=np.uint8)
polygons = []
for i, mask in enumerate(masks):
if with_edge:
contours, _ = bitmap_to_polygon(mask)
polygons += [Polygon(c) for c in contours]
color_mask = color[i]
while tuple(color_mask) in taken_colors:
color_mask = _get_bias_color(color_mask)
taken_colors.add(tuple(color_mask))
mask = mask.astype(bool)
img[mask] = img[mask] * (1 - alpha) + color_mask * alpha
p = PatchCollection(
polygons, facecolor='none', edgecolors='w', linewidths=1, alpha=0.8)
ax.add_collection(p)
return ax, img
def imshow_det_bboxes(img,
bboxes=None,
labels=None,
segms=None,
class_names=None,
score_thr=0,
bbox_color='green',
text_color='green',
mask_color=None,
thickness=2,
font_size=8,
win_name='',
show=True,
wait_time=0,
out_file=None):
"""Draw bboxes and class labels (with scores) on an image.
Args:
img (str | ndarray): The image to be displayed.
bboxes (ndarray): Bounding boxes (with scores), shaped (n, 4) or
(n, 5).
labels (ndarray): Labels of bboxes.
segms (ndarray | None): Masks, shaped (n,h,w) or None.
class_names (list[str]): Names of each classes.
score_thr (float): Minimum score of bboxes to be shown. Default: 0.
bbox_color (list[tuple] | tuple | str | None): Colors of bbox lines.
If a single color is given, it will be applied to all classes.
The tuple of color should be in RGB order. Default: 'green'.
text_color (list[tuple] | tuple | str | None): Colors of texts.
If a single color is given, it will be applied to all classes.
The tuple of color should be in RGB order. Default: 'green'.
mask_color (list[tuple] | tuple | str | None, optional): Colors of
masks. If a single color is given, it will be applied to all
classes. The tuple of color should be in RGB order.
Default: None.
thickness (int): Thickness of lines. Default: 2.
font_size (int): Font size of texts. Default: 13.
show (bool): Whether to show the image. Default: True.
win_name (str): The window name. Default: ''.
wait_time (float): Value of waitKey param. Default: 0.
out_file (str, optional): The filename to write the image.
Default: None.
Returns:
ndarray: The image with bboxes drawn on it.
"""
assert bboxes is None or bboxes.ndim == 2, \
f' bboxes ndim should be 2, but its ndim is {bboxes.ndim}.'
assert labels.ndim == 1, \
f' labels ndim should be 1, but its ndim is {labels.ndim}.'
assert bboxes is None or bboxes.shape[1] == 4 or bboxes.shape[1] == 5, \
f' bboxes.shape[1] should be 4 or 5, but its {bboxes.shape[1]}.'
assert bboxes is None or bboxes.shape[0] <= labels.shape[0], \
'labels.shape[0] should not be less than bboxes.shape[0].'
assert segms is None or segms.shape[0] == labels.shape[0], \
'segms.shape[0] and labels.shape[0] should have the same length.'
assert segms is not None or bboxes is not None, \
'segms and bboxes should not be None at the same time.'
img = mmcv.imread(img).astype(np.uint8)
if score_thr > 0:
assert bboxes is not None and bboxes.shape[1] == 5
scores = bboxes[:, -1]
inds = scores > score_thr
bboxes = bboxes[inds, :]
labels = labels[inds]
if segms is not None:
segms = segms[inds, ...]
img = mmcv.bgr2rgb(img)
width, height = img.shape[1], img.shape[0]
img = np.ascontiguousarray(img)
fig = plt.figure(win_name, frameon=False)
plt.title(win_name)
canvas = fig.canvas
dpi = fig.get_dpi()
# add a small EPS to avoid precision lost due to matplotlib's truncation
# (https://github.com/matplotlib/matplotlib/issues/15363)
fig.set_size_inches((width + EPS) / dpi, (height + EPS) / dpi)
# remove white edges by set subplot margin
plt.subplots_adjust(left=0, right=1, bottom=0, top=1)
ax = plt.gca()
ax.axis('off')
max_label = int(max(labels) if len(labels) > 0 else 0)
text_palette = palette_val(get_palette(text_color, max_label + 1))
text_colors = [text_palette[label] for label in labels]
num_bboxes = 0
if bboxes is not None:
num_bboxes = bboxes.shape[0]
bbox_palette = palette_val(get_palette(bbox_color, max_label + 1))
colors = [bbox_palette[label] for label in labels[:num_bboxes]]
draw_bboxes(ax, bboxes, colors, alpha=0.8, thickness=thickness)
horizontal_alignment = 'left'
positions = bboxes[:, :2].astype(np.int32) + thickness
areas = (bboxes[:, 3] - bboxes[:, 1]) * (bboxes[:, 2] - bboxes[:, 0])
scales = _get_adaptive_scales(areas)
scores = bboxes[:, 4] if bboxes.shape[1] == 5 else None
draw_labels(
ax,
labels[:num_bboxes],
positions,
scores=scores,
class_names=class_names,
color=text_colors,
font_size=font_size,
scales=scales,
horizontal_alignment=horizontal_alignment)
# 计数功能
count_nums(ax,
nums=num_bboxes,
txt_color=(0, 0, 0), font_size=font_size)
if segms is not None:
mask_palette = get_palette(mask_color, max_label + 1)
colors = [mask_palette[label] for label in labels]
colors = np.array(colors, dtype=np.uint8)
draw_masks(ax, img, segms, colors, with_edge=True)
if num_bboxes < segms.shape[0]:
segms = segms[num_bboxes:]
horizontal_alignment = 'center'
areas = []
positions = []
for mask in segms:
_, _, stats, centroids = cv2.connectedComponentsWithStats(
mask.astype(np.uint8), connectivity=8)
largest_id = np.argmax(stats[1:, -1]) + 1
positions.append(centroids[largest_id])
areas.append(stats[largest_id, -1])
areas = np.stack(areas, axis=0)
scales = _get_adaptive_scales(areas)
draw_labels(
ax,
labels[num_bboxes:],
positions,
class_names=class_names,
color=text_colors,
font_size=font_size,
scales=scales,
horizontal_alignment=horizontal_alignment)
plt.imshow(img)
stream, _ = canvas.print_to_buffer()
buffer = np.frombuffer(stream, dtype='uint8')
if sys.platform == 'darwin':
width, height = canvas.get_width_height(physical=True)
img_rgba = buffer.reshape(height, width, 4)
rgb, alpha = np.split(img_rgba, [3], axis=2)
img = rgb.astype('uint8')
img = mmcv.rgb2bgr(img)
if show:
# We do not use cv2 for display because in some cases, opencv will
# conflict with Qt, it will output a warning: Current thread
# is not the object's thread. You can refer to
# https://github.com/opencv/opencv-python/issues/46 for details
if wait_time == 0:
plt.show()
else:
plt.show(block=False)
plt.pause(wait_time)
if out_file is not None:
mmcv.imwrite(img, out_file)
plt.close()
return img
上面修改了好几个文件,记得要重新编译代码python setup.py install
,或者到环境中同步修改源文件!不然可能会报错的!
具体修改源文件的方式,请看我上一篇文章。
因篇幅问题不能全部显示,请点此查看更多更全内容