导言
目标检测(Object Detection)可以识别一幅图像中的多个物体,定位不同物体的同时(边界框),贴上相应的类别。简单来说,解决了what和where问题。授人以鱼,不如授人以渔,本文不会具体介绍某类/某种算法(one-stage or two-stage),但会给出目标检测相关论文的最强合集(持续更新ing)。为了follow潮流(装B),Amusi将目标检测论文合集的github库起名为awesome-object-detection。
CVer
编辑: Amusi
校稿: Amusi
Object Detection Wiki
Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class (such as humans, buildings, or cars) in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including image retrieval and video surveillance. Object Detection
首先,Amusi先安利一个网站,打开下述链接后,既可以看到令人热血沸腾的画面。
link:
https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html
当初看到这个网址,我很惊讶,链接上写的是2015/10/09,我以为是很老的资源,但看到内容后,着实震惊了。该库在handong大神的个人主页上,但并没有Object Detection单独的github库。受此启发,我擅自(因为还没有得到本人同意)将handong大神的Object Detection整理的内容进行精简和补充(实在班门弄斧了)。于是创建了一个名为awesome-object-detection的github库。
Awesome-Object-Detection
接下来,重点介绍一下这个“很copy”的库。awesome-object-detection的目的是为了提供一个目标检测(Object Detection)学习的平台。特点是:介绍最新的paper和最新的code(尽量更新!)由于Amusi还是初学者,目前还没有办法对每个paper进行介绍,但后续会推出paper精讲的内容,也欢迎大家star,fork并pull自己所关注到最新object detection的工作。
那来看看目前,awesome-object-detection里有哪些干货吧~
为了节省篇幅,这里只介绍较为重要的工作:
R-CNN三件套(R-CNN Fast R-CNN和Faster R-CNN)
Light-Head R-CNN
Cascade R-CNN
YOLO三件套(YOLOv1 YOLOv2 YOLOv3)
SSD(SSD DSSD FSSD ESSD Pelee)
R-FCN
FPN
DSOD
RetinaNet
DetNet
...
大家对常见的R-CNN系列和YOLO系列一定很熟悉了,这里Amusi也不想重复,因为显得没有逼格~这里主要简单推荐两篇paper,来凸显一下awesome-object-detection的意义。
Pelee
《Pelee: A Real-Time Object Detection System on Mobile Devices》 intro: (ICLR 2018 workshop track)
arxiv: https://arxiv.org/abs/1804.06882
github: https://github.com/Robert-JunWang/Pelee
Abstract:An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and NASNet-A. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy by 0.6% (71.3% vs. 70.7%) and 11% lower computational cost than MobileNet, the state-of-the-art efficient architecture. Meanwhile, PeleeNet is only 66% of the model size of MobileNet. We then propose a real-time object detection system by combining PeleeNet with Single Shot MultiBox Detector (SSD) method and optimizing the architecture for fast speed. Our proposed detection system, named Pelee, achieves 76.4% mAP (mean average precision) on PASCAL VOC2007 and 22.4 mAP on MS COCO dataset at the speed of 17.1 FPS on iPhone 6s and 23.6 FPS on iPhone 8. The result on COCO outperforms YOLOv2 in consideration of a higher precision, 13.6 times lower computational cost and 11.3 times smaller model size. The code and models are open sourced.
Quantization Mimic
《Quantization Mimic: Towards Very Tiny CNN for Object Detection》
Tsinghua University1 & The Chinese University of Hong Kong2 &SenseTime3
arxiv: https://arxiv.org/abs/1805.02152
注:看一下这篇paper联名的机构......2018-05-06发布在arXiv(热乎乎的还烫手)
Abstract:In this paper, we propose a simple and general framework for training very tiny CNNs for object detection. Due to limited representation ability, it is challenging to train very tiny networks for complicated tasks like detection. To the best of our knowledge, our method, called Quantization Mimic, is the first one focusing on very tiny networks. We utilize two types of acceleration methods: mimic and quantization. Mimic improves the performance of a student network by transfering knowledge from a teacher network. Quantization converts a full-precision network to a quantized one without large degradation of performance. If the teacher network is quantized, the search scope of the student network will be smaller. Using this property of quantization, we propose Quantization Mimic. It first quantizes the large network, then mimic a quantized small network. We suggest the operation of quantization can help student network to match the feature maps from teacher network. To evaluate the generalization of our hypothesis, we carry out experiments on various popular CNNs including VGG and Resnet, as well as different detection frameworks including Faster R-CNN and R-FCN. Experiments on Pascal VOC and WIDER FACE verify our Quantization Mimic algorithm can be applied on various settings and outperforms state-of-the-art model acceleration methods given limited computing resouces.
总结
awesome-object-detection这个库的目的是为了尽可能介绍最新的关于目标检测(Object Detection)相关的工作(paper and code)。由于Amusi还是初学者,所以整理不好/不规范的地方,还请大家及时指出。因为该库直接copy了handong大神的内容,所以如果有版权侵犯,我会立即删除/修改(正在联系handong大神ing)。
如果大家觉得awesome-object-detection对自己有一丢丢帮助,那么欢迎大家star和fork,哈哈,更欢迎大家pull~
打开“阅读原文”,可以直接访问awesome-object-detection
link:https://github.com/amusi/awesome-object-detection