Object Detection 物体检测
RCNN FastRCNN FasterRCNN为一脉相承。另外两个方向为Yolo 和SSD。Yolo迭代到Yolo V3,SSD的设计也让它后来在很多方向都有应用。
Christian Szegedy / Google 用AlexNet也做过物体检测的尝试。
[1] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. "Deep neural networks for object detection." Advances in Neural Information Processing Systems. 2013. pdf
不过真正取得巨大突破,引发基于深度学习目标检测的热潮的还是RCNN
但是如果将如何检测出区域,按照回归问题的思路去解决,预测出(x,y,w,h)四个参数的值,从而得出方框的位置。回归问题的训练参数收敛时间要长很多,于是将回归问题转成分类问题来解决。总共两个步骤:
第一步:将图片转换成不同大小的框,
第二步:对框内的数据进行特征提取,然后通过分类器判定,选区分最高的框作为物体定位框。
评价标准: IoU(Intersection over Union); mAP(Mean Average Precision) 速度:帧率FPS
-
Method
-[SPPNet] -
[Two-Stage Object Detection】
- [R-CNN]
- [Fast R-CNN]
- [Faster R-CNN]
-
[Single-Shot Object Detection]
- [YOLO]
- [YOLOv2]
- [YOLOv3]
- [SSD]
- [RetinaNet]
-
[Great improvement]
-
[R-FCN]
-
Feature Pyramid Network (FPN)
Method
SPPNet 何凯明 He Kaiming /MSRA
- SPPNet Spatial Pyramid Pooling(空间金字塔池化)
[3] He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014. pdf
一般CNN后接全连接层或者分类器,他们都需要固定的输入尺寸,因此不得不对输入数据进行crop或者warp,这些预处理会造成数据的丢失或几何的失真。SPP Net的提出,将金字塔思想加入到CNN,实现了数据的多尺度输入。此时网络的输入可以是任意尺度的,在SPP layer中每一个pooling的filter会根据输入调整大小,而SPP的输出尺度始终是固定的。
这样打破了之前大家认为需要先提出检测框,然后resize到一个固定尺寸再通过CNN的模式,而可以图片先通过CNN获取到特征后,在特征图上使用不同的检测框提取特征。之后pooling到同样尺寸进行后续步骤。这样可以提高物体检测速度。
- intro: ECCV 2014 / TPAMI 2015
- keywords: SPP-Net
- arxiv: http://arxiv.org/abs/1406.4729
- github: https://github.com/ShaoqingRen/SPP_net
- notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/
Two-Stage Object Detection
RCNN Ross B. Girshick(RBG) link / UC-Berkeley
- RCNN R-CNN框架,取代传统目标检测使用的滑动窗口+手工设计特征,而使用CNN来进行特征提取。这是深度神经网络的应用。
Traditional region proposal methods + CNN classifier
也就是将第二步改成了深度神经网络提取特征。
然后通过线性svm分类器识别对象的的类别,再通过回归模型用于收紧边界框;
创新点:将CNN用在物体检测上,提高了检测率。
缺点: 基于选择性搜索算法为每个图像提取2,000个候选区域,使用CNN为每个图像区域提取特征,重复计算,速度慢,40-50秒。
R-CNN在PASCAL VOC2007上的检测结果提升到66%(mAP)
[2] SGirshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. pdf
github: https://github.com/rbgirshick/rcnn
intro: R-CNN
arxiv: http://arxiv.org/abs/1311.2524
supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf
slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf
github: https://github.com/rbgirshick/rcnn
notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
caffe-pr(“Make R-CNN the Caffe detection example”): https://github.com/BVLC/caffe/pull/482
Fast RCNN Ross B. Girshick
- Fast RCNN
[4] Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.
如果RCNN的卷积计算只需要计算一次,那么速度就可以很快降下来了。
Ross Girshick将SPPNet的方法应用到RCNN中,提出了一个可以看做单层sppnet的网络层,叫做ROI Pooling,这个网络层可以把不同大小的输入映射到一个固定尺度的特征向量.将图像输出到CNN生成卷积特征映射。使用这些特征图结合候选区域算法提取候选区域。然后,使用RoI池化层将所有可能的区域重新整形为固定大小,以便将其馈送到全连接网络中。
1.首先将图像作为输入;
2.将图像传递给卷积神经网络,计算卷积后的特征。
3.然后通过之前proposal的方法提取ROI,在所有的感兴趣的区域上应用RoI池化层,并调整区域的尺寸。然后,每个区域被传递到全连接层的网络中;
4.softmax层用于全连接网以输出类别。与softmax层一起,也并行使用线性回归层,以输出预测类的边界框坐标。
Fast R-CNN
arxiv: http://arxiv.org/abs/1504.08083
slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
github: https://github.com/rbgirshick/fast-rcnn
github(COCO-branch): https://github.com/rbgirshick/fast-rcnn/tree/coco
webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
notes: http://blog.csdn.net/linj_m/article/details/48930179
github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
github: https://github.com/mahyarnajibi/fast-rcnn-torch
github: https://github.com/apple2373/chainer-simple-fast-rnn
github: https://github.com/zplizzi/tensorflow-fast-rcnn
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03414
paper: http://abhinavsh.info/papers/pdfs/adversarial_object_detection.pdf
github(Caffe): https://github.com/xiaolonw/adversarial-frcnn
Faster RCNN 何凯明 He Kaiming
- Faster RCNN
Fast RCNN的区域提取还是使用的传统方法,而Faster RCNN将Region Proposal Network和特征提取、目标分类和边框回归统一到了一个框架中。
Faster R-CNN = Region Proposal Network +Fast R-CNN
将区域提取通过一个CNN完成。这个CNN叫做Region Proposal Network,RPN的运用使得region proposal的额外开销就只有一个两层网络。关于RPN可以参考link
Faster R-CNN设计了提取候选区域的网络RPN,代替了费时的Selective Search(选择性搜索),使得检测速度大幅提升,下表对比了R-CNN、Fast R-CNN、Faster R-CNN的检测速度:
[5] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
- intro: NIPS 2015
- arxiv: http://arxiv.org/abs/1506.01497
- gitxiv: http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region
- slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
- github(official, Matlab): https://github.com/ShaoqingRen/faster_rcnn
- github: https://github.com/rbgirshick/py-faster-rcnn
- github(MXNet): https://github.com/msracver/Deformable-ConvNets/tree/master/faster_rcnn
- github: https://github.com//jwyang/faster-rcnn.pytorch
- github: https://github.com/mitmul/chainer-faster-rcnn
- github: https://github.com/andreaskoepf/faster-rcnn.torch
- github: https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
- github: https://github.com/smallcorgi/Faster-RCNN_TF
- github: https://github.com/CharlesShang/TFFRCNN
- github(C++ demo): https://github.com/YihangLou/FasterRCNN-Encapsulation-Cplusplus
- github: https://github.com/yhenon/keras-frcnn
- github: https://github.com/Eniac-Xie/faster-rcnn-resnet
- github(C++): https://github.com/D-X-Y/caffe-faster-rcnn/tree/dev
R-CNN minus R
- intro: BMVC 2015
- arxiv: http://arxiv.org/abs/1506.06981
Faster R-CNN in MXNet with distributed implementation and data parallelization
Contextual Priming and Feedback for Faster R-CNN
- intro: ECCV 2016. Carnegie Mellon University
- paper: http://abhinavsh.info/context_priming_feedback.pdf
- poster: http://www.eccv2016.org/files/posters/P-1A-20.pdf
An Implementation of Faster RCNN with Study for Region Sampling
- intro: Technical Report, 3 pages. CMU
- arxiv: https://arxiv.org/abs/1702.02138
- github: https://github.com/endernewton/tf-faster-rcnn
Interpretable R-CNN
- intro: North Carolina State University & Alibaba
- keywords: AND-OR Graph (AOG)
- arxiv: https://arxiv.org/abs/1711.05226
Light-Head R-CNN: In Defense of Two-Stage Object Detector
- intro: Tsinghua University & Megvii Inc
- arxiv: https://arxiv.org/abs/1711.07264
- github(official, Tensorflow): https://github.com/zengarden/light_head_rcnn
- github: https://github.com/terrychenism/Deformable-ConvNets/blob/master/rfcn/symbols/resnet_v1_101_rfcn_light.py#L784
Cascade R-CNN: Delving into High Quality Object Detection
- intro: CVPR 2018. UC San Diego
- arxiv: https://arxiv.org/abs/1712.00726
- github(Caffe, official): https://github.com/zhaoweicai/cascade-rcnn
Cascade R-CNN: High Quality Object Detection and Instance Segmentation
- https://arxiv.org/abs/1906.09756
- github(Caffe, official): https://github.com/zhaoweicai/cascade-rcnn
- github(official): https://github.com/zhaoweicai/Detectron-Cascade-RCNN
SMC Faster R-CNN: Toward a scene-specialized multi-object detector
Domain Adaptive Faster R-CNN for Object Detection in the Wild
- intro: CVPR 2018. ETH Zurich & ESAT/PSI
- arxiv: https://arxiv.org/abs/1803.03243
- github(official. Caffe): https://github.com/yuhuayc/da-faster-rcnn
Robust Physical Adversarial Attack on Faster R-CNN Object Detector
Auto-Context R-CNN
- intro: Rejected by ECCV18
- arxiv: https://arxiv.org/abs/1807.02842
Grid R-CNN
- intro: SenseTime
- arxiv: https://arxiv.org/abs/1811.12030
Grid R-CNN Plus: Faster and Better
- intro: SenseTime Research & CUHK & Beihang University
- arxiv: https://arxiv.org/abs/1906.05688
- github: https://github.com/STVIR/Grid-R-CNN
Few-shot Adaptive Faster R-CNN
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1903.09372
Libra R-CNN: Towards Balanced Learning for Object Detection
- intro: CVPR 2019
- arxiv: https://arxiv.org/abs/1904.02701
Rethinking Classification and Localization in R-CNN
- intro: Northeastern University & Microsoft
- arxiv: https://arxiv.org/abs/1904.06493
Reprojection R-CNN: A Fast and Accurate Object Detector for 360° Images
- intro: Peking University
- arxiv: https://arxiv.org/abs/1907.11830
- Rethinking Classification and Localization for Cascade R-CNN
- intro: BMVC 2019
- arxiv: https://arxiv.org/abs/1907.11914
Single-Shot Object Detection
Yolo
-
Yolo(You only look once)
YOLO的检测思想不同于R-CNN系列的思想,它将目标检测作为回归任务来解决。YOLO 的核心思想就是利用整张图作为网络的输入,直接在输出层回归 bounding box(边界框) 的位置及其所属的类别。
[6] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). pdfYOLO,Oustanding Work, really practical
PPT
c 官方: https://pjreddie.com/darknet/yolo/ v3
https://pjreddie.com/darknet/yolov2/ v2
https://pjreddie.com/darknet/yolov1/ v1
pytorch (tencent) v1, v2, v3 :https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet
yolo 介绍 可以参考介绍
- arxiv: http://arxiv.org/abs/1506.02640
- code: http://pjreddie.com/darknet/yolo/
- github: https://github.com/pjreddie/darknet
- blog: https://pjreddie.com/publications/yolo/
- slides: https://docs.google.com/presentation/d/1aeRvtKG21KHdD5lg6Hgyhx5rPq_ZOsGjG5rJ1HP7BbA/pub?start=false&loop=false&delayms=3000&slide=id.p
- reddit: https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
- github: https://github.com/gliese581gg/YOLO_tensorflow
- github: https://github.com/xingwangsfu/caffe-yolo
- github: https://github.com/frankzhangrui/Darknet-Yolo
- github: https://github.com/BriSkyHekun/py-darknet-yolo
- github: https://github.com/tommy-qichang/yolo.torch
- github: https://github.com/frischzenger/yolo-windows
- github: https://github.com/AlexeyAB/yolo-windows
- github: https://github.com/nilboy/tensorflow-yolo
darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++
- blog: https://thtrieu.github.io/notes/yolo-tensorflow-graph-buffer-cpp
- github: https://github.com/thtrieu/darkflow
Start Training YOLO with Our Own Data
- intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
- blog: http://guanghan.info/blog/en/my-works/train-yolo/
- github: https://github.com/Guanghan/darknet
YOLO: Core ML versus MPSNNGraph
- intro: Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
- blog: http://machinethink.net/blog/yolo-coreml-versus-mps-graph/
- github: https://github.com/hollance/YOLO-CoreML-MPSNNGraph
TensorFlow YOLO object detection on Android
- intro: Real-time object detection on Android using the YOLO network with TensorFlow
- github: https://github.com/natanielruiz/android-yolo
Computer Vision in iOS – Object Detection
- blog: https://sriraghu.com/2017/07/12/computer-vision-in-ios-object-detection/
- github:https://github.com/r4ghu/iOS-CoreML-Yolo
YOLOv2
YOLO9000: Better, Faster, Stronger
- arxiv: https://arxiv.org/abs/1612.08242
- code: http://pjreddie.com/yolo9000/
- github(Chainer): https://github.com/leetenki/YOLOv2
- github(Keras): https://github.com/allanzelener/YAD2K
- github(PyTorch): https://github.com/longcw/yolo2-pytorch
- github(Tensorflow): https://github.com/hizhangp/yolo_tensorflow
- github(Windows): https://github.com/AlexeyAB/darknet
- github: https://github.com/choasUp/caffe-yolo9000
- github: https://github.com/philipperemy/yolo-9000
darknet_scripts
- intro: Auxilary scripts to work with (YOLO) darknet deep learning famework. AKA -> How to generate YOLO anchors?
- github: https://github.com/Jumabek/darknet_scripts
Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2
LightNet: Bringing pjreddie’s DarkNet out of the shadows
YOLO v2 Bounding Box Tool
- intro: Bounding box labeler tool to generate the training data in the format YOLO v2 requires.
- github: https://github.com/Cartucho/yolo-boundingbox-labeler-GUI
YOLOv3
YOLOv3: An Incremental Improvement
- project page: https://pjreddie.com/darknet/yolo/
- paper: https://pjreddie.com/media/files/papers/YOLOv3.pdf
- arxiv: https://arxiv.org/abs/1804.02767
- github: https://github.com/DeNA/PyTorch_YOLOv3
- github: https://github.com/eriklindernoren/PyTorch-YOLOv3
Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers
Spiking-YOLO: Spiking Neural Network for Real-time Object Detection
SSD(The Single Shot Detector) 详解 detail
-
SSD SSD是一种直接预测bounding box的坐标和类别的object detection算法,没有生成proposal的过程。它使用object classification的模型作为base network,如VGG16网络,
[7] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015). pdf
tensorflow 源码 https://github.com/balancap/SSD-Tensorflow/blob/master/nets/ssd_vgg_300.py
- intro: ECCV 2016 Oral
- arxiv: http://arxiv.org/abs/1512.02325
- paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
- slides: http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf
- github(Official): https://github.com/weiliu89/caffe/tree/ssd
- video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973
- github: https://github.com/zhreshold/mxnet-ssd
- github: https://github.com/zhreshold/mxnet-ssd.cpp
- github: https://github.com/rykov8/ssd_keras
- github: https://github.com/balancap/SSD-Tensorflow
- github: https://github.com/amdegroot/ssd.pytorch
- github(Caffe): https://github.com/chuanqi305/MobileNet-SSD
What’s the diffience in performance between this new code you pushed and the previous code?
DSSD : Deconvolutional Single Shot Detector
- intro: UNC Chapel Hill & Amazon Inc
- arxiv: https://arxiv.org/abs/1701.06659
- github: https://github.com/chengyangfu/caffe/tree/dssd
- github: https://github.com/MTCloudVision/mxnet-dssd
- demo: http://120.52.72.53/www.cs.unc.edu/c3pr90ntc0td/~cyfu/dssd_lalaland.mp4
Enhancement of SSD by concatenating feature maps for object detection
- intro: rainbow SSD (R-SSD)
- arxiv: https://arxiv.org/abs/1705.09587
Context-aware Single-Shot Detector
- keywords: CSSD, DiCSSD, DeCSSD, effective receptive fields (ERFs), theoretical receptive fields (TRFs)
- arxiv: https://arxiv.org/abs/1707.08682
Feature-Fused SSD: Fast Detection for Small Objects
FSSD: Feature Fusion Single Shot Multibox Detector
Weaving Multi-scale Context for Single Shot Detector
- intro: WeaveNet
- keywords: fuse multi-scale information
- arxiv: https://arxiv.org/abs/1712.03149
Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network
- keywords: ESSD
- arxiv: https://arxiv.org/abs/1801.05918
Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection
MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects
- intro: Zhengzhou University
- arxiv: https://arxiv.org/abs/1805.07009
Accurate Single Stage Detector Using Recurrent Rolling Convolution
- intro: CVPR 2017. SenseTime
- keywords: Recurrent Rolling Convolution (RRC)
- arxiv: https://arxiv.org/abs/1704.05776
- github: https://github.com/xiaohaoChen/rrc_detection
Residual Features and Unified Prediction Network for Single Stage Detection
FPN
FPN(feature pyramid networks)特征金字塔,是一种融合了多层特征信息的特征提取方法,可以结合各种深度神经网络使用。
SSD的多尺度特征融合的方式,没有上采样过程,没有用到足够低层的特征(在SSD中,最低层的特征是VGG网络的conv4_3)
Feature Pyramid Networks for Object Detection pdf
Feature Pyramid Networks for Object Detection
- intro: Facebook AI Research
- arxiv: https://arxiv.org/abs/1612.03144
Action-Driven Object Detection with Top-Down Visual Attentions
arxiv: https://arxiv.org/abs/1612.06704
Beyond Skip Connections: Top-Down Modulation for Object Detection
- intro: CMU & UC Berkeley & Google Research
- arxiv: https://arxiv.org/abs/1612.06851
Wide-Residual-Inception Networks for Real-time Object Detection
- intro: Inha University
- arxiv: https://arxiv.org/abs/1702.01243
Attentional Network for Visual Object Detection
- intro: University of Maryland & Mitsubishi Electric Research Laboratories
- arxiv: https://arxiv.org/abs/1702.01478
Learning Chained Deep Features and Classifiers for Cascade in Object Detection
- keykwords: CC-Net
- intro: chained cascade network (CC-Net). 81.1% mAP on PASCAL VOC 2007
- arxiv: https://arxiv.org/abs/1702.07054
DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling
- intro: ICCV 2017 (poster)
- arxiv: https://arxiv.org/abs/1703.10295
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1704.03944
Spatial Memory for Context Reasoning in Object Detection
Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
https://arxiv.org/abs/1704.05775
LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems
- intro: Embedded Vision Workshop in CVPR. UC San Diego & Qualcomm Inc
- arxiv: https://arxiv.org/abs/1705.05922
Point Linking Network for Object Detection
- intro: Point Linking Network (PLN)
- arxiv: https://arxiv.org/abs/1706.03646
Perceptual Generative Adversarial Networks for Small Object Detection
https://arxiv.org/abs/1706.05274
Few-shot Object Detection
https://arxiv.org/abs/1706.08249
Yes-Net: An effective Detector Based on Global Information
https://arxiv.org/abs/1706.09180
Towards lightweight convolutional neural networks for object detection
https://arxiv.org/abs/1707.01395
RON: Reverse Connection with Objectness Prior Networks for Object Detection
- intro: CVPR 2017
- arxiv: https://arxiv.org/abs/1707.01691
- github: https://github.com/taokong/RON
Deformable Part-based Fully Convolutional Network for Object Detection
- intro: BMVC 2017 (oral). Sorbonne Universités & CEDRIC
- arxiv: https://arxiv.org/abs/1707.06175
Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1707.06399
Recurrent Scale Approximation for Object Detection in CNN
- intro: ICCV 2017
- keywords: Recurrent Scale Approximation (RSA)
- arxiv: https://arxiv.org/abs/1707.09531
- github: https://github.com/sciencefans/RSA-for-object-detection
DSOD: Learning Deeply Supervised Object Detectors from Scratch
- intro: ICCV 2017. Fudan University & Tsinghua University & Intel Labs China
- arxiv: https://arxiv.org/abs/1708.01241
- github: https://github.com/szq0214/DSOD
Object Detection from Scratch with Deep Supervision
https://arxiv.org/abs/1809.09294
CoupleNet: Coupling Global Structure with Local Parts for Object Detection
- intro: ICCV 2017
- arxiv: https://arxiv.org/abs/1708.02863
Incremental Learning of Object Detectors without Catastrophic Forgetting
- intro: ICCV 2017. Inria
- arxiv: https://arxiv.org/abs/1708.06977
Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection
https://arxiv.org/abs/1709.04347
StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
https://arxiv.org/abs/1709.05788
Dynamic Zoom-in Network for Fast Object Detection in Large Images
https://arxiv.org/abs/1711.05187
Zero-Annotation Object Detection with Web Knowledge Transfer
- intro: NTU, Singapore & Amazon
- keywords: multi-instance multi-label domain adaption learning framework
- arxiv: https://arxiv.org/abs/1711.05954
MegDet: A Large Mini-Batch Object Detector
- intro: Peking University & Tsinghua University & Megvii Inc
- arxiv: https://arxiv.org/abs/1711.07240
Receptive Field Block Net for Accurate and Fast Object Detection
- intro: RFBNet
- arxiv: https://arxiv.org/abs/1711.07767
- github: https://github.com//ruinmessi/RFBNet
An Analysis of Scale Invariance in Object Detection - SNIP
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1711.08189
- github: https://github.com/bharatsingh430/snip
Feature Selective Networks for Object Detection
https://arxiv.org/abs/1711.08879
Learning a Rotation Invariant Detector with Rotatable Bounding Box
- arxiv: https://arxiv.org/abs/1711.09405
- github(official, Caffe): https://github.com/liulei01/DRBox
Scalable Object Detection for Stylized Objects
- intro: Microsoft AI & Research Munich
- arxiv: https://arxiv.org/abs/1711.09822
Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids
Deep Regionlets for Object Detection
- keywords: region selection network, gating network
- arxiv: https://arxiv.org/abs/1712.02408
Training and Testing Object Detectors with Virtual Images
- intro: IEEE/CAA Journal of Automatica Sinica
- arxiv: https://arxiv.org/abs/1712.08470
Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video
- keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
- arxiv: https://arxiv.org/abs/1712.08832
Spot the Difference by Object Detection
- intro: Tsinghua University & JD Group
- arxiv: https://arxiv.org/abs/1801.01051
Localization-Aware Active Learning for Object Detection
Object Detection with Mask-based Feature Encoding
https://arxiv.org/abs/1802.03934
LSTD: A Low-Shot Transfer Detector for Object Detection
- intro: AAAI 2018
- arxiv: https://arxiv.org/abs/1803.01529
Pseudo Mask Augmented Object Detection
https://arxiv.org/abs/1803.05858
Revisiting RCNN: On Awakening the Classification Power of Faster RCNN
- intro: ECCV 2018
- keywords: DCR V1
- arxiv: https://arxiv.org/abs/1803.06799
- github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement
Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection
- keywords: DCR V2
- arxiv: https://arxiv.org/abs/1810.04002
- github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement
Learning Region Features for Object Detection
- intro: Peking University & MSRA
- arxiv: https://arxiv.org/abs/1803.07066
Object Detection for Comics using Manga109 Annotations
- intro: University of Tokyo & National Institute of Informatics, Japan
- arxiv: https://arxiv.org/abs/1803.08670
Task-Driven Super Resolution: Object Detection in Low-resolution Images
https://arxiv.org/abs/1803.11316
Transferring Common-Sense Knowledge for Object Detection
https://arxiv.org/abs/1804.01077
Multi-scale Location-aware Kernel Representation for Object Detection
- intro: CVPR 2018
- arxiv: https://arxiv.org/abs/1804.00428
- github: https://github.com/Hwang64/MLKP
Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors
- intro: National University of Defense Technology
- arxiv: https://arxiv.org/abs/1804.04606
DetNet: A Backbone network for Object Detection
- intro: Tsinghua University & Megvii Inc
- arxiv: https://arxiv.org/abs/1804.06215
AdvDetPatch: Attacking Object Detectors with Adversarial Patches
https://arxiv.org/abs/1806.02299
Attacking Object Detectors via Imperceptible Patches on Background
https://arxiv.org/abs/1809.05966
Physical Adversarial Examples for Object Detectors
- intro: WOOT 2018
- arxiv: https://arxiv.org/abs/1807.07769
Object detection at 200 Frames Per Second
- intro: United Technologies Research Center-Ireland
- arxiv: https://arxiv.org/abs/1805.06361
Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images
- intro: CVPR 2018 Deep Vision Workshop
- arxiv: https://arxiv.org/abs/1805.11778
SNIPER: Efficient Multi-Scale Training
- intro: University of Maryland
- keywords: SNIPER (Scale Normalization for Image Pyramid with Efficient Resampling)
- arxiv: https://arxiv.org/abs/1805.09300
- github: https://github.com/mahyarnajibi/SNIPER
Soft Sampling for Robust Object Detection
https://arxiv.org/abs/1806.06986
MetaAnchor: Learning to Detect Objects with Customized Anchors
- intro: Megvii Inc (Face++) & Fudan University
- arxiv: https://arxiv.org/abs/1807.00980
Localization Recall Precision (LRP): A New Performance Metric for Object Detection
- intro: ECCV 2018. Middle East Technical University
- arxiv: https://arxiv.org/abs/1807.01696
- github: https://github.com/cancam/LRP
Pooling Pyramid Network for Object Detection
- intro: Google AI Perception
- arxiv: https://arxiv.org/abs/1807.03284
Modeling Visual Context is Key to Augmenting Object Detection Datasets
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1807.07428
Acquisition of Localization Confidence for Accurate Object Detection
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1807.11590
- gihtub: https://github.com/vacancy/PreciseRoIPooling
CornerNet: Detecting Objects as Paired Keypoints
- intro: ECCV 2018
- keywords: IoU-Net, PreciseRoIPooling
- arxiv: https://arxiv.org/abs/1808.01244
- github: https://github.com/umich-vl/CornerNet
Unsupervised Hard Example Mining from Videos for Improved Object Detection
- intro: ECCV 2018
- arxiv: https://arxiv.org/abs/1808.04285
SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
https://arxiv.org/abs/1808.04974
A Survey of Modern Object Detection Literatur