AlphaTree：Object Detection 物体检测

gingo 2021-04-27 10:22:55 资料仓库物体检测收藏

0 / 895

Object Detection 物体检测

RCNN FastRCNN FasterRCNN为一脉相承。另外两个方向为Yolo 和SSD。Yolo迭代到Yolo V3，SSD的设计也让它后来在很多方向都有应用。

Christian Szegedy / Google 用AlexNet也做过物体检测的尝试。

[1] Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. "Deep neural networks for object detection." Advances in Neural Information Processing Systems. 2013. pdf

不过真正取得巨大突破，引发基于深度学习目标检测的热潮的还是RCNN

但是如果将如何检测出区域，按照回归问题的思路去解决，预测出（x,y,w,h）四个参数的值，从而得出方框的位置。回归问题的训练参数收敛时间要长很多，于是将回归问题转成分类问题来解决。总共两个步骤：

第一步：将图片转换成不同大小的框，
第二步：对框内的数据进行特征提取，然后通过分类器判定，选区分最高的框作为物体定位框。

compare.png

评价标准: IoU(Intersection over Union)； mAP(Mean Average Precision) 速度：帧率FPS

link

Method
-[SPPNet]
[Two-Stage Object Detection】
- [R-CNN]
- [Fast R-CNN]
- [Faster R-CNN]
[Single-Shot Object Detection]
- [YOLO]
- [YOLOv2]
- [YOLOv3]
- [SSD]
- [RetinaNet]
[Great improvement]
[R-FCN]
Feature Pyramid Network (FPN)

Method

SPPNet 何凯明 He Kaiming /MSRA

SPPNet Spatial Pyramid Pooling（空间金字塔池化）
[3] He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014. pdf

一般CNN后接全连接层或者分类器，他们都需要固定的输入尺寸，因此不得不对输入数据进行crop或者warp，这些预处理会造成数据的丢失或几何的失真。SPP Net的提出，将金字塔思想加入到CNN，实现了数据的多尺度输入。此时网络的输入可以是任意尺度的，在SPP layer中每一个pooling的filter会根据输入调整大小，而SPP的输出尺度始终是固定的。

这样打破了之前大家认为需要先提出检测框，然后resize到一个固定尺寸再通过CNN的模式，而可以图片先通过CNN获取到特征后，在特征图上使用不同的检测框提取特征。之后pooling到同样尺寸进行后续步骤。这样可以提高物体检测速度。

intro: ECCV 2014 / TPAMI 2015
keywords: SPP-Net
arxiv: http://arxiv.org/abs/1406.4729
github: https://github.com/ShaoqingRen/SPP_net
notes: http://zhangliliang.com/2014/09/13/paper-note-sppnet/

Two-Stage Object Detection

RCNN Ross B. Girshick(RBG) link / UC-Berkeley

RCNN R-CNN框架，取代传统目标检测使用的滑动窗口+手工设计特征，而使用CNN来进行特征提取。这是深度神经网络的应用。

Traditional region proposal methods + CNN classifier

也就是将第二步改成了深度神经网络提取特征。
然后通过线性svm分类器识别对象的的类别，再通过回归模型用于收紧边界框；
创新点：将CNN用在物体检测上，提高了检测率。
缺点：基于选择性搜索算法为每个图像提取2,000个候选区域，使用CNN为每个图像区域提取特征，重复计算，速度慢，40-50秒。

R-CNN在PASCAL VOC2007上的检测结果提升到66%(mAP)

rcnn

[2] SGirshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014. pdf

github: https://github.com/rbgirshick/rcnn

intro: R-CNN
arxiv: http://arxiv.org/abs/1311.2524
supp: http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf
slides: http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf
slides: http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf
github: https://github.com/rbgirshick/rcnn
notes: http://zhangliliang.com/2014/07/23/paper-note-rcnn/
caffe-pr(“Make R-CNN the Caffe detection example”): https://github.com/BVLC/caffe/pull/482

Fast RCNN Ross B. Girshick

Fast RCNN
[4] Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.

如果RCNN的卷积计算只需要计算一次，那么速度就可以很快降下来了。

Ross Girshick将SPPNet的方法应用到RCNN中，提出了一个可以看做单层sppnet的网络层，叫做ROI Pooling，这个网络层可以把不同大小的输入映射到一个固定尺度的特征向量.将图像输出到CNN生成卷积特征映射。使用这些特征图结合候选区域算法提取候选区域。然后，使用RoI池化层将所有可能的区域重新整形为固定大小，以便将其馈送到全连接网络中。

1.首先将图像作为输入；
2.将图像传递给卷积神经网络，计算卷积后的特征。
3.然后通过之前proposal的方法提取ROI，在所有的感兴趣的区域上应用RoI池化层，并调整区域的尺寸。然后，每个区域被传递到全连接层的网络中；
4.softmax层用于全连接网以输出类别。与softmax层一起，也并行使用线性回归层，以输出预测类的边界框坐标。

fastrcnn

Fast R-CNN

arxiv: http://arxiv.org/abs/1504.08083
slides: http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf
github: https://github.com/rbgirshick/fast-rcnn
github(COCO-branch): https://github.com/rbgirshick/fast-rcnn/tree/coco
webcam demo: https://github.com/rbgirshick/fast-rcnn/pull/29
notes: http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/
notes: http://blog.csdn.net/linj_m/article/details/48930179
github(“Fast R-CNN in MXNet”): https://github.com/precedenceguo/mx-rcnn
github: https://github.com/mahyarnajibi/fast-rcnn-torch
github: https://github.com/apple2373/chainer-simple-fast-rnn
github: https://github.com/zplizzi/tensorflow-fast-rcnn

A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03414
paper: http://abhinavsh.info/papers/pdfs/adversarial_object_detection.pdf
github(Caffe): https://github.com/xiaolonw/adversarial-frcnn

Faster RCNN 何凯明 He Kaiming

Faster RCNN
Fast RCNN的区域提取还是使用的传统方法，而Faster RCNN将Region Proposal Network和特征提取、目标分类和边框回归统一到了一个框架中。

Faster R-CNN = Region Proposal Network +Fast R-CNN

fasterrcnn1

fasterrcnn

fasterrcnn2

将区域提取通过一个CNN完成。这个CNN叫做Region Proposal Network，RPN的运用使得region proposal的额外开销就只有一个两层网络。关于RPN可以参考link

rpn

Faster R-CNN设计了提取候选区域的网络RPN，代替了费时的Selective Search（选择性搜索），使得检测速度大幅提升，下表对比了R-CNN、Fast R-CNN、Faster R-CNN的检测速度：

speed

[5] Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

intro: NIPS 2015
arxiv: http://arxiv.org/abs/1506.01497
gitxiv: http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region
slides: http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf
github(official, Matlab): https://github.com/ShaoqingRen/faster_rcnn
github: https://github.com/rbgirshick/py-faster-rcnn
github(MXNet): https://github.com/msracver/Deformable-ConvNets/tree/master/faster_rcnn
github: https://github.com//jwyang/faster-rcnn.pytorch
github: https://github.com/mitmul/chainer-faster-rcnn
github: https://github.com/andreaskoepf/faster-rcnn.torch
github: https://github.com/ruotianluo/Faster-RCNN-Densecap-torch
github: https://github.com/smallcorgi/Faster-RCNN_TF
github: https://github.com/CharlesShang/TFFRCNN
github(C++ demo): https://github.com/YihangLou/FasterRCNN-Encapsulation-Cplusplus
github: https://github.com/yhenon/keras-frcnn
github: https://github.com/Eniac-Xie/faster-rcnn-resnet
github(C++): https://github.com/D-X-Y/caffe-faster-rcnn/tree/dev

R-CNN minus R

intro: BMVC 2015
arxiv: http://arxiv.org/abs/1506.06981

Faster R-CNN in MXNet with distributed implementation and data parallelization

github: https://github.com/dmlc/mxnet/tree/master/example/rcnn

Contextual Priming and Feedback for Faster R-CNN

intro: ECCV 2016. Carnegie Mellon University
paper: http://abhinavsh.info/context_priming_feedback.pdf
poster: http://www.eccv2016.org/files/posters/P-1A-20.pdf

An Implementation of Faster RCNN with Study for Region Sampling

intro: Technical Report, 3 pages. CMU
arxiv: https://arxiv.org/abs/1702.02138
github: https://github.com/endernewton/tf-faster-rcnn

Interpretable R-CNN

intro: North Carolina State University & Alibaba
keywords: AND-OR Graph (AOG)
arxiv: https://arxiv.org/abs/1711.05226

Light-Head R-CNN: In Defense of Two-Stage Object Detector

intro: Tsinghua University & Megvii Inc
arxiv: https://arxiv.org/abs/1711.07264
github(official, Tensorflow): https://github.com/zengarden/light_head_rcnn
github: https://github.com/terrychenism/Deformable-ConvNets/blob/master/rfcn/symbols/resnet_v1_101_rfcn_light.py#L784

Cascade R-CNN: Delving into High Quality Object Detection

intro: CVPR 2018. UC San Diego
arxiv: https://arxiv.org/abs/1712.00726
github(Caffe, official): https://github.com/zhaoweicai/cascade-rcnn

Cascade R-CNN: High Quality Object Detection and Instance Segmentation

https://arxiv.org/abs/1906.09756
github(Caffe, official): https://github.com/zhaoweicai/cascade-rcnn
github(official): https://github.com/zhaoweicai/Detectron-Cascade-RCNN

SMC Faster R-CNN: Toward a scene-specialized multi-object detector

https://arxiv.org/abs/1706.10217

Domain Adaptive Faster R-CNN for Object Detection in the Wild

intro: CVPR 2018. ETH Zurich & ESAT/PSI
arxiv: https://arxiv.org/abs/1803.03243
github(official. Caffe): https://github.com/yuhuayc/da-faster-rcnn

Robust Physical Adversarial Attack on Faster R-CNN Object Detector

https://arxiv.org/abs/1804.05810

Auto-Context R-CNN

intro: Rejected by ECCV18
arxiv: https://arxiv.org/abs/1807.02842

Grid R-CNN

intro: SenseTime
arxiv: https://arxiv.org/abs/1811.12030

Grid R-CNN Plus: Faster and Better

intro: SenseTime Research & CUHK & Beihang University
arxiv: https://arxiv.org/abs/1906.05688
github: https://github.com/STVIR/Grid-R-CNN

Few-shot Adaptive Faster R-CNN

intro: CVPR 2019
arxiv: https://arxiv.org/abs/1903.09372

Libra R-CNN: Towards Balanced Learning for Object Detection

intro: CVPR 2019
arxiv: https://arxiv.org/abs/1904.02701

Rethinking Classification and Localization in R-CNN

intro: Northeastern University & Microsoft
arxiv: https://arxiv.org/abs/1904.06493

Reprojection R-CNN: A Fast and Accurate Object Detector for 360° Images

intro: Peking University
arxiv: https://arxiv.org/abs/1907.11830
Rethinking Classification and Localization for Cascade R-CNN
intro: BMVC 2019
arxiv: https://arxiv.org/abs/1907.11914

Single-Shot Object Detection

Yolo

Yolo(You only look once)

YOLO的检测思想不同于R-CNN系列的思想，它将目标检测作为回归任务来解决。YOLO 的核心思想就是利用整张图作为网络的输入，直接在输出层回归 bounding box（边界框）的位置及其所属的类别。

[6] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015). pdfYOLO,Oustanding Work, really practical
PPT

c 官方: https://pjreddie.com/darknet/yolo/ v3
https://pjreddie.com/darknet/yolov2/ v2
https://pjreddie.com/darknet/yolov1/ v1

pytorch (tencent) v1, v2, v3 :https://github.com/TencentYoutuResearch/ObjectDetection-OneStageDet

yolo 介绍可以参考介绍

arxiv: http://arxiv.org/abs/1506.02640
code: http://pjreddie.com/darknet/yolo/
github: https://github.com/pjreddie/darknet
blog: https://pjreddie.com/publications/yolo/
slides: https://docs.google.com/presentation/d/1aeRvtKG21KHdD5lg6Hgyhx5rPq_ZOsGjG5rJ1HP7BbA/pub?start=false&loop=false&delayms=3000&slide=id.p
reddit: https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/
github: https://github.com/gliese581gg/YOLO_tensorflow
github: https://github.com/xingwangsfu/caffe-yolo
github: https://github.com/frankzhangrui/Darknet-Yolo
github: https://github.com/BriSkyHekun/py-darknet-yolo
github: https://github.com/tommy-qichang/yolo.torch
github: https://github.com/frischzenger/yolo-windows
github: https://github.com/AlexeyAB/yolo-windows
github: https://github.com/nilboy/tensorflow-yolo

darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++

Start Training YOLO with Our Own Data

intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
blog: http://guanghan.info/blog/en/my-works/train-yolo/
github: https://github.com/Guanghan/darknet

YOLO: Core ML versus MPSNNGraph

intro: Tiny YOLO for iOS implemented using CoreML but also using the new MPS graph API.
blog: http://machinethink.net/blog/yolo-coreml-versus-mps-graph/
github: https://github.com/hollance/YOLO-CoreML-MPSNNGraph

TensorFlow YOLO object detection on Android

intro: Real-time object detection on Android using the YOLO network with TensorFlow
github: https://github.com/natanielruiz/android-yolo

Computer Vision in iOS – Object Detection

YOLOv2

YOLO9000: Better, Faster, Stronger

arxiv: https://arxiv.org/abs/1612.08242
code: http://pjreddie.com/yolo9000/
github(Chainer): https://github.com/leetenki/YOLOv2
github(Keras): https://github.com/allanzelener/YAD2K
github(PyTorch): https://github.com/longcw/yolo2-pytorch
github(Tensorflow): https://github.com/hizhangp/yolo_tensorflow
github(Windows): https://github.com/AlexeyAB/darknet
github: https://github.com/choasUp/caffe-yolo9000
github: https://github.com/philipperemy/yolo-9000

darknet_scripts

intro: Auxilary scripts to work with (YOLO) darknet deep learning famework. AKA -> How to generate YOLO anchors?
github: https://github.com/Jumabek/darknet_scripts

Yolo_mark: GUI for marking bounded boxes of objects in images for training Yolo v2

github: https://github.com/AlexeyAB/Yolo_mark

LightNet: Bringing pjreddie’s DarkNet out of the shadows

https://github.com//explosion/lightnet

YOLO v2 Bounding Box Tool

intro: Bounding box labeler tool to generate the training data in the format YOLO v2 requires.
github: https://github.com/Cartucho/yolo-boundingbox-labeler-GUI

YOLOv3

YOLOv3: An Incremental Improvement

project page: https://pjreddie.com/darknet/yolo/
paper: https://pjreddie.com/media/files/papers/YOLOv3.pdf
arxiv: https://arxiv.org/abs/1804.02767
github: https://github.com/DeNA/PyTorch_YOLOv3
github: https://github.com/eriklindernoren/PyTorch-YOLOv3

Gaussian YOLOv3: An Accurate and Fast Object Detector Using Localization Uncertainty for Autonomous Driving

https://arxiv.org/abs/1904.04620

YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers

https://arxiv.org/abs/1811.05588

Spiking-YOLO: Spiking Neural Network for Real-time Object Detection

https://arxiv.org/abs/1903.06530

SSD(The Single Shot Detector) 详解 detail

SSD SSD是一种直接预测bounding box的坐标和类别的object detection算法，没有生成proposal的过程。它使用object classification的模型作为base network，如VGG16网络，

[7] Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015). pdf

tensorflow 源码 https://github.com/balancap/SSD-Tensorflow/blob/master/nets/ssd_vgg_300.py

caffe ：https://github.com/weiliu89/caffe/tree/ssd

intro: ECCV 2016 Oral
arxiv: http://arxiv.org/abs/1512.02325
paper: http://www.cs.unc.edu/~wliu/papers/ssd.pdf
slides: http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf
github(Official): https://github.com/weiliu89/caffe/tree/ssd
video: http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973
github: https://github.com/zhreshold/mxnet-ssd
github: https://github.com/zhreshold/mxnet-ssd.cpp
github: https://github.com/rykov8/ssd_keras
github: https://github.com/balancap/SSD-Tensorflow
github: https://github.com/amdegroot/ssd.pytorch
github(Caffe): https://github.com/chuanqi305/MobileNet-SSD

What’s the diffience in performance between this new code you pushed and the previous code?

Issue #327 · weiliu89/caffe

DSSD : Deconvolutional Single Shot Detector

intro: UNC Chapel Hill & Amazon Inc
arxiv: https://arxiv.org/abs/1701.06659
github: https://github.com/chengyangfu/caffe/tree/dssd
github: https://github.com/MTCloudVision/mxnet-dssd
demo: http://120.52.72.53/www.cs.unc.edu/c3pr90ntc0td/~cyfu/dssd_lalaland.mp4

Enhancement of SSD by concatenating feature maps for object detection

intro: rainbow SSD (R-SSD)
arxiv: https://arxiv.org/abs/1705.09587

Context-aware Single-Shot Detector

keywords: CSSD, DiCSSD, DeCSSD, effective receptive fields (ERFs), theoretical receptive fields (TRFs)
arxiv: https://arxiv.org/abs/1707.08682

Feature-Fused SSD: Fast Detection for Small Objects

https://arxiv.org/abs/1709.05054

FSSD: Feature Fusion Single Shot Multibox Detector

https://arxiv.org/abs/1712.00960

Weaving Multi-scale Context for Single Shot Detector

intro: WeaveNet
keywords: fuse multi-scale information
arxiv: https://arxiv.org/abs/1712.03149

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

keywords: ESSD
arxiv: https://arxiv.org/abs/1801.05918

Tiny SSD: A Tiny Single-shot Detection Deep Convolutional Neural Network for Real-time Embedded Object Detection

https://arxiv.org/abs/1802.06488

MDSSD: Multi-scale Deconvolutional Single Shot Detector for small objects

intro: Zhengzhou University
arxiv: https://arxiv.org/abs/1805.07009

Accurate Single Stage Detector Using Recurrent Rolling Convolution

intro: CVPR 2017. SenseTime
keywords: Recurrent Rolling Convolution (RRC)
arxiv: https://arxiv.org/abs/1704.05776
github: https://github.com/xiaohaoChen/rrc_detection

Residual Features and Unified Prediction Network for Single Stage Detection

https://arxiv.org/abs/1707.05031

FPN

FPN（feature pyramid networks）特征金字塔，是一种融合了多层特征信息的特征提取方法，可以结合各种深度神经网络使用。
SSD的多尺度特征融合的方式，没有上采样过程，没有用到足够低层的特征（在SSD中，最低层的特征是VGG网络的conv4_3）

fpn

Feature Pyramid Networks for Object Detection pdf

Feature Pyramid Networks for Object Detection

intro: Facebook AI Research
arxiv: https://arxiv.org/abs/1612.03144

Action-Driven Object Detection with Top-Down Visual Attentions

arxiv: https://arxiv.org/abs/1612.06704

Beyond Skip Connections: Top-Down Modulation for Object Detection

intro: CMU & UC Berkeley & Google Research
arxiv: https://arxiv.org/abs/1612.06851

Wide-Residual-Inception Networks for Real-time Object Detection

intro: Inha University
arxiv: https://arxiv.org/abs/1702.01243

Attentional Network for Visual Object Detection

intro: University of Maryland & Mitsubishi Electric Research Laboratories
arxiv: https://arxiv.org/abs/1702.01478

Learning Chained Deep Features and Classifiers for Cascade in Object Detection

keykwords: CC-Net
intro: chained cascade network (CC-Net). 81.1% mAP on PASCAL VOC 2007
arxiv: https://arxiv.org/abs/1702.07054

DeNet: Scalable Real-time Object Detection with Directed Sparse Sampling

intro: ICCV 2017 (poster)
arxiv: https://arxiv.org/abs/1703.10295

Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries

intro: CVPR 2017
arxiv: https://arxiv.org/abs/1704.03944

Spatial Memory for Context Reasoning in Object Detection

arxiv: https://arxiv.org/abs/1704.04224

Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

https://arxiv.org/abs/1704.05775

LCDet: Low-Complexity Fully-Convolutional Neural Networks for Object Detection in Embedded Systems

intro: Embedded Vision Workshop in CVPR. UC San Diego & Qualcomm Inc
arxiv: https://arxiv.org/abs/1705.05922

Point Linking Network for Object Detection

intro: Point Linking Network (PLN)
arxiv: https://arxiv.org/abs/1706.03646

Perceptual Generative Adversarial Networks for Small Object Detection

https://arxiv.org/abs/1706.05274

Few-shot Object Detection

https://arxiv.org/abs/1706.08249

Yes-Net: An effective Detector Based on Global Information

https://arxiv.org/abs/1706.09180

Towards lightweight convolutional neural networks for object detection

https://arxiv.org/abs/1707.01395

RON: Reverse Connection with Objectness Prior Networks for Object Detection

Deformable Part-based Fully Convolutional Network for Object Detection

intro: BMVC 2017 (oral). Sorbonne Universités & CEDRIC
arxiv: https://arxiv.org/abs/1707.06175

Adaptive Feeding: Achieving Fast and Accurate Detections by Adaptively Combining Object Detectors

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1707.06399

Recurrent Scale Approximation for Object Detection in CNN

intro: ICCV 2017
keywords: Recurrent Scale Approximation (RSA)
arxiv: https://arxiv.org/abs/1707.09531
github: https://github.com/sciencefans/RSA-for-object-detection

DSOD: Learning Deeply Supervised Object Detectors from Scratch

intro: ICCV 2017. Fudan University & Tsinghua University & Intel Labs China
arxiv: https://arxiv.org/abs/1708.01241
github: https://github.com/szq0214/DSOD

Object Detection from Scratch with Deep Supervision

https://arxiv.org/abs/1809.09294

CoupleNet: Coupling Global Structure with Local Parts for Object Detection

intro: ICCV 2017
arxiv: https://arxiv.org/abs/1708.02863

Incremental Learning of Object Detectors without Catastrophic Forgetting

intro: ICCV 2017. Inria
arxiv: https://arxiv.org/abs/1708.06977

Zoom Out-and-In Network with Map Attention Decision for Region Proposal and Object Detection

https://arxiv.org/abs/1709.04347

StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection

https://arxiv.org/abs/1709.05788

Dynamic Zoom-in Network for Fast Object Detection in Large Images

https://arxiv.org/abs/1711.05187

Zero-Annotation Object Detection with Web Knowledge Transfer

intro: NTU, Singapore & Amazon
keywords: multi-instance multi-label domain adaption learning framework
arxiv: https://arxiv.org/abs/1711.05954

MegDet: A Large Mini-Batch Object Detector

intro: Peking University & Tsinghua University & Megvii Inc
arxiv: https://arxiv.org/abs/1711.07240

Receptive Field Block Net for Accurate and Fast Object Detection

intro: RFBNet
arxiv: https://arxiv.org/abs/1711.07767
github: https://github.com//ruinmessi/RFBNet

An Analysis of Scale Invariance in Object Detection - SNIP

intro: CVPR 2018
arxiv: https://arxiv.org/abs/1711.08189
github: https://github.com/bharatsingh430/snip

Feature Selective Networks for Object Detection

https://arxiv.org/abs/1711.08879

Learning a Rotation Invariant Detector with Rotatable Bounding Box

arxiv: https://arxiv.org/abs/1711.09405
github(official, Caffe): https://github.com/liulei01/DRBox

Scalable Object Detection for Stylized Objects

intro: Microsoft AI & Research Munich
arxiv: https://arxiv.org/abs/1711.09822

Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids

arxiv: https://arxiv.org/abs/1712.00886
github: https://github.com/szq0214/GRP-DSOD

Deep Regionlets for Object Detection

keywords: region selection network, gating network
arxiv: https://arxiv.org/abs/1712.02408

Training and Testing Object Detectors with Virtual Images

intro: IEEE/CAA Journal of Automatica Sinica
arxiv: https://arxiv.org/abs/1712.08470

Large-Scale Object Discovery and Detector Adaptation from Unlabeled Video

keywords: object mining, object tracking, unsupervised object discovery by appearance-based clustering, self-supervised detector adaptation
arxiv: https://arxiv.org/abs/1712.08832

Spot the Difference by Object Detection

intro: Tsinghua University & JD Group
arxiv: https://arxiv.org/abs/1801.01051

Localization-Aware Active Learning for Object Detection

arxiv: https://arxiv.org/abs/1801.05124

Object Detection with Mask-based Feature Encoding

https://arxiv.org/abs/1802.03934

LSTD: A Low-Shot Transfer Detector for Object Detection

intro: AAAI 2018
arxiv: https://arxiv.org/abs/1803.01529

Pseudo Mask Augmented Object Detection

https://arxiv.org/abs/1803.05858

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

intro: ECCV 2018
keywords: DCR V1
arxiv: https://arxiv.org/abs/1803.06799
github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement

Decoupled Classification Refinement: Hard False Positive Suppression for Object Detection

keywords: DCR V2
arxiv: https://arxiv.org/abs/1810.04002
github(official, MXNet): https://github.com/bowenc0221/Decoupled-Classification-Refinement

Learning Region Features for Object Detection

intro: Peking University & MSRA
arxiv: https://arxiv.org/abs/1803.07066

Object Detection for Comics using Manga109 Annotations

intro: University of Tokyo & National Institute of Informatics, Japan
arxiv: https://arxiv.org/abs/1803.08670

Task-Driven Super Resolution: Object Detection in Low-resolution Images

https://arxiv.org/abs/1803.11316

Transferring Common-Sense Knowledge for Object Detection

https://arxiv.org/abs/1804.01077

Multi-scale Location-aware Kernel Representation for Object Detection

Loss Rank Mining: A General Hard Example Mining Method for Real-time Detectors

intro: National University of Defense Technology
arxiv: https://arxiv.org/abs/1804.04606

DetNet: A Backbone network for Object Detection

intro: Tsinghua University & Megvii Inc
arxiv: https://arxiv.org/abs/1804.06215

AdvDetPatch: Attacking Object Detectors with Adversarial Patches

https://arxiv.org/abs/1806.02299

Attacking Object Detectors via Imperceptible Patches on Background

https://arxiv.org/abs/1809.05966

Physical Adversarial Examples for Object Detectors

intro: WOOT 2018
arxiv: https://arxiv.org/abs/1807.07769

Object detection at 200 Frames Per Second

intro: United Technologies Research Center-Ireland
arxiv: https://arxiv.org/abs/1805.06361

Object Detection using Domain Randomization and Generative Adversarial Refinement of Synthetic Images

intro: CVPR 2018 Deep Vision Workshop
arxiv: https://arxiv.org/abs/1805.11778

SNIPER: Efficient Multi-Scale Training

intro: University of Maryland
keywords: SNIPER (Scale Normalization for Image Pyramid with Efficient Resampling)
arxiv: https://arxiv.org/abs/1805.09300
github: https://github.com/mahyarnajibi/SNIPER

Soft Sampling for Robust Object Detection

https://arxiv.org/abs/1806.06986

MetaAnchor: Learning to Detect Objects with Customized Anchors