Recent Object Detection Techniques: A Survey

Автор: Diwakar, Deepa Raj

Журнал: International Journal of Image, Graphics and Signal Processing @ijigsp

Статья в выпуске: 2 vol.14, 2022 года.

Бесплатный доступ

In the field of computer vision, object detection is the fundamental most widely used and challenging problem. Last several decades, great effort has been made by computer scientists or researchers to handle the object detection problem. Object detection is basically, used for detecting the object from image/video. At the beginning of the 21st century, a lot of work has been done in this field such as HOG, SIFT, SURF etc. are performing well but can’t be efficiently used for Real-time detection with speed and accuracy. Furthermore, in the deep learning era Convolution Neural Network made a rapid change and leads to a new pathway and a lot of excellent work has been done till dated such as region-based convolution network YOLO, SSD, retina NET etc. In this survey paper, lots of research papers were reviewed based on popular traditional object detection methods and current trending deep learning-based methods and displayed challenges, limitations, methodologies used to detect the object and also directions for future research.

Еще

Object detection, Convolutional Neural Network, deep learning techniques

Короткий адрес: https://sciup.org/15018313

IDR: 15018313 | DOI: 10.5815/ijigsp.2022.02.05

Список литературы Recent Object Detection Techniques: A Survey

D.G. Lowe, Object recognition from local scale-invariant features, in: Proc. Seventh IEEE Int. Conf. Comput. Vis., 1999: pp. 1150–1157 vol.2. https://doi.org/10.1109/ICCV.1999.790410.
P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in: Proc. 2001 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. CVPR 2001, 2001: p. I–I. https://doi.org/10.1109/CVPR.2001.990517.
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in: 2005 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. CVPR05, 2005: pp. 886–893 vol. 1. https://doi.org/10.1109/CVPR.2005.177.
H. Bay, T. Tuytelaars, L. Van Gool, SURF: Speeded Up Robust Features, in: A. Leonardis, H. Bischof, A. Pinz (Eds.), Comput. Vis. – ECCV 2006, Springer, Berlin, Heidelberg, 2006: pp. 404–417. https://doi.org/10.1007/11744023_32.
R. Girshick, J. Donahue, T. Darrell, J. Malik, Region-Based Convolutional Networks for Accurate Object Detection and Segmentation, IEEE Trans. Pattern Anal. Mach. Intell. 38 (2016) 142–158. https://doi.org/10.1109/TPAMI.2015.2437384.
R. Girshick, Fast R-CNN, in: 2015: pp. 1440–1448. https://openaccess.thecvf.com/content_iccv_2015/html/Girshick_Fast_R-CNN_ICCV_2015_paper.html (accessed September 14, 2021).
S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, in: Adv. Neural Inf. Process. Syst., Curran Associates, Inc., 2015. https://proceedings.neurips.cc/paper/2015/hash/14bfa6bb14875e45bba028a21ed38046-Abstract.html (accessed September 14, 2021).
K. He, G. Gkioxari, P. Dollar, R. Girshick, Mask R-CNN, in: 2017: pp. 2961–2969. https://openaccess.thecvf.com/content_iccv_2017/html/He_Mask_R-CNN_ICCV_2017_paper.html (accessed September 14, 2021).
J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You Only Look Once: Unified, Real-Time Object Detection, in: 2016: pp. 779–788. https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Redmon_You_Only_Look_CVPR_2016_paper.html (accessed September 14, 2021).
J. Redmon, A. Farhadi, YOLO9000: Better, Faster, Stronger, in: 2017: pp. 7263–7271. https://openaccess.thecvf.com/content_cvpr_2017/html/Redmon_YOLO9000_Better_Faster_CVPR_ 2017_paper.html (accessed September 14, 2021).
J. Redmon, A. Farhadi, YOLOv3: An Incremental Improvement, ArXiv180402767 Cs. (2018). http://arxiv.org/abs/1804.02767 (accessed September 14, 2021).
A. Bochkovskiy, C.-Y. Wang, H.-Y.M. Liao, YOLOv4: Optimal Speed and Accuracy of Object Detection, ArXiv200410934 Cs Eess. (2020). http://arxiv.org/abs/2004.10934 (accessed September 14, 2021).
ultralytics/yolov5, Ultralytics, 2021. https://github.com/ultralytics/yolov5 (accessed September 14, 2021).
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A.C. Berg, SSD: Single Shot MultiBox Detector, in: B. Leibe, J. Matas, N. Sebe, M. Welling (Eds.), Comput. Vis. – ECCV 2016, Springer International Publishing, Cham, 2016: pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollar, Focal Loss for Dense Object Detection, in: 2017: pp. 2980–2988. https://openaccess.thecvf.com/content_iccv_2017/html/Lin_Focal_Loss_for_ICCV_2017_paper.html (accessed September 14, 2021).
S. Zhang, L. Wen, X. Bian, Z. Lei, S.Z. Li, Single-Shot Refinement Neural Network for Object Detection, in: 2018: pp. 4203–4212. https://openaccess.thecvf.com/content_cvpr_2018/html/Zhang_Single-Shot_Refinement_Neural_CVPR_2018_paper.html (accessed September 14, 2021).
P.F. Felzenszwalb, R.B. Girshick, D. McAllester, Cascade object detection with deformable part models, in: 2010 IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2010: pp. 2241–2248. https://doi.org/10.1109/CVPR.2010.5539906.
T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, J. Shi, FoveaBox: Beyound Anchor-Based Object Detection, IEEE Trans. Image Process. 29 (2020) 7389–7398. https://doi.org/10.1109/TIP.2020.3002345.
J. Guo, J. Wang, R. Bai, Y. Zhang, Y. Li, A New Moving Object Detection Method Based on Frame-difference and Background Subtraction, IOP Conf. Ser. Mater. Sci. Eng. 242 (2017) 012115. https://doi.org/10.1088/1757-899X/242/1/012115.
F. Particke, R. Kolbenschlag, M. Hiller, L. Patiño-Studencki, J. Thielecke, Deep Learning for Real-Time Capable Object Detection and Localization on Mobile Platforms, IOP Conf. Ser. Mater. Sci. Eng. 261 (2017) 012005. https://doi.org/10.1088/1757-899X/261/1/012005.
D. Lin, X. Shen, C. Lu, J. Jia, Deep LAC: Deep Localization, Alignment and Classification for Fine-Grained Recognition, in: 2015: pp. 1666–1674. https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Lin_Deep_LAC_Deep_2015_CVPR_paper.html (accessed September 14, 2021).
T. Malisiewicz, A. Gupta, A.A. Efros, Ensemble of exemplar-SVMs for object detection and beyond, in: 2011 Int. Conf. Comput. Vis., 2011: pp. 89–96. https://doi.org/10.1109/ICCV.2011.6126229.
X.-C. Yin, X. Yin, K. Huang, H.-W. Hao, Robust Text Detection in Natural Scene Images, IEEE Trans. Pattern Anal. Mach. Intell. 36 (2014) 970–983. https://doi.org/10.1109/TPAMI.2013.182.
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. LeCun, OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks, ArXiv13126229 Cs. (2014). http://arxiv.org/abs/1312.6229 (accessed September 14, 2021).
D. Erhan, C. Szegedy, A. Toshev, D. Anguelov, Scalable Object Detection using Deep Neural Networks, in: 2014: pp. 2147–2154. https://openaccess.thecvf.com/content_cvpr_2014/html/Erhan_Scalable_Object_Detection_2014 _CVPR_paper.html (accessed September 14, 2021).
K. He, X. Zhang, S. Ren, J. Sun, Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, IEEE Trans. Pattern Anal. Mach. Intell. 37 (2015) 1904–1916. https://doi.org/10.1109/TPAMI.2015.2389824.
D. Yoo, S. Park, J.-Y. Lee, A.S. Paek, I. So Kweon, AttentionNet: Aggregating Weak Directions for Accurate Object Detection, in: 2015: pp. 2659–2667. https://www.cv-foundation.org/openaccess/content_iccv_2015/html/Yoo_AttentionNet_Aggregating_Weak_ICCV_ 2015_paper.html (accessed September 14, 2021).
S. Gidaris, N. Komodakis, Object Detection via a Multi-Region and Semantic Segmentation-Aware CNN Model, in: 2015: pp. 1134–1142. https://openaccess.thecvf.com/content_iccv_2015/html/Gidaris_Object_Detection_via_ICCV_2015_ paper.html (accessed September 14, 2021).
A. Ghodrati, A. Diba, M. Pedersoli, T. Tuytelaars, L. Van Gool, DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers, in: 2015: pp. 2578–2586. https://openaccess.thecvf.com/content_iccv_2015/html/Ghodrati_DeepProposal_Hunting_Objects_ ICCV_2015_paper.html (accessed September 14, 2021).
T. Kong, A. Yao, Y. Chen, F. Sun, HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection, in: 2016: pp. 845–853. https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Kong_HyperNet_Towards_Accurate_CVPR_2016 _paper.html (accessed September 14, 2021).
Z. Cai, Q. Fan, R.S. Feris, N. Vasconcelos, A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection, in: B. Leibe, J. Matas, N. Sebe, M. Welling (Eds.), Comput. Vis. – ECCV 2016, Springer International Publishing, Cham, 2016: pp. 354–370. https://doi.org/10.1007/978-3-319-46493-0_22.
T-CNN: Tubelets With Convolutional Neural Networks for Object Detection From Videos, (n.d.). https://ieeexplore.ieee.org/abstract/document/8003302/ (accessed September 14, 2021).
Z. Shen, Z. Liu, J. Li, Y.-G. Jiang, Y. Chen, X. Xue, DSOD: Learning Deeply Supervised Object Detectors From Scratch, in: 2017: pp. 1919–1927. https://openaccess.thecvf.com/content_iccv_2017/html/Shen_DSOD_Learning_Deeply_ICCV_ 2017_paper.html (accessed September 14, 2021).
C.-Y. Fu, W. Liu, A. Ranga, A. Tyagi, A.C. Berg, DSSD : Deconvolutional Single Shot Detector, ArXiv170106659 Cs. (2017). http://arxiv.org/abs/1701.06659 (accessed September 14, 2021).
T. Kong, F. Sun, A. Yao, H. Liu, M. Lu, Y. Chen, RON: Reverse Connection With Objectness Prior Networks for Object Detection, in: 2017: pp. 5936–5944. https://openaccess.thecvf.com/content_cvpr_2017/html/Kong_RON_Reverse_Connection_CVPR_2017 _paper.html (accessed September 14, 2021).
J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, Y. Wei, Deformable Convolutional Networks, in: 2017: pp. 764–773. https://openaccess.thecvf.com/content_iccv_2017/html/Dai_Deformable_Convolutional_Networks _ICCV_2017_paper.html (accessed September 14, 2021).
L. Tychsen-Smith, L. Petersson, DeNet: Scalable Real-Time Object Detection With Directed Sparse Sampling, in: 2017: pp. 428–436. https://openaccess.thecvf.com/content_iccv_2017/html/Tychsen-Smith_DeNet_Scalable_Real-Time_ICCV_2017_paper.html (accessed September 14, 2021).
P. Zhou, B. Ni, C. Geng, J. Hu, Y. Xu, Scale-Transferrable Object Detection, in: 2018: pp. 528–537. https://openaccess.thecvf.com/content_cvpr_2018/html/Zhou_Scale-Transferrable_Object_Detection_CVPR_2018_paper.html (accessed September 14, 2021).
H. Hu, J. Gu, Z. Zhang, J. Dai, Y. Wei, Relation Networks for Object Detection, in: 2018: pp. 3588–3597. https://openaccess.thecvf.com/content_cvpr_2018/html/Hu_Relation_Networks_for_CVPR_2018 _paper.html (accessed September 14, 2021).
H. Law, J. Deng, CornerNet: Detecting Objects as Paired Keypoints, in: 2018: pp. 734–750. https://openaccess.thecvf.com/content_ECCV_2018/html/Hei_Law_CornerNet_Detecting_Objects_ ECCV_2018_paper.html (accessed September 14, 2021).
J. Pang, K. Chen, J. Shi, H. Feng, W. Ouyang, D. Lin, Libra R-CNN: Towards Balanced Learning for Object Detection, in: 2019: pp. 821–830. https://openaccess.thecvf.com/content_CVPR_2019/html/Pang_Libra_R-CNN_Towards_Balanced_Learning_for_Object_Detection_CVPR_2019_paper.html (accessed September 14, 2021).
K. Chen, J. Pang, J. Wang, Y. Xiong, X. Li, S. Sun, W. Feng, Z. Liu, J. Shi, W. Ouyang, C.C. Loy, D. Lin, Hybrid Task Cascade for Instance Segmentation, in: 2019: pp. 4974–4983. https://openaccess.thecvf.com/content_CVPR_2019/html/Chen_Hybrid_Task_Cascade_for_Instance_ Segmentation_CVPR_2019_paper.html (accessed September 14, 2021).
Y. Li, Y. Chen, N. Wang, Z. Zhang, Scale-Aware Trident Networks for Object Detection, in: 2019: pp. 6054–6063. https://openaccess.thecvf.com/content_ICCV_2019/html/Li_Scale-Aware_Trident_Networks_for_Object_Detection_ICCV_2019_paper.html (accessed September 14, 2021).
Multi-scale Positive Sample Refinement for Few-Shot Object Detection | SpringerLink, (n.d.). https://link.springer.com/chapter/10.1007/978-3-030-58517-4_27 (accessed September 14, 2021).
M. Tan, R. Pang, Q.V. Le, EfficientDet: Scalable and Efficient Object Detection, in: 2020: pp. 10781–10790. https://openaccess.thecvf.com/content_CVPR_2020/html/Tan_EfficientDet_Scalable_and_Efficient _Object_Detection_CVPR_2020_paper.html (accessed September 14, 2021).
H. Zhang, H. Chang, B. Ma, N. Wang, X. Chen, Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Comput. Vis. – ECCV 2020, Springer International Publishing, Cham, 2020: pp. 260–275. https://doi.org/10.1007/978-3-030-58555-6_16.
P. Sun, R. Zhang, Y. Jiang, T. Kong, C. Xu, W. Zhan, M. Tomizuka, L. Li, Z. Yuan, C. Wang, P. Luo, Sparse R-CNN: End-to-End Object Detection With Learnable Proposals, in: 2021: pp. 14454–14463. https://openaccess.thecvf.com/content/CVPR2021/html/Sun_Sparse_R-CNN_End-to-End_Object_Detection_With_Learnable_Proposals_CVPR_2021_paper.html (accessed September 14, 2021).
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going Deeper With Convolutions, in: 2015: pp. 1–9. https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Szegedy_Going_Deeper_With_2015_CVPR _paper.html (accessed September 14, 2021).
A. Pramanik, S.K. Pal, J. Maiti, P. Mitra, Granulated RCNN and Multi-Class Deep SORT for Multi-Object Detection and Tracking, IEEE Trans. Emerg. Top. Comput. Intell. (2021) 1–11. https://doi.org/10.1109/TETCI.2020.3041019.

Еще

Статья научная