Open Access Open Access  Restricted Access Subscription Access

Object Detection: A Comprehensive Review of the State-of-the-Art Methods

Akhil Kumar, Arvind Kalia, Akashdeep Sharma


The process of localizing and classifying an object in a given sequence of images by computer vision systems is known as Object Detection. The work presented in the area of object detection is categorized into two broad categories. First category of work is based on traditional methods that deal with detection of an object in a single image having no or fewer deformations. The second category of work is based on evolutionary methods that deal with detection of multiple objects in a given image or a sequence of images having deformations. The evolutionary methods of object detection addresses many core issues like fast detection, multi-view, multi-resolution, object part relation and deformations due to moving object and background. In this work, authors have presented a survey of the state-of-the-art methods of object detection. The object detection methods surveyed in this paper are Histogram of Oriented Gradients based Features, family of Region Proposal based Convolutional Neural Networks, Spatial Pyramid Pooling Network, family of You Only Look Once and Single Shot Detector. This work discusses the methods, training and evaluation aspects of evolutionary object detection methods based on Convolutional Neural Networks and Deep Learning. At the end, open research issues of object detection area are discussed.

Full Text:



FUKUSHIMA, KUNIHIKO, 1980. Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shify in Position, Biological Cybernatics, Volume 36 (4), pp. 193-202.

DALAL, N., AND TRIGGS, B., 2005. Histograms of Oriented Gradients for Human Detection, IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

GIRSHICK, R.B., DONAHUE, J., DARRELL, T. AND MALIK, J., 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, IEEE Conference on Computer Vision and Pattern Recognition.

KAIMING, H., ZHANG, X., REN, S., AND SUN, J., 2014. Spatial Pyramid Pooling in Deep Convolutional Neural Networks for Visual Recognition, Computer Vision-ECCV 2014, pp. 346-361.

GIRSHICK, R.B., 2015. Fast R-CNN, IEEE International Conference on Computer Vision (ICCV).

REN, S., HE, K., GIRSHICK, R.B. AND SUN, J., 2015. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence.

KAIMING, H., GEORGIA, G., PIOTR, D. AND GIRSHICK, R.B., 2017. Mask-RCNN, 2017 IEEE Conference on Computer Vision (ICCV).

REDMON, J., DIVVALA, S., GIRSHICK, R. AND FARHADI, A., 2016. You Only Look Once: Unified real-time object detection, IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

REDMON, J. AND FARHADI, A., 2017. YOLO 9000: Better, Faster, Stronger, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 6517-6525.

REDMON, J. AND FARHADI, A., 2018. YOLO V3: An Incremental Improvement, arXiv:1804.02767.

LIU, W., ANGUELOV, D., ERHAN, D., SZEGEDY, C., REED, S.E., FU, C. AND BERG, A.C., 2016. SSD: Single Shot MultiBox Detector, ECCV.

CIRESAN, D., UELI MEIER, JONATHAN, M., GAMBARDELLA, L., AND SCHMIDHUBER, J., 2011. "Flexible, High Performance Convolutional Neural Networks for Image Classification", Proceedings of the Twenty-Second international joint conference on Artificial Intelligence, Volume 2 (2), pp. 12371242.

BRIDLE, J.S.. 1990a. Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition. In: F.Fogleman Soulie and J.Herault (eds.), Neurocomputing: Algorithms, Architectures and Applications, Berlin: Springer-Verlag, pp. 227-236.

BRIDLE, J.S., 1990b. Training Stochastic Model Recognition Algorithms as Networks can lead to Maximum Mutual Information Estimation of Parameters. In: D.S.Touretzky (ed.), Advances in Neural Information Processing Systems 2, San Mateo: Morgan Kaufmann, pp. 211-217.

SIMONYAN, K., AND ZISSERMAN, A., 2014. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, abs/1409.1556.

FISCHLER, M. A. AND ELSCHLAGER, R., 1973. The representation and matching of pictorial structures, IEEE Transactions on Computers, Volume C-22, pp. 6792.

FALOUTSOS, C., BARBER, R., FLICKNER, M., HAFNER, J., NIBLACK, W., PETKOVIC, D. AND EQUITZ, W., 1994. Efficient and Effective Querying by Image Content, Journal of Intelligent Information Systems, Volume 3, no. 1, pp. 231-262.

VINOD, V.V. AND MURASE, H., 1997. Video Shot Analysis using Efficient Multiple Object Tracking, Proceedings of IEEE International Conference on Multimedia Computing and Systems, pp. 501-508.

GROVE, T.D., BAKER, K.D., AND TAN, T.N., 1998. Color based object tracking, 14th International Conference on Pattern Recognition (CV41).

JAIN, A.K., ZHONG, Y. AND LAKSHMANAN, S., 1996. Object Matching Using Deformable Templates, IEEE Trans. Pattern Analysis and Machine Intelligence, Volume 18, No. 3, pp. 267-278.

HEIKKILA, M. AND PIETIKAINEN, M., 2006. A Texture-Based Method for Modeling the Background and Detecting Moving Objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 28, pp. 657662.

KASS, M., WITKIN, A. AND TERZOPOULOS, D., 1998. Snakes: Active Contour Models, International Journal of Computer Vision, Volume. 1, pp. 321-332.

CASELLES, V. AND COLL, B., 1996. Snakes in Movement, SIAM Journal of Numerical Analysis, Volume. 33, pp. 2,445-2,456.

WIXSON, L., 2000. Detecting Salient Motion by Accumulating Directionally-Consistent Flow, IEEE transactions on pattern analysis and machine intelligence, Volume. 22, No. 8.

GLUMOV, N.I., KOLOMIYETZ, E.I., AND SERGEYEV, V. V., 1995. Detection of objects on the image using a sliding window mode, Optics & Laser Technology, Volume 27, Issue 4, pp. 241-249.

LECUN Y., HAFFNER P., BOTTOU L. AND BENGIO Y., 1999. Object Recognition with Gradient-Based Learning In: Shape, Contour and Grouping in Computer Vision, Lecture Notes in Computer Science, Volume 1681. Springer, Berlin, Heidelberg.

TSAI, C.-F., 2012. Bag-of-words representation in image annotation: a review, ISRN Artificial Intelligence, Volume 19.

GONZALEZ, R.C. AND WOODS, R.E., 1992. Digital Image Processing, Addison-Wesley, New York.

HADI, R.A., SULONG, G. AND GEORGE, L.W., 2014. Vehicle detection and tracking techniques: a concise review, arXiv preprint arXiv:1410.5894.

KULCHANDANI, J.S., AND DANGARWALA, K.J., 2015. "Moving object detection: Review of recent research trends", Proc. Int. Conf. Pervasive Comput. (ICPC), pp. 1-5.

AHN H., RHEE SB., 2015. Research of Object Recognition and Tracking Based on Feature Matching. In: Park J., Stojmenovic I., Jeong H., Yi G. (eds) Computer Science and its Applications. Lecture Notes in Electrical Engineering, Volume 330. Springer, Berlin, Heidelberg

VERSCHAE, R., AND RUIZ-DEL-SOLAR, J., 2015. Object Detection: Current and Future Directions, Frontiers in Robotics and AI, Article 29, Volume 2.

CHENG, G., AND HAN, J., 2016. "A survey on object detection in optical remote sensing images", ISPRS J. Photogramm. Remote Sens., Volume 117, pp. 11-28.

WANG, Z., AND LIU, J., 2017. "A review of object detection based on convolutional neural network", Proc. 36th Chin. Control Conf. (CCC), pp. 11104-11109.

SHARMA, K.U., AND THAKUR, N.V., 2017. "A review and an approach for object detection in images", Int. J. Comput. Vis. Robot., Volume 7, No. 1, pp. 196-237.

KAUSHAL, M., KHEHRA, B.S., AND SHARMA, A., 2018. Soft Computing based object detection and tracking approaches: State-of-the-Art survey, Applied Soft Computing Journal, Volume 70, pp. 423464, 10.1016/j.asoc.2018.05.023.

BORUAH, A., KAKOTY, N.M., AND ALI, T., 2018. Object Recognition based on Surface Detection- A Review, Procedia Computer Science, Volume 133, pp. 63-74.

NISA, S.U., AND IMRAN, M., 2019. "A Critical Review of Object Detection using Convolution Neural Network," 2nd International Conference on Communication, Computing and Digital systems (C-CODE), Islamabad, Pakistan, 2019, pp. 154-159.