Sky-CNN: a CNN-based learning approach for skyline scene understanding

Автор: Ameni Sassi, Wael Ouarda, Chokri Ben Amar, Serge Miguet

Журнал: International Journal of Intelligent Systems and Applications @ijisa

Статья в выпуске: 4 vol.11, 2019 года.

Бесплатный доступ

Skyline scenes are a scientific matter of interest for some geographers and urbanists. These scenes have not been well-handled in computer vision tasks. Understanding the context of a skyline scene could refer to approaches based on hand-crafted features combined with linear classifiers; which are somewhat side-lined in favor of the Convolutional Neural Networks based approaches. In this paper, we proposed a new CNN learning approach to categorize skyline scenes. The proposed model requires a pre-processing step enhancing the deep-learned features and the training time. To evaluate our suggested system; we constructed the SKYLINEScene database. This new DB contains 2000 images of urban and rural landscape scenes with a skyline view. In order to examine the performance of our Sky-CNN system, many fair comparisons were carried out using well-known CNN architectures and the SKYLINEScene DB for tests. Our approach shows it robustness in Skyline context understanding and outperforms the hand-crafted approaches based on global and local features.

Еще

Convolutional Neural Network, deep learning, scene categorization, skyline, features representation, deep learned features

Короткий адрес: https://sciup.org/15016584

IDR: 15016584 | DOI: 10.5815/ijisa.2019.04.02

Список литературы Sky-CNN: a CNN-based learning approach for skyline scene understanding

Wei, X., Phung, S.L., Bouzerdoum, A.: ‘Visual descriptors for scene categorization: experimental evaluation’, Artificial Intelligence Review, 2016, 45, (3), pp. 333–368. Available from: https://doi.org/10.1007/s10462-015-9448-4
Sassi, A., Amar, C.B., Miguet, S. ‘Skyline-based approach for natural scene identification’. In: 13th IEEE/ACS International Conference of Computer Systems and Applications, AICCSA 2016, Agadir, Morocco, November 29 - December 2, 2016. pp. 1–8.
Day, A.: ‘Urban visualization and public inquiries: the case of the heron tower, london’, Architectural Research Quarterly, 2002, 6, (4), pp. 363–372
III, A.S., Nasar, J.L., Hanyu, K.: ‘Using pre-construction validation to regulate urban skylines’, Journal of the American Planning Association, 2005, 71, (1), pp. 73–91
Nasar, J.L., Terzano, K.: ‘The desirability of views of city skylines after dark’, Journal of Environmental. Psychology, 2010, 30, (2), pp. 215 – 225
Ayadi, M., Suta, L., Scuturici, M., Miguet, S., Ben.Amar, C. In: Blanc.Talon, J., Distante, C., Philips, W., Popescu, D., Scheunders, P., editors. ‘A parametric algorithm for skyline extraction’. (Cham: Springer International Publishing, 2016. pp. 604–615
Tonge, R., Maji, S., Jawahar, C.V. ‘Parsing world’s skylines using shape-constrained mrfs’. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition 2014, pp. 3174–3181
Sassi, A., Ouarda, W., Ben.Amar, C., Miguet, S. ‘Neural Approach for Context Scene Image Classification based on Geometric, Texture and Color Information’. In: Representation, analysis and recognition of shape and motion FroM Image data. (Aussois, France: RFIA, 2017. Availablefrom: https://hal.archives-ouvertes.fr/hal-01687973
Yassin, F.M., Lazzez, O., Ouarda, W., Alimi, A.M. ‘Travel user interest discovery from visual shared data in social networks’. In: 2017 Sudan Conference on Computer Science and Information Technology (SCCSIT), pp. 1–7
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. ‘Going deeper with convolutions’. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. pp. 1–9
Zuo, Z., Shuai, B., Wang, G., Liu, X., Wang, X., Wang, B., et al. ‘Convolutional recurrent neural networks: Learning spatial dependencies for image representation’. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015. pp. 18–26
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: ‘Multi-scale orderless pooling of deep convolutional activation features’, CoRR, 2014, abs/1403.1840. Available from: http://arxiv.org/abs/1403.1840
Krizhevsky, A., Sutskever, I., Hinton, G.E. ‘Imagenet classification with deep convolutional neural networks’. In: Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1. NIPS’12. (USA: Curran Associates Inc., 2012. pp. 1097–1105
Sánchez, J., Perronnin, F., Mensink, T., Verbeek, J.: ‘Image classification with the fisher vector: Theory and practice’, Int J Comput Vision, 2013, 105, (3), pp. 222–245
Yang, J., Yu, K., Gong, Y., Huang, T.S. ‘Linear spatial pyramid matching using sparse coding for image classification’. In: CVPR. (IEEE Computer Society, 2009. pp. 1794–1801
Xiao, J., Ehinger, K.A., Hays, J., Torralba, A., Oliva, A.: ‘Sun database: Exploring a large collection of scene categories’, International Journal of Computer Vision, 2016, 119, (1), pp. 3–22
Oliva, A. & Torralba, A.: ‘Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope’. In: International Journal of Computer Vision, 2001, 42: 145–147.
Ojala, T., PietikÃd’inen, M., Harwood, D.: ‘A comparative study of texture measures with classification based on featured distributions’, Pattern Recognition, 1996, 29, (1), pp. 51 – 59
Huttunen, S., Rahtu, E., Kunttu, I., Gren, J., Heikkilä, J. In: Heyden, A., Kahl, F., editors. ‘Real-time detection of landscape scenes’. (Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. pp. 338–347
Han, X., Chen, Y. ‘Image categorization by learned PCA subspace of combined visual-words and low-level features’. In: Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), Kyoto, Japan, 12-14 September, 2009, Proceedings, 2009. pp. 1282–1285
Serrano, N., Savakis, A.E., Luo, J.: ‘Improved scene classification using efficient low-level features and semantic cues’, Pattern Recognition, 2004, 37, (9), pp. 1773– 1784
Vailaya, A., Jain, A., Zhang, H.J.: ‘On image classification: City images vs. landscapes’, Pattern Recognition, 1998, 31, (12), pp. 1921 – 1935
Chen, Z., Chi, Z., Fu, H. ‘A hybrid holistic/semantic approach for scene classification’. In: 22nd International Conference on Pattern Recognition, ICPR 2014, Stockholm, Sweden, August 24-28, 2014. (, 2014. pp. 2299–2304
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P. ‘Gradient-based learning applied to document recognition’. In: Proceedings of the IEEE. (, 1998. pp. 2278–2324
He, K., Zhang, X., Ren, S., Sun, J. ‘Deep residual learning for image recognition’. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016. pp. 770–778
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al. ‘Tensorflow: A system for large-scale machine learning’. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. OSDI’16. (Berkeley, CA, USA: USENIX Association, 2016. pp. 265–283. Available from: http://dl.acm.org/citation.cfm?id=3026877.3026899
Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., et al.: ‘A survey on deep learning in medical image analysis’, Medical Image Analysis, 2017, 42, pp. 60 – 88. Available from: http://www.sciencedirect.com/science/article/pii/S1361841517301135
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: ‘Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40, (4), pp. 834–848
Balduzzi, D., Frean, M., Leary, L., Lewis, J.P., Ma, K.W., McWilliams, B.: ‘The shattered gradients problem: If resnets are the answer, then what is the question?’, CoRR, 2017, Available from: http://arxiv.org/abs/1702.08591
Philipp, G., Song, D., Carbonell, J.G.. ‘Gradients explode - deep networks are shallow - resnet explained’, 2018. Available from: https://openreview. net/forum?id=HkpYwMZRb
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: ‘Rethinking the inception architecture for computer vision’, CoRR, 2015, abs/1512.00567. Available from: http://arxiv.org/abs/1512.00567
He, K., Zhang, X., Ren, S., Sun, J.: ‘Identity mappings in deep residual networks’, CoRR, 2016, abs/1603.05027. Available from: http://arxiv.org/abs/1603.05027
Hiippala, T. ‘Recognizing military vehicles in social media images using deep learning’. In: 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), 2017. pp. 60–65
Alvarez, S., Vanrell, M.: ‘Texton theory revisited: A bag-of-words approach to combine textons’, Pattern Recognition, 2012, 45, (12), pp. 4312– 4325.

Еще

Статья научная