Статьи журнала - Компьютерная оптика

Все статьи: 2346

Transformer point net: cost-efficient classification of on-road objects captured by light ranging sensors on low-resolution conditions

Transformer point net: cost-efficient classification of on-road objects captured by light ranging sensors on low-resolution conditions

Pamplona Jos Fernando, Madrigal Carlos Andrs, Herrera-Ramirez Jorge Alexis

Статья научная

The three-dimensional perception applications have been growing since Light Detection and Ranging devices have become more affordable. On those applications, the navigation and collision avoidance systems stand out for their importance in autonomous vehicles, which are drawing an appreciable amount of attention these days. The on-road object classification task on three-dimensional information is a solid base for an autonomous vehicle perception system, where the analysis of the captured information has some factors that make this task challenging. On these applications, objects are represented only on one side, its shapes are highly variable and occlusions are commonly presented. But the highest challenge comes with the low resolution, which leads to a significant performance dropping on classification methods. While most of the classification architectures tend to get bigger to obtain deeper features, we explore the opposite side contributing to the implementation of low-cost mobile platforms that could use low-resolution detection and ranging devices. In this paper, we propose an approach for on-road objects classification on extremely low-resolution conditions. It uses directly three-dimensional point clouds as sequences on a transformer-convolutional architecture that could be useful on embedded devices. Our proposal shows an accuracy that reaches the 89.74 % tested on objects represented with only 16 points extracted from the Waymo, Lyft’s level 5 and Kitti datasets. It reaches a real time implementation (22 Hz) in a single core processor of 2.3 Ghz.

Бесплатно

Tree-serial parametric dynamic programming with flexible prior model for image denoising

Tree-serial parametric dynamic programming with flexible prior model for image denoising

Thang Pham Cong, Kopylov Andrei Valerievich

Статья научная

We consider here image denoising procedures, based on computationally effective tree-serial pa-rametric dynamic programming procedures, different representations of an image lattice by the set of acyclic graphs and non-convex regularization of a new type which allows to flexibly set a priori pref-erences. Experimental results in image denoising, as well as comparison with related methods, are provided. A new extended version of multi quadratic dynamic programming procedures for image denoising, proposed here, shows an improved accuracy for images of a different type.

Бесплатно

Tunable diffraction grating with transparent indium-tin oxide electrodes on a lithium niobate X-cut crystal

Tunable diffraction grating with transparent indium-tin oxide electrodes on a lithium niobate X-cut crystal

Paranin Vyacheslav Dmitrievich, Karpeev Sergei Vladimirovich, Tukmakov Konstantin Nickolaevich, Volodkin Boris Olegovich

Статья научная

A tunable diffraction grating based on an electrooptic X-cut lithium niobate crystal has been manufactured and experimentally analyzed. The period of electrodes is 290 μm, the electrode width is 117.5 μm, and the thickness of an electrode is 150 - 160 nm. The electrodes are made of a transparent conducting indium-tin oxide that serves as an antireflection coating with the aim of increasing the optical transmission. In order to prevent crystal polarization switching and electrical breakdown an optimized electrode topology with end ellipticity 1:1 and increased interelectrode gap is used. The optical diagram of the tunable grating with alternating electrode potentials for various gap voltages is analyzed. The intensity of the zero order of diffraction is shown to decrease by 40 % at a voltage of 800 V. At the same time, the origination of new diffraction orders at angles ± λ / (2 d ) is noted. The measurement of the forward-bias and reverse-bias regions of the modulation characteristic reveals the absence of hysteresis, which confirms the correctness of the electrode topology design.

Бесплатно

Two calibration models for compensation of the individual elements properties of self-emitting displays

Two calibration models for compensation of the individual elements properties of self-emitting displays

Basova Olga Andreevna, Gladilin Sergey Alexandrovich, Grigoryev Anton Sergeevich, Nikolaev Dmitry Petrovich

Статья научная

In this paper, we examine the applicability limits of different methods of compensation of the individual properties of self-emitting displays with significant non-uniformity of chromaticity and maximum brightness. The aim of the compensation is to minimize the perceived image non-uniformity. Compensation of the displayed image non-uniformity is based on minimizing the perceived distance between the target (ideally displayed) and the simulated image displayed by the calibrated screen. The S-CIELAB model of the human visual system properties is used to estimate the perceived distance between two images. In this work, we compare the efficiency of the channel-wise and linear (with channel mixing) compensation models depending on the models of variation in the characteristics of display elements (subpixels). It was found that even for a display with uniform chromatic subpixels characteristics, the linear model with channel mixing is superior in terms of compensation accuracy.

Бесплатно

U-net-bin: hacking the document image binarization contest

U-net-bin: hacking the document image binarization contest

Bezmaternykh Pavel Vladimirovich, Ilin Dmitrii Alexeevich, Nikolaev Dmitry Petrovich

Статья научная

Image binarization is still a challenging task in a variety of applications. In particular, Document Image Binarization Contest (DIBCO) is organized regularly to track the state-of-the-art techniques for the historical document binarization. In this work we present a binarization method that was ranked first in the DIBCO' 17 contest. It is a convolutional neural network (CNN) based method which uses U-Net architecture, originally designed for biomedical image segmentation. We describe our approach to training data preparation and contest ground truth examination and provide multiple insights on its construction (so called hacking). It led to more accurate historical document binarization problem statement with respect to the challenges one could face in the open access datasets. A docker container with the final network along with all the supplementary data we used in the training process has been published on Github.

Бесплатно

Unsupervised color texture segmentation based on multi-scale region-level Markov random field models

Unsupervised color texture segmentation based on multi-scale region-level Markov random field models

Song Xu, Wu Liang, Liu Guoying

Статья научная

In the field of color texture segmentation, region-level Markov random field model (RMRF) has become a focal problem because of its efficiency in modeling the large-range spatial constraints. However, the RMRF defined on a single scale cannot describe the un-stationary essence of the image, which highly limits its robustness. Hence, by combining wavelet transformation and the RMRF model, we present a multi-scale RMRF (MsRMRF) model in wavelet domainin this paper. In the Bayesian framework, the proposed model seamlessly integrates the multi-scale information stemmed from both the original image and the region-level spatial constraints. Therefore, the new model can accurately describe the characteristics of different kinds of texture. Based on MsRMRF, an unsupervised segmentation algorithm is designed for segmenting color texture images. Both synthetic color texture images and remote sensing images are employed in the comparative experiments, and the experimental results show that the proposed method can obtain more accurate segmentation results than the competitors.

Бесплатно

Vanishing point detection with direct and transposed fast hough transform inside the neural network

Vanishing point detection with direct and transposed fast hough transform inside the neural network

Sheshkus Alexander Vladimirovich, Chirvonaya Anastasiya Nikolaevna, Matveev Daniil Mikhailovich, Nikolaev Dmitry Petrovich, Arlazarov Vladimir Lvovich

Статья научная

In this paper, we suggest a new neural network architecture for vanishing point detection in images. The key element is the use of the direct and transposed fast Hough transforms separated by convolutional layer blocks with standard activation functions. It allows us to get the answer in the coordinates of the input image at the output of the network and thus to calculate the coordinates of the vanishing point by simply selecting the maximum. Besides, it was proved that calculation of the transposed fast Hough transform can be performed using the direct one. The use of integral operators enables the neural network to rely on global rectilinear features in the image, and so it is ideal for detecting vanishing points. To demonstrate the effectiveness of the proposed architecture, we use a set of images from a DVR and show its superiority over existing methods. Note, in addition, that the proposed neural network architecture essentially repeats the process of direct and back projection used, for example, in computed tomography.

Бесплатно

Vehicle wheel weld detection based on improved YOLO V4 algorithm

Vehicle wheel weld detection based on improved YOLO V4 algorithm

Liang Tian Jiao, Pan Wei Guo, Bao Hong, Pan Feng

Статья научная

In recent years, vision-based object detection has made great progress across different fields. For instance, in the field of automobile manufacturing, welding detection is a key step of weld inspection in wheel production. The automatic detection and positioning of welded parts on wheels can improve the efficiency of wheel hub production. At present, there are few deep learning based methods to detect vehicle wheel welds. In this paper, a method based on YOLO v4 algorithm is proposed to detect vehicle wheel welds. The main contributions of the proposed method are the use of k-means to optimize anchor box size, a Distance-IoU loss to optimize the loss function of YOLO v4, and non-maximum suppression using Distance-IoU to eliminate redundant candidate bounding boxes. These steps improve detection accuracy. The experiments show that the improved methods can achieve high accuracy in vehicle wheel weld detection (4.92 % points higher than the baseline model with respect to AP75 and 2.75 % points higher with respect to AP50). We also evaluated the proposed method on the public KITTI dataset. The detection results show the improved method’s effectiveness.

Бесплатно

Veiling glare removal: synthetic dataset generation, metrics and neural network architecture

Veiling glare removal: synthetic dataset generation, metrics and neural network architecture

Shoshin Alexey Valeryevich, Shvets Evgeny Alexandrovich

Статья научная

In photography, the presence of a bright light source often reduces the quality and readability of the resulting image. Light rays reflect and bounce off camera elements, sensor or diaphragm causing unwanted artifacts. These artifacts are generally known as “lens flare” and may have different influences on the photo: reduce contrast of the image (veiling glare), add circular or circular-like effects (ghosting flare), appear as bright rays spreading from light source (starburst pattern), or cause aberrations. All these effects are generally undesirable, as they reduce legibility and aesthetics of the image. In this paper we address the problem of removing or reducing the effect of veiling glare on the image. There are no available large-scale datasets for this problem and no established metrics, so we start by (i) proposing a simple and fast algorithm of generating synthetic veiling glare images necessary for training and (ii) studying metrics used in related image enhancement tasks (dehazing and underwater image enhancement). We select three such no-reference metrics (UCIQE, UIQM and CCF) and show that their improvement indicates better veil removal. Finally, we experiment on neural network architectures and propose a two-branched architecture and a training procedure utilizing structural similarity measure.

Бесплатно

Video images compression and restoration methods based on optimal sampling

Video images compression and restoration methods based on optimal sampling

Drynkin Vladimir Nikolaevich, Nabokov Sergey Alexeyevich, Tsareva Tatiana Igorevna

Статья научная

The study proposes video images compression and restoration methods based on multidimensional sampling theory that provide four-fold video compression and subsequent real-time restoration with loss levels below visually perceptible threshold. The proposed methods can be used separately or along with any other video compression techniques, thus providing additional quadruple compression.

Бесплатно

Vortex beams in turbulent media: review

Vortex beams in turbulent media: review

Soifer Victor Alexandrovich, Korotkova Olga, Khonina Svetlana Nikolaevna, Shchepakina Elena Anatolevna

Статья научная

The review covers publications concerned with propagation of laser beams through turbulent media described by the Kolmogorov theory and generalizations thereof to describe signal transmission in optical communications and detection systems. In this case, the turbulent medium is interpreted as an optical channel with random parameters. Various optical signals considered include partially coherent beams, non-uniformly polarized vector beams, as well as specifically configured spatial laser beams. Special attention is given to vortex laser beams. The latter are shown to have a number of remarkable properties that give them an advantage over conventional Gaussian beams.

Бесплатно

Vortex-free laser beam with an orbital angular momentum

Vortex-free laser beam with an orbital angular momentum

Kotlyar Victor Victorovich, Kovalev Alexey Andreevich

Статья научная

We show that if one cylindrical lens is placed in the Gaussian beam waist and another cylindrical lens is placed at some distance from the first one and rotated by some angle, then the laser beam after the second lens has an orbital angular momentum (OAM). An explicit analytical expression for the OAM of such a beam is obtained. Depending on the inter-lens distance, the OAM can be positive, negative, or zero. Such a laser beam has no isolated intensity s with a singular phase and it is not an optical vortex, but has an OAM. By choosing the radius of the beam waist of the source Gaussian beam, the focal lengths of the lenses and the distance between them, it is possible to generate a vortex-free laser beam equivalent to an optical vortex with a topological charge of several hundreds.

Бесплатно

Vulnerability analysis on Hyderabad city, India

Vulnerability analysis on Hyderabad city, India

Boori Mukesh Singh, Choudhary Komal, Kupriyanov Alexander Victorovich

Статья научная

City vulnerability is an assessment of priorities for implementation in a city. Thus, it is imperative to determine vulnerable regions in the city to identify priority areas that may require immediate intervention. Several methods used for national, international and local level vulnerability assessment are based on remote sensing and GIS technology. This paper aims to determine the vulnerability of Hyderabad city using a geospatial based vulnerability index for sustainable development of the city. We use an urbanization and vulnerability concept for the development of city policy measures. We assessed the city vulnerability using a conceptual diagram composed of exposure, sensitivity and adaptive capacity. For Exposure, we considered the elevation (contour), watershed, waterway, roads, railways and airport thematic layers. For Sensitivity, the built-up area, industry, manages (?) system such as farmland and land use/cover map from GIS data were used. To examine the adaptive capacity, we addressed the natural vegetation layer, economic points and infrastructure. Results show that the center and northern part of the city are highly and extremely vulnerable due to industry and high socio-economic activities when compared with the southern part of the city. We divided the whole city into 5 types of vulnerability: Resilient 2.24 %, at risk 13.20 %, vulnerable 46.15 %, highly vulnerable 7.26 % and extremely vulnerable 31.15 %, in terms of the city area percentage. The vegetation area (50.51 %) has the maximum vulnerable area and the vulnerable class covers the maximum area (46.15 %) of the city. All this information is very indispensable and can be used to address management issues, such as resource prioritization and optimization.

Бесплатно

Weighted combination of per-frame recognition results for text recognition in a video stream

Weighted combination of per-frame recognition results for text recognition in a video stream

O. Petrova, K. Bulatov, V.V. Arlazarov, V.L. Arlazarov

Статья

The scope of uses of automated document recognition has extended and as a result, recognition techniques that do not require specialized equipment have become more relevant. Among such techniques, document recognition using mobile devices is of interest. However, it is not always possible to ensure controlled capturing conditions and, consequentially, high quality of input images. Unlike specialized scanners, mobile cameras allow using a video stream as an input, thus obtaining several images of the recognized object, captured with various characteristics. In this case, a problem of combining the information from multiple input frames arises. In this paper, we propose a weighing model for the process of combining the per-frame recognition results, two approaches to the weighted combination of the text recognition results, and two weighing criteria. The effectiveness of the proposed approaches is tested using datasets of identity documents captured with a mobile device camera in different conditions, including perspective distortion of the document image and low lighting conditions. The experimental results show that the weighting combination can improve the text recognition result quality in the video stream, and the per-character weighting method with input image focus estimation as a base criterion allows one to achieve the best results on the datasets analyzed.

Бесплатно

Wescore: quality assessment method of multichannel image visualization with regard to angular resolution

Wescore: quality assessment method of multichannel image visualization with regard to angular resolution

Sidorchuk Dmitry Sergeevich

Статья научная

This work considers the problem of quality assessment of multichannel image visualization methods. One approach to such an assessment, the Escore quality measure, is studied. This measure, initially proposed for decolorization methods evaluation, can be generalized for the assessment of hyperspectral image visualization methods. It is shown that Escore does not account for the loss of local contrast at the supra-pixel scale. The sensitivity to the latter in humans depends on the observation conditions, so we propose a modified wEscore measure which includes the parameters allowing for the adjustment of the local contrast scale based on the angular resolution of the images. We also describe the adjustment of wEscore parameters for the evaluation of known decolorization algorithms applied to the images from the COLOR250 and the Cadik datasets with given observational conditions. When ranking the results of these algorithms and comparing it to the ranking based on human perception, wEscore turned out to be more accurate than Escore.

Бесплатно

X-ray tomography: the way from layer-by-layer radiography to computed tomography

X-ray tomography: the way from layer-by-layer radiography to computed tomography

Arlazarov Vladimir Lvovich, Nikolaev Dmitry Petrovich, Arlazarov Vladimir Viktorovich, Chukalina Marina Valerievna

Статья научная

The methods of X-ray computed tomography allow us to study the internal morphological structure of objects in a non-destructive way. The evolution of these methods is similar in many respects to the evolution of photography, where complex optics were replaced by mobile phone cameras, and the computers built into the phone took over the functions of high-quality image generation. X-ray tomography originated as a method of hardware non-invasive imaging of a certain internal cross-section of the human body. Today, thanks to the advanced reconstruction algorithms, a method makes it possible to reconstruct a digital 3D image of an object with a submicron resolution. In this article, we will analyze the tasks that the software part of the tomographic complex has to solve in addition to managing the process of data collection. The issues that are still considered open are also discussed. The relationship between the spatial resolution of the method, sensitivity and the radiation load is reviewed. An innovative approach to the organization of tomographic imaging, called “reconstruction with monitoring”, is described. This approach makes it possible to reduce the radiation load on the object by at least 2 - 3 times. In this work, we show that when X-ray computed tomography moves towards increasing the spatial resolution and reducing the radiation load, the software part of the method becomes increasingly important.

Бесплатно

Аберрации второго порядка градиентной среды: методы расчета

Аберрации второго порядка градиентной среды: методы расчета

Ильинский Р.Е., Ровенская Т.С.

Статья

Бесплатно

Аберрации синтезированных дифракционных линз, вызванные ошибками их изготовления

Аберрации синтезированных дифракционных линз, вызванные ошибками их изготовления

Грейсух Г.И., Степанов С.А.

Статья научная

Приведены результаты исследований влияния ошибок при синтезе кольцевой структуры дифракционных линз на их аберрации для точки на оси. Определены типы аберрационных искажений, возникающих за счет эллиптичности зон дифракционной структуры и систематических ошибок их радиусов. На основе критерия Марешаля получены технологические допуски на параметры структуры линз.

Бесплатно

Аберрации третьего порядка градиентных оптических систем, обладающих двоякой симметрией

Аберрации третьего порядка градиентных оптических систем, обладающих двоякой симметрией

Ильинский Роман Евгеньевич

Статья научная

Для градиентных оптических систем, обладающих двоякой симметрией, получены в явном виде коэффициенты геометрических аберраций третьего порядка.

Бесплатно

Абстрактная модель искусственной иммунной сети на основе комитета классификаторов и ее использование для распознавания образов клавиатурного почерка

Абстрактная модель искусственной иммунной сети на основе комитета классификаторов и ее использование для распознавания образов клавиатурного почерка

Сулавко Алексей Евгеньевич

Статья научная

Предложены абстрактная модель искусственной иммунной сети на базе комитета классификаторов и два алгоритма ее обучения (с учителем и с подкреплением) для задач классификации, которые характеризуются малыми объемами и низкой репрезентативностью обучающих выборок. Оценка эффективности модели и алгоритмов выполнена на примере задачи аутентификации по клавиатурному почерку с использованием 3 баз данных биометрических образов. Разработанная искусственная иммунная сеть обладает эмерджентностью, памятью, двойной пластичностью, устойчивостью обучения. Эксперименты показали, что искусственная иммунная сеть дает меньший или сопоставимый процент ошибок по сравнению с некоторыми архитектурами нейронных сетей при гораздо меньшем объеме обучающей выборки.

Бесплатно

Журнал