HyperSpectral Imagers (HySI) are used in the spacecraft or aircrafts to get minute characteristics of target element through capturing image in a large number of narrow and contiguous bands. HySI data represented as data cube with two dimensions representing spatial distribution and third dimension providing band information is huge in volume and challenging task to handle. Hence onboard compression becomes a necessary for optimal usage of onboard storage and downlink bandwidth. CCSDS recommended 123.0-B-1 standard[2] has been released with onboard compression scheme of hyperspectral data. The scheme is based on Fast Lossless algorithm and consists of two main functional blocks namely Predictor and Encoder. Predictor algorithm can be implemented in two modes 'Full Neighborhood Oriented' and 'Reduced Column Oriented'. Encoder algorithm also defines two options 'sample-adaptive' and 'block-adaptive'. We have developed a MATLAB based model implementing the compression scheme with all options defined by the standard. Decompression model is also developed for getting back actual data and end to end verification. Four sets of HySI data (AVIRIS, Hyperion, Chandrayan-1 and FTIS) have been applied as input to the developed model for evaluation of the model. Compression ratio achieved is between 2 to 3 and lossless compression is ensured for each set of data as Mean Square Error (MSE) is zero for all hyperspectral images. Also visual reconstruction of decompressed data matches with original ones. In this paper we have discussed algorithm implementation methodology and results.

Бесплатно

Solid Launcher Dynamical Analysis and Autopilot Design

Ping Sun

Статья научная

The dynamics of a small solid launch vehicle has been investigated. This launcher consists of a liquid upper stage and three fundamental solid rocket boosters aligned in series. During the ascent flight phase, lateral jets and grid fins are adopted by the flight control system to stable the attitude of the launcher. The launcher is a slender and aerodynamically unstable vehicle with sloshing tanks. A complete set of six-degrees-of-freedom dynamic models of the launcher, incorporation its rigid body, aerodynamics, gravity, sloshing, mass change, actuator, and elastic body, is developed. Dynamic analysis results of the structural modes and the bifurcation locus are calculated on the basis of the presented models. This complete set of dynamic models is used in flight control system design. A methodology for employing numerical optimization to develop the attitude filters is presented. The design objectives include attitude tracking accuracy and robust stability with respect to rigid body dynamics, propellant slosh, and flex. Later a control approach is presented for flight control system of the launcher using both State Dependent Riccati Equation (SDRE) method and Fast Output Sampling (FOS) technique. The dynamics and kinematics for attitude stable problem are of typical nonlinear character. SDRE technique has been well applied to this kind of highly nonlinear control problems. But in practice the system states needed in the SDRE method are sometimes difficult to obtain. FOS method, which makes use of only the output samples, is combined with SDRE to accommodate the incomplete system state information. Thus, the control approach is more practical and easy to implement. The resulting autopilot can provide stable control systems for the vehicle.

Бесплатно

Sound Source Localization Ability in Hearing Aids: A Survey

Jyoti M. Katagi, Pandurangarao N. Kulkarni

Статья научная

Ability to locate sound source in human acoustic system is a prime factor. The source of sound has various spectral, temporal and strength characteristics depending on where it is located. To identify the sound location, the listeners analyze these characteristics arising from various directions on the horizontal and the vertical surfaces. In noisy background, it is very difficult to understand the speech for individuals with sensorineural hearing loss. In order to reliably distinguish various sound sources and increase speech intelligibility in noisy conditions, binaural hearing is adopted. Diffraction induced by the pinnae, head, shoulders and torso changes the pressure waveform when sound waves travel from the audio source to the listener's eardrum. Two transfer functions that specify the relation between the sound pressures at the listener's right and left ear drums will catch these propagation effects. These spectral changes are recorded by Head Related Transfer Functions (HRTFs). Different hearing aid algorithms are to be studied to measure their effectiveness in improving speech perception through series of subjective evaluations involving subjects with sensorineural hearing loss with different types of loss characteristics under different listening conditions. We investigated the various proposed approaches, weighed in on their benefits and drawbacks and most importantly, examined whether and how the resulting HRTFs perceptual validity is evaluated. This paper brings out current research efforts on sound source localization ability in hearing aids, which includes use of Head Related Transfer Functions (HRTFs) for generating spatial sounds in elevation and azimuth plane, evaluating the effect of monaural and binaural hearing aid algorithms on source localization under different listening conditions on subjects with different hearing losses and also to assess the effectiveness of localization with type of hearing aids.

Бесплатно

Sparse representation and face recognition

M. Khorasani, S. Ghofrani, M. Hazari

Статья научная

Now a days application of sparse representation are widely spreading in many fields such as face recognition. For this usage, defining a dictionary and choosing a proper recovery algorithm plays an important role for the method accuracy. In this paper, two type of dictionaries based on input face images, the method named SRC, and input extracted features, the method named MKD-SRC, are constructed. SRC fails for partial face recognition whereas MKD-SRC overcomes the problem. Three extension of MKD-SRC are introduced and their performance for comparison are presented. For recommending proper recovery algorithm, in this paper, we focus on three greedy algorithms, called MP, OMP, CoSaMP and another called Homotopy. Three standard data sets named AR, Extended Yale-B and Essex University are used to asses which recovery algorithm has an efficient response for proposed methods. The preferred recovery algorithm was chosen based on achieved accuracy and run time.

Бесплатно

Spatial-temporal shape and motion features for dynamic hand gesture recognition in depth video

Vo Hoai Viet, Nguyen Thanh Thien Phuc, Pham Minh Hoang, Liu Kim Nghia

Статья научная

Human-Computer Interaction (HCI) is one of the most interesting and challenging research topics in computer vision community. Among different HCI methods, hand gesture is the natural way of human-computer interaction and is focused on by many researchers. It allows the human to use their hand movements to interact with machine easily and conveniently. With the birth of depth sensors, many new techniques have been developed and gained a lot of achievements. In this work, we propose a set of features extracted from depth maps for dynamic hand gesture recognition. We extract HOG2 for shape and appearance of hand in gesture representation. Moreover, to capture the movement of the hands, we propose a new feature named HOF2, which is extracted based on optical flow algorithm. These spatial-temporal descriptors are easy to comprehend and implement but perform very well in multi-class classification. They also have a low computational cost, so it is suitable for real-time recognition systems. Furthermore, we applied Robust PCA to reduce feature’s dimension to build robust and compact gesture descriptors. The robust results are evaluated by cross-validation scheme using a SVM classifier, which shows good outcome on challenging MSR Hand Gestures Dataset and VIVA Challenge Dataset with 95.51% and 55.95% in accuracy, respectively.

Бесплатно

Spatiotemporal Data Fusion using Dictionary Learning and Temporal Edge Primitives

J. Malleswara Rao, C. V. Rao, A. Senthil Kumar, B. Gopala Krishna, V. K. Dadhwal

Статья научная

Technological limitations restrict to acquire an image at high spatial and high temporal resolutions with space borne global sensors. In this paper, we propose a novel technique to create such images at ground-based data processing system. The Resourcesat-2 is one of the Indian Space Research Organization (ISRO) global missions and it carries Linear Imaging and Self-Scanning Sensors (LISS III and LISS IV) and an Advanced Wide-Field Sensor (AWiFS). The spatial resolution of LISS III is 23.5 m and that of AWiFS is 56 m. The temporal resolution of LISS III is 24 days and that of AWiFS is 5 days. Objective of the paper is to create a synthetic LISS III image at 23.5 m spatial and 5-day temporal resolutions. A synthetic LISS III image at time tk is created from an AWiFS image at time tk and a single AWiFS–LISS III image pair at time t0 which is acquired before or after the prediction time tk , here t0≠tk. The proposed method involves three phases. The first is super resolution phase. In this phase, two transition images are obtained for the time t0 and tk by improving AWiFS spatial resolution. The second is high pass modulation phase. In this phase, the high frequency details which are obtained in the difference of LISS III image and the transition image of time t0 are proportionally injected into the transition image at time tk. In composition of multi-temporal images of different spatial resolutions, spurious spatial discontinuities are inevitable. In the third phase, these spurious discontinuities are identified and smoothed with the spatial-profile-averaging method. The proposed method achieves better prediction accuracy when compared to the state-of-the art techniques.

Бесплатно

Speaker Emotion Recognition based on Speech Features and Classification Techniques

J. Sirisha Devi, Srinivas Yarramalle, Siva Prasad Nandyala

Статья научная

Speech Processing has been developed as one of the vital provision region of Digital Signal Processing. Speaker recognition is the methodology of immediately distinguishing who is talking dependent upon special aspects held in discourse waves. This strategy makes it conceivable to utilize the speaker's voice to check their character and control access to administrations, for example voice dialing, data administrations, voice send, and security control for secret information. A review on speaker recognition and emotion recognition is performed based on past ten years of research work. So far iari is done on text independent and dependent speaker recognition. There are many prosodic features of speech signal that depict the emotion of a speaker. A detailed study on these issues is presented in this paper.

Бесплатно

Speaker Identification using SVM during Oriya Speech Recognition

Sanghamitra Mohanty, Basanta Kumar Swain

Статья научная

In this research paper, we have developed a system that identifies users by their voices and helped them to retrieve the information using their voice queries. The system takes into account speaker identification as well as speech recognition i.e. two pattern recognition techniques in speech domain. The conglomeration of speaker identification task and speech recognition task provides multitude of facilities in comparison to isolated approach. The speaker identification task is achieved by using SVM where as speech recognition is based on HMM. We have used two different types of corpora for training the system. Gamma tone cepstral coefficients and mel frequency cepstral coefficients are extracted for speaker identification and speech recognition respectively. The accuracy of the system is measured from two perspective i.e. accuracy of speaker identity and accuracy of speech recognition task. The accuracy of the speaker identification is enhanced by adopting the speech recognition at the initial stage of speaker identification.

Бесплатно

Speaker Recognition in Mismatch Conditions: A Feature Level Approach

Sharada V Chougule, Mahesh S. Chavan

Статья научная

Mismatch in speech data is one of the major reasons limiting the use of speaker recognition technology in real world applications. Extracting speaker specific features is a crucial issue in the presence of noise and distortions. Performance of speaker recognition system depends on the characteristics of extracted features. Devices used to acquire the speech as well as the surrounding conditions in which speech is collected, affects the extracted features and hence degrades the decision rates. In view of this, a feature level approach is used to analyze the effect of sensor and environment mismatch on speaker recognition performance. The goal here is to investigate the robustness of segmental features in speech data mismatch and degradation. A set of features derived from filter bank energies namely: Mel Frequency Cepstral Coefficients (MFCCs), Linear Frequency Cepstral Coefficients (LFCCs), Log Filter Bank Energies (LOGFBs) and Spectral Subband Centroids (SSCs) are used for evaluating the robustness in mismatch conditions. A novel feature extraction technique named as Normalized Dynamic Spectral Features (NDSF) is proposed to compensate the sensor and environment mismatch. A significant enhancement in recognition results is obtained with proposed feature extraction method.

Бесплатно

Speckle Reduction with Edge Preservation in B-Scan Breast Ultrasound Images

Madan Lal, Lakhwinder Kaur, Savita Gupta

Статья научная

Speckle is a multiplicative noise that degrades the quality of ultrasound images and its presence makes the visual inspection difficult. In addition, it limits the professional application of image processing techniques such as automatic lesion segmentation. So speckle reduction is an essential step before further processing of ultrasonic images. Numerous techniques have been developed to preserve the edges while reducing speckle noise, but these filters avoid smoothing near the edges to preserve fine details. The objective of this work is to suggest a new technique that enhances B-Scan breast ultrasound images by increasing the speckle reduction capability of an edge sensitive filter. In the proposed technique a local statics based filter is applied in the non homogeneous regions, to the output of an edge preserving filter and an edge map is used to retain the original edges. Experiments are conducted using synthetic test image and real time ultrasound images. The effectiveness of the proposed technique is evaluated qualitatively by experts and quantitatively in terms of various quality metrics. Results indicate that proposed method can reduce more noise and simultaneously preserve important diagnostic edge information in breast ultrasound images.

Бесплатно

Spectral Subtractive-Type Algorithms for Enhancement of Noisy Speech: An Integrative Review

Navneet Upadhyay, Abhijit Karmakar

Статья научная

The spectral subtraction method is a classical approach for enhancement of speech degraded by additive background noise. The basic principle of this method is to estimate the short-time spectral magnitude of speech by subtracting estimated noise spectrum from the noisy speech spectrum. This is also achieved by multiplying the noisy speech spectrum with a gain function and later combining it with the phase of the noisy speech. Besides reducing the background noise, this method introduces an annoying perceptible tonal characteristic in the enhanced speech and affects the human listening, known as remnant musical noise. Several variations and implementations of this method have been adopted in past decades to address the limitations of spectral subtraction method. These variations constitute a family of subtractive-type algorithms and operate in frequency domain. The objective of this paper is to provide an extensive overview of spectral subtractive-type algorithms for enhancement of noisy speech. After the review, this paper is concluded by mentioning a future direction of speech enhancement research from spectral subtraction perspective.

Бесплатно

Spectral and Time Based Assessment of Meditative Heart Rate Signals

Ateke Goshvarpour, Mousa Shamsi, Atefeh Goshvarpour

Статья научная

The objective of this article was to study the effects of Chi meditation on heart rate variability (HRV). For this purpose, the statistical and spectral measures of HRV from the RR intervals were analyzed. In addition, it is concerned with finding adequate Auto-Regressive Moving Average (ARMA) model orders for spectral analysis of the time series formed from RR intervals. Therefore, Akaike's Final Prediction Error (FPE) was taken as the base for choosing the model order. The results showed that overall the model order chosen most frequently for FPE was p = 8 for before meditation and p = 5 for during meditation. The results suggested that variety of orders in HRV models upon different psychological states could be due to some differences in intrinsic properties of the system.

Бесплатно

Speech Emotion Recognition based on SVM as Both Feature Selector and Classifier

Amirreza Shirani, Ahmad Reza Naghsh Nilchi

Статья научная

The aim of this paper is to utilize Support Vector Machine (SVM) as feature selection and classification techniques for audio signals to identify human emotional states. One of the major bottlenecks of common speech emotion recognition techniques is to use a huge number of features per utterance which could significantly slow down the learning process, and it might cause the problem known as "the curse of dimensionality". Consequently, to ease this challenge this paper aims to achieve high accuracy system with a minimum set of features. The proposed model uses two methods, namely "SVM features selection" and the common "Correlation-based Feature Subset Selection (CFS)" for the feature dimensions reduction part. In addition, two different classifiers, one Support Vector Machine and the other Neural Network are separately adopted to identify the six emotional states of anger, disgust, fear, happiness, sadness and neutral. The method has been verified using Persian (Persian ESD) and German (EMO-DB) emotional speech databases, which yield high recognition rates in both databases. The results show that SVM feature selection method provides better emotional speech-recognition performance compared to CFS and baseline feature set. Moreover, the new system is able to achieve a recognition rate of (99.44%) on the Persian ESD and (87.21%) on Berlin Emotion Database for speaker-dependent classification. Besides, promising result (76.12%) is obtained for speaker-independent classification case; which is among the best-known accuracies reported on the mentioned database relative to its little number of features.

Бесплатно

Speech Enhancement based on Wavelet Thresholding the Multitaper Spectrum Combined with Noise Estimation Algorithm

P.Sunitha, K.Satya Prasad

Статья научная

This paper presents a method to reduce the musical noise encountered with the most of the frequency domain speech enhancement algorithms. Musical Noise is a phenomenon which occurs due to random spectral speaks in each speech frame, because of large variance and inaccurate estimate of spectra of noisy speech and noise signals. In order to get low variance spectral estimate, this paper uses a method based on wavelet thresholding the multitaper spectrum combined with noise estimation algorithm, which estimates noise spectrum based on the spectral average of past and present according to a predetermined weighting factor to reduce the musical noise. To evaluate the performance of this method, sine multitapers were used and the spectral coefficients are threshold using Wavelet thresholding to get low variance spectrum .In this paper, both scale dependent, independent thresholdings with soft and hard thresholding using Daubauchies wavelet were used to evaluate the proposed method in terms of objective quality measures under eight different types of real-world noises at three distortions of input SNR. To predict the speech quality in presence of noise, objective quality measures like Segmental SNR ,Weighted Spectral Slope Distance ,Log Likelihood Ratio, Perceptual Evaluation of Speech Quality (PESQ) and composite measures are compared against wavelet de-noising techniques, Spectral Subtraction and Multiband Spectral Subtraction provides consistent performance to all eight different noises in most of the cases considered.

Бесплатно

Speech Enhancement through Implementation of Adaptive Noise Canceller Using FHEDS Adaptive Algorithm

Ch.D.Umasankar, M. Satya Sai Ram

Статья научная

Speech analysis is the modelling and estimating of the different speech characteristics that would provide the importance on each set of criteria established on the real time applications. One such analytic section in enhancement process on speeches would improve the need of speech enhancement. This paper compares the performance analysis of our proposed Fast Hybrid Euclidean Direction Search (FHEDS) algorithm with other adaptive algorithms such as NHP and FEDS algorithm. These algorithms have been tested for their adaptive noise cancellation of speech signal corrupted by different noises such as Babble, Factory, Destroy Engine, Car, Fire Engine and Train Noises. Ensuring the design criteria with current design limits of the database and its analysis have been encapsulated with each phase of design with Noise model, improving the better performance aspects. The relative factors for comparisons have been tabulated with each set of the noise and clear speech data with proposed filter operation. The proposed model effectively reduces the noise for achieving better speech enhancement. The proposed model achieves high Signal-to-Noise Ratio (SNR) when compared to traditional models.

Бесплатно

Speech Feature Extraction for Gender Recognition

Anjali Pahwa, Gaurav Aggarwal

Статья научная

Speech Recognition Technology can be embedded in various real time applications in order to increase the human-computer interaction. From robotics to health care and aerospace, from interactive voice response systems to mobile telephony and telematics, speech recognition technology have enhanced the human-machine interaction. Gender recognition is an important component for the application embedding speech recognition as it reduces the computational complexity for the further processing in these applications. The paper involves the extraction of one of the most dominant and most researched up on speech feature, Mel coefficients and its first and second order derivatives. We extracted 13 values for each of these from a data-set 46 speech samples containing the Hindi vowels (आ, इ, ई, उ, ऊ, ऋ, ए, ऎ, ऒ, ऑ) and trained them using a combined model of SVM and neural network classification to determine their gender using stacking. The results obtained showed the accuracy of 93.48% after taking into consideration the first Mel coefficient. The purpose of this study was to extract the correct features and to compare the performance based on first Mel coefficient.

Бесплатно

Spliced image classification and tampered region localization using local directional pattern

Surbhi Sharma, Umesh Ghanekar

Статья научная

In this paper the authors have proposed a spliced image detection algorithm based on Local Directional Pattern (LDP). The output of many splicing detection techniques is either to classify spliced image from authentic images or to localize the spliced region. But the proposed algorithm has ability to classify and to localize the spliced region. First, the original image (RGB color space) is converted to Ycbcr color space. The histogram of LDP of chrominance component of suspect image is used in classification. Whereas for localization of spliced region, the chrominance component of input image is divide into overlapping blocks; then, the LDP of each block is calculated. The standard deviation of each block is used as clue to visualize the spliced region. The experimental results are calculated in terms of accuracy, specificity (true negative tare), sensitivity (true positive rate) and error rate and proves effectiveness of the proposed algorithm. The accuracy of the proposed algorithm is 98.55 %. The algorithm is also robust against post splicing image processing operation such as gaussian blur, additive white gaussian noise, JPEG compression and scaling however, previous techniques have not considered these experimental environment.

Бесплатно

Stabilogram mPCA Decomposition and Effects Analysis of Several Entries on The Postural Stability

Dhouha MAATAR, Zied LACHIRI, Régis FOURNIER, Amine NAIT-ALI

Статья научная

This paper presents an analysis of stabilogram using the modified Principal Component Analysis (mPCA) decomposition which will be employed to highlight the effects of different aspects on the human postural stability. The aim of this study is to analyze stabilogram center of pressure time series using the mPCA decomposition method. The mPCA is a decomposition method applied to a complex signal. It decomposes the stabilogram, considered as an additive model, into three components: trend, rambling and trembling. The study of the trace of analytic trembling (respectively of rambling) in the complex plan highlights a unique rotation center. So the phase is defined and two parameters are extracted: the area of the circle in which 95% of the trace's data points are located and the angular frequency. In this study 25 healthy volunteers (average age 31± 11 years) are required to stand upright on an electromagnetic platform either with eyes closed or open and with feet outspread or tighten. Experimental results show the efficiency of the parameter area to identify the effect of visual, proprioceptive and directional entries on the postural stability. These results are able to discriminate between control and young groups and indicate a less well-controlled posture for control subjects (34.5± 7.5y) relatively to young subjects (22.5 ±2. 5y). Results serve also to display that female subjects are more stable than males, that fat subjects are more stable than thin and that tall subjects are more stable than small.

Бесплатно

Statistical Image Classification for Image Steganographic Techniques

Seyyed Amin Seyyedi, Nick Ivanov

Статья научная

Steganography is the method of information hiding. Free selection of cover image is a particular preponderance of steganography to other information hiding techniques. The performance of steganographic system can be improved by selecting the reasonable cover image. This article presents two level unsupervised image classification algorithm based on statistical characteristics of the image which helps Sender to make reasonable selection of cover image to enhance performance of steganographic method based on his specific purpose. Experiments demonstrate the effect of classification in satisfying steganography requirements.

Бесплатно

Statistical Texture Features Based Automatic Detection and Classification of Diabetic Retinopathy

Md. Rahat Khan, A. S. M. Shafi

Статья научная

Diabetes is a globally prevalent disease that can cause microvascular compilation such as Diabetic Retinopathy (DR) in the human eye organs and it might prompt a significant reason for visual deficiency. The present study aimed to develop an automatic detection and classification system to diagnosing diabetic retinopathy from digital fundus images. An automated diabetic retinopathy detection and classification system from retinal images is proposed in our work to reduce the workload of ophthalmologists. This work comprises three main stages. Our proposed method first extracts the blood vessels from color fundus image. Secondly, the method detects whatever the input image as normal or diabetic retinopathy and then illustrates an automatic diabetic retinopathy classification technique through statistical texture features. It embeds Gray Level Co-occurrence Matrix (GLCM) and Gray Level Run Length Matrix (GLRLM) for second-order and higher-order statistical texture feature as a feature extraction technique into three renowned classifiers namely K-Nearest Neighbor (KNN), Random Forest (RF) and Support Vector Machine (SVM). The evaluation results containing a dataset of 644 retinal images indicate that the proposed method based on random forest classifier is found to be effective with a weighted sensitivity, precision, F1-score and accuracy of 95.53% 96.45%, 95.38% and 95.19% respectively for the detection and classification of diabetic retinopathy. These outcomes propose, that the method could decrease the cost of screening and diagnosis while achieving higher than suggested performance and that the system could be implemented in clinical assessments requiring better evaluating.

Бесплатно

1
...
45
46
47
48
49
50
51
...
В конец

Журнал