Search results

Filters

  • Journals
  • Authors
  • Keywords
  • Date
  • Type

Search results

Number of results: 17
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

In this paper, a new feature-extraction method is proposed to achieve robustness of speech recognition systems. This method combines the benefits of phase autocorrelation (PAC) with bark wavelet transform. PAC uses the angle to measure correlation instead of the traditional autocorrelation measure, whereas the bark wavelet transform is a special type of wavelet transform that is particularly designed for speech signals. The extracted features from this combined method are called phase autocorrelation bark wavelet transform (PACWT) features. The speech recognition performance of the PACWT features is evaluated and compared to the conventional feature extraction method mel frequency cepstrum coefficients (MFCC) using TI-Digits database under different types of noise and noise levels. This database has been divided into male and female data. The result shows that the word recognition rate using the PACWT features for noisy male data (white noise at 0 dB SNR) is 60%, whereas it is 41.35% for the MFCC features under identical conditions
Go to article

Authors and Affiliations

Sayf A. Majeed
Hafizah Husain
Salina A. Samad
Download PDF Download RIS Download Bibtex

Abstract

Individual identification of similar communication emitters in the complex electromagnetic environment has great research value and significance in both military and civilian fields. In this paper, a feature extraction method called HVG-NTE is proposed based on the idea of system nonlinearity. The shape of the degree distribution, based on the extraction of HVG degree distribution, is quantified with NTE to improve the anti-noise performance. Then XGBoost is used to build a classifier for communication emitter identification. Our method achieves better recognition performance than the state-of-the-art technology of the transient signal data set of radio stations with the same plant, batch, and model, and is suitable for a small sample size.
Go to article

Bibliography

  1.  J. Dudczyk, “Radar emission sources identification based on hierarchical agglomerative clustering for large data sets”, J. Sens. 2016, 1879327 (2016).
  2.  G. Manish, G. Hareesh, and M. Arvind, “Electronic Warfare: Issues and Challenges for Emitter Classification”, Def. Sci. J. 201161(3), 228‒234 (2011).
  3.  J. Dudczyk and A. Kawalec, “Specific emitter identification based on graphical representation of the distribution of radar signal parameters”, Bull. Pol. Acad. Sci. Tech. Sci. 63(2), 391‒396 (2015).
  4.  Q. Xu, R. Zheng, W. Saad, and Z. Han, “Device Fingerprinting in Wireless Networks: Challenges and Opportunities”, IEEE Commun. Surv. Tutor. 18(1), 94‒104 (2016).
  5.  P.C. Adam and G.L. Dennis, “Identification of Wireless Devices of Users Who Actively Fake Their RF Fingerprints With Artificial Data Distortion”, IEEE Trans. Wirel. Commun. 14(11), 5889‒5899 (2015).
  6.  N. Zhou, L. Luo, G. Sheng, and X. Jiang, “High Accuracy Insulation Fault Diagnosis Method of Power Equipment Based on Power Maximum Likelihood Estimation”, IEEE Trans. Power Deliv. 34(4), 1291‒1299 (2019).
  7.  S. Guo, R.E. White, and M. Low, “A comparison study of radar emitter identification based on signal transients”, IEEE Radar Conference, Oklahoma City, 2018, pp. 286‒291.
  8.  Q. Wu, C. Feres, D. Kuzmenko, D. Zhi, Z. Yu, and X. Liu, “Deep learning based RF fingerprinting for device identification and wireless security”, Electron. Lett. 54(24), 1405‒1407 (2018).
  9.  A. Kawalec, R. Owczarek, and J. Dudczyk, “Karhunen-Loeve transformation in radar signal features processing”, International Conference on Microwaves, Krakow, 2006.
  10.  B. Danev and S. Capkun, “Transient-based identification of wireless sensor nodes”, Information Processing in Sensor Networks, San Francisco, 2009, pp. 25‒36.
  11.  R.W. Klein, M.A. Temple, M.J. Mendenhall, and D.R. Reising, “Sensitivity Analysis of Burst Detection and RF Fingerprinting Classification Performance”, International Conference on Communications, Dresden, 2009, pp. 641‒645.
  12.  C. Bertoncini, K. Rudd, B. Nousain, and M. Hinders, “Wavelet Fingerprinting of Radio-Frequency Identification (RFID) Tags”, I IEEE Trans. Ind. Electron. 59(12), 4843‒4850 (2012).
  13.  Z. Shi, X. Lin, C. Zhao, and M. Shi, “Multifractal slope feature based wireless devices identification”, International Conference on Computer Science and Education, Cambridge, 2015, pp. 590‒595.
  14.  C.K. Dubendorfer, B.W. Ramsey, and M.A. Temple, “ZigBee Device Verification for Securing Industrial Control and Building Automation Systems”, International Conference on Critical Infrastructure Protection ,Washington DC, 2013, pp. 47‒62.
  15.  D.R. Reising and M.A. Temple, “WiMAX mobile subscriber verification using Gabor-based RF-DNA fingerprints”, International Conference on Communications, Ottawa, 2012, pp. 1005‒1010.
  16.  Y. Li, Y. Zhao, L. Wu, and J. Zhang, “Specific emitter identification using geometric features of frequency drift curve”, Bull. Pol. Acad. Sci. Tech. Sci. 66, 99‒108 (2018).
  17.  Y. Yuan, Z. Huang, H. Wu, and X. Wang, “Specific emitter identification based on Hilbert-Huang transform-based time-frequency-energy distribution features”, IET Commun. 8(13), 2404‒2412 (2014).
  18.  T.L. Carroll, “A nonlinear dynamics method for signal identification”, Chaos Interdiscip. J. Nonlinear Sci. 17(2), 023109 (2007).
  19.  D. Sun, Y. Li, Y. Xu, and J. Hu, “A Novel Method for Specific Emitter Identification Based on Singular Spectrum Analysis”, Wireless Communications & Networking Conference, San Francisco, 2017, pp. 1‒6.
  20.  Y. Jia, S. Zhu, and G. Lu, “Specific Emitter Identification Based on the Natural Measure”, Entropy 19(3), 117 (2017).
  21.  L. Lacasa, B. Luque, J. Luque, and J.C. Nuno, “The visibility graph: A new method for estimating the Hurst exponent of fractional Brownian motion”, Europhys. Lett. 86(3), 30001‒30005 (2009).
  22.  M. Ahmadlou and H. Adeli, “Visibility graph similarity: A new measure of generalized synchronization in coupled dynamic systems”, Physica D 241(4), 326‒332 (2012).
  23.  S. Zhu and L. Gan, “Specific emitter identification based on horizontal visibility graph”, IEEE International Conference Computer and Communications, Chengdu, 2017, pp. 1328‒1332.
  24.  B. Luque, L. Lacasa, F. Ballesteros, and J. Luque, “Horizontal visibility graphs: Exact results for random time series”, Phys. Rev. E. 80(4), 046103 (2009).
  25.  W. Jiang, B. Wei, J. Zhan, C. Xie, and D. Zhou, “A visibility graph power averaging aggregation operator: A methodology based on network analysis”, Comput. Ind. Eng. 101, 260‒268 (2016).
  26.  M. Wajs, P. Kurzynski, and D. Kaszlikowski, “Information-theoretic Bell inequalities based on Tsallis entropy”, Phys. Rev. A. 91(1), 012114 (2015).
  27.  J. Liang, Z. Huang, and Z. Li, “Method of Empirical Mode Decomposition in Specific Emitter Identification”, Wirel. Pers. Commun. 96(2), 2447‒2461, (2017).
  28.  A.M. Ali, E. Uzundurukan, and A. Kara, “Improvements on transient signal detection for RF fingerprinting”, Signal Processing and Communications Applications Conference (SIU), Antalya, 2017, pp. 1‒4.
  29.  Y. Yuan, Z. Huang, H. Wu, and X. Wang, “Specific emitter identification based on Hilbert-Huang transform-based time-frequency-energy distribution features”, IET Commun. 8(13), 2404‒2412 (2014).
  30.  D.R. Kong and H.B. Xie, “Assessment of Time Series Complexity Using Improved Approximate Entropy”, Chin. Phys. Lett. 28(9), 90502‒90505 (2011).
  31.  T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System”, Knowledge Discovery and Data Mining, San Francisco, 2016, pp. 785‒794.
  32.  G. Huang, Y. Yuan, X. Wang, and Z. Huang, “Specific Emitter Identification Based on Nonlinear Dynamical Characteristics”, Can. J. Electr. Comp. Eng.-Rev. Can. Genie Electr. Inform. 39(1), 34‒41 (2016).
  33.  D. Sun, Y. Li, Y. Xu, and J. Hu, “A Novel Method for Specific Emitter Identification Based on Singular Spectrum Analysis”, Wireless Communications and Networking Conference, San Francisco, 2017, pp. 1‒6.
Go to article

Authors and Affiliations

Ke Li
1 2 3
ORCID: ORCID
Wei Ge
1 2
ORCID: ORCID
Xiaoya Yang
1 2
Zhengrong Xu
1

  1. School of Information and Computer, Anhui Agricultural University, Hefei, Anhui, 230036, China
  2. Anhui Provincial Engineering Laboratory for Beidou Precision Agriculture Information, Anhui Agricultural University, Hefei, Anhui, 230036, China
  3. Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai University, Shanghai, 200072, China
Download PDF Download RIS Download Bibtex

Abstract

This paper presents the classification of musical instruments using Mel Frequency Cepstral Coefficients (MFCC) and Higher Order Spectral features. MFCC, cepstral, temporal, spectral, and timbral features have been widely used in the task of musical instrument classification. As music sound signal is generated using non-linear dynamics, non-linearity and non-Gaussianity of the musical instruments are important features which have not been considered in the past. In this paper, hybridisation of MFCC and Higher Order Spectral (HOS) based features have been used in the task of musical instrument classification. HOS-based features have been used to provide instrument specific information such as non-Gaussianity and non-linearity of the musical instruments. The extracted features have been presented to Counter Propagation Neural Network (CPNN) to identify the instruments and their family. For experimentation, isolated sounds of 19 musical instruments have been used from McGill University Master Sample (MUMS) sound database. The proposed features show the significant improvement in the classification accuracy of the system.

Go to article

Authors and Affiliations

Daulappa Guranna Bhalke
C. B. Rama Rao
Dattatraya Bormane
Download PDF Download RIS Download Bibtex

Abstract

Based on recent advances in non-linear analysis, the surface electromyography (sEMG) signal has been studied from the viewpoints of self-affinity and complexity. In this study, we examine usage of critical exponent analysis (CE) method, a fractal dimension (FD) estimator, to study properties of the sEMG signal and to deploy these properties to characterize different movements for gesture recognition. SEMG signals were recorded from thirty subjects with seven hand movements and eight muscle channels. Mean values and coefficient of variations of the CE from all experiments show that there are larger variations between hand movement types but there is small variation within the same type. It also shows that the CE feature related to the self-affine property for the sEMG signal extracted from different activities is in the range of 1.855~2.754. These results have also been evaluated by analysis-of-variance (p-value). Results show that the CE feature is more suitable to use as a learning parameter for a classifier compared with other representative features including root mean square, median frequency and Higuchi's method. Most p-values of the CE feature were less than 0.0001. Thus the FD that is computed by the CE method can be applied to be used as a feature for a wide variety of sEMG applications.

Go to article

Authors and Affiliations

Angkoon Phinyomark
Montri Phothisonothai
Pornchai Phukpattaranont
Chusak Limsakul
Download PDF Download RIS Download Bibtex

Abstract

In the last decade of the XX-th century, several academic centers have launched intensive research programs on the brain-computer interface (BCI). The current state of research allows to use certain properties of electromagnetic waves (brain activity) produced by brain neurons, measured using electroencephalographic techniques (EEG recording involves reading from electrodes attached to the scalp - the non-invasive method - or with electrodes implanted directly into the cerebral cortex - the invasive method). A BCI system reads the user's “intentions” by decoding certain features of the EEG signal. Those features are then classified and "translated" (on-line) into commands used to control a computer, prosthesis, wheelchair or other device. In this article, the authors try to show that the BCI is a typical example of a measurement and control unit.

Go to article

Authors and Affiliations

Remigiusz J. Rak
Marcin Kołodziej
Andrzej Majkowski
Download PDF Download RIS Download Bibtex

Abstract

An array consisting of four commercial gas sensors with target specifications for hydrocarbons, ammonia, alcohol, explosive gases has been constructed and tested. The sensors in the array operate in the dynamic mode upon the temperature modulation from 350°C to 500°C. Changes in the sensor operating temperature lead to distinct resistance responses affected by the gas type, its concentration and the humidity level. The measurements are performed upon various hydrogen (17-3000 ppm), methane (167-3000 ppm) and propane (167-3000 ppm) concentrations at relative humidity levels of 0-75%RH. The measured dynamic response signals are further processed with the Discrete Fourier Transform. Absolute values of the dc component and the first five harmonics of each sensor are analysed by a feed-forward back-propagation neural network. The ultimate aim of this research is to achieve a reliable hydrogen detection despite an interference of the humidity and residual gases.
Go to article

Authors and Affiliations

Patryk Gwiżdż
Andrzej Brudnik
Katarzyna Zakrzewska
Download PDF Download RIS Download Bibtex

Abstract

Acoustical analysis of snoring provides a new approach for the diagnosis of obstructive sleep apnea hypopnea syndrome (OSAHS). A classification method is presented based on respiratory disorder events to predict the apnea-hypopnea index (AHI) of OSAHS patients. The acoustical features of snoring were extracted from a full night’s recording of 6 OSAHS patients, and regular snoring sounds and snoring sounds related to respiratory disorder events were classified using a support vector machine (SVM) method. The mean recognition rate for simple snoring sounds and snoring sounds related to respiratory disorder events is more than 91.14% by using the grid search, a genetic algorithm and particle swarm optimization methods. The predicted AHI from the present study has a high correlation with the AHI from polysomnography and the correlation coefficient is 0.976. These results demonstrate that the proposed method can classify the snoring sounds of OSAHS patients and can be used to provide guidance for diagnosis of OSAHS.

Go to article

Authors and Affiliations

Can Wang
Jianxin Peng
Xiaowen Zhang
Download PDF Download RIS Download Bibtex

Abstract

This study proposes a method that combines Histogram of Oriented Gradients (HOG) feature extraction and Extreme Gradient Boosting (XGBoost) classification to resolve the challenges of concrete crack monitoring. The purpose of the study is to address the common issue of overfitting in machine learning models. The research uses a dataset of 40,000 images of concrete cracks and HOG feature extraction to identify relevant patterns. Classification is performed using the ensemble method XGBoost, with a focus on optimizing its hyperparameters. This study evaluates the efficacy of XGBoost in comparison to other ensemble methods, such as Random Forest and AdaBoost. XGBoost outperforms the other algorithms in terms of accuracy, precision, recall, and F1-score, as demonstrated by the results. The proposed method obtains an accuracy of 96.95% with optimized hyperparameters, a recall of 96.10%, a precision of 97.90%, and an F1-score of 97%. By optimizing the number of trees hyperparameter, 1200 trees yield the greatest performance. The results demonstrate the efficacy of HOG-based feature extraction and XGBoost for accurate and dependable classification of concrete fractures, overcoming the overfitting issues that are typically encountered in such tasks.
Go to article

Authors and Affiliations

Ida Barkiah
1
Yuslena Sari
2

  1. Department of Civil Engineering, Universitas Lambung, Mangkurat, Indonesia
  2. Department of Information Technology, Universitas Lambung Mangkurat, Indonesia
Download PDF Download RIS Download Bibtex

Abstract

The copy-move forgery detection (CMFD) begins with the preprocessing until the image is ready to process. Then, the image features are extracted using a feature-transform-based extraction called the scale-invariant feature transform (SIFT). The last step is features matching using Generalized 2 Nearest- Neighbor (G2NN) method with threshold values variation. The problem is what is the optimal threshold value and number of keypoints so that copy-move detection has the highest accuracy. The optimal threshold value and number of keypoints had determined so that the detection has the highest accuracy. The research was carried out on images without noise and with Gaussian noise.

Go to article

Authors and Affiliations

R. Rizal Isnanto
Ajub Ajulian Zahra
Imam Santoso
Muhammad Salman Lubis
Download PDF Download RIS Download Bibtex

Abstract

The individual identification of communication emitters is a process of identifying different emitters based on the radio frequency fingerprint features extracted from the received signals. Due to the inherent non-linearity of the emitter power amplifier, the fingerprints provide distinguishing features for emitter identification. In this study, approximate entropy is introduced into variational mode decomposition, whose features performed in each mode which is decomposed from the reconstructed signal are extracted while the local minimum removal method is used to filter out the noise mode to improve SNR. We proposed a semi-supervised dimensionality reduction method named exponential semi-supervised discriminant analysis in order to reduce the high-dimensional feature vectors of the signals, and LightGBM is applied to build a classifier for communication emitter identification. The experimental results show that the method performs better than the state-of-the-art individual communication emitter identification technology for the steady signal data set of radio stations with the same plant, batch and model.
Go to article

Authors and Affiliations

Wei Ge
1 2
ORCID: ORCID
Lin Qi
1 2
Lin Tong
1 2
Jun Zhu
1 2
Jing Zhang
1 2
Dongyang Zhao
3
Ke Li
1 2
ORCID: ORCID

  1. School of Information & Computer Science, Anhui Agricultural University, Hefei, Anhui, 230036, China
  2. Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, Anhui, 230601, China
  3. Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, ShenZhen, GuangDong, 518000, China
Download PDF Download RIS Download Bibtex

Abstract

Marine mammal identification and classification for passive acoustic monitoring remain a challenging task. Mainly the interspecific and intraspecific variations in calls within species and among different individuals of single species make it more challenging. Varieties of species along with geographical diversity induce more complications towards an accurate analysis of marine mammal classification using acoustic signatures. Prior methods for classification focused on spectral features which result in increasing bias for contour base classifiers in automatic detection algorithms. In this study, acoustic marine mammal classification is performed through the fusion of 1D Local Binary Pattern (1D-LBP) and Mel Frequency Cepstral Coefficient (MFCC) based features. Multi-class Support Vector Machines (SVM) classifier is employed to identify different classes of mammal sounds. Classification of six species named Tursiops truncatus, Delphinus delphis, Peponocephala electra, Grampus griseus, Stenella longirostris, and Stenella attenuate are targeted in this research. The proposed model achieved 90.4% accuracy on 70–30% training testing and 89.6% on 5-fold cross-validation experiments.

Go to article

Authors and Affiliations

Maheen Nadir
Syed Muhammad Adnan
Sumair Aziz
Muhammad Umar Khan
Download PDF Download RIS Download Bibtex

Abstract

Last decades, rolling bearing faults assessment and their evolution with time have been receiving much interest due to their crucial role as part of the Conditional Based Maintenance (CBM) of rotating machinery. This paper investigates bearing faults diagnosis based on classification approach using Gaussian Mixture Model (GMM) and the Mel Frequency Cepstral Coefficients (MFCC) features. Throughout, only one criterion is defined for the evaluation of the performance during all the cycle of the classification process. This is the Average Classification Rate (ACR) obtained from the confusion matrix. In every test performed, the generated features vectors are considered along to discriminate between four fault conditions as normal bearings, bearings with inner and outer race faults and ball faults. Many configurations were tested in order to determinate the optimal values of input parameters, as the frame analysis length, the order of model, and others. The experimental application of the proposed method was based on vibration signals taken from the bearing datacenter website of Case Western Reserve University (CWRU). Results show that proposed method can reliably classify different fault conditions and have a highest classification performance under some conditions.

Go to article

Authors and Affiliations

Youcef Atmani
Said Rechak
Ammar Mesloub
Larbi Hemmouch
Download PDF Download RIS Download Bibtex

Abstract

The tomato crop is more susceptible to disease than any other vegetable, and it can be infected with over 200 diseases caused by different pathogens worldwide. Tomato plant diseases have become a challenge to food security globally. Currently, diagnosing and preventing tomato plant diseases is a challenge due to the lack of essential methods or tools. The traditional techniques of detecting plant disease are arduous and error-prone. Utilizing precise or automatic detection methods in spotting early plant disease can improve the quality of food production and reduce adverse effects. Deep learning has significantly increased the recognition accuracy of image classification and object detection systems in recent years. In this study, a 15-layer convolutional neural network is proposed as the backbone for single shot detector (SSD) to improve the detection of healthy, and three classes of tomato fruit diseases. The proposed model performance is compared with ResNet-50, AlexNet, VGG 16, and VGG19 as the backbone for Single shot detector. The findings of the experiment showed that the proposed CNN-SDD achieved 98.87% higher detection accuracy, which outperformed state-of-the-art models.
Go to article

Authors and Affiliations

Benedicta Nana Esi Nyarko
1
ORCID: ORCID
Wu Bin
1
Zhou Jinzhi
1
ORCID: ORCID
Justice Odoom
1
ORCID: ORCID

  1. School of Information Engineering, Southwest University of Science and Technology, Mianyang, Sichuan, China
Download PDF Download RIS Download Bibtex

Abstract

Afault diagnosis method for the rotating rectifier of a brushless three-phase synchronous aerospace generator is proposed in this article. The proposed diagnostic system includes three steps: data acquisition, feature extraction and fault diagnosis. Based on a dynamic Fast Fourier Transform (FFT), this method processes the output voltages of aerospace generator continuously and monitors the continuous change trend of the main frequency in the spectrum before and after the fault. The trend can be used to perform fault diagnosis task. The fault features of the rotating rectifier proposed in this paper can quickly and effectively distinguish single and double faulty diodes. In order to verify the proposed diagnosis system, simulation and practical experiments are carried out in this paper, and good results can be achieved.
Go to article

Authors and Affiliations

Sai Feng
1
Jiang Cui
1
Zhuoran Zhang
1

  1. Nanjing University of Aeronautics and Astronautics, College of Automation Engineering, Nanjing City, Jiangsu Province, 211100, China
Download PDF Download RIS Download Bibtex

Abstract

Speech emotion recognition (SER) is a complicated and challenging task in the human-computer interaction because it is difficult to find the best feature set to discriminate the emotional state entirely. We always used the FFT to handle the raw signal in the process of extracting the low-level description features, such as short-time energy, fundamental frequency, formant, MFCC (mel frequency cepstral coefficient) and so on. However, these features are built on the domain of frequency and ignore the information from temporal domain. In this paper, we propose a novel framework that utilizes multi-layers wavelet sequence set from wavelet packet reconstruction (WPR) and conventional feature set to constitute mixed feature set for achieving the emotional recognition with recurrent neural networks (RNN) based on the attention mechanism. In addition, the silent frames have a disadvantageous effect on SER, so we adopt voice activity detection of autocorrelation function to eliminate the emotional irrelevant frames. We show that the application of proposed algorithm significantly outperforms traditional features set in the prediction of spontaneous emotional states on the IEMOCAP corpus and EMODB database respectively, and we achieve better classification for both speaker-independent and speaker-dependent experiment. It is noteworthy that we acquire 62.52% and 77.57% accuracy results with speaker-independent (SI) performance, 66.90% and 82.26% accuracy results with speaker-dependent (SD) experiment in final.
Go to article

Bibliography

  1.  M. Gupta, et al., “Emotion recognition from speech using wavelet packet transform and prosodic features”, J. Intell. Fuzzy Syst. 35, 1541–1553 (2018).
  2.  M. El Ayadi, et al., “Survey on speech emotion recognition: Features, classification schemes, and databases”, Pattern Recognit. 44, 572–587 (2011).
  3.  P. Tzirakis, et al., “End-to-end speech emotion recognition using deep neural networks”, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018, pp. 5089‒5093, doi: 10.1109/ICASSP.2018.8462677.
  4.  J.M Liu, et al., “Learning Salient Features for Speech Emotion Recognition Using CNN”, 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia), Beijing, China, 2018, pp. 1‒5, doi: 10.1109/ACIIAsia.2018.8470393.
  5.  J. Kim, et al., “Learning spectro-temporal features with 3D CNNs for speech emotion recognition”, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), San Antonio, USA, 2017, pp. 383‒388, doi: 10.1109/ACII.2017.8273628.
  6.  M.Y Chen, X.J He, et al., “3-D Convolutional Recurrent Neural Networks with Attention Model for Speech Emotion Recognition”, IEEE Signal Process Lett. 25(10), 1440‒1444 (2018), doi: 10.1109/LSP.2018.2860246.
  7.  V.N. Degaonkar and S.D. Apte, “Emotion modeling from speech signal based on wavelet packet transform”, Int. J. Speech Technol. 16, 1‒5 (2013).
  8.  T. Feng and S. Yang, “Speech Emotion Recognition Based on LSTM and Mel Scale Wavelet Packet Decomposition”, Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2018), New York, USA, 2018, art. 38.
  9.  P. Yenigalla, A. Kumar, et. al”, Speech Emotion Recognition Using Spectrogram & Phoneme Embedding Promod”, Proc. Interspeech 2018, 2018, pp. 3688‒3692, doi: 10.21437/Interspeech.2018-1811.
  10.  J. Kim, K.P. Truong, G. Englebienne, and V. Evers, “Learning spectro-temporal features with 3D CNNs for speech emotion recognition”, 2017 Seventh International Conference on Affective Computing and Intelligent Interaction (ACII), San Antonio, USA, 2017, pp. 383‒388, doi: 10.1109/ACII.2017.8273628.
  11.  S. Jing, X. Mao, and L. Chen, “Prominence features: Effective emotional features for speech emotion recognition”, Digital Signal Process. 72, 216‒231 (2018).
  12.  L. Chen, X. Mao, P. Wei, and A. Compare, “Speech emotional features extraction based on electroglottograph”, Neural Comput. 25(12), 3294–3317 (2013).
  13.  J. Hook, et al., “Automatic speech based emotion recognition using paralinguistics features”, Bull. Pol. Ac.: Tech. 67(3), 479‒488, 2019.
  14.  A. Mencattini, E. Martinelli, G. Costantini, M. Todisco, B. Basile, M. Bozzali, and C. Di Natale, “Speech emotion recognition using amplitude modulation parameters and a combined feature selection procedure”, Knowl.-Based Syst. 63, 68–81 (2014).
  15.  H. Mori, T. Satake, M. Nakamura, and H. Kasuya, “Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics”, Speech Commun. 53(1), 36–50 (2011).
  16.  B. Schuller, S. Steidl, A. Batliner, F. Burkhardt, L. Devillers, C. Müller, and S. Narayanan, “Paralinguistics in speech and language—state- of-the-art and the challenge”, Comput. Speech Lang. 27(1), 4–39 (2013).
  17.  S. Mariooryad and C. Busso, “Compensating for speaker or lexical variabilities in speech for emotion recognition”, Speech Commun. 57, 1–12 (2014).
  18.  G.Trigeorgis et.al, “Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network”, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 5200‒5204, doi: 10.1109/ ICASSP.2016.7472669.
  19.  Y. Xie et.al, “Attention-based dense LSTM for speech emotion recognition”, IEICE Trans. Inf. Syst. E102.D, 1426‒1429 (2019).
  20.  F. Tao and G.Liu, “Advanced LSTM: A Study about Better Time Dependency Modeling in Emotion Recognition”, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 2018, pp. 2906‒2910, doi: 10.1109/ ICASSP.2018.8461750.
  21.  Y.M. Huang and W. Ao, “Novel Sub-band Spectral Centroid Weighted Wavelet Packet Features with Importance-Weighted Support Vector Machines for Robust Speech Emotion Recognition”, Wireless Personal Commun. 95, 2223–2238 (2017).
  22.  Firoz Shah A. and Babu Anto P., “Wavelet Packets for Speech Emotion Recognition”, 2017 Third International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics (AEEICB), Chennai, 2017, pp. 479‒481, doi: 10.1109/ AEEICB.2017.7972358.
  23.  K.Wang, N. An, and L. Li, “Speech Emotion Recognition Based on Wavelet Packet Coefficient Model”, The 9th International Symposium on Chinese Spoken Language Processing, Singapore, China, 2014, pp. 478‒482, doi: 10.1109/ISCSLP.2014.6936710.
  24.  S. Sekkate, et al., “An Investigation of a Feature-Level Fusion for Noisy Speech Emotion Recognition”, Computers 8, 91 (2019).
  25.  Varsha N. Degaonkar and Shaila D. Apte, “Emotion Modeling from Speech Signal based on Wavelet Packet Transform”, Int. J. Speech Technol. 16, 1–5 (2013).
  26.  F. Eyben, et al., “Opensmile: the munich versatile and fast open-source audio feature extractor”, MM ’10: Proceedings of the 18th ACM international conference on Multimedia, 2010, pp. 1459‒1462.
  27.  Ch.-N. Anagnostopoulos, T. Iliou, and I. Giannoukos, “Features and classifiers for emotion recognition from speech: a survey from 2000 to 2011,” Artif. Intell. 43(2), 155–177 (2015).
  28.  H. Meng, T. Yan, F. Yuan, and H. Wei, “Speech Emotion Recognition From 3D Log-Mel SpectrogramsWith Deep Learning Network”, IEEE Access 7, 125868‒125881 (2019).
  29.  Keren, Gil and B. Schuller. “Convolutional RNN: An enhanced model for extracting features from sequential data,” International Joint Conference on Neural Networks, 2016, pp. 3412‒3419.
  30.  C.W. Huang and S.S. Narayanan, “Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition”, IEEE International Conference on Multimedia and Expo (ICME), Hong Kong, 2017, pp. 583‒588, doi: 10.1109/ ICME.2017.8019296.
  31.  S. Mirsamadi, E. Barsoum, and C. Zhang, “Automatic Speech Emotion Recognition using Recurrent Neural Networks with Local Attention”, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, 2017, pp. 2227- 2231, doi: 10.1109/ICASSP.2017.7952552.
  32.  Ashish Vaswani, et al., “Attention Is All You Need”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, USA, 2017.
  33.  X.J Wang, et al., “Dynamic Attention Deep Model for Article Recommendation by Learning Human Editors’ Demonstration”, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, Canada, 2017.
  34.  C. Busso, et al., “IEMOCAP: interactive emotional dyadic motion capture database,” Language Resources & Evaluation 42(4), 335 (2008).
  35.  F. Burkhardt, A. Paeschke, M. Rolfes, W.F. Sendlmeier, and B.Weiss, “A database of German emotional speech,” INTERSPEECH 2005 – Eurospeech, Lisbon, Portugal, 2005, pp. 1517‒1520.
  36.  D. Kingma and J. Ba, “International Conference on Learning Representations (ICLR)”, ICLR, San Diego, USA, 2015.
  37.  F. Vuckovic, G. Lauc, and Y. Aulchenko. “Normalization and batch correction methods for high-throughput glycomics”, Joint Meeting of the Society-For-Glycobiology 2016, pp. 1160‒1161.
Go to article

Authors and Affiliations

Hao Meng
1
Tianhao Yan
1
Hongwei Wei
1
Xun Ji
2

  1. Key laboratory of Intelligent Technology and Application of Marine Equipment (Harbin Engineering University), Ministry of Education, Harbin, 150001, China
  2. College of Marine Electrical Engineering, Dalian Maritime University, Dalian, 116026, China
Download PDF Download RIS Download Bibtex

Abstract

The condition monitoring of offshore wind power plants is an important topic that remains open. This monitoring aims to lower the maintenance cost of these plants. One of the main components of the wind power plant is the wind turbine foundation. This study describes a data-driven structural damage classification methodology applied in a wind turbine foundation. A vibration response was captured in the structure using an accelerometer network. After arranging the obtained data, a feature vector of 58 008 features was obtained. An ensemble approach of feature extraction methods was applied to obtain a new set of features. Principal Component Analysis (PCA) and Laplacian eigenmaps were used as dimensionality reduction methods, each one separately. The union of these new features is used to create a reduced feature matrix. The reduced feature matrix is used as input to train an Extreme Gradient Boosting (XGBoost) machine learning-based classification model. Four different damage scenarios were applied in the structure. Therefore, considering the healthy structure, there were 5 classes in total that were correctly classified. Five-fold cross validation is used to obtain a final classification accuracy. As a result, 100% of classification accuracy was obtained after applying the developed damage classification methodology in a wind-turbine offshore jacket-type foundation benchmark structure.
Go to article

Authors and Affiliations

Jersson X. Leon-Medina
1 2
ORCID: ORCID
Núria Parés
3
ORCID: ORCID
Maribel Anaya
4
ORCID: ORCID
Diego A. Tibaduiza
4
ORCID: ORCID
Francesc Pozo
1 5
ORCID: ORCID

  1. Control, Data, and Artificial Intelligence (CoDAlab), Department of Mathematics, Escola d’Enginyeria de Barcelona Est (EEBE),Campus Diagonal-Besòs (CDB), Universitat Politècnica de Catalunya (UPC), Eduard Maristany 16, 08019 Barcelona, Spain
  2. Programa de Ingeniería Mecatrónica, Universidad de San Buenaventura, Carrera 8H #172-20, Bogota, Colombia
  3. Laboratori de Càlcul Numèric (LaCàN), Department of Mathematics, Escola d’Enginyeria de Barcelona Est (EEBE), Campus Diagonal-Besòs
  4. Departamento de Ingeniería Eléctrica y Electrónica, Universidad Nacional de Colombia, Cra 45 No. 26-85, Bogotá 111321, Colombia
  5. Institute of Mathematics (IMTech), Universitat Politècnica de Catalunya (UPC), Pau Gargallo 14, 08028 Barcelona, Spain
Download PDF Download RIS Download Bibtex

Abstract

Since the induction motor operates in a complex environment, making the stator and rotor of the motor susceptible to damage, which would have significant impact on the whole system, efficient diagnostic methods are necessary to minimize the risk of failure. However, traditional fault diagnosis methods have limited applicability and accuracy in diagnosing various types of stator and rotor faults. To address this issue, this paper proposes a stator-rotor fault diagnosis model based on time-frequency domain feature extraction and Extreme Learning Machine (ELM) optimized with Golden Jackal Optimization (GJO) to achieve highprecision diagnosis of motor faults. The proposed method first establishes a platform for acquiring induction motor stator-rotor fault data. Next, wavelet threshold denoising is used to pre-process the fault current signal data, followed by feature extraction to perform time-frequency domain eigenvalue analysis. By comparison, the impulse factor is finally adopted as the feature vector of the diagnostic model. Finally, an induction motor fault diagnosis model is constructed by using the GJO to optimize the ELM. The resulting simulations are carried out by comparing with neural networks, and the results show that the proposed GJO-ELM model has the highest diagnostic accuracy of 94.5%. This finding indicates that the proposed method outperforms traditional methods in feature learning and classification of induction motor fault diagnosis, and has certain engineering application value.
Go to article

Authors and Affiliations

Lingzhi Yi
1 2
Jiao Long
1
Yahui Wang
1
Tao Sun
3
Jianxiong Huang
1
Yi Huang
1

  1. College of Automation and Electronic Information, Xiangtan University, Xiangtan, Hunan, 411105, China
  2. Hunan Engineering Research Center of Multi-Energy Cooperative Control Technology, Xiangtan, Hunan 411105, China
  3. State Grid Anhui Electric Power Ultra-High Voltage Company, Hefei, Anhui, 230000, China

This page uses 'cookies'. Learn more