The main aim of the essay is to examine three philosophical narrations. One of them, Hegel’s master-slave dialectic, clearly inspired the other two, that is: Marx’s reflections in his Economic and Philosophic Manuscripts of 1844 and the interpretation of the Odyssey in Horkheimer and Adorno’s Dialectic of Enlightenment. Whereas Hegel’s dialectic opens a perspective of mutual recognition of individuals, permanently codified in their fundamental rights, the two remaining narrations lead to totally different conclusions. According to young Marx, the subjects not only do not recognize themselves mutually but even, under the influence of economic relationships, treat each other with disregard. Also in Adorno and Horkehimer’s view the labor processes, which according to Hegel led towards the freedom of individuals, distort interpersonal relations and strengthen the growing coercion. At the end, the proposal of Jürgen Habermas is taken into consideration. He argues that communication acts instead of labor processes are the real emancipating factor.
This study examined whether differences in reverberation time (RT) between typical sound field test rooms used in audiology clinics have an effect on speech recognition in multi-talker environments. Separate groups of participants listened to target speech sentences presented simultaneously with 0-to-3 competing sentences through four spatially-separated loudspeakers in two sound field test rooms having RT = 0:6 sec (Site 1: N = 16) and RT = 0:4 sec (Site 2: N = 12). Speech recognition scores (SRSs) for the Synchronized Sentence Set (S3) test and subjective estimates of perceived task difficulty were recorded. Obtained results indicate that the change in room RT from 0.4 to 0.6 sec did not significantly influence SRSs in quiet or in the presence of one competing sentence. However, this small change in RT affected SRSs when 2 and 3 competing sentences were present, resulting in mean SRSs that were about 8-10% better in the room with RT = 0:4 sec. Perceived task difficulty ratings increased as the complexity of the task increased, with average ratings similar across test sites for each level of sentence competition. These results suggest that site-specific normative data must be collected for sound field rooms if clinicians would like to use two or more directional speech maskers during routine sound field testing.
A phoneme segmentation method based on the analysis of discrete wavelet transform spectra is described. The localization of phoneme boundaries is particularly useful in speech recognition. It enables one to use more accurate acoustic models since the length of phonemes provide more information for parametrization. Our method relies on the values of power envelopes and their first derivatives for six frequency subbands. Specific scenarios that are typical for phoneme boundaries are searched for. Discrete times with such events are noted and graded using a distribution-like event function, which represent the change of the energy distribution in the frequency domain. The exact definition of this method is described in the paper. The final decision on localization of boundaries is taken by analysis of the event function. Boundaries are, therefore, extracted using information from all subbands. The method was developed on a small set of Polish hand segmented words and tested on another large corpus containing 16 425 utterances. A recall and precision measure specifically designed to measure the quality of speech segmentation was adapted by using fuzzy sets. From this, results with F-score equal to 72.49% were obtained.
The article reports three experiments conducted to determine whether musicians possess better ability of recognising the sources of natural sounds than non-musicians. The study was inspired by reports which indicate that musical training develops not only musical hearing, but also enhances various non-musical auditory capabilities. Recognition and detection thresholds were measured for recordings of environmental sounds presented in quiet (Experiment 1) and in the background of a noise masker (Experiment 2). The listener’s ability of sound source recognition was inferred from the recognition-detection threshold gap (RDTG) defined as the difference in signal level between the thresholds of sound recognition and sound detection. Contrary to what was expected from reports of enhanced auditory abilities of musicians, the RDTGs were not smaller for musicians than for non-musicians. In Experiment 3, detection thresholds were measured with an adaptive procedure comprising three interleaved stimulus tracks with different sounds. It was found that the threshold elevation caused by stimulus interleaving was similar for musicians and non-musicians. The lack of superiority of musicians over non-musicians in the auditory tasks explored in this study is explained in terms of a listening strategy known as casual listening mode, which is a basis for auditory orientation in the environment.
Biometric identification systems, i.e. the systems that are able to recognize humans by analyzing their physiological or behavioral characteristics, have gained a lot of interest in recent years. They can be used to raise the security level in certain institutions or can be treated as a convenient replacement for PINs and passwords for regular users. Automatic face recognition is one of the most popular biometric technologies, widely used even by many low-end consumer devices such as netbooks. However, even the most accurate face identification algorithm would be useless if it could be cheated by presenting a photograph of a person instead of the real face. Therefore, the proper liveness measurement is extremely important. In this paper we present a method that differentiates between video sequences showing real persons and their photographs. First we calculate the optical flow of the face region using the Farnebäck algorithm. Then we convert the motion information into images and perform the initial data selection. Finally, we apply the Support Vector Machine to distinguish between real faces and photographs. The experimental results confirm that the proposed approach could be successfully applied in practice.
The paper presents application of differential electronic nose in the dynamic (on-line) volatile measurement. First we compare the classical nose employing only one sensor array and its extension in the differential form containing two sensor arrays working in differential mode. We show that differential nose performs better at changing environmental conditions, especially the temperature, and well performs in the dynamic mode of operation. We show its application in recognition of different brands of tobacco
This article focuses on the emotionality of belonging among European Union (EU) citizens in the context of the United Kingdom’s (UK) 2016 referendum and its result in favour of the UK leaving the EU, commonly referred to as Brexit. Drawing from testimonies of EU27 citizens in the UK (mainly mid- to long-term residents) published in a book and on blog and Twitter accounts by the not-for-profit and non-political initiative, the ‘In Limbo Project’, it explores a range of emotions which characterise the affective impact of Brexit and how they underpin two key processes disrupting the sense of belonging of EU citizens: the acquisition of ‘migrantness’ and the non-recognition of the contributions and efforts made to belong. The resulting narratives are characterised by senses of ‘unbelonging’, where processes of social bonding and membership are disrupted and ‘undone’. These processes are characterised by a lack of intersubjective recognition in the private, legal and communal spheres, with ambivalent impacts on EU citizens’ longer-term plans to stay or to leave and wider implications for community relations in a post-Brexit society.
Today’s human-computer interaction systems have a broad variety of applications in which automatic human emotion recognition is of great interest. Literature contains many different, more or less successful forms of these systems. This work emerged as an attempt to clarify which speech features are the most informative, which classification structure is the most convenient for this type of tasks, and the degree to which the results are influenced by database size, quality and cultural characteristic of a language. The research is presented as the case study on Slavic languages.
In this paper a review on biometric person identification has been discussed using features from retinal fundus image. Retina recognition is claimed to be the best person identification method among the biometric recognition systems as the retina is practically impossible to forge. It is found to be most stable, reliable and most secure among all other biometric systems. Retina inherits the property of uniqueness and stability. The features used in the recognition process are either blood vessel features or non-blood vessel features. But the vascular pattern is the most prominent feature utilized by most of the researchers for retina based person identification. Processes involved in this authentication system include pre-processing, feature extraction and feature matching. Bifurcation and crossover points are widely used features among the blood vessel features. Non-blood vessel features include luminance, contrast, and corner points etc. This paper summarizes and compares the different retina based authentication system. Researchers have used publicly available databases such as DRIVE, STARE, VARIA, RIDB, ARIA, AFIO, DRIDB, and SiMES for testing their methods. Various quantitative measures such as accuracy, recognition rate, false rejection rate, false acceptance rate, and equal error rate are used to evaluate the performance of different algorithms. DRIVE database provides 100% recognition for most of the methods. Rest of the database the accuracy of recognition is more than 90%.
A variety of algorithms allows gesture recognition in video sequences. Alleviating the need for interpreters is of interest to hearing impaired people, since it allows a great degree of self-sufficiency in communicating their intent to the non-sign language speakers without the need for interpreters. State-of-theart in currently used algorithms in this domain is capable of either real-time recognition of sign language in low resolution videos or non-real-time recognition in high-resolution videos. This paper proposes a novel approach to real-time recognition of fingerspelling alphabet letters of American Sign Language (ASL) in ultra-high-resolution (UHD) video sequences. The proposed approach is based on adaptive Laplacian of Gaussian (LoG) filtering with local extrema detection using Features from Accelerated Segment Test (FAST) algorithm classified by a Convolutional Neural Network (CNN). The recognition rate of our algorithm was verified on real-life data.
Perception takes into account the costs and benefits of possible interpretations of incoming sensory data. This should be especially pertinent for threat recognition, where minimising the costs associated with missing a real threat is of primary importance. We tested whether recognition of threats has special characteristics that adapt this process to the task it fulfils. Participants were presented with images of threats and visually matched neutral stimuli, distorted by varying levels of noise. We found threat superiority effect and liberal response bias. Moreover, increasing the level of noise degraded the recognition of the neutral images to higher extent than the threatening images. To summarise, recognising threats is special, in that it is more resistant to noise and decline in stimulus quality, suggesting that threat recognition is a fast ‘all or nothing’ process, in which threat presence is either confirmed or negated.
Five models and methodology are discussed in this paper for constructing classifiers capable of recognizing in real time the type of fuel injected into a diesel engine cylinder to accuracy acceptable in practical technical applications. Experimental research was carried out on the dynamic engine test facility. The signal of in-cylinder and in-injection line pressure in an internal combustion engine powered by mineral fuel, biodiesel or blends of these two fuel types was evaluated using the vibro-acoustic method. Computational intelligence methods such as classification trees, particle swarm optimization and random forest were applied.