This paper presents the current stage of the development of EA-MOSGWA – a tool for identifying causal genes in Genome Wide Association Studies (GWAS). The main goal of GWAS is to identify chromosomal regions which are associated with a particular disease (e.g. diabetes, cancer) or with some quantitative trait (e.g height or blood pressure). To this end hundreds of thousands of Single Nucleotide Polymorphisms (SNP) are genotyped. One is then interested to identify as many SNPs as possible which are associated with the trait in question, while at the same time minimizing the number of false detections.
The software package MOSGWA allows to detect SNPs via variable selection using the criterion mBIC2, a modified version of the Schwarz Bayesian Information Criterion. MOSGWA tries to minimize mBIC2 using some stepwise selection methods, whereas EA-MOSGWA applies some advanced evolutionary algorithms to achieve the same goal. We present results from an extensive simulation study where we compare the performance of EA-MOSGWA when using different parameter settings. We also consider using a clustering procedure to relax the multiple testing correction in mBIC2. Finally we compare results from EA-MOSGWA with the original stepwise search from MOSGWA, and show that the newly proposed algorithm has good properties in terms of minimizing the mBIC2 criterion, as well as in minimizing the misclassification rate of detected SNPs.
Air core solenoids, possibly single layer and with significant spacing between turns, are often used to ensure low stray capacitance, as they are used as part of many sensors and instruments. The problem of the correct estimation of the stray capacitance is relevant both during design and to validate measurement results; the expected value is so low to be influenced by any stray capacitance of the external measurement instrument. A simplified method is proposed that does not perturb the stray capacitance of the solenoid under test; the method is based on resonance with an external capacitor and on the use of a linear regression technique.
In this paper, the results of correlations between air temperature and electricity demand by linear regression and Wavelet Coherence (WTC) approach for three different European countries are presented. The results show a very close relationship between air temperature and electricity demand for the selected power systems, however, the WTC approach presents interesting dynamics of correlations between air temperature and electricity demand at different time-frequency space and provide useful information for a more complete understanding of the related consumption.
Infiltration process plays important role in water balance concept particularly in runoff analysis, groundwater re-charged, and water conservation. Hence, increasing knowledge concerning infiltration process becomes essential for water manager to gain an effective solution to water resources problems. This study employed multiple linear regression for esti-mating infiltration rate where the soil properties used as the predictor variable and measured infiltration rate as the response variable. Field measurement was conducted at sixteen points to obtain infiltration rate using double ring infiltrometer and soil properties namely soil porosity, silt, clay, sand content, degree of saturation, and water content. The result showed that measured infiltration rate had an average initial infiltration rate (f0) of 6.92 mm∙min–1 and final infiltration rate (fc) of 1.49 mm∙min–1. Soil porosity and sand content showed a positive correlation with infiltration rate by 0.842, 0.639, respectively, while silt, clay, water content, and degree of saturation exhibited a negative correlation by –0.631, –0.743, –0.66 and –0.49, respectively. Three types of regression equations were established based on type of soil properties used as predictor varia-bles. The model performance analysis was conducted for each equation and the result shows that the equation with five predictor variables fMLR_3 = – 62.014 + 1.142 soil porosity – 0.205 clay, – 0.063 sand – 0.301, silt + 0.07 soil water content with R2 (0.87) and Nash–Sutcliffe (0.998) gave the best result for estimating infiltration rate. The study found that soil po-rosity contributes mostly to the regression equation that indicates great influence in controlling soil infiltration behavior.
The purpose of the work was to predict the selected product parameters of the dry separation process using a pneumatic sorter. From the perspective of application of coal for energy purposes, determination of process parameters of the output as: ash content, moisture content, sulfur content, calorific value is essential. Prediction was carried out using chosen machine learning algorithms that proved to be effective in forecasting output of various technological processes in which the relationships between process parameters are non-linear. The source of data used in the work were experiments of dry separation of coal samples. Multiple linear regression was used as the baseline predictive technique. The results showed that in the case of predicting moisture and sulfur content this technique was sufficient. The more complex machine learning algorithms like support vector machine (SVM) and multilayer perceptron neural network (MPL) were used and analyzed in the case of ash content and calorific value. In addition, k-means clustering technique was applied. The role of cluster analysis was to obtain additional information about coal samples used as feed material. The combination of techniques such as multilayer perceptron neural network (MPL) or support vector machine (SVM) with k-means allowed for the development of a hybrid algorithm. This approach has significantly increased the effectiveness of the predictive models and proved to be a useful tool in the modeling of the coal enrichment process.
In recent years, smog and poor air quality have become a growing environmental problem. There is a need to continuously monitor the quality of the air. The lack of selectivity is one of the most important problems limiting the use of gas sensors for this purpose. In this study, the selectivity of six amperometric gas sensors is investigated. First, the sensors were calibrated in order to find a correlation between the concentration level and sensor output. Afterwards, the responses of each sensor to single or multicomponent gas mixtures with concentrations from 50 ppb to 1 ppm were measured. The sensors were studied under controlled conditions, a constant gas flow rate of 100 mL/min and 50 % relative humidity. Single Gas Sensor Response Interpretation, Multiple Linear Regression, and Artificial Neural Network algorithms were used to predict the concentrations of SO2 and NO2. The main goal was to study different interactions between sensors and gases in multicomponent gas mixtures and show that it is insufficient to calibrate sensors in only a single gas.
Streamflow modelling is a very important process in the management and planning of water resources. However, com-plex processes associated with the hydro-meteorological variables, such as non-stationarity, non-linearity, and randomness, make the streamflow prediction chaotic. The study developed multi linear regression (MLR) and back propagation neural network (BPNN) models to predict the streamflow of Wadi Hounet sub-basin in north-western Algeria using monthly hy-drometric data recorded between July 1983 and May 2016. The climatological inputs data are rainfall (P) and reference evapotranspiration (ETo) on a monthly scale. The outcomes for both BPNN and MLR models were evaluated using three statistical measurements: Nash–Sutcliffe efficiency coefficient (NSE), the coefficient of correlation (R) and root mean square error (RMSE). Predictive results revealed that the BPNN model exhibited good performance and accuracy in the prediction of streamflow over the MLR model during both training and validation phases. The outcomes demonstrated that BPNN-4 is the best performing model with the values of 0.885, 0.941 and 0.05 for NSE, R and RMSE, respectively. The highest NSE and R values and the lowest RMSE for both training and validation are an indication of the best network. Therefore, the BPNN model provides better prediction of the Hounet streamflow due to its capability to deal with complex nonlinearity procedures.
Statistical analysis is helpful for better understanding of the processes which take place in agricultural ecosystems. Particular attention should be paid to the processes of crops’ productivity formation under the influence of natural and anthropogenic factors. The goal of our study was to provide new theoretical knowledge about the dependence of vegetable crops’ productivity on water supply and heat income. The study was conducted in the irrigated conditions of the semi-arid cold Steppe zone on the fields of the Institute of Irrigated Agriculture of NAAS, Kherson, Ukraine. We studied the historical data of productivity of three most common in the region vegetable crops: potato, tomato, onion. The crops were cultivated by using the generally accepted in the region agrotechnology. Historical yielding and meteorological data of the period 1990–2016 were used to develop the models of the vegetable crops’ productivity. We used two approaches: development of pair linear models in three categories (“yield – water use”, “yield – sum of the effective air temperatures above 10°C”); development of complex linear regression models taking into account such factors as total water use, and temperature regime during the crops’ vegetation. Pair linear models of the crops’ productivity showed that the highest effect on the yields of potato and onion has the water use index (R2 of 0.9350 and 0.9689, respectively), and on the yield of tomato – temperature regime (R2 of 0.9573). The results of pair analysis were proved by the multiple regression analysis that revealed the same tendencies in the crop yield formation depending on the studied factors.