The goal of features is to characterize data from measurements whose values are very similar to objects in the same class and very different for objects in a different class. As well as providing discriminatory information, one of the most important functions of feature extraction is a dimensionality reduction of the data. This classification algorithm extracts several features of respiratory signals were extracted and utilized for apnea detection. The feature extraction plays a very important role since the classification is completely based on the values of the extracted features. Feature extraction can be done through a technique called signal processing. It consists of theory, algorithms, architecture, implementation, and applications related to processing information contained in different formats broadly designated as signals.
Signal processing deals with operations on or analysis of signals in both discrete and continuous time and it is used in an area of systems engineering, electrical engineering and applied mathematics. It includes signals such as sensor data , sound, time-varying measurement values, and images. Electrocardiograms, control system signals, telecommunication transmission signals and many others are some of the examples of biological data which are widely used in this area. Signals are the time-varying or spatial varying physical quantities contain some information.
Here, we used Second order autoregressive modeling for the estimation of parameters. Since the respiratory signal does not have constant amplitude and variations, we cannot predict the features directly. So the respiratory signal is modeled as a second order AR equation and then the coefficients are calculated. Then by using the coefficient the features of the respiratory signal is extracted.
RESPIRATORY SIGNAL(MIT-BIH DATABASE)3.2 BLOCK DIAGRAM
FEATURES EXTRACTION
STRENGTH OF DOMINANT FREQUENCY (STR)
DOMINANT FREQUENCY (FAR)
RESPIRATION RATE(FZX)
ENERGY INDEX(EI)
BURG METHOD
LEAST SQUARES METHOD
AUTO REGRESSIVE (AR) MODELLLING
CLASSIFIED FEATURES
3.3 FEATURES
Feature extraction involves simplifying the amount of resources required to describe a large set of data accurately. The fundamental features of the respiratory signal provide the numerical values which are compared with the threshold values and the classification results will be produced.
The fundamental features of respiratory signals are
1. Energy Index (EI)
2. Respiration frequency (FZX)
3. Dominant frequency estimated by AR modeling (FAR)
4. Strength of the dominant frequency estimated by AR modeling (STR)
To compute these features, the mean value was first removed from a given section of respiration recordings.
3.3.1 Energy Index (EI)
Energy index is the maximum amount of energy present in the signal.
Given a continuous-time signal f (t), the energy contained over a finite time interval is defined as follows
E(x1,x2)= f(t)|2.dt, T2 > T1 (3.1)
Ef =f(t)|2.dt (3.2)
Equation (3.1) defines the energy contained in the signal over time interval from T1 till T2. On the other hand, equation (3.2) defines the total energy contained in the signal. If the total energy of a signal is a finite non-zero value, then that signal is classified as an energy signal. Typically the signals which are not periodic turn out to be energy signals.
The equation for computing Energy index is T1 T2
2 (3.3)
Where N is the number of total data samples.
The mean of a signal x is defined as the average value of its sample
(3.4)
The total energy of a signal x is defined as the sum of squared modulo:
(3.5)
(Energy of x)
Energy is defined as the ability to do the work and their Physical units are routinely discarded in digital signal processing and then, the signals are renormalized.
The average power of the signal x is defined as the energy per sample
(3.6)
(Average power of x)
Another common description when x is real is the mean square. When x is a complex sinusoid is any function of the form.
3.3.2 Respiration Rate (FZX)
It is defined as the number of breaths a person takes during one minute. The average respiratory rate of a healthy adult at rest is 12-18 breaths per minute.
Average Respiratory Rates by Age:
AGE
BREATHS PER MINUTE
Newborns
30-40
Less than 1 year
30-40
1-3 Years
23-35
3-6 Years
20-30
6-12 Years
18-26
12-17 Years
12-20
Adults
8-20
Respiration frequency (FZX) was determined by counting the number of times that x (n) cross a baseline which is defined as the square root of EI.
FZX= (3.7)
It is also called as Zero Crossing and it used the term in electronics, mathematics, and image processing.
File:Zero crossing.svg
Fig 3.1 Representation of Zero crossing
In mathematical terms, when the sign of a function changes from positive to negative and vice versa, then that point is referred as a zero crossing and it is represented by a crossing the zero value. To estimate the fundamental frequency of speech, counting zero-crossing is a method used in speech processing.. The interval between zero crossings gives a good estimation of its frequency.
3.3.3 Dominant Frequency (FAR)
In order to obtain the features FAR and STR, the coefficients of a second order AR model have to be estimated. The respiration signal can be modeled as a second order autoregressive model as the following
x(n) = a1x(n-1) + a2x(n-2) + e(n) (3.8)
Where e (n) is the prediction error i.e., the error between the actual value and the predicted value and {a1, a2} are AR model coefficients.
The frequency that is occurring more often in a signal is called the dominant frequency of the signal. Using second order autoregressive model coefficients, one can determine the dominant frequency and the signal regularity strength as the following
Dominant frequency (Freq AR) = Freq AR = arctan (3.9)
Where fs is the sampling frequency and arctan gives the arc tangent of a1/a2 taking into account which quadrant the point (a1, a2 ) is in. A sampling frequency of 250Hz was used for analysis. In addition to features derived above, the average energy of a respiration segment is also calculated.
3.3.4 Strength of dominant Frequency (STR)
The AR coefficients were used to form a second auxiliary polynomial and FAR and STR were determined from the locations of a pair of complex conjugate roots that is,
FAR=sampling frequency*angle/360
STR=distance from the origin
Basically, FAR and STR serve the same purpose as power spectrum usually does, indicating the dominant frequency and its corresponding power level.
The classification of the signal is based on the derived parameters shown above of the other features extracted a modified zero crossing algorithm and thresholds would be properly initialized to allow accurate classification.
Signal strength (MagAR) = MagAR = (3.10)
3.4 AUTO REGRESSIVE (AR) MODELLING
A model based on both inputs and outputs of the system is called an autoregressive-moving-average model (ARMA). The model which depends only on the previous outputs of the system is called an autoregressive model (AR) and the model which depends on the inputs of the system is called a moving average model (MA). So previous outputs are the main factor here to find the present output.
The autoregressive model predicts an output y [n] of a system based on the previous outputs (y [n-1], y [n-2],….) and inputs (x [n], x [n-1],…) and it is one of a group of linear prediction formulas.
It is known as an infinite impulse response filter (IIR) or an all pole filter in the filter design industry, and it is known as a maximum entropy model in physical applications.
The definition used here is,
Where ai is the autoregression coefficients, xnis the series and n is the order of the filter which is very much less than the length of the series generally. The noise term or residue , ∈ in the above is mostly assumed to be Gaussian white noise. The linear weighted sum of previous terms in the series is used to estimate the current term of the series. The weights are auto regression coefficients. AR model is used for:
a) Each type of process (MA, AR and ARMA) can be converted to other types.
b) An AR model can be found by solving a linear set of equations, unlike the others.
c) An AR spectrum, calculate from a signal of length N.T, can have much better frequency resolution than the 1/(N.T) of classical estimators.
d) Under certain circumstances, an AR model for Pss (w) can maximize entropy.
e) An AR model can have far fewer coefficients than the corresponding MA model, just as a Butterworth IIR filter has far fewer coefficients than an FIR filter of similar performance.
AR spectral estimation gives a very significant improvement in frequency resolution compared to the traditional periodogram method as implemented by the FFT. The estimated AR spectrum is a continuous function of frequency and it would be evaluated numerically at any number of frequencies which is uniformly spaced or otherwise in the interval, 0 ≤ f ≤ 0.5fs where fs is the sampling frequency.
AR model framework assumes that an all-pole linear filter describes the generation of the signal under consideration and that the filter is driven by a white noise signal.The AR model therefore specifies the shape of the signals spectrum and it is used for analyzing stationary stochastic processes for different applications like radar, geophysics and economics.
3.4.1 TECHNIQUES USED FOR AR COEFFICIENTS CALCULATION
AR coefficients can be computed by number of techniques and the main two categories are :
Least squares method and
Burg method
3.4.1.1 Least Squares Method
The commonly used least squares method is based on the Yule-Walker equations. It involves the use of autocorrelation or covariance estimation. But in some special cases, the Yule-Walker estimation leads to poor parameter estimates, even for moderately sized data samples and it may lead to an unstable model.
3.4.1.2 Burg Method
The Burg algorithm estimates the AR parameters by determining reflection coefficients k that minimizes the sum of forward and backward residuals. The extension of the algorithm to segments is that the reflection coefficients are estimated by minimizing the sum of forward and backward residuals of all segments taken together. The new weighted Burg algorithm allows combining segments of different amplitudes.
The Burg algorithm finds a set of all-pole model parameters that minimizes the sum of the squares of the forward and backward prediction errors. However, in order to assure that the model is stable; this minimization is performed sequentially with respect to the reflection coefficients. Since the Burg algorithm does not apply a window to the data,the estimates of auto regressive parameters are more accurate than those obtained with the auto correlation method.
Burg's technique has the advantages of
Stable AR model,
It is computationally efficient method,
Having high frequency resolution.
The AR co- efficients using Burgs algorithm can be calculated as,
Choose m, the number of wanted coefficients.
Initialize A0=[1].
Using fk(n)= and bk(n)= ,
Initialize all f0(n)=b0(n)=xn.
Using
Fk=)2= 2
= 2
And
Bk=)2 = 2
= 2
Compute F0 and B0
Using Dk=Fk-fk(k)2+ Bk-bk(N-k)2,Compute D0.
For k from 0 to m-1
Calculate µ, using µ=
Where µ is the reflection coefficient
Update Ak+1 using an´= an+µak+1-n
Update (fk+1(n))n where n[k+1,N]
Using fk+1(n)=fk(n)+µbk(n-k-1)
Update (bk+1(n))n where n[0,N-k-1]
Usingbk+1(n)=bk(n)+µfk(n+k+1)
Update Dk using,
Dk=(1-µ2)Dk-fk+1(k+1)2-bk+1(N-k-1)2
3.5 THRESHOLD SCHEME
The required thresholds also can be determined automatically by the following approach. Divide a typical normal respiration signal of interest into smaller segments and compute the average energy index.
Then 33% and the 150% of the calculated energy index can be used as the low and the high energy threshold respectively. The normal breathing frequency for a human being is usually between 0.2-0.3Hz and maximum frequency is unlikely to exceed 0.7-0.8Hz.Hence these values are used as the minimum and the maximum threshold for the respiration rate.
Using the square root of the energy index as the appropriate baseline value for zero crossing, the number of times the signal crosses the baseline value was recorded and the respiratory frequency was detected from it. A moving baseline was used to allow for changes in the mean respiration level.
The calculated energy index the respiration rate the dominant frequency and the strength of the signal were compared with the set threshold values and were classified as either normal respiration , apnea or respiration with the artifact.
3.5.1 Threshold Values
FEATURES
RANGE
EI_MIN & EI_MAX
33% & 150% of average energy
FZX_MIN & FZX_MAX
0.2 Hz &0.7 Hz of average rate
FAR_MIN & FAR_MAX
50% & 150% of dominant frequency
STR_MIN & STR_MAX
75% & 95% of average strength
Table 3.1 Threshold values used for classification of respiratory signal
For the signal shown in , the threshold values obtained are
FEATURES
RANGE
EI_MIN & EI_MAX
33% & 150% of 0.0379
FZX_MIN & FZX_MAX
0.2 Hz &0.7 Hz of 0.1948
FAR_MIN & FAR_MAX
50% & 150% of -41.4606
STR_MIN & STR_MAX
75% & 95% of 1.9822
Table 3.2 Threshold values for a real time respiratory signal
3.6 DATASETS
The datasets are created using the above mentioned features which are extracted from the respiratory signal.The polysomnography database is made available with a set of labels corresponding to each 30 Sec epoch of data. The labels define events that occur within the epoch. The aim here is to distinguish between normal and obstructive apnea and thereby, all samples that are labeled as either obstructive sleep apnea or normal are selected to construct the data set.
3.7 PROGRAM ROUTINE
The respiratory signal is analyzed and classified using the following procedure,
The human respiratory signal is taken as samples from the website www.physionet.com
The respiratory signal is classified using the program designed in MATLAB 7.9.0
The samples of respiratory signals is given as input to the classifying program
The mathematical model is formed for the input respiratory signal using second order auto regressive modeling.
Four features are extracted from the signal for determining the threshold values.
In these four features, energy index and respiratory rate are calculated using (1) and (5)
The next two features, dominant frequency and strength of dominant frequency are calculated using burgs algorithm as mentioned in (6) and (7)
Threshold values are calculated using the above obtained features in reference to threshold table 1.
For each cycle of the input respiratory signal, again the four features are extracted and compared with the threshold values.
Based on this comparison the respiratory signal is classified.
3.8 FLOWCHART
Compute EI, FZX, FAR, STR
Start
YES
Str1>STR_LOW eiy>EI_HIGH
0.2>far1>0 0.7<FZX
Unclassified
fzx1<0.2
YES
eiy>EI_LOW
NO
Apnea
NO
str1>STR_HIGH 0.2<fzx1<0.7 far1>0.2
YES
Normal
NO
YES
Artifacts
NO
NO
NO
YES
YES
Respiration with artifacts
eiy>EI_HIGH str1<STR_LOW
YES
eiy<EI_LOW fzx1<0.2
Apnea
NO
Normal
3.9 ALGORITHM FOR FLOWCHART
Step 1 : Start the program
Step 2 : Calculate EI, FZX, FAR, STR
Step 3 : Check whether eiy is greater than EI_LOW
If yes, check the condition str1>STR_HIGH, 0.2<fzx1<0.7, far1>0.2
Else check fzx1<0.2
Step 4 : If the condition fzx1<0.2 is true, then it is apnea.
Else it is unclassified
Step 5 : If str1>STR_HIGH 0.2<fzx1<0.7 far1>0.2 is satisfied, then the signal is normal
Else check the next condition Str1>STR_LOW , eiy>EI_HIGH
Step 6 : If the above condition is true, then check the condition 0.2>far1>0 0.7<FZX to
Prove the signal is a respiratory signal with artifacts
Else it is an artifact
Step 7 : If eiy>EI_HIGH str1<STR_LOW then it is an artifact
If not then check eiy<EI_LOW fzx1<0.2
Step 8 : if eiy<EI_LOW fzx1<0.2 is true then it is apnea else normal.
CONCLUSION:
This work focussed on automatic feature extraction and classification of respiratory signals for detection of sleep apnea and motion artifacts. Respiratory signal features are used for classification which is carried out using Neural networks.