Ecg Signal Compression Using Wavelet Foveation Biology Essay

Published: November 2, 2015 Words: 4446

Foveation principles suggested by natural vision systems enable the construction of a proper mask that may modulate the coefficients given by the Discrete Wavelet Transform of an ECG record. The mask is spatially selective and provides maximum accuracy around specific regions of interest. Subsequent denoising and coefficient quantization are further combined with efficient coding techniques such as SPIHT and JPEG2000 in order to provide high compression ratios at low reconstruction errors. Both 1D and 2D cases are considered, and experimental results reported on a number of MIT-BIH records show improved performances over existing solutions.

ECG signal compression has been one of the major biomedical research topics during the last decades. Triggered by the availability of new theoretical tools, technological advances, and e-health initiatives, a plethora of original solutions have been reported and comparative tests on publicly available databases were conducted. Transform-based methods using a broad range of tools such as wavelet analysis, Discrete Cosinus Transform (DCT) or Principal Components Analysis (PCA) have been preferred over direct methods dealing with the raw ECG time series such as AZTEC or CORTES, and have been complemented by further signal processing stages based on optimal quantization and entropy encoding. Special attention has been paid to assessing the clinical quality of the lossy compressed data, hence specific distortion measures have been defined [1, 2, 3] in addition to the classical Euclidean-based ones.

One common aspect shared by many of the top performant compression methods refers to the use of the Discrete Wavelet Transform (DWT). It has been widely used in ECG signal analysis since this type of signals (short high-frequency impulses (R wave) superimposed on slowly varying waveforms (P, T, U waves), including artifacts and noise) favors processing tools with variable time-frequency resolution. Reported results indicate that DWT not only enables reliable identification of specific points and segments characterizing the ECG waveforms, but may also yield high compression ratios since most of the signal energy is concentrated in a limited number of significant coefficients.

An interesting line of research focuses on transforming the original one-dimensional ECG waveforms into 2D information, followed by further processing using various image processing tools. Examples include multiresolution analysis, vector quantization, and JPEG2000 coding. In this context, the present paper proposes a novel compression scheme inspired by natural vision systems, based on combining DWT analysis with the construction of a space-scale selective mask obtained following retina-like foveation principles [4]. Basically, foveation implements the design of the natural retina that provides variable spatial resolution : it has a very large number of cells in the central region (the fovea) and a continually decreasing number of cells towards its periphery. As a consequence, the details present in the original image are kept unaltered along the foveated region, while they smoothly disappear towards the margins of the visual field (anyway, in order to get a better picture of "interesting" regions in the inspected scene, the eyes successively switch between several distinct foveal points). Examples of foveated 1D and 2D signals are given in Fig. 1.

Using variable spatial resolution by means of foveation yields reduced data dimensionality, which may be exploited within a compression framework. Previous results have shown the possibility of treating foveation as an integral operator that may be conveniently approximated in order to provide a mask that will modulate the DWT coefficients of the analyzed signal, for both 1D and 2D cases [4]. As a consequence, the number of significant DWT coefficients will drop, and subsequent quantization will further decrease their number. Finally, the output data is obtained after applying a general-purpose lossless compression technique or more sophisticated methods especially efficient within the wavelet analysis framework such as EZW or SPIHT [5, 6].

Fig. 1.a) Original image (512x512 pixels); b) Foveated image (foveal point placed in the center of the image, along the right eye); c) Original (solid line) and foveated (circle line) ECG signal (foveal point placed on the R peak).

Recent studies revealed that a cardiologist would spend most of the time inspecting specific regions along the ECG waveform [7], hence in order to preserve critical diagnose information we shall construct the mask by combining the contributions of several distinct foveal points, placed on the R peak, and the middle of the P and T waves, respectively. As such, we may preserve the durations, the amplitudes, and the shapes of clinically significant features that will ultimately contribute to assessing the quality of the compressed waveforms by means of various performance criteria such as Percent Root-Mean Square Difference (PRD), Compression Ratio (CR), or more sophisticated Weighted Diagnostic Distortion (WDD) measure [1]. It is interesting to note that some of the top performant segmentation procedures aiming at identifying the characteristic points of ECG signals have also been introduced within the wavelet analysis framework [8], although we may still consider complementary methods rooted in general pattern recognition theory that are able to detect specific regions-of-interest (ROI) [9, 10].

The paper is organized as follows: in Section 2 we introduce the key elements of the wavelet foveation principle for both 1D and 2D cases. We focus on design issues within an ECG compression framework, including specific preprocessing procedures and implementation aspects. Experimental results on a number of MIT-BIH records for both 1D and 2D cases are given in Section 3, while discussions and conclusions are finally outlined in Sections 4 and 5.

Wavelet foveation: principles, methods, and implementation

The block diagram of the proposed approach is indicated in Fig. 2. It includes a preprocessing module that splits the original waveform into separate single-beat segments, identifies the corresponding characteristic points, and performs a padding procedure in order to equalize the lengths of the individual segments to a common value (that should be chosen as a power-of-two to accommodate subsequent DWT analysis). For the 2D case, the single-beat segments should be arranged into a matrix format, and specific sorting procedures will be applied to fully utilize both intra- and inter-correlation between the lines [11, 12]. Based on the characteristic points provided by the preprocessing stage, a separate module computes a (multi-) foveation mask, the parameters of which are mainly influenced by the widths of the regions-of-interest. For the 1D case we have a distinct mask for each individual segment (based on its specific PQRST complex), whereas in the 2D case the number of foveation points will drop significantly, since a single region-of-interest may cover a group of single-beat segments. We subsequently perform DWT decomposition of the ECG segments and modulate the resulting coefficients by multiplying those with the corresponding mask. As a result, the number of (masked) DWT coefficients will decrease considerably, and their number may further be reduced after applying one of the available denoising procedures. In order to obtain the final compressed information we may choose between the following alternatives:

a) quantize the masked DWT coefficients and apply a general-purpose lossless or lossy procedure. Anyway, we should keep in mind that both the amplitudes and the positions of the remaining wavelet coefficients are to be stored. Additionally, we should also transmit a small overhead indicating the actual length of each single-beat segment, and its mean value (if the segments have been previously zero-centered);

b) use an embedded scalar quantizer/compression method such as SPIHT that is especially suited to deal with wavelet-generated data (besides the original version of the algorithm we may even consider some of the recently introduced modified forms, available for both 1D and 2D cases [13, 14]).

The reconstruction procedure should first decompress the coded information, then perform inverse DWT in order to yield a foveated waveform (or image) to be compared against the original data.

Fig. 2. Block diagram of the wavelet foveation approach.

2.1 1D wavelet foveation

Going from a uniform resolution 1D or 2D signal to the fovea-like distribution exhibiting variable spatial resolution is called foveation. For the one-dimensional case, foveation of a signal has been modeled as space-variant smoothing according to [4]:

where defines a weighting function and is a smoothing function. The weight w(x) is modeled as and depends on three parameters: α is the rate by which the resolution decreases, γ is the fovea (the point with the highest resolution), and β is the foveal resolution (α and β must be non-negative). The smoothing function is normalized such that . Since we are typically interested in processing the available information using multiple foveal points, the global weighting function w(x) may be constructed from a set of individual ones {w1(x), w2(x), …} according to: .

From a more general perspective, foveation as given in equation may be also treated as an integral operator [4]:

where is the kernel of transformation T. Based on the ability of certain classes of wavelet bases in approximating integral operators [15], a number of important results have been established in [4] with direct application within the foveation framework. The main result states that we may compactly express the foveated waveform as:

where M represents a predetermined mask that is computed as a (diagonal) approximation of the two-dimensional DWT of the kernel function. Hence, once given the weighting function w(x), the smoothing function g(x), and the wavelet orthonormal basis, the values of the mask M can be precomputed. The foveated waveform can then simply be obtained by first computing the standard (uniform resolution) DWT transform, modulating it with the mask, and finally performing the inverse DWT.

Despite the relative simplicity of the approach, there are still a number of key points to be further discussed, including:

a) optimality of the wavelet basis: several studies oriented towards wavelet-based solutions for ECG signal compression have considered comparative analysis regarding the influence of the specific wavelet basis upon the compression performances [16, 17]. While not exhibiting dramatic different behavior from this perspective, we may still note the quality of the basis classes proposed by Villasenor [18], that were founded optimal for dealing with both one-dimensional and two-dimensional signals. Moreover, we should also note that since practical waveforms have usually compact support, biorthogonal basis may be preferred to orthogonal ones [16].

b) dealing with the end-effects: this aspect is shared by all transformations that imply periodic extension of finite-duration signals. We may typically choose between simple repetition of the given waveform (that is prone to errors if the amplitudes of the beginning and the end of the signal are quite different) and various versions of "mirroring", namely symmetric extensions applied at both edges of the data. Anyway, this procedure may be used only if the wavelets filters themselves are symmetric (that is common for biorthogonal wavelets but not for more familiar compactly supported orthogonal wavelets such as the Daubechies family [16]). This subject has been extensively studied in [19], and the analysis takes into consideration additional aspects such as the length of the analyzed waveform, and the type of transformation to be performed (direct or inverse).

c) denosing the (masked) DWT coefficients: since ECG signals carry significant levels of noise, that are typically related to wavelet coefficients having small amplitudes, we may use one of the many denoising methods described in the literature [20] in order to discard those. Options include various forms of hard or soft thresholding, whereas the threshold values may be set globally or scale-dependent. In the experiments to be reported in Section 3 we have used the Birgé-Massart method [21] that improves on the classical results of Donoho and Johnstone [22].

d) implementation aspects: the theoretical results rigorously presented in [4] are valid for the case where the weighting function takes the simple form . With reference to the more general definition , the same study reveals that a similar approximation of the masking function is still applicable for non-zero foveal point γ and foveal resolution β. Based on the properties of the wavelet transform it is further shown that computing the mask in the general case may be efficiently approximated by using a "look-up" procedure that amounts at storing the values of the mask for a set of distinct values of the α and β parameters. This makes the approach especially attractive from the implementation perspective in real-life applications, where it is expected that the foveal points (and associated parameters) change often and fast. Moreover, a further simplification is also suggested in [4], by considering a binary approximation of each entry of the mask after comparing it against a suitably chosen threshold.

2.2 2D wavelet foveation

One interesting line of research that has attracted much attention recently relies on transforming the original one-dimensional ECG waveforms into 2D images, and thus enabling further coding techniques to use both interbeat and intrabeat correlations. As such, classical general-purpose compression methods such as JPEG2000 or wavelet-oriented solutions such as SPIHT or EZW have shown impressive performance improvements over the one-dimensional case [11-13]. Typically, the 1D-2D reshaping is based on special preprocessing procedures involving cut-and-align successive single-beat intervals [23] (complemented with constant value padding), or period normalization [24]. Various improvements of the basic approaches have also been proposed in order to increase the correlation between successive lines of the resulting image [11, 12].

The foveation principle previously introduced for the one-dimensional case may be generalized for the 2D case by considering a two-dimensional smoothing kernel [4]:

and a corresponding two-dimensional weight function . Much similar to the 1D case, the kernel is also transformed via the (two-dimensional) DWT, showing again rapid decay away from the foveal point. The foveated image is also obtained using equation , whereas the actual expression of the 2D mask has to be computed distinctly along the diagonal, vertical, and horizontal directions, respectively. One key advantage of the method revealed in [4] indicates that the two-dimensional mask along those three directions may be obtained by taking products of two distinct one-dimensional masks according to:

As a consequence, the mask around each foveal point will have a star-shaped aspect, as illustrated in Fig. 3. An additional advantage is given by the fact that the widths of the weighting functions governing the masks along the two coordinates may be set independently, hence we may construct more general shaped masks.

Fig. 3. Example of foveation mask computed according to relations . Foveation points

were selected from Fig. 1a on the centers of the eyes, and the tip of the nose, respectively.

Experimental results

We have performed extensive computer simulations in order to validate the compression performances of the foveation approach. Tests were conducted using 10-minute long signals extracted from a number of records in the MIT-BIH Arrhythmia database [25], namely records 100, 101, 102, 103, 105, 106, 115, 117, 119, 202, 209, and 212. The data is sampled with 360 Hz, and is represented with 11 bits resolution. The records first undergo a preprocessing step that is merely driven by the 2D experiments. Specifically, we automatically detected the characteristic points of the ECG waveforms using the wavelet-based procedure introduced in [8]. It relies on an important theoretical result reported by S. Mallat that illustrates the possibility of characterizing the local shape of irregular structures by means of the evolution of DWT local maxima along successive dyadic scales [26]. A number of several such modulus-maxima based approaches have been reported [8], showing remarkable performances on several publicly available databases. Basically, a multiscale decomposition is first performed on the given ECG record, and afterwards the presence of min-max pairs of coefficients is identified on successive distinct resolution scales. Only pairs that are present on all scales are selected and further processed, the rest are treated as parasitic and eliminated, with the additional benefit of removing also many sources of noise and/or artifacts. We begin by first detecting the R peaks, since they yield modulus maxima with highest amplitudes. This enables the segmentation of the original ECG record into individual single-beat signals. In fact, following the procedure in [11], we cut the raw data at every 130th sample before each R peak in order to avoid large variations between the edge values. Next, we apply mean-padding of the individual beats to obtain a common length for all the segments by augmenting them with an appropriate number of constant values given by the average amplitude of each segment. The common length is given by the next power-of-two of the longest original single-beat segment in order to accommodate the data to the subsequent DWT processing (more specifically, the common length was set to 512 samples). The next step consists in detecting the onset and final points of the T and P waves, respectively, using the same modulus-maxima approach described earlier. Although the method has been used with remarkable success in practical applications, we should nevertheless note that the performances of this strategy rely on existing medical experience that provides statistical information on average segments durations, refractory periods, and typical artifacts. For the 2D case, individual (mean-padded, period normalized) segments are assembled into a square matrix to be foveated, as in Fig. 4. Following [11, 12], we increase the correlation between successive rows of the image by performing a sorting procedure. We experimented with both period length sorting as in [11], and complexity sorting as in [12], and obtained better results with the later approach.

Fig. 4. Preprocessing procedures for the 2D case (record 117): a) period sorting; b) complexity sorting.

DWT was performed using a biorthogonal basis, namely 7/9 Villesanor type [18], that proved optimal according to previous studies [16, 17]. The α parameter controlling the foveation rate was chosen inversely proportional with the width of the QRS complex, and the P and T waves, respectively. The values of the β parameter controlling the foveal resolution was selected at a common value of 1 for all the foveal points.

The compression performances were computed in terms of the Percent Root-Mean Square Difference (PRD) and the Compression Ratio (CR) defined as follows:

where denotes the original signal and the compressed version. Nbits defines the total number of bits used to encode the original and the compressed signal, respectively. Following common practice, we substracted an 1024 offset value from every sample before computing the PRD values.

We have further compressed the masked wavelet coefficients using both a general purpose compression algorithm, namely the Lempel-Ziv coding strategy implemented in the gzip tool [27], and the wavelet-oriented SPIHT algorithm [5]. In the former case, we first quantized the amplitudes of the coefficients using a 8-bit scalar quantizer (the compressed information includes both the amplitudes of the masked wavelet coefficients and their positions). As regarding the SPIHT algorithm, we used the code developed by the authors of the original method to deal with the 1D case (code available at http://www.cipr.rpi.edu/research/SPIHT/). For the 2D case we first constructed the corresponding foveated images and then compressed them with the JPEG2000 algorithm (implemented by the Kakadu software used by most of the existing references dealing with 2D ECG compression, available at http://www.kakadusoftware.com).

The 1D case

Experimental results for the one-dimensional case are indicated in Table 1. Besides PRD and CR values, we also indicate the normalized PRD performances (basically PRD after removing the average value of the waveforms), and the number of non-zero (masked) DWT coefficients resulted after quantization. A key point to note is that those results take into consideration only the actual lengths of the individual single-beat segments, not the common length obtained after padding (otherwise, the results would be over-optimistic). Augmenting foveation with the compression algorithms show low reconstruction errors at significantly high compression ratios.

Comparative performances with existing solutions are given in Table 2 for record 117, indicating the superiority of the proposed approach when using compression with both gzip and SPIHT method.

The dependence of CR performances versus quantization resolution of the DWT coefficients is indicated in Table 3 for record 117, showing smooth PRD degradation vs. significant CR increase. Examples of original and foveated waveforms (without further compression), along with corresponding (absolute value) error sequences are indicated in Fig. 5, showing better approximation along foveated regions of interest.

Fig. 5. Left - Original (black) and foveated waveforms (red); Right - Corresponding absolute error value sequences.

From first to last row, the records under study are 106, 117, and 212, respectively.

The 2D case

The full benefit of the foveation approach is obtained in the 2D case, since we do not only exploit both intra-beat and inter-beat correlations, but we may also significantly decrease the number of foveal points, since each point would extend its "region of influence" along both coordinates. Within this context, we will exploit the observation at the end of Section 2, by using distinct α parameters to define the weighting functions (and corresponding one-dimensional masks) along the horizontal and vertical coordinates. More specifically, while we keep the idea of choosing the α parameter along the horizontal direction inversely proportional to the durations of the foveal segments (QRS complex, and the P and T waves, respectively), we will set the α parameter along the vertical direction inversely proportional to the number of selected foveal points. As such, we may obtain a uniform covering of the foveal regions using a small number of actual foveal points. In Table 4 we report compression results for record 117 versus the number of selected foveal points, showing smooth variation of the performances. Comparative analysis with existing reported data is indicated in Table 5 for records 100, 117, and 119, showing improved performances of the foveation based approach when coupled with the JPEG2000 algorithm. The dimensions of the images were 512x286, 512x429, and 512x329 pixels, respectively, and number of foveation points was 16. End-effects were tackled by performing symmetric extensions along both coordinates.

In Fig. 6a are evidenced the regions-of-interest defined by the QRS complex, P and T waves, respectively, along with the actual positions of the foveal points, whereas in Fig. 6b we present the shape of the two-dimensional mask centered on the selected foveal points, showing uniform coverage of the clinically significant regions.

Examples of original and compressed waveforms, along with corresponding (absolute value) error sequences are indicated in Fig. 7, showing again minimization of the approximation error along the foveated regions of interest.

It is evident from Fig. 8 that the significant features of the data are preserved even at high compression ratios using JPEG2000, the main effect refers to progressive smoothing of the foveated waveforms.

Tables 1-5

Fig. 6. Wavelet foveation for the 2D case: a) Continuous line: onset and final points of the P waves (blue), QRS complexes (red), and T waves (green), respectively; Circle line: corresponding foveal points;

b) Foveation mask uniformly covering the regions-of-interest.

Fig. 7. Left - Original (black) and foveated waveforms (red); Right - Corresponding error sequences.

From first to last row, the records under study are 100, 117, and 119, respectively.

Fig. 8. Compressed foveated waveforms (record 117) using JPEG2000:

a) Original data; b) CR = 8; c) CR = 13; CR = 24.

Discussion

Experimental results presented in the previous section indicate encouraging compression performances of the foveation based approach when compared against existing top-performant solutions. Nevertheless, on critical aspect is related to the fact that the modulus-maxima DWT oriented segmentation of the ECG waveforms into clinically significant fragments may fail in some pathological cases. This is especially true when the QRS complex is severely distorted or the P and T waves shapes and durations show significant deviations relative to normal waveforms. In these cases the foveated regions of interest should be identified by alternative methods. Examples include the use of Hilbert Transform [30], time-frequency descriptors [31], or morphological transformations [32, 33]. A real-time implementation implying minimal preprocessing and reduced a priori information on the ECG waveforms has been reported in [34].

As an example, we present in Fig. 9 the results of detecting the boundaries of characteristic waves of some pathological ECG records using the Multiscale Morphological Detector (MMD) introduced in [32]. Those include right bundle branch block (RBBB), atrial premature contraction (APC), and premature ventricular contraction (PVC). Basically, the procedure relies of first choosing a moving window of length (2s + 1) samples, finding the maximum and minimum values of the analyzed waveform f(t) within the window, as well as the value of the signal at the central point f(x). Then, the MMD transform at the central point x can be computed as:

As pointed out in [32], local minima of the MMD transform correspond to positive peaks of the waveform under study such as the R peak, and local maxima of MMD identify the onsets and offsets of the P and T waves. Comparing local extrema of the MMD transformed signal against suitably chosen threshold values may reliably identify the characteristic waves even for severely distorted pathological ECG records.

Fig. 9: Upper plots: original ECG signals and detected characteristic segments (thick line). Bottom plots: MMD transformed signals with local extrema marked with circles (local maxima) and squares (local minima):

a) normal waveform; b) RBBB; c) APC; d) PVC.

Conclusions

Foveation has been long recognized as an efficient compression method inspired by natural vision systems. While this tool has been typically used within an image processing context, recent results have indicated its potential in dealing with one-dimensional signals too. Based on the observation that diagnostic useful information is mainly distributed around specific characteristic points and segments of ECG waveforms, we applied wavelet foveation principles for compression purposes with encouraging results. When coupled with modern compression tools such as SPIHT or JPEG2000 the performances improve over most of the previously reported results.

There are a number of advantages of the proposed method, such as: a) the approximation error is spread non-uniformly along the analyzed waveform, exhibiting lower values around segments of critical clinical importance; b) since it is basically a wavelet-oriented approach it may benefit of existing highly performant compression techniques such as SPIHT. In this context, an additional advantage is given by the possibility of progressive-quality transmission (most significant bits are transmitted first, hence the major features of the received signal are readily observable, then it keeps continually improving) that may be highly desirable in critical, alarm-triggering situations; c) as pointed out in [4], the foveation operator may be realized by using lookup tables, and further simplified by considering binary approximations, thus greatly smoothing the progress of the implementation step; d) to protect privacy of information, we may imagine altering the actual foveation mask values by multiplication with a known matrix acting as a private key.

Further work will be dedicated to dealing with pathological cases where the modulus-maxima approach to identifying the characteristic points of the ECG records fail, hence the foveal regions should be defined based on alternative methods, such as those based on morphological transformations or Empirical Mode Decomposition [35] for the one-dimensional case, or rooted in general pattern recognition theory that are able to detect specific regions-of-interest (ROI) [9, 10] for the two-dimensional case.