Denoising Medical Images Using Undecimated Wavelet Transform Engineering Essay

Published: November 21, 2015 Words: 3916

Abstract- In Medical diagnosis operations such as feature extraction and object recognition will play the key role. These tasks will become difficult if the images are corrupted with noises. So the development of effective algorithms for noise removal became an important research area in present days. Developing Image denoising algorithms is a difficult task since fine details in a medical image embedding diagnostic information should not be destroyed during noise removal. Many of the wavelet based denoising algorithms use DWT (Discrete Wavelet Transform) in the decomposition stage which is suffering from shift variance. To overcome this in this paper we are proposing the denoising method which uses Undecimated Wavelet Transform to decompose the image and we performed the shrinkage operation to eliminate the noise from the noisy image. In the shrinkage step we used semi-soft and stein thresholding operators along with traditional hard and soft thresholding operators and verified the suitability of different wavelet families for the denoising of medical images. The results proved that the denoised image using UDWT (Undecimated Discrete Wavelet Transform) have a better balance between smoothness and accuracy than the DWT. We used the SSIM (Structural similarity index measure) along with PSNR to assess the quality of denoised images.

Keywords: Discrete Wavelet Transform, Undecimated Wavelet Transform, Wavelet Shrinkage.

I. INTRODUCTION

Medical information, composed of clinical data, images and other physiological signals, has become an essential part of a patient's care, whether during screening, the diagnostic stage or the treatment phase. Data in the form of images and signals form part of each patient's medical file, and as such have to be stored and often transmitted from one place to another [18].

Over the past three decades, rapid developments in information technology (IT) & Medical Instrumentation has facilitated the development of digital medical imaging. This development has mainly concerned Computed Tomography (CT), Magnetic Resonance Imaging (MRI), the different digital radiological processes for vascular, cardiovascular and contrast imaging, mammography, diagnostic ultrasound imaging, nuclear medical imaging with Single Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET). All these processes are producing ever-increasing quantities of images. These images are different from typical photographic images primarily because they reveal internal anatomy as opposed to an image of surfaces.

A. Specificities of medical images

In Natural monochromatic or colour images, the pixel intensity of the image corresponds to the reflection coefficient of natural light. Whereas images acquired for clinical procedures reflect very complex physical and physiological phenomena, of many different types, hence the wide variety of images.

Medical imaging was initiated and developed due to the diversity of physical phenomena being used (X-rays, γ-rays, ultrasound waves, magnetic nuclear resonance). Medical imaging was further developed with the increased use of computers in the acquisition process (real-time treatment of a large amount of information) as well as for image reconstruction (tomography). Each medical imaging modality (digital radiology, computerized tomography (CT), magnetic resonance imaging (MRI), ultrasound imaging (US)) has its own specific features corresponding to the physical and physiological phenomena studied, as shown below

Figure1:Sagittal slices of the brain by different imaging modalities: a) magnetic resonance imaging (MRI), b) computed tomography (CT), c) positron emission tomography (PET), d) ultrasound (US)

B. The different features of medical imaging formation processes

The pixel or voxel values depend on the chemical and physical characteristics of the tissues studied. These characteristics often correspond to a physiological phenomenon. This intensity tallies with:

- an attenuation coefficient of X-rays for radiology and CT;

- a local concentration of radioactive markers in nuclear medicine giving for example access to information on the body glucose consumption or the metabolic activity within the body;

- a local concentration of contrast agent in radiology or MRIs;

- a density of protons or the speed of paramagnetic proton relaxation for MRI;

- an anisotropic movement of water molecules for diffusion MRI;

- a concentration change in oxyhemoglobin for functional MRI;

- a reflective and scattering coefficient for ultrasound;

- a local speed vector depicting blood flow during Doppler ultrasound, etc.

Imaging mechanisms based on spontaneous contrasts often provide anatomic information, while imaging mechanisms known as "functional" often use markers reflecting fluid motion or metabolic exchanges. These mechanisms efficiently depict important details on how well the body functions.

These medical mages have their own unique set of challenges. Although our focus in this paper will be on two-dimensional images, three-dimensional (volume) images, time-varying two-dimensional images (movies), and time-varying three-dimensional images are commonly used clinically as imaging modalities are becoming more sophisticated.

Spatial filters have long been used as the traditional means of removing noise from images and signals [1]. These filters usually smooth the data to reduce the noise, but, in the process, also blur the data. In the last decade, several new techniques have been developed that improve on spatial filters by removing the noise more effectively while preserving the edges in the data. Some of these techniques borrow ideas from partial differential equations and computational fluid dynamics such as level set methods, total variation methods [2], non-linear isotropic and anisotropic diffusion, Other techniques combine impulse removal filters with local adaptive filtering in the transform domain to remove not only white and mixed noise, but also their mixtures [2,3].

An efficient technique for such a non-stationary signal processing is the wavelet transform. The wavelet transform can be used as a decomposition of a signal in the time frequency scale plane [7,15,16]. There are many application areas of wavelet transform like as sub-band coding data compression, characteristic points detection and noise reduction. In order to reduce the noise of medical images many techniques are available like digital filters (FIR or IIR), adaptive method and wavelet transform thresholding methods. However, digital filters and adaptive methods can be applied to signal whose statistical characteristics are stationary in many cases. Recently the wavelet transform has been proven to be useful tool for non-stationary signal analysis.

In this paper we will present a different class of methods exploits the decomposition of the data into the wavelet basis and shrinks the wavelet coefficients in order to denoise the data [6,9,10]. While this is typically done using the more memory efficient decimated wavelet transforms, the use of non-decimated transforms will minimize the artifacts in the denoised data [12].

II. DISCRETE WAVELET TRANSFORM

Real-world signals, however, often have frequencies that can change over time or have pulses, anomalies, or other

"Events" at certain specific times. This type of signal can tell us where something is located on the planet, the health of a human heart, the position and velocity of a "blip" on a Radar screen, stock market behaviour, or the location of underground oil deposits. For these signals, we will often do better with wavelets than Fourier Transform [7, 15, 16].

Figure 2: DWT Single Stage Decomposition

The DWT of a signal x(n) is calculated by passing it through a series of filters. First the samples are passed through a low pass filter with impulse response g(n) resulting in a convolution of the two:

(1)

The signal is also decomposed simultaneously using a high-pass filter h. The outputs of the highpass filter are detail coefficients and the outputs of the lowpass filter are approximation coefficients. It is important that the two filters are related to each other and they are known as a quadrature mirror filter.

Since half the frequencies of the signal have now been removed, half the samples can be discarded according to Nyquist's rule. The filter outputs are then subsampled by 2.

(2)

This decomposition has halved the time resolution since only half of each filter output characterises the signal. However, each output has half the frequency band of the input so the frequency resolution has been doubled.

A. Cascading and Filter banks

This decomposition is repeated to further increase the frequency resolution and the approximation coefficients decomposed with high and low pass filters and then down-sampled. This is represented as a binary tree with nodes representing a sub-space with a different time-frequency localisation. The tree is known as a filter bank.

Figure 3: DWT Multilevel Decomposition

At each level in the above diagram the signal is decomposed into low and high frequencies. Due to the decomposition process the input signal must be a multiple of 2n where n is the number of levels.

B. 2D DWT

2D DWT can be implemented by applying the 1D DWT along the rows of an image first and applying then on the columns of an image. When a wavelet transform is applied to an image which is a 2D signal it decomposes the image into four subbands as shown in the figure 3. The LL band contains the approximation coefficients, LH band contains horizontal details, HL band contains vertical details and HH band will contain the diagonal details.

Figure 4: 2D DWT Decomposition of an image

C. Advantages of DWT

The ability to compact most of the signals energy into a few transformation coefficients which is called energy compaction.

The ability to capture and represent effectively low frequency components as well as high frequency transients.

The variable resolution decomposition with almost uncorrelated coefficients.

III. UNDECIMATED WAVELET TRANSFORM

Decimation of the wavelet coefficients is an intrinsic property of the discrete wavelet transform (DWT). The decimation step removes every other of the coefficients of the current level. Thus the computation of the wavelet transform is faster and more compact in terms of storage space. More importantly, the transformed signal can be perfectly reconstructed from the remaining coefficients. Unfortunately, the decimation is causing shift variance of the wavelet transform.

In order to achieve shift- invariance, researches from different fields and having various goals have invented several wavelet transform algorithms. This type of transforms is known under the common name undecimated wavelet transform (UWT).

Figure 5: UDWT Multilevel Decomposition

Unlike the discrete wavelet transform (DWT), which downsamples the approximation coefficients and detail coefficients at each decomposition level, the undecimated wavelet transform (UWT) does not incorporate the downsampling operations. Thus, the approximation coefficients and detail coefficients at each level are the same length as the original signal. The UWT upsamples the coefficients of the lowpass and highpass filters at each level. The upsampling operation is equivalent to dilating wavelets. The resolution of the UWT coefficients decreases with increasing levels of decomposition.

By comparing the UWT with the DWT, the UWT has some unique features.

A. Translation-Invariant Property

Unlike the DWT, the UWT has the translation-invariant, or shift-invariant, property. If two signals are shifted versions of each other, the UWT results for the two signals also are shifted versions of each other. The translation-invariant property is important in feature-extraction applications [12].

B. Better Denoising Capability

Denoising with the UWT also is shift-invariant. The denoising result of the UWT has a better balance between smoothness and accuracy than the DWT. The DWT-based method is more computationally efficient than the UWT-based method. However, we cannot achieve both smoothness and accuracy with the DWT-based denoising method.

C. Better Peak Detection Capability

Peaks often imply important information about a signal. We can use the UWT to identify the peaks in a noise-contaminated signal [14].

The UWT-based peak detection method is less sensitive to noise than the DWT-based method, because the UWT-based method involves finding zero-crossings in the multiscale UWT coefficients. The UWT-based method first finds zero-crossings among the coefficients with coarse resolution and then finds zero-crossings among the coefficients with finer resolution. Finding zero-crossings among the coefficients with coarse resolution enables us to remove noise from a signal efficiently. Finding zero-crossings among the coefficients with finer resolution improves the precision with which we can find peak locations.

IV. EXPERIMENTAL SETUP AND DENOISING ALGORITHM

Figure 6: Medical Image Denoising System

A. Wavelet Shrinkage

The wavelet shrinkage [4] is a signal denoising technique based on the idea of thresholding the wavelet coefficients. Wavelet coefficients having small absolute value are considered to encode mostly noise and very fine details of the signal. In contrast, the important information is encoded by the coefficients having large absolute value. Removing the small absolute value coefficients and then reconstructing the signal should produce signal with lesser amount of noise. The wavelet shrinkage approach can be summarized as follows [6, 10, 11]:

Consider the standard univariate non-parametric regression setting

(3)

Where 's are assumed to come from Zero-Mean normal distribution, are independent standard normal N(0,1) random variables and noise level may be known or unknown. The goal is to recover the underlying function 'S' from the noisy data,' X' without assuming any particular parametric structure for 'S'.

1. Calculate the wavelet coefficient matrix' w' by applying a wavelet transform 'W' to the data

(4)

2. Modify the detail coefficients of w to obtain the estimate of the wavelet coefficients of S.

(5)

3. Inverse transform the modified detail coefficients to obtain the denoised coefficients.

(6)

The number of wavelet coefficients' in equation 4 varies depending on the type of transform (Decimated or Undecimated) used. Consists of both scaling coefficients and wavelet coefficients. In decimated wavelet transform the number of coefficients in is same as number of data points.

The first step in denoising is selection of the forward and inverse transformation and respectively. There are various types of wavelets that can be used which differ in their support, symmetry and number of vanishing moments. In this paper we used the wavelets from the Haar, Daubechies, Coiflets, Symlets, Biorthogonal, Reverse Biorthogonal families and evaluated their suitability for medical image denoising interms of PSNR.

In addition to a wavelet, we also need to select the number of multiresolution levels. The maximum number of levels can be calculated by where K is the total number of data points.

Thresholding methods can be grouped into two categories, global thresholds and level dependent thresholds. The former method chooses a single value for threshold T to be applied globally to all empirical wavelet coefficients while the later method uses different thresholds for different levels. In this work we have used the universal threshold, which is a simple entropy measure totally depends on the size of the signal: T = σ.sqrt(2*log(K)), where K is the size of the signal and T is the threshold value. These thresholds require an estimate of the noise level .The usual standard deviation of the data values is clearly not a good estimator, unless the underlying function 'S' is reasonably flat. Donoho and Jhonstone considered estimating in the wavelet domain and suggested a robust estimate that is based only on the empirical wavelet coefficients at the finest resolution level. The reason for considering only the finest level is that the corresponding empirical wavelet coefficients tend to consist mostly of noise. Since there is some signal present even at this level, Donoho and Jhonstone proposed a robust estimate of the noise level (based on the median absolute deviation) given by

Here w0, w1, etc... are detail coefficients at the finest level.

Shrinkage step: Let denote a single detail coefficient and denote its shrink version. Let 'T' be the threshold and denote the shrinkage function which determines how threshold is applied to the data and be the estimate of the standard deviation of the noise in Eq(1). Then

(7)

By dividing with we standardise the coefficients to get and to this standardised we apply the threshold operator. After thresholding the resultant coefficients are multiplied with to obtain. If is built into the Thresholding model or if the data is normalised with respect to noise standard deviation, equation for estimated value of is

(8)

Another open question in the wavelet shrinkage algorithm is how to apply the threshold. The so-called hard thresholding method zeros the coefficients that are smaller than the threshold and leaves the other ones unchanged. In contrast, the soft thresholding scales the remaining coefficients in order to form a continuous distribution of the coefficients centered on zero. Several varieties of soft thresholding are described in the literature [4][9]. In our experiments we have used four thresholding techniques. They are Hard Thresholding, Soft Thresholding, Semi-Soft Thresholding, and Stein Thresholding and compared their efficiency of denoising the ECG signals based on PSNR (Peak Signal to Noise Ratio).

Hard Thresholding:

Soft Thresholding:

Semi-soft thresholding is a familly of non-linearities that interpolates between soft and hard thresholding. It uses both a main threshold T and a secondary threshold T1=mu*T.

(11)

When mu=1, the semi-soft thresholding performs a hard thresholding, whereas when mu=infty, it performs a soft thresholding.

Stein Thresholding: Another way to achieve a trade-off between hard and soft thresholding is to use a soft-squared thresholding non-linearity, also named a Stein estimator.

Figure 7: Wavelet Shrinkage Functions

V. RESULTS & DISCUSSIONS

In this paper we used the Universal threshold and applied it globally. We used STD, MAD methods to estimate the noise level. Finally we used the Hard, Soft, Semi-soft, and Stein Thresholding functions for the shrinkage of wavelet coefficients and compared their efficiency of denoising the images based on PSNR (Peak Signal to Noise Ratio) and SSIM (Structural Similarity Index measure).

A. MSE (Mean Squared error)

The mean square error between a signal and an approximation is the squared norm of the difference divided by the number of elements in the signal.

(12)

B. PSNR (Peak Signal to Noise Ratio)

PSNR is the peak signal-to-noise ratio in decibels (dB). The PSNR is only meaningful for data encoded in terms of bits per sample, or bits per pixel. For example, an image with 8 bits per pixel contains integers from 0 to 255.

(13)

Where B represents bits per sample.

C. Structural similarity Index measure (SSIM)

The structural similarity (SSIM) index is a method for measuring the similarity between two images [17]. The SSIM index is a full reference metric, in other words, the measuring of image quality based on an initial uncompressed or distortion-free image as reference. SSIM is designed to improve on traditional methods like peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which have proved to be inconsistent with human eye perception.

The SSIM metric is calculated on various windows of an image. The measure between two windows x and y of common size NÃ-N is:

with

the average of ;

the average of ;

the variance of ;

the variance of ;

the covariance of and ;

, two variables to stabilize the division with weak denominator;

the dynamic range of the pixel-values (typically this is );

and by default.

In order to evaluate the image quality this formula is applied only on luma. The resultant SSIM index is a decimal value between -1 and 1, and value 1 is only reachable in the case of two identical sets of data. Typically it is calculated on window sizes of 8Ã-8. The window can be displaced pixel-by-pixel on the image.

The tabulations were made σ vs PSNR and SSIM for various families of wavelets and four shrinkage functions as shown in the following tables. If the PSNR value is high it does not mean that the image is denoised in better way. Even the noise is removed it suffers from blurring and ringing effects when DWT is used. These artifacts are eliminated by using UDWT in place of DWT. The denoised images were shown in the figure 7.

Table 1: Denoising using DWT & MAD

DWT, MAD (σ vs PSNR, SSIM Index)

Wavelet Family

σ

Hard

Soft

Semi-Soft

Haar

10

42.0530

0.7529

40.0527

0.6652

41.5096

0.7199

20

39.2609

0.6240

37.9295

0.5716

39.1448

0.6112

30

37.6679

0.5478

36.7112

0.5210

37.8493

0.5501

Db4

10

52.7756

0.7804

40.8682

0.6972

42.3132

0.7503

20

39.9432

0.6533

38.6843

0.6012

39.6867

0.6414

30

38.3625

0.5752

37.4499

0.5507

38.5872

0.5829

Coif4

10

42.900

0.7867

41.0156

0.7038

42.4612

0.7572

20

40.0277

0.6591

38.7823

0.6050

39.9940

0.6465

30

38.4015

0.5792

37.5203

0.5526

38.6323

0.5839

Sym4

10

42.8308

0.7820

40.9804

0.6993

42.3895

0.7525

20

40.0420

0.6596

38.8418

0.6050

40.0166

0.6445

30

38.4760

0.5824

37.6073

0.5552

38.7237

0.5851

Bior6.8

10

38.1565

0.7621

35.5096

0.7207

35.1759

0.7514

20

34.3236

0.6285

35.2109

0.6255

34.5139

0.6439

30

33.6906

0.5457

35.0067

0.5743

34.2113

0.5869

rBior6.8

10

37.7163

0.7573

36.4730

0.6554

37.4605

0.7153

20

36.7325

0.6367

35.4283

0.5699

36.5457

0.6145

30

36.0120

0.5695

34.7345

0.5263

35.9140

0.5614

Table 2: Denoising using UDWT & MAD

DWT, MAD (σ vs PSNR, SSIM Index)

Wavelet Family

σ

Hard

Soft

Semi-Soft

Haar

10

43.9256

0.7919

43.9256

0.7919

42.6955

0.7374

20

41.4062

0.6936

41.4062

0.6936

40.4073

0.6471

30

40.0669

0.6380

40.0669

0.6380

39.0866

0.5971

Db4

10

44.1201

0.8059

44.1201

0.8059

42.8344

0.7521

20

41.4460

0.7018

41.4460

0.7018

40.4451

0.6545

30

40.0711

0.6435

40.0711

0.6435

39.1422

0.6015

Coif4

10

44.0996

0.8063

44.0996

0.8063

42.8169

0.7515

20

41.3931

0.6991

41.3931

0.6991

40.4047

0.6508

30

39.9893

0.6382

39.9893

0.6382

39.1426

0.5990

Sym4

10

44.1415

0.8049

44.1415

0.8049

42.8542

0.7497

20

41.5005

0.7013

41.5005

0.7013

40.5123

0.6533

30

40.1265

0.6429

40.1265

0.6429

39.2499

0.6028

Bior6.8

10

35.8947

0.8176

35.8947

0.8176

35.7409

0.7725

20

35.3758

0.7212

35.3758

0.7212

35.2037

0.6775

30

35.0352

0.6614

35.0352

0.6614

34.9363

0.6260

rBior6.8

10

37.9903

0.7573

37.9903

0.7573

37.5793

0.7019

20

37.1147

0.6087

37.1147

0.6087

36.3260

0.5948

30

36.0138

0.5628

36.0138

0.5628

36.0138

0.5628

Figure 8: Denoising using DWT

Figure 8: Denoising using UWT

The above figures show the denoising done by using DWT with noise variance of 30db. The denoised signal is losing fine details in the image. This can be overcome by using UDWT as shown below.

VI. REFERENCES

[1] Rangaraj M. Rangayyan, "Biomedical Signal Analysis A Case study Approach", IEEE Press, 2005.

[2] Yang Wang and Haomin Zhou, "Total Variation Wavelet-Based Medical Image Denoising", International Journal of Biomedical Imaging, Hindawi Publication, Volume 2006 Article ID89095, pages 1-6, 2006.

[3] A.Buades, B.Coll,J. M.Morel"A Review of Image Denoising Algorithms, with a new one", Journal of Multiscale Modeling and Simulation, Vol.4, No.2, pp. 490-530, 2005.

[4] Stephane Mallat, "A Wavelet Tour of signal Processing", Elsevier, 2006.

[5] Leslie Cromwell, Fred J. Weibell, "Biomedical Instrumentation and Measurements", Prentice Hall India.

[6] D L Donoho and M. Jhonstone, "Wavelet shrinkage: Asymptopia? ", J.Roy.Stat.Soc., SerB, Vol.57, pp. 301-369, 1995.

[7] K.P Soman, K.I.Ramachandran, "Insight into Wavelets From theory to practice", Prentice Hall India, 2004.

[8] Jeny Rajan, M R Kaimal, "Image denoising using wavelet embedded anisotropic diffusion", IEE Int.Conf on visual Information Engineering, pp.589-593, 2006.

[9] D L Donoho, "De-Noising by Soft-Thresholding", IEEE Transactions on Information Theory, vol.41, No.3, May 1995.

[10] Imola K. Fodor, Chandrika Kamath, "Denoising through wavlet shrinkage: An empirical study", Center for applied computing Lawrence Livermore National Laboratory, July 27, 2001.

[11] David L. Donoho and Iain M. Johnstone.,"Adapting to unknown smoothness via wavelet shrinkage", Journal of the American Statistical Association, vol.90, no432, pp.1200-1224, December 1995. National Laboratory, July 27, 2001.

[12] R. Coifman and D. Donoho, "Translation invariant de-noising," in Lecture Notes in Statistics: Wavelets and Statistics, vol. New York: Springer-Verlag, pp. 125--150, 1995.

[13] V. Strela. "Denoising via block Wiener filtering in wavelet domain". In 3rd European Congress of Mathematics, Barcelona, July 2000. Birkhäuser Verlag.

[14] S. G. Mallat and W. L. Hwang, "Singularity detection and processing with wavelets," IEEE Trans. Inform. Theory, vol. 38, pp. 617-643, Mar. 1992.

[15] S.G.Mallat "Wavelet Tour of Signal Processing", Elsevier, 2004.

[16] I.Daubechies, "Ten Lectures on Wavelets", SIAM Publishers, 1992.

[17] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004.

[18] Tanguy, J.-Y., Jallet, P., Le Bozec, C. and Frija, G. (2010) Relevance of Biomedical Data Compression, in Compression of Biomedical Images and Signals (eds A. Naït-Ali and C. Cavaro-Ménard),ISTE,London,UK. doi: 10.1002/9780470611159.ch1.