Memoryless Viterbi Decoder With Hybrid Information Technology Essay

Published: November 30, 2015 Words: 2100

Abstract- The problem of survival memory management of a Viterbi Decoder(VD) was solved by introducing a noval pointer implementation for the register exchange & hybrid register exchange method,where a pointer is assigned to each row of memory in the survivor memory unit(SMU).The content of the pointer which points to one row of memory is altered to point to another row of memory,instead of copying the contents of the first row to the second.In this paper,the one pointer VD is proposed,if the initial state of the convolutional encoder is known,the entire SMU is reduced to only one row.Because the decoded data bits are generated in the required order,even this row of memory is dispensable.Thus,the one pointer architecture,reffered to as memoryless VD,reduces the power consumption of a traditional traceback VD by approximately 50%,but has some performance degradation.A prototype of the MLVD with a one third convolutional code rate and a constraint length of nine is mapped into Xilinx 2v6000 chip,operating at 25MHz with a decoding throughput of more than 3 Mbps and a latency of two data bits.

Index Terms-Low power,memoryless,register exchange,hybrid register exchange,Viterbi Decoder,wireless.

1.Introduction

The Viterbi decoder (VD) is a kind of maximum likelihood decoder used for decoding convolutional codes. In VD, convolutionally coded received symbols that are usually corrupted by the noise are compared with all possible expected symbols, using Hamming distance. The possible expected symbols depend upon the data to be decoded and internal states of the encoder. The viterbi decoder attempts to find most probable set of states and the most probable sequence of possible inputs to the encoder.

The VD consists of three basic units; branch metric unit (BMU), add compare select unit (ACSU) and decoding unit called as trace back unit (TBU). The TBU performs its function by accessing the path memory four to five times the constraint length of convolutional code repeatedly, while BMU and ACSU performs their operation by executing a single operation. Therefore the TBU contributes more than half of the total power dissipation in viterbi decoder. Therefore minimization of the number of operations performed in the trace back operation is critical to the implementation of viterbi decoder.

For the implementation of viterbi decoder named as modified trace back method (MTBM), instead of decoding a bit at a time here blocks decoding of m (the number of shift register used in the encoder) bits are possible, so that the number of operations will be minimized. In this MTBM, block decoding is possible using single step trellis. Survivor path storage memory has to be modified, which consists of the previous state information for the current state instead of the decision bits at each stage. The REM is acceptable for trellises with only the small number of states. Since the partial data to be transferred from one location to another, the switching activity is also more. Therefore suggested a HREM Ease of Use which reduces the switching activity with decoding a dibit.

ACSU

BMU

SMU

ACS

PMM

I/P O/P

Fig 1:Diagram of Viterbi Decoder

2. VITERBI DECODER ARCHITECTURE

The viterbi decoding algorithm performs maximum likelihood sequence detection on data which have been convolutionally coded. The decoding uses the trellis diagram to determine the output data sequence path which is the most likely to the input sequences from encoder.

Viterbi decoder has three major processing units, branch metric unit (BMU), add-compare-select unit (ACSU), and survival memory unit (SMU) as shown in Fig. 1. In the decoding process, the BMU will first calculate with branch metrics (BM) of Hamming distance (for hard decision) or Euclidean distance (for soft decision) from received input codeword and also from the branch word [6]. The branch word value depends on the constraint length, the generator polynomial, and the coding rate of encoder. Then, the BMU produces 2n branch metric numbers which coding rate of 1/n. Next, the ACSU is used to find the path metric (PM) as well as decision bits (DB) from two branch metrics associated with two path metrics. The updated path metric will be the smallest previous metrics which has been accumulated by branch metric from BMU.

Table 1: The proposed Viterbi decoder specifications

Coding Rate

1/2,1/3

Constraint length

9

Generator

polynomials

561g, 753g , R =1/2

557g, 663g, 711g, R =1/3

Survivor path length

64

Decision level

Soft decision 8 level (3 bits)

Offset binary representation

Path Metric

8 bits Modulo Normalization

ACS unit

4

Data transmission

Continuous

DB

ADDERBM0n

Mux

Comp-aratorPM0n

BM1n

ADDER

PMn+1

Fig 2. ACS Diagram

Subsequently, it will be stored in the path metric memory (PMM) for calculating of the next path metrics and also decision bits. This ACS model is shown in Fig. 2 and described by

PM0n+1= min(PM0n + BM0n,PM1n + BM1n)

Where PMin+1(i=0,1) and BMin+1(i=0,1) are path metrics and branch metrics at time n respectively [7].

In ACSU, parallel viterbi decoder architecture uses 2k-1 ACS (K is constraint length) while serial architecture uses 2i ACS (i is 0, 1, 2, …). Obviously, that parallel structure takes large implemented area than that of the serial one. Finally, the SMU stores decision bits from ACSU, then, to find survivor path, and to decode the received data. There are two methods to find survivor path and decoded data. These are register exchange (RE) and track back (TB). The first method, it is the conceptual simplest and fast decoding, but it requires large chip area. This register exchange consists of array of multiplexers and registers which are connected to resemble the trellis diagram of convolution encoder. The depth of array of multiplexers and registers are required five times larger than the value of constraint length for without resulting in significant performance degradation [4]. The second method is the track back that the decision bits from ACSU will be stored in memory and used to trace backward as done similarly as in the method of RE. This scheme works with a higher complexity and with slower in speed than that of the RE method. But it requires smaller area for implementation as the advantage.

3. THE PROPOSED CONCEPT

In this paper, the viterbi decoder is designed and implemented aims for 3GPP standard. Its specifications are summarized in Table 1. The diagram of proposed architecture is shown in Fig. 3. It is composed of branch

metric unit (BMU), add compare select unit (ACSU),survival memory unit (SMU), and a control unit.

ACSU

Controller

BMU

4 ACS

S/P

Dual port RAM

SMU(RE)

Fig 3:The Proposed Viterbi Decoder

+

+

+

+

+

+

+

+

+

+

+

3

4

SMU Pointer

I0 Y0

Y1

Y2

Y3

I1

Y4

Y5

I2 Y6

Y7

Fig. 4: The 3 bit soft decision branch metric unit

3.1 Branch Metric Unit (BMU)

This BMU works for calculating branch metric from received input sequences. Its structure of the 8-level (3bits) soft decision and adaptive coding rate of both ½ and 1/3 are designed in this work. This is to suit with the

4 ACS architecture as the purpose. This BMU consists of 12 adders (8 bits), 8 multiplexers (8:1) and 8 dual rate of convolution encoder (1/2 and 1/3). The 8-bits adders calculate Euclidean distances from received input codeword following by

BM(b0,b1,b2) = −I0(b0) - I1(b1) - I2(b2) where I0, I1 and I2 are soft decision inputs, and BM(b0,b1,b2) is branch metric. b0, b1, and b2 are binary values varied from 0 to 7, and S(bi) represents the symmetric symbols of a or -a which corresponding to the logic value either "one" or "zero" [8]. For example b0, b1 ,and b2 represents to (0, 1, 0), then S(bi) is (-, +, -). Next, the branch metrics from adders are selected by dual rate of convolution encoder and multiplexers. In addition, this BMU is divided into two pipeline stages in order to increase its speed. The diagram of this 3 bit soft decision BMU is shown in Fig. 4.

3.2 Add Compare Select Unit (ACSU)

The ACSU work in serial architecture style with 4 ACS for having small implementation area. This BMU consists of 4 ACS, a 64 x 64 dual port RAM, and a serial to parallel (S/P) unit as shown in Fig. 3. The ACS takes two inputs respectively of branch metric from BMU and path metrics from path metric memory (PMM) which resides in dual port RAM. Then, the updated path metric and the decision bit are given as the output. These outputs are followed to calculate for the decision bit and for the updated path metric by using Equation 1. This ACS takes modulo normalization technique with modified comparison rule of [9] to avoid the path metric overflow. In addition, The ACS is divided into three pipeline stages for increasing speed. The diagram of to 3-pipelines ACS is shown in Fig. 5. ACSU computes by using 64 clock cycles for calculating one of information bit on 256 states of decoding. While the dual port RAM stores that updated path metrics to use for the next bit of computing process. At the last step, S/P unit converts 4 to 256 decision bits on each of one-bit decoding and then transfers them to

S MU.

3.3 Survival Memory Unit (SMU)

In SMU, register exchange (RE) method is used in this work. This RE is modified to suit with 4 ACS style for reducing implemented area. This modified RE method is shown in Fig. 6. Since, the original register exchange structure is large, it is modified in order to reduce the size of multiplexers (2:1) by replacing with 6K x 2K-1 shift registers and a column of 2K-1 multiplexers (2:1). The shift registers store decision bits and then the 2K-1 multiplexers (2:1) exchange

decision bits for decoding operation. It is done by using 6K clock cycles to calculate one of information.This proposed SMU structure consists of 64 x 256 shift registers and 256 multiplexers (2:1). The 256 decision bits from ACSU are saved to 256 shift registers on each of one-bit decoding (64 clock cycles). Then,

decision bits are exchanged by 256 multiplexes on each of clock-cycle of the decoding. The decoded bit is initiated after an equivalent time of 54 (K is 9) has elapsed without resulting in significant performance degradation. In this design, 64 of survivor paths length of decoding to suite for ACSU operation, is selected. The proposed SMU can decrease 63-colums out of 256 of the multiplexers and also reduce complexity of multiplexers interconnection. This SMU with modified register exchange

4. Memoryless Viterbi Decoder

The PVD keeps track of the current row position of the decoder in the memory.It makes use of the fact that the bit appended to each row of memory is exactly the bit that is shifted into the pointer to form the new pointer to that row of memory.To show the functionality,k=3 a(four states) and rate=1/3 convolutional encoder(G0=101G1=111,G2=111) is employed to encode the input sequence

of(110110010000).The code stream((111,011,100,100,011,011,100,101,101,111,000,000)is generated and transmitted over a noisy channel.The noisy code

stream,(111,110,101,100,001,011,011,110,101,111,000,000) for example.is received at the decoder.The underlined bits are incorrect because of the noise encountered during transmission.Applying the PVD method result in the successive values for the pointera and row of memory over time.

Pointer

The pointer block contains the current state of the decoder (8 bits). It is reset to zero (the initial state of the encoder) for each of bits decoded. Then, for each bit decoded, the pointer content is updated, by the output of the ACSTOSM module. The exact position of the bit that will be updated is determined by the MSB block, which acts as a circular pointer to the pointer block.Power reduction estimation and performance evaluation are conducted in the next section.

00

01

10

01

11

10

01

10

00

01

10

00

1

10

101

1011

10110

1011010

10110100

101101000

1101100010

110110010010

110110010000

Fig.5.MLVD approach with pointer mplementation (upper box carries the pointer and lower box carries the decode bits)

Fig 6 .example of hybrid registers exchange

method

ARCHITECTURE OF MLVDMLVD is an extra low-power design for the VD with the only restriction of resetting the encoder register at each of the encoded data bits and providing the necessary synchronization with the decoder. The block diagram of the MLVD, designed in VHDL, is shown in Fig.6.In order to have a built-in self-test design, a linear feedbackshift register (LFSR), a convolutional encoder and a comparator are added. The LFSR produces the random input for the encoder, whereas the comparator compares the delayed version of the LFSRs output with the decoded output of the MLVD. An output signal, status, indicates the correct functionality of the design.The following is a discussion of the different parts of the MLVD design and their functionality.