This chapter includes a detailed introduction regarding my thesis work. My thesis work is basically on stream cipher that is geffe generator. Stream ciphers are an important class of keystream generation algorithms. They usually encrypt individual characters (binary digits) of a plain text message one time at a time, using an encryption transformation which varies with time. Feedback shift registers particularly linear feedback shift registers are the basic components of many key stream generators.
Modern cryptology is for protecting sensitive information on the Internet from several forms of attacks. Cryptography is the study of designing cryptosystems, while cryptanalysis is the study of breaking these cryptosystems. Today the scope of cryptology is not limited to data encryption and decryption only. The main purposes of cryptology are threefold: (i) Confidentiality of message, (ii) Message authentication and (iii) Entity authentication.
According to the nature of the secret keys used in the encryption or decryption algorithms, cryptology can be divided into two broad classes. If both the encryption and decryption algorithms use the same shared key, then they come under symmetric cryptography otherwise asymmetric cryptography. In modern cryptography, stream ciphers are most useful in applications where information needs to be processed at high speed and when less computational resources are available.
Purpose of the document:
The purpose of my work is to analyze security of one of stream cipher that is Geffe generator against two different and powerful attacks that is the correlation attack and algebraic attack to make the data more secure in the network.
Applications of stream ciphers:
Stream ciphers are often used in applications where plaintext comes in quantities of unknowable length-for example, a secure wireless connection. If a block cipher were to be used in this type of application, the designer would need to choose either transmission efficiency or implementation complexity, since block ciphers cannot directly work on blocks shorter than their block size. For example, if a 128-bit block cipher received separate 32-bit bursts of plaintext, three quarters of the data transmitted would be padding. Block ciphers must be used in ciphertext stealing or residual block termination mode to avoid padding, while stream ciphers eliminate this issue by naturally operating on the smallest unit that can be transmitted (usually bytes).
Another advantage of stream ciphers in military cryptography is that the cipher stream can be generated in a separate box that is subject to strict security measures and fed to other devices, e.g. a radio set, which will perform the xor operation as part of their function. The latter device can then be designed and used in less stringent environments.
Objectives:
The aim of my research and implementation work is to evaluate performance of Geffe generator by performing/implementing its algorithm and then applying two different powerful attacks on the algorithm to enhance and check its performance against both types of attacks. And then I will compare impact of correlation attack and algebraic attack on the algorithm of Geffe generator which is non-linear combination generator (one type of stream ciphers). Objective is to check out that Geffe generator is more secure against which attack, correlation attack or algebraic attack.
Motivation:
Data Security is extremely enormous field. Basically stream/block ciphers are used to secure the data by applying encryption and decryption processes. As the data is the foremost asset for the organizations so there is a desperate need to protect the data from being pirate. Data protection is achieved by applying different algorithms to generate a powerful keystream so that attackers can't judge them.
Research Statement:
This document is basically comparison of two most common attacks i.e., algebraic attack and correlation attack by applying them on the stream cipher algorithm that is Geffe Generator and analyzing that Geffe Generator is more secure against which attack (correlation/algebraic).
Scope of Document:
Any project is investment, either financial, time, or knowledge investment. It is to be returned and though it is to be planned well. As the use of computers is so wide and is expanding day by day hence it becomes very necessary to protect and secure the data against attackers and other unauthorized third party.
The scope of the document is
To provide awareness about the strength of stream ciphers based on LFSRs
To show the impact of attacks by analyzing and evaluating existing algorithm of geffe generator with its strengths and weaknesses.
To analyze the strength of geffe generator against correlation attack and algebraic attack.
Outline:
The research work is arranged in the subsequent sections. Chapter 1 is a basic introduction of the research work. Chapter 2 covers the background knowledge regarding the stream ciphers. Chapter 3 entails the geffe generator algorithm, model, characteristics and weaknesses. Chapter 4 covers the attacks i.e., correlation attack and algebraic attack. Chapter 5 describes the analysis of the attacks by implementing them on geffe generator algorithm. Chapter 6 includes the implementation, testing and results respectively. Chapter 7 is concerned about the summary and conclusion.
BACKGROUND
This chapter is related to the background and origin of geffe generator in detail. Geffe generator is basically a stream cipher for generating key stream for encryption and decryption purposes.
Cipher:
In cryptography, cipher is basically an algorithm for performing encryption and decryption that is a series of well-defined steps that can be followed as a procedure. "Cipher" is alternatively spelled "cypher".
Figure:1 Cipher
When using a cipher the original information is known as plaintext, and the encrypted form as cipher text. The cipher text message contains all the information of the plaintext message, but is not in a format readable by a human or computer without the proper mechanism to decrypt it; it should resemble random garbage to those not intended to read it [1].
Categories of Ciphers:
As mentioned below in the figure 2, ciphers are divided into rotor machines, classical and modern ciphers. Here I will briefly discuss modern ciphers two most important subcategories that is stream ciphers and block ciphers.
Figure: 2 Ctegories of cipher.
Block cipher:
Block ciphers basically encrypts block of data of fixed size. In cryptography, a block cipher is a symmetric key cipher operating on fixed-length groups of bits, called blocks, with an unvarying transformation. A block cipher encryption algorithm might take (for example) a 128-bit block of plaintext as input, and output a corresponding 128-bit block of cipher text. The exact transformation is controlled using a second input, the secret key. Decryption is similar: the decryption algorithm takes, in this example, a 128-bit block of cipher text together with the secret key, and yields the original 128-bit block of plaintext. [3]
Stream Cipher:
Stream cipher is basically encryption of continuous streams of data. Stream ciphers can be either symmetric-key or public-key. In cryptography, a stream cipher is a symmetric key cipher where plaintext bits are combined with a pseudorandom cipher bit stream (key stream), typically by an exclusive-or operation. In a stream cipher the plaintext digits are encrypted one at a time, and the transformation of successive digits varies during the encryption. An alternative name of stream cipher is a state cipher, as the encryption of each digit is dependent on the current state. Practically, the digits are typically single bits or bytes [2].
Types of Stream Ciphers
Synchronous stream ciphers
In a synchronous stream cipher a stream of pseudo-random digits is generated independently of the plaintext and cipher text messages, and then combined with the plaintext (to encrypt) or the cipher text (to decrypt). In the most common form, binary digits are used (bits), and the key stream is combined with the plaintext using the exclusive or operation (XOR). This is termed a binary additive stream cipher. [2]
Self-Synchronizing stream ciphers
Another approach uses several of the previous N cipher text digits to compute the key stream. Such schemes are known as self-synchronizing stream ciphers, asynchronous stream ciphers or cipher text auto key (CTAK)[2]. The idea of self-synchronization was patented in 1946, and has the advantage that the receiver will automatically synchronize with the key stream generator after receiving N cipher text digits, making it easier to recover if digits are dropped or added to the message stream. Single-digit errors are limited in their effect, affecting only up to N plaintext digits. [2]
Stream Ciphers vs. Block ciphers
Stream ciphers are faster and smaller to implement than block ciphers, however, they have an important security gap. If the same key stream is used, certain types of attacks may cause the information to be revealed. A stream cipher consists of a state machine that outputs at each state transition one bit of information. This stream of output bits is commonly called the running key. The state machine is nothing more than a pseudo-random number generator [7]. Block ciphers encrypt information by breaking it down into blocks and encrypting data in each block [7]. A block cipher encrypts data in fixed sized blocks (commonly of 64 bits).
Linear Feed Back Shift Register (LFSR) based stream cipher
Feedback Shift Registers are a commonly used method of producing pseudo-random sequences. A linear feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state. The only linear function of single bits is xor, thus it is a shift register whose input bit is driven by the exclusive-or (xor) of some bits of the overall shift register value. [4] The initial value of the LFSR is called the seed, and because the operation of the register is deterministic, the stream of values produced by the register is completely determined by its current (or previous) state. Likewise, because the register has a finite number of possible states, it must eventually enter a repeating cycle. However, an LFSR with a well-chosen feedback function can produce a sequence of bits which appears random and which has a very long cycle. Applications of LFSRs include generating pseudo-random numbers, pseudo-noise sequences, fast digital counters, and whitening sequences [4]. Both hardware and software implementations of LFSRs are common.
When we use an LFSR in cryptology, the feedback polynomial and thus the feedback coefficient are known. The initial state however is not known and this is just the secret key. So it's clear that the use of 1 LFSR is not secure. Just use the linear relations and use some linear algebra to find the inital state.
John Koeter, in his paper named "What's an LFSR?" [5], explained what a Linear Feedback Shift Register (LFSR) and a Parallel Signature Analyzer (PSA) are and how to use them to test a TI Application-Specific Integrated Circuit (ASIC) using SCOPEE cells. According to John Koeter, an LFSR is a shift register that, when clocked, advances the signal through the register from one bit to the next most-significant bit. Linear feedback shift registers make extremely good pseudorandom pattern generators. When the outputs of the flip-flops are loaded with a seed value (anything except all 0s, which would cause the LFSR to produce all 0 patterns) and when the LFSR is clocked, it will generate a pseudorandom pattern of 1s and 0s. Note that the only signal necessary to generate the test patterns is the clock. Each output of the LFSR is multiplexed with an ASIC input and, when the device is placed in the LFSR (test) mode, the random, high-toggle-rate patterns produced are extremely good for generating high-fault coverage. To minimize the number of results that need to be compared to expected results, a PSA is used [5]. The PSA compresses multiple parallel patterns into a single pattern signature that is compared to the expected value. If the signatures match, it is assumed that the ASIC passed the test vectors applied and there are no manufacturing defects.
In the paper, named "Clock-Controlled Shift Registers for Key-Stream Generation" [6], author "Alexander Kholosha" estimated the period of the sequence generated by a clock-controlled LFSR with an irreducible feedback polynomial and an arbitrary structure of the control sequence, as well as some randomness properties of this sequence including element distribution and the autocorrelation function. Also he had constructed and analyzed a specific key-stream generator that applies clock-control [6]. Finally, Alexander presents a comprehensive survey of known correlation attacks on clock-controlled registers and their memory less combiners.
Properties of LFSR:
LFSRs have the following properties that are attractive in cryptography.
The period is large, and grows exponentially along with the size of the LFSR.
LFSR sequences have good statistical properties. The statistical distribution of the sequence generated is very close to the uniform distribution.
LFSR sequences are easy to construct and implement both in software and hardware.
It has a low linear complexity.
Applications of Linear feedback Shift Registers (LFSR)
LFSRs can be implemented in hardware, and this makes them useful in applications that require very fast generation of a pseudo-random sequence, such as direct-sequence spread spectrum radio. LFSRs have also been used for generating an approximation of white noise in various programmable sound generators. The Global Positioning System uses an LFSR to rapidly transmit a sequence that indicates high-precision relative time offsets [4].
Uses as counters
LFSR counters have simpler feedback logic than natural binary counters or Gray code counters, and therefore can operate at higher clock rates. However it is necessary to ensure that the LFSR never enters an all-zeros state, for example by presetting it at start-up to any other state in the sequence [4]. One can obtain any other period by adding to an LFSR that has a longer period some logic that shortens the sequence by skipping some states.
Uses in cryptography
LFSRs have long been used as pseudo-random number generators for use in stream ciphers (especially in military cryptography), due to the ease of construction from simple electromechanical or electronic circuits, long periods, and very uniformly distributed output streams. However, an LFSR is a linear system, leading to fairly easy cryptanalysis. For example, given a stretch of known plaintext and corresponding cipher text, an attacker can intercept and recover a stretch of LFSR output stream used in the system described, and from that stretch of the output stream can construct an LFSR of minimal size that simulates the intended receiver by using the Berlekamp-Massey algorithm [4]. This LFSR can then be fed the intercepted stretch of output stream to recover the remaining plaintext.
Three general methods are employed to reduce this problem in LFSR-based stream ciphers:
Non-linear combination of several bits from the LFSR state;
Non-linear combination of the output bits of two or more LFSRs (see also: shrinking generator); or
Irregular clocking of the LFSR, as in the alternating step generator.
Important LFSR-based stream ciphers include A5/1 and A5/2, used in GSM cell phones, E0, used in Bluetooth, and the shrinking generator. The A5/2 cipher has been broken and both A5/1 and E0 have serious weaknesses.
Uses in digital broadcasting and communications
To prevent short repeating sequences (e.g., runs of 0's or 1's) from forming spectral lines that may complicate symbol tracking at the receiver or interfere with other transmissions, linear feedback registers are often used to "randomize" the transmitted bit stream. This randomization is removed at the receiver after demodulation [4]. When the LFSR runs at the same rate as the transmitted symbol stream, this technique is referred to as scrambling. When the LFSR runs considerably faster than the symbol stream, expanding the bandwidth of the transmitted signal, this is direct-sequence spread spectrum [4].
According to the Alexander Kholosha [8], the principle part of any synchronous stream cipher is a key-stream generator. In order to satisfy the basic security requirements of long period, large linear complexity and uniform distribution properties for a key stream, the vast majority of key-stream generators are built following few basic design principles. The long period and uniformity are achieved due to the use of linear feedback shift registers (LFSR's). Nonlinearity is introduced either explicitly by applying nonlinear functions (e.g., combination and ¯lter generators) or implicitly by controlling the clock of an LFSR [8]. In the thesis we contribute both to the study of nonlinear cryptographic functions and to the theory of clock-controlled LFSR's [8].
Stream Ciphers using LFSRs:
As LFSRs are very easy to implement, that's why people have introduced many other types of steam ciphers. Two of the most widely used types are the filter generator and the combination generator.
Filter Generator
A filter generator consists of a single LFSR and uses a nonlinear Boolean function to produce the output.
Figure:3 Filter Generator
Combination Generator
A Combination generator constis of several linear LFSRs which are combined by a nonlinear BF (boolean function), also called the combiner or combination function.
Figure:4 Combinnation Generator
I also want to mention that the output of the combination generator can also be represented by an LFSR. The length of this LFSR is lesser or equal then the value evaluated by the BF with inputs the length of the combined LFSRs.
Non linear Combination Generator
One general technique for destroying the linearity inherent in LFSRs is to use several LFSRs in parallel. The keystream is generated as a nonlinear function f of the outputs of the component LFSRs. Such keystream generators are called nonlinear combination generators, and f is called the combining function.
Figure:5 Non-linear combination generator
Attacks:
Attack is actually either a successful or unsuccessful attempt at breaking part or all of a cryptosystem. This attack is on cryptographic encryption or key-generation.
Complexity of the Attack
Time: The number of primitive operations that are needed to execute the attack.
Memory: The amount of storage required to perform the attack.
Data: The amount of data required for the attack
Some Types of Attacks:
Brute Force Attack
Brute Force Attack is a form of attack in which each possibility is tried until success is obtained. Typically, a cipher text is deciphered under different keys until plaintext is recognized.
Algebraic attack
A method of cryptanalytic attack used against block ciphers that exhibit a significant amount of mathematical structure.
Birthday attack
A brute-force attack used to find collisions. [9] It gets its name from the surprising result that the probability of two or more people in a group of 23 sharing the same birthday is greater than 1/2.
Chosen ciphertext attack
An attack where the cryptanalyst may choose the ciphertext to be decrypted
Chosen plaintext attack
A form of cryptanalysis where the cryptanalyst may choose the plaintext to be encrypted [9]
Ciphertext-only attack
It's a form of cryptanalysis where the cryptanalyst has some cipher text but nothing else.
Correlation attack
Correlation attacks are a class of known plaintext attacks for breaking stream ciphers whose keystream is generated by combining the output of several linear feedback shift registers (LFSR) using a Boolean function. Correlation attacks are possible when there is a significant correlation between the output state of one individual LFSR in the keystream generator and the output of the Boolean function that combines the output state of all of the LFSRs. Combined with partial knowledge of the keystream (which is easily derived from partial knowledge of the plaintext, as the two are simply XORed together), this allows an attacker to brute-force the key for that individual LFSR and the rest of the system separately.
Dictionary attack
This attack is a brute force attack that tries passwords and or keys from a precompiled list of values [9]. This is often done as a pre-computation attack.
REVIEW OF LITERATURE
Shimin Wei (2006) [10] had describe the generalization of geffe generator by presenting a new construction of a pseudorandom generator based on a simple combination of q+1 LFSRs over GF(q), which is a generalization of geffe generator. They introduced a new shrinking generator, called Geffe's Shrinking generator over GF(q), a combination of geffe and shrinking generators. As geffe generator uses three LFSRs i.e., LFSR1, LFSR2 and LFSR3, Shimin Wei has also offered the advantage of being useful as a module of a superstructure of similar arrangements, i.e., the entire generator as given in the below figure could play the role of LFSR1 in the same arrangement with like generators. The complexity would escalate accordingly.
Figure:1 Superstructure of geffe genrator
Shimin Wei [10] proposed a new shrinking geffe generator. For this he assumed that s=(s0, s1, s2, L) be the output of an individual LFSR. At time k, he had consider the pair (sk, sk+1) of terms from the output of the LFSR. If sk=1, the term sk+1 is output by the self-shrinking generator. If sk=0, no term is output. For new shrinking generator over GF(q), he suggested to use q LFSR's over GF(q), say LFSRi, i∈GF(q), as basic components. The pseudorandom bits had produced as follows: If the control generator LFSR0 produces a s0j=j(≠0), then LFSRj is connected, i.e. the resulting generator outputs ti=sji; otherwise, no term is output, where σj=(sj0, sj1, sj2, L) were q-ary sequences outputted by LFSRj, respectively; τ=(t0, t1, t2, L) was the resulting sequence. Hence the resulting generator was called Geffe's shrinking generator over GF(q). The construction had attractive properties as simplicity (conceptual and implementation wise), scalability (hardware and security), proven minimal security conditions (exponential period, exponential linear complexity).
Saria Safdar (2008) [11] had implemented algorithms of clock-control generators to found that which one is more secure against correlation attack. Both the algorithms of clock-control generators were implemented by gradually increasing the lengths of initial input bits to LFSR's. Two clock-controlled based stream ciphers were presented in that study and their common weaknesses were analyzed. She presented alternating step generator and shrinking generator models, algorithms and their analysis through different case studies. At the end she compared both the clock controlled generators by applying correlation attack on them. In her study, she found that shrinking generator is more secure than alternating step generator against correlation attack. Saria analyzed that the increase in the initial input bits of LFSR's resulted in increase of the key length. In case of Shrinking Generator if L1, L2=l i.e GCD(L1,L2)=1 then shrinking generator has a security level approximately equal to 22L . Secure lengths of Shrinking Generator and Alternating Step Generator against the correlation attack were found 64 and 128. When initial inputs to LFSR's were 64 and 128, generated key length was very large, so the possibility of correlation attack was reduced. After comparing both the algorithms she found that Shrinking Generator was a better choice to generate a key stream because of its simple structure and efficient nature. Keystream sequence generators that produce sequences with large periods, high linear complexities and good statistical properties are very useful as building blocks for stream cipher applications. The use of clock-controlled generators in keystream generators appeared to be a good way of achieving sequences with these properties.
Nicolas and Willi Meier [12] described the algebraic attacks on stream ciphers with linear feedback shift registers, such that the only non-linear component is a filtering function. They had reduced their cryptanalysis to solve a system of algebraic equations. The main contribution of Nicolas and Willi was that if the non-linear function of a cipher with linear feedback uses only a small subset of state bits, the cipher will be insecure. As classical construction of stream ciphers is to combine several LFSRs and a highly non-linear Boolean function f so their security is usually analyzed in terms of correlation attacks, which can be seen as solving a system of multivariate linear equations, true with some probability. In this paper of [12] it has been shown that how to substantially lower the degree of equations by multiplying them by well-chosen multivariate polynomials. Nicolas and Willi had also considered a new general algebraic attack that had broken the stream ciphers by satisfying all the previously known design criteria in at most the square root of the complexity of the previously known generic attack. They only considered synchronous stream ciphers in paper because in synchronous stream ciphers, each state is generated from the previous state independent of the plaintext. They discussed over defined system for solving multivariate equations of degree d. Linearization, XL, and Gaussian reduction method was used in the paper [12] to solve multivariate equations. Cryptanalysis of toyocrypt, a stream cipher, was also done in the paper.
Martin Kreuzer (2009) [13] had surveyed on the current techniques of algebraic cryptanalysis. After introducing the basic setup of algebraic attacks and discussing several attack scenarios for symmetric cryptosystems, public key cryptosystems, and stream ciphers, he discussed a number of individual methods. The XL, XSL, and MutantXL attacks are based on linearization techniques for multivariate polynomial systems. The XL Attack uses the XL Algorithm which in turn is based on a technique called relinearization. Then he looked at Grobner basis and border bases methods. In the last section of the paper he introduced attacks based on integer programming techniques and tried them in some concrete cases.
Karthik N. et al. (2007) [14] considered nonlinear combination generators which are made up of a number of linear feedback shift registers whose outputs are combined by a nonlinear boolean function. Though only parity check equations of weight 3 are considered here, the method can be extended to higher weights as well. He took Molland's decoding algorithm and found a faster and more efficient way to generate many equations, improving both the actual computation and pre-computation complexity.
Thomas et al. (2002) [15] presented a theoretical analysis of a powerful algorithm for fast correlation attacks. The analysis resulted in a bound which determines the required number of observed keystream symbols needed in a successful attack. The bound was derived using random coding bounds for convolutional bounds. If he compares the theoretical bound with results obtained by simulations he observed that the bound was rather tight. Thus, Thomas concluded that the results from the theoretical analysis can be used as a tool for evaluating the security of stream ciphers based on LFSRs, where the shift register length makes simulations impossible.