The main focus of this thesis is the analysis of RC4 stream cipher and its implications in the design issues of shuffle-exchange paradigm of stream cipher.
The RC4 stream cipher has two components. These are the Key Scheduling Algorithm (KSA) and the Pseudo-Random Generation Algorithm (PRGA). The KSA uses a secret key $K[0\ldots l-1]$ of $l$ bytes to scramble a permutation $S[0\ldots N-1]$ of $N$ bytes using two indices $i$ and $j$. The PRGA uses this scrambled permutation and performs further shuffle-exchanges to produce keystream output bytes $z_1, z_2, z_3,\ldots$.

First, we perform a detailed theoretical analysis of RC4 KSA. We derive explicit formulae for the probabilities with which the permutation bytes $S[y]$ at any stage of the KSA are biased to the secret key. Theoretical proofs of these probabilities have been left open since Roos' observation (1995). Along the same line, we analyze a generalization of the RC4 KSA corresponding to a class of update functions of
the indices involved in the swaps and find that such weaknesses are intrinsic in shuffle-exchange kind of key scheduling. Moreover, for the first time we show that biases towards the secret key also exist in $S[S[y]], S[S[S[y]]]$, and so on, for initial values of $y$. We also study a weakness of the RC4 Key Scheduling Algorithm (KSA) that has already been noted by Mantin and Mironov. We present a simple proof that each permutation byte after the KSA is
significantly biased (either positive or negative) towards many values in the range $0, \ldots, N-1$. Further, we present a detailed empirical study over Mantin's work when the theoretical formulae vary significantly from
experimental results due to repetition of short keys in RC4.

Based on our analysis of the key scheduling, for the first time we show that the secret key of RC4 can be recovered from the state information in a time much less than the exhaustive search with good probability. Our research generated significant interest in the community on this problem and subsequently a series of works in this area has been published.

Next, we analyze the RC4 PRGA in detail and find many new biases of the keystream bytes towards the secret key. We show that the first byte of the keystream output of RC4 has non-negligible bias towards the sum of the first three bytes of the secret key. This result is based on our observation that the index, from where the
first byte of the keystream output is chosen, is approximately twice more likely to be 2 than any other value. Our technique is further used to
theoretically prove Roos' experimental observation (1995) related to weak keys. Based on our analysis of the KSA and the Glimpse bias (1996), we also present a complete framework is to show that many keystream output bytes of RC4 are significantly biased towards several linear combinations of the secret key bytes. We find new biases in the initial as well as in the 256-th and 257-th keystream output bytes and discuss how these biases may propagate further if the value of $j$ is known at certain points during the PRGA. Further, we show that the non-randomness of the permutation makes the
keystream non-random and this in turn leads to the construction of new distinguishers for RC4 keystream.

We then present a complete characterization of the RC4 PRGA evolution for one step. Considering all the permutations (we also keep in mind the
Finney states), we find that the distribution of $z$ is not uniform given $i, j$. A corollary of this result shows that information about $j$ is always leaked from $z$. Moreover, studying two consecutive steps of RC4 PRGA, we prove that the index $j$ is not produced uniformly at random given the value of $j$ two steps ago. We also provide additional evidence of $z$ leaking
information on $j$. Further, we present a novel distinguisher for RC4 which shows that under certain conditions the equality of two consecutive bytes is more probable than by random association. The most important fact about this analysis is that it holds regardless of the amount of initial keystream bytes
thrown away during the RC4 PRGA.

We also study the behaviour of RC4 when the index $j$ is stuck at a certain value not known to the attacker. This work presents the nontrivial issues involved in the analysis, identifying how the information regarding $S$
starts leaking with as low as 258 keystream output bytes. The leakage of information increases as more bytes are available and finally the complete $S$ is recovered with $2^{11}$ bytes in around $2^{25}$ time complexity. This study presents a nice combinatorial structure that is relevant to the
fault analysis of RC4.

Finally, we propose few additional layers over the RC4 KSA and RC4 PRGA. Analysis of the modified cipher (we call it RC4$^+$) shows that this new strategy avoids existing weaknesses of RC4.