miércoles, 24 de octubre de 2012

6th Assignment: Dragon Stream CIpher

For this week, I chose the Dragon stream cipher for the assignment. I will explain how it works, its advantages and the common attacks-vulnerabilities.

How does it work?

The Dragon Stream cipher explained in the PDF(link at the references) is used to generate a keystream using a key and an initialization vector as input. These two are provided at the end of the PDF, including both sboxes that can be used to test some examples. I used a 256-bit key and initialization vector.

F Function

The first thing defined in the PDF is the F function used both in the key setup and generation. As the PDF defines it: 

"The F function is a reversible mapping of 192 bits (six 32-bit words) to 192 bits. It takes six 32-bit words as input and produces six 32-bit words as output."

These six 32-bit input words are normally defined as a, b, c, d, e, f. And the output words are a', b', c', d', e', f'. Also the function has other components called G1, G2, G3, H1, H2, H3. These are used to provide high non-linearity. 

The main operations used in the F function are the xor(+) and the addition modulo 2^32[+]. In python we can do both with decimal integers, but normally we would need the binary representation for each number. For example:

 a = 29
 b = 21    
                   XOR                                                  ADDITION MODULO 2^32 

               a  (+) b  =  ?                                                           a [+] b = ?
               29  (+) 21 = ?                                                  (a + b) % 2^32 = ?
       << Conversion to Binaries >>                                (21 + 29) % 2^32 = ?
             11101 (+) 10101 = 01000                                 (50) % 2^32 = 32
       << Back to Decimal >>
               29 (+) 21 = 8
               a  (+) b = 8
This F function is described as follows:

The inputs as mentioned earlier are six 32-bit words and the outputs are also six 32-bit words, but mixed. 

Now the G and H functions are constructed using the 8x32 bit  S1 and S2 sboxes. These sboxes are provided at the end of the PDF as explained early. Also the 32-bit input used in the G and H functions is separated in four bytes: x0, x1, x2 and x3. Then the results for each G and H function are obtained as follows:

That was the F function, which is used in both the initialization and generation, explained bellow.


The initialization is used to generate an internal state W, used later in the generation. This internal state is initially filled by concatenating the key(K) and the initialization vector(IV). Then this is divided into eight 128-bit words (W0-W7). Then a 64-bit Memory component is defined.  The next thing is repeating for 16 rounds a series of operations involving W, so we can initialize the internal state, which is going to be used in the keystream generation.

The initialization algorithm as stated in the PDF is:

The input is the key and the initialization vector, and the output is  the eight 32-bit words that define the initialized internal state. Simply explained, the initialization algorithm involves:
  1. Concatenate K + (K xor IV) + ¬(K + IV) + IV which will be a 1024-bit word, and separate it into eight 128-bit words W0 - W7.
  2. Initialize the memory variable M = 0x0000447261676F6E (64-bit).

    Repeat 16 times
  3.  Calculate (W0 xor W6 xor W7) and divide it into four 32-bit words a, b, c, d.
  4. Divide M into two 32-bit words e, f.
  5. Calculate a', b', c', d', e', f' using the F function defined previously.
  6. Join a', b', c', d' and calculate xor with W4, this will be the new W0.
  7. Calculate the new Wi = Wi - 1, looping from i = 7 to 1.
  8. Join e' and f', this will be the new M.
Keystream Generation:

The keystream generation involves generating a keystream using the previously initialized internal state, and the last value of M as input. The internal state is now divided into thirty-two 32-bit words (B0-B31). During each round of the keystream generation, B0, B9, B16, B19, B30, B31 are used as the F function inputs a, b, c, d, e, f. 

Each round of the keystream generation the output is a 64-bit length keystream k, and an updated state B and M, which can be used again to generate more and more keystreams.
The process is described as follows:

The key generation algorithm simply explained:
  1. Split M into two 32-bit words ML (left part) and MR(right part).
  2. Obtain the six 32-bit words, making:
    a = B0, b = B9, c = B16, d = B19, e = B30 xor ML, f = B31 xor MR
  3. Calculate a', b', c', d', e', f' using the F function defined previously.
  4. Make the new B0 and B1 equal to b' and c' respectively.
  5. Calculate the new B's with Bi = Bi-2  looping from i = 32 to i = 2.
  6. Update M = M + 1.
  7. Get the 64-bit keystream joining a' and e'.
  8. Repeat using the last B's and M values.

For an example of this cipher I created python script that used the test data provided in the PDF(256-bit key and IV). The script reads from the sbox1.dat and sbox2.dat files the S1 and S2 sboxes used in the f function, and generates one round of a keystream with the said test data. I use a lot of string operations to convert stuff into binary strings and manipulate them easier (even though it is also possible to do it with decimals). The Python code:

Output(Generating 100 rounds):

Note: There was no information available that I could found about how the keystream is used to encrypt a plaintext, so I didn't create an example for that.


Time-Memory Trade-off Attack

Time-Memory tradeoff attacks  rely on pre-computation to reduce the effort required for a key recovery attack on a keystream. The attack comprises two steps.
  • The first, the preprocessing step, sees the attacker calculating a table of keys or internal states and corresponding keystream prefixes. The table is ordered upon the prefix. 
  • The second step involves observing keystreams and attempting to match each against a prefix in the table. If the match is successful, then with some likelihood the internal state is known by reading the opposing entry in the table
Guess and determine

To calculate keystream words from two rounds of Dragon, the attacker is required to guess more than 256 bits of the internal state. This is worse than exhaustive key search, and makes guess and determine attacks on Dragon unfeasible.

Distinguishing Attacks

If the output sequence of a stream cipher can be statistically distinguished from a random sequence, then the cipher is not strong enough for cryptographic applications. Dragon is designed with a large state and complex initialization and update function. It has no linear masking, and therefore immune to this type of distinguishing attacks


1 comentario: