Rocke Verser's DES key search method

This description is excerpted with permission from a message Rocke Verser sent me in April 1997.

- Darrell Kindred <dkindred@cs.cmu.edu>

[Darrell Kindred's bitslice DES info]


LoopA:  Each pass through LoopA performs 256 complementary pairs of
        keys.  (512 keys per iteration of LoopA.)

LoopB:  Each pass through LoopB performs 16 complementary pairs of
        keys.  (32 keys per iteration of LoopB.)  This loop is
        arranged so that the first 1+5/8 rounds only need to be
        computed once.  These first 1+5/8 rounds are invariant
        within LoopC.

LoopC:  Each pass through LoopC tests 1 key.  Each pass through
        this loop corresponds to a pass through the "PrecomputeRound1"
        function.  Since the last 1+6/8+1/8 round are precomputed,
        they do not need to be recomputed in LoopC.


LoopA:
  /* Perform a set of 256 complementary pairs of keys (512 keys per set) */

  Call Precompute-Round-1

  LoopB:
    /* Perform a set of 16 complementary pairs of keys (32 keys per set) */

    Call Precompute-Round-16

    LoopC:
      /* Test a single key */

      Fetch the "primary" or "complementary" result from "precompute-round-16"

      Apply S-boxes 2, 3, 7 of Round 15.

      Apply Rounds 14, 13, 12, 11, 10, 9, 8, 7, and 6.

      Apply S-box 7 of Round 5.

      Compare 4-bit result of Round 5 with stored result of Round 3
        from "precompute-round-1".  (I.e., compare S-box 7 outputs.)

      If no match, iterate LoopC.  
      [dk: this abort rarely happens in bitslice: 1.6% for 64-bit, 12.7% w/ 32]

      Apply S-boxes 3 and 4 of Round 5.

      Apply S-boxes 1 and 7 of Round 2 to previously stored result.

      Apply S-boxes 3 and 4 of Round 3 to previously stored result.

      Compare 8-bit results of Round 5 with 8-bit result of Round 3.
        (I.e., compare S-box 3 and 4 outputs.)

      If no match, iterate LoopC.

      /* We get here, on average 1 time in 4096 keys */
      /* [dk: with 64-bit bitslicing, we get here 1.6% of the time] */

      Apply S-boxes 1, 2, 5, 6, and 8 of Round 5.

      Apply S-boxes 1, 2, 5, 6, and 8 of Round 3.

      Compare 20-bit results of Round 5 with 20-bit result of Round 3.
        (I.e., compare S-box 1, 2, 5, 6, and 8 outputs.)

      If no match, iterate LoopC.

      /* We get here, on average 1 time in 2^32 keys. */

      /* This is a "half-match".  It must be logged and reported to
         the keyserver.  This is how the keyserver can inexpensively
         check that clients are actually doing their work!!!  Very
         important!!!!   What is reported to the server is the
         loop counter (0 to 2^31) at the time the half-match was
         detected.  The loop counter simply increments sequentially
         (not gray-code) with each iteration of LoopC. */

      Apply Round 4.

      Compare 32-bit output of Round 4 with 32-bit result stored result
        of Round 2.

      If a match, we probably have the winning key -- a 64-bit match!

      Iterate-LoopC:  Process keys in same order as table generated by
        "precompute-Round-1".  [4 key bits are toggled to every possible
        value.  Complements of tested on alternate passes through the
        loop.]

      End-LoopC

    Iterate-LoopB:  Toggle key bits 12, 7, 43, and 52 in gray-code
      order (bit 52 toggling most frequently).

    End-LoopB

  Print "." every 2^21 pairs of keys tested.

  Print CR/LF every 2^27 pairs of keys tested.

  Iterate-LoopA:  Toggle remaining key bits in gray-code order
    (bit 63 toggling most frequently) until entire keyspace is
    tested.

  End-LoopA

=========================================================================
Precompute Round 16 and part of Round 15:

Precompute and store 2 sets of values.  Each value is used 16 times
as we process the keys in this test.

1.  Perform Round 16 on ciphertext or complement of ciphertext.

2.  Save outputs.  [32 bits of the output will require a 2-element
    array.]

3.  Perform S-boxes 1, 4, 5, 6, and 8 of Round 15.

4.  Save outputs.  [20 bits of the output will require a 2-element
    array.]

5.  Holding all key bits constant, we perform these operations once
    using the primary ciphertext.  And then with the complementary
    ciphertext.

=========================================================================
Precompute Round 1 and part of Rounds 2 and 3:

Precompute and store 32 sets of values.  Each value may be used 16
times as we process the 512 keys in this set.

1.  Perform Round 1 on plaintext or complement of plaintext.

2.  Save outputs.  [32 bits of the output will require a 32-element
    array.  32 bits of the output will require a 2-element array.
    (bitslice code may not need the 2-element arrays.  They are simply
    permuted copies of 32 bits of plaintext and their complement.]

3.  Perform S-boxes 2, 3, 4, 5, 6, and 8 of Round 2.

4.  Save outputs of these 6 S-boxes.  [24 bits of the output will each
    require a 32-element array.]

5.  Perform S-box 7 of Round 3.

6.  Save outputs of this S-box.  [4 bits of the output will each require
    a 32-element array.]

7.  Holding all other key bits constant, this table shows which bits are
    toggled (bits 60, 51, 54, 58 count in gray-code order; the tables of
    precomputed values alternately store the results of using "primary"
    then "complementary" plaintext.

    60   51   54   58    C      <-- Key bit (1-64)  and  "complementary" flag
   ------------------------
     0    0    0    0    0
     0    0    0    0    1
     0    0    0    1    0
     0    0    0    1    1
     0    0    1    1    0
     0    0    1    1    1
     0    0    1    0    0
     0    0    1    0    1
     0    1    1    0    0
     0    1    1    0    1
     0    1    1    1    0
     0    1    1    1    1
     0    1    0    1    0
     0    1    0    1    1
     0    1    0    0    0
     0    1    0    0    1
     1    1    0    0    0
     1    1    0    0    1
     1    1    0    1    0
     1    1    0    1    1
     1    1    1    1    0
     1    1    1    1    1
     1    1    1    0    0
     1    1    1    0    1
     1    0    1    0    0
     1    0    1    0    1
     1    0    1    1    0
     1    0    1    1    1
     1    0    0    1    0
     1    0    0    1    1
     1    0    0    0    0
     1    0    0    0    1

========================================================================