The Chaocipher Challenge: Progress Report #3

Moshe Rubin (mosher@mountainvistasoft.com)
Last updated: 2 March 2009

Hypothesis

I would like to present evidence that the Chaocipher algorithm requires a 13- or 26-letter blocking.  This supports Greg Mellen's hypothesis that such a blocking was used.

A case for a 13- or 26-letter blocking factor

In Greg Mellen's seminal article [1] he hypothesized that the Chaocipher device works on blocks of 13 letters at a time.  He based this on the fact that he believed a pt/ct identity never occurred within a 13-letter block of text.  For example, if the Chaocipher text in Exhibit 1 is divided into blocks of 13 letters, starting with the first letter of the exhibit, we would have the following:
A L L G O O D Q Q U I C K
C L Y T Z P N Z K L D D Q
B R O W N F O X E S J U M
G F B O O T Y S N E P U A
P O V E R L A Z Y D O G T
G K I U N K N C R I N R C
O S A V E T H E I R P A R
V K J N H T O A F Q P D P

etc. . . .
The above division assumes a block begins with the first letter of the text ("offset 0").  If we assume the blocking begins with an offset of 1 we would get the follows:

                        A
                        C
L L G O O D Q Q U I C K B
L Y T Z P N Z K L D D Q G
R O W N F O X E S J U M P
F B O O T Y S N E P U A G
O V E R L A Z Y D O G T O
K I U N K N C R I N R C V
S A V E T H E I R P A R T
K J N H T O A F Q P D P N
Continuing in this fashion, there are in total 13 ways to divide the text into blocks of 13 letters.

Similarly, we can divide the text into blocks of 10, 11, 12, 14, 15, etc. letters.

I wrote a Perl script that counted the number of pt/ct identities in each block, for every block size and offset.  The raw results can be seen here.  For example, blocking size 13 shows the following:

Block sizeOffsetNumber Pt/Ct IdentitiesFirst identity offset
13086253
1 11 389
2 13 389
3 11 3689
4 13 3480
5 12 1347
6 14 3205
7 13 994
8 12 994
9 7 5868
10 9 1940
11 6 1445
12 5 8893

The results shows that 13-letter blocking with an offset of 0 resulted in 8 pt/ct identities, while an offset of 1 had 11 pt/ct identities.  The last column, for reference, denotes the offset within the text where the first pt/ct pair of the first identity occurred.

What happens when we graph the block size/offset results?

At this point I graphed each block size N, mapping all N offset results.  For example, mapping block size 12 gives the following graph:

Graphing blocksize 12

The x-axis denotes the offset within a block size of 12 (i.e., 0, 1, 2, ... 11), while the y-axis shows the number of pt/ct identities for that offset.  For this graph we see multiple nodes with possible skewing to the right.

I produced graphs for block sizes 12-14, 22-28, 39, and 52.  You can see all the graphs here.  The results are amazing: block sizes 13 and 26 are clearly normally distributed while the others are clearly not.  Block sizes 39 and 52 (i.e., multiples of 13/26) do not seem to show normal distribution.  Block sizes 13 and 26 are obviously causal while other block sizes introduce random results.

My explanation is:

Conclusion

References

[1] Mellen, Greg.  1979.  J. F. Byrne and the Chaocipher, Work in Progress.  Cryptologia, 3(3): 136-154.


Copyright (c) 2009 Moshe Rubin

Return to the home page