The Chaocipher Clearing House

Chaocipher hypothesis re "no hits < 7" is proven

Moshe Rubin (mosher@mountainvistasoft.com)

The first "Chaocipher Progress Report", published on this web site back in February 2009, highlighted an intriguing characteristic in Exhibit #1 for in John F. Byrne's chapter 21 in "Silent Years".  Greg Mellen, the author of the excellent article in Cryptologia (Volume 3 Number 3, July 1979, pp. 136-154) entitled “J. F. Byrne and the Chaocipher: Work in Progress”, noted that identical back-to-back plaintext letters were never enciphered into identical ciphertext letters:

        A LL G OO D QQ UICKBROWNFOXES...

Line 1: C LY T ZP N ZK LDDQGFBOOTYSNE...
Line 2: L TV F IC O TS SLWYYIHBICFUTH...
Line 3: T BZ X TM V GL TJXCSQXLNJTENC...
Line 4: H KY G QJ T OG YSDBNVDJOWHKEC...
Line 5: R IF F ZA Q NH SOMJPORWTJOIJI...
Line 6: J EO Z IF K JC FMETESYYHZUVLF...
Line 7: W EF R FY H KX OPKXRQSZKLCZKH...
Line 8: B UZ L AG D BC UAMFQLACRWWTUG...
Line 9: R UT K FK A SG VLVYYFVRAIYNIV...
Line10: U KQ A SX K GS PWHRYMTQSOQBAM...

As can be seen in this truncated table, the doubled letters (e.g., "LL", "OO", "QQ") are never enciphered into two identical ciphertext letters (the table here shows only the first ten lines, but it is true throughout the entire Exhibit #1).

Let's call two identical (pt,ct) pairs as a "hit".  We will  word the above observed phenomenon as "hits do not occur as a distance of one (1)".

Interestingly enough, searching through Exhibit #1, we find that "hits do not occur at a distance of two (2)".

Continuing this experimental research, we find that, in Exhibit #1, the closest distance between two hits is nine (9).  Here the four (4) instances in Exhibit #1 of hits at a distance fo nine (9):

Offset  Plain / Cipher
4207 [UMP] OVERLAZYDO [GTO]
[QFK] EHGUTNFVTE [UOQ]
7800 [EWO] ULDRELINQU [ISH]
[BVD] BFBWYZLOYB [SST]
7864 [NES] TIMABLETOT [HEM]
[IGY] ARLTYSNDFA [LTT]
9838 [THE] SECOLONIES [VFO]
[VZN] XYPJKPVESX [NUD]

This discovery fueled almost all of the Chaocipher research up until the cipher system was revealed.  This phenomenon had to be causal, and must have sprung from Chaocipher's internal algorithm.  A working assumption was that any Chaocipher model put forward needed to replicated this observation that "hits do not occur at a distance less than nine (9)".

Chaocipher Progress Report #26 announced the surprising fact that the lower bound of hits occurring was not nine (9), but rather seven (7).  In other words, the phenomenon could be described as:

"Hits never occur at a distance less than 7"

Additional research showed that, when exhaustively iterating through a subset of possible encipherings, seven (7) was indeed the theoretical minimum distance.  In other words, hits could never occur at a distance of 6, 5, 4, 3, 2, or 1.

A challenge for proving the assertion is issued

In February 2019 I uploaded a post for the readers of the Tapatalk Crypto Forum.  Entitled "Looking for mathematical reason why classic Chaocipher never shows pt/ct "hits" less than a distance of seven (7)", I challenged the readers to prove methematically that this result was indeed correct.  At the time it was not clear why this was seemingly inherent in the Classic Chaocipher algorithm, and how to go about proving the premise.  No one replied to the post, so the challenge remained unanswered.

I recently felt the urge to "bite the bullet" and prove this important assertion using mathematically or algorithmic proofs.  In short order, I was able to prove that, indeed, hits can never occur in less that a distance of seven (7).  The proof is interesting in its own right, and may provide a valuable method for proving other cryptanalytic problems.

Using the "roaming zenith" method for displaying successive pt/ct alphabets

As explained in the page "Two ways to permute and display Chaocipher alphabets" found in this progress report, there are two cryptographically equivalent ways to present successive Chaocipher left and right alphabets.  These are known as the "fixed zenith" and "roaming zenith" methods.  The "roaming zenith" method was developed specifically to analyze the "no hits < 7" problem.  It was believed that this presentation method would enable one to better understand and feel the gradual diffusion of the alphabets, and might provide the necessary insight to solving the problem.  And it did indeed.

Therefore, in the sections below, we will be using the "roaming zenith" method to present the proof.

Understanding how Chaocipher alphabets change in a single encipherment

Let's begin with a simple example to understand what we're up against.  We will assume that both the left (ciphertext) and the right (plaintext) alphabets are straight alphabets (i.e., A to Z):

Left  (ct): ABCDEFGHIJKLMNOPQRSTUVWXYZ
Right (pt): ABCDEFGHIJKLMNOPQRSTUVWXYZ

In our example, we want to study the pt/ct pair (A/A), observing how they move relative to each other based on encipherments.  Throughout our proof, we will be scrutizing the relative positions, or distance between, both the two letters, Act and Apt.

We define the distance between Act and Apt as the difference between the offsets of Act and Apt, modulo 26.  In the current state of the two alphabets, the distance is offset (Act) - offset (Apt) = 0 - 0 = 0.  We will refer to this alphabet juxtaposition as a "distance of zero (0)".

We now encipher a plaintext letter.  For this example, we choose to encipher the plaintext letter found at an offset of 0 to the right of Apt - in other words, we will encipher Apt itself.  Using the "roaming zenith" method of displaying the alphabets, we obtain the following alphabets after enciphering Apt:

Left  (ct): ACDEFGHIJKLMNBOPQRSTUVWXYZ
Right (pt): BCEFGHIJKLMNODPQRSTUVWXYZA

We see that we now have an alphabet juxtaposition of a "distance of 1".  This is because offset (Act) - offset (Apt) = 0 - 25 (mod 26) = -25 (mod 26) = 1.

Let's recap: enciphering the plaintext letter at offset 0 to Apt, using alphabets juxtaposed at a distance of 0, resuts in an alphabets juxtaposed at a distance of 1.

Now we'll go back to alphabets juxtaposed at a distance of 0, but will encipher the plaintext letter at an offset of 1 to the right of Apt, i.e., Bpt.  We now get the following alphabets:

Left  (ct): ABDEFGHIJKLMNOCPQRSTUVWXYZ
Right (pt): BCDFGHIJKLMNOPEQRSTUVWXYZA

Although the alphabets differ ever so slightly from the previous encipherment, the alphabet juxtaposition distance remains 1.

Enciphering Mpt using the alphabets juxtaposed at distance 0, we get alphabets at a distance of offset (Act) - offset (Apt) = 0 - 24 (mod 26) = -24 (mod 26) = 2::

Left  (ct): ABCDEFGHIJKLMOPQRSTUVWXYZN
Right (pt): BCDEFGHIJKLMNOQRSTUVWXYZAP

Enciphering Xpt results in an even larger alphabet juxtaposition distance of 25 - 10 = 15:

Left  (ct): BCDEFGHIJKYLMNOPQRSTUVWXZA
Right (pt): CDEFGHIJKLAMNOPQRSTUVWXYZB

If we enciphered each one of the 26 possible plaintext letters as we did above, we would obtain all 26 alphabet juxtaposition transitions possible beginning with alphabets at a distance of 0:

      0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 0:   1  1  1  1  1  1  1  1  1  1  1  1  2  1  1  1  1  1  1  1  1  1  1 15  0 13

The "0:" at the left denotes that the alphabet juxtoposition distance is 0, while the numebrs 0-25 on top refer to the plaintext offset relative to Apt.  You can see that enciphering Apt (a distance of 0) transitions to a alphabet juxtaposition distance of 1, Mpt (a distance of 12) transitions to 2, Xpt (distance of 23) to 15, and Zpt (distance of 25) to 13.  All other plaintext letters transition to 1.

Summarizing all possible alphabet changes in a single table

In the previous section we concentrated on enciphering all possible plaintext letters starting with alphabets juxtaposed at a distance of 0.  We can do the same for the other 25 alphabets juxtaposed at distances of 1 through 25.  The following table summarizes the results of all 26 alphabet juxtapositions, for all 26 possible plaintext letters:

      0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
    ------------------------------------------------------------------------------
 0:   1  1  1  1  1  1  1  1  1  1  1  1  2  1  1  1  1  1  1  1  1  1  1 15  0 13
 1:  14  2  2  2  2  2  2  2  2  2  2  2  3  3  2  2  2  2  2  2  2  2  2 16  1  1
 2:   2 15  3  3  3  3  3  3  3  3  3  3  4  4  4  3  3  3  3  3  3  3  3 17  2  2
 3:   3  3 16  4  4  4  4  4  4  4  4  4  5  5  5  5  4  4  4  4  4  4  4 18  3  3
 4:   4  4  4 17  5  5  5  5  5  5  5  5  6  6  6  6  6  5  5  5  5  5  5 19  4  4
 5:   5  5  5  5 18  6  6  6  6  6  6  6  7  7  7  7  7  7  6  6  6  6  6 20  5  5
 6:   6  6  6  6  6 19  7  7  7  7  7  7  8  8  8  8  8  8  8  7  7  7  7 21  6  6
 7:   7  7  7  7  7  7 20  8  8  8  8  8  9  9  9  9  9  9  9  9  8  8  8 22  7  7
 8:   8  8  8  8  8  8  8 21  9  9  9  9 10 10 10 10 10 10 10 10 10  9  9 23  8  8
 9:   9  9  9  9  9  9  9  9 22 10 10 10 11 11 11 11 11 11 11 11 11 11 10 24  9  9
10:  10 10 10 10 10 10 10 10 10 23 11 11 12 12 12 12 12 12 12 12 12 12 12 25 10 10
11:  11 11 11 11 11 11 11 11 11 11 24 12 13 13 13 13 13 13 13 13 13 13 13  1 11 11
12:  12 12 12 12 12 12 12 12 12 12 12 25 14 14 14 14 14 14 14 14 14 14 14  2 13 12
13:  13 13 13 13 13 13 13 13 13 13 13 13  1 15 15 15 15 15 15 15 15 15 15  3 14 14
14:  15 14 14 14 14 14 14 14 14 14 14 14 15  2 16 16 16 16 16 16 16 16 16  4 15 15
15:  16 16 15 15 15 15 15 15 15 15 15 15 16 16  3 17 17 17 17 17 17 17 17  5 16 16
16:  17 17 17 16 16 16 16 16 16 16 16 16 17 17 17  4 18 18 18 18 18 18 18  6 17 17
17:  18 18 18 18 17 17 17 17 17 17 17 17 18 18 18 18  5 19 19 19 19 19 19  7 18 18
18:  19 19 19 19 19 18 18 18 18 18 18 18 19 19 19 19 19  6 20 20 20 20 20  8 19 19
19:  20 20 20 20 20 20 19 19 19 19 19 19 20 20 20 20 20 20  7 21 21 21 21  9 20 20
20:  21 21 21 21 21 21 21 20 20 20 20 20 21 21 21 21 21 21 21  8 22 22 22 10 21 21
21:  22 22 22 22 22 22 22 22 21 21 21 21 22 22 22 22 22 22 22 22  9 23 23 11 22 22
22:  23 23 23 23 23 23 23 23 23 22 22 22 23 23 23 23 23 23 23 23 23 10 24 12 23 23
23:  24 24 24 24 24 24 24 24 24 24 23 23 24 24 24 24 24 24 24 24 24 24 11 13 24 24
24:  25 25 25 25 25 25 25 25 25 25 25 24 25 25 25 25 25 25 25 25 25 25 25  0 25 25
25:   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 14 12  0

This table summarizes all alphabet transitions possible for Classic Chaocipher.  In the next section we will see how to convert this table to a directed graph.


Creating a directed graph from the alphabet change table

Converting the previous table data into a form accepted by Graphviz, we paste the DOT data file into an online Graphviz renderer, such as Edotor to create and export a directed graph:

Alphabet transition directed graph

Now we have a visual directed graph that illustrates transitioning between alphabet juxtaposition distances.  Assuming we begin with our pt/ct letters at a distance of 0, we need to find the shortest path from node 0 back to itself.  Here are a few observations about the directed graph:
You can play with the graph, trying to find the shortest path.  For example, the path:

0 → 1 → 3 → 4 → 19 → 21 → 22 → 24 → 25 → 0

produces a "hit" in nine (9) steps.  Our goal is to find the shortest path from node 0 back to itself.

Proving the "no hits < 7" assertion using Dijkstra's Shortest Path First (SPF) algorithm

The final step is find an algorithm that will find the shortest path between two nodes within a directed graph.  Fortunately, such an algorithm exists, and is known as Dijkstra's Shortest Path First (SPF) algorithm.  We use a Python implementation of the algorithm, with modifications made for our purposes, and keep the following in mind:
  1. We cannot tell the algorithm to find the shortest path from node 0 to node 0, as it will return a path size of 0, i.e., no transition is needed to travel from 0 to 0.  Therefore, we wil move to another node and then run the algorithm.
  2. We are only interested in paths that begin with the transition of "0 → 1"
  3. The algorithm will be directed to compute the shortest path from node 1 back to node 0.
  4. Adding one (1) to the shortest path found in step 3 will give us the shortest hit distance possible.
Here are the results of running the SPF algorithm:

1 --> 0                 6               1 3 5 7 9 24 0
1 --> 1                 0               1
1 --> 2                 1               1 2
1 --> 3                 1               1 3
1 --> 4                 2               1 2 4
1 --> 5                 2               1 3 5
1 --> 6                 2               1 16 6
1 --> 7                 3               1 3 5 7
1 --> 8                 3               1 16 6 8
1 --> 9                 4               1 3 5 7 9
1 --> 10                4               1 16 6 8 10
1 --> 11                4               1 16 6 21 11
1 --> 12                5               1 16 6 8 10 12
1 --> 13                5               1 16 6 21 11 13
1 --> 14                1               1 14
1 --> 15                2               1 2 15
1 --> 16                1               1 16
1 --> 17                2               1 2 17
1 --> 18                2               1 3 18
1 --> 19                3               1 2 4 19
1 --> 20                3               1 3 5 20
1 --> 21                3               1 16 6 21
1 --> 22                4               1 3 5 7 22
1 --> 23                4               1 16 6 8 23
1 --> 24                5               1 3 5 7 9 24
1 --> 25                5               1 16 6 8 10 25

The algorithm computed that the shortest path from node 1 to node 0 has a length of 6 (it shows one of many possible paths, in this case 1-3-5-7-9-24-0).  Adding one more because of the required initial transition from node 0 to node 1 gives us the desired result: the shortest distance to produce a "hit" in Classic Chaocipher is seven (7) steps.

Q.E.D.

Return to Progress Report #27


Copyright (c) 2020 Moshe Rubin
Created: 1 August 2020