Number One From Moscow
CIA HISTORICAL REVIEW PROGRAM
22 SEPT 93
NUMBER ONE FROM MOSCOW1
At the trial in 1957 of Colonel Rudolf Abel for espionage on behalf of the Soviet Union, one of the exhibits in evidence was a bit of microfilm carrying ten columns, 21 rows, of five-figure groups. This cipher message, found inside a hollow nickel in 1953 and turned over to the FBI, had proved impregnable to solution until its key was made available four years later by the defection of Reino Hayhanen, Abel's erstwhile assistant, to whom the message had been addressed.2 The inability of government cryptanalysts to read it was no reflection upon their competence, for the cryptographic system used in the message was the finest and most advanced mnemonic cipher ever made public. Although not theoretically insoluble, it is effectually unbreakable without prior knowledge of the system and on the basis of a single message.
At the trial, although prosecuting attorney Kevin Maroney did a masterful job of leading Hayhanen, as state's witness, through the intricacies of the system, the cipher was so complicated that its description bored the jurors and the process could not be followed even by a cryptographer without the written program furnished the jury. A better look at it at leisure will be rewarding to anyone with an interest in cryptography.
If the cipher were to be given a technical name, it would be known as a "straddling bipartite monoalphabetic substitution superenciphered by modified double transposition." Four mnemonic keys--the Russian word for "snowfall," a snatch of popular song, the date of the Soviet V-J Day, and the agent's personal number--were used to derive the arrangement of the alphabet for the substitution and the order for the two transpositions. The system can most easily be illustrated by following through the encipherment of the exhibited message, Moscow's first to Hayhanen after his arrival in New York, much as some Soviet cipher clerk did it in a well-guarded office of the KGB on a wintry third of December, 1952. Translated, the message read:
We congratulate you on [your] safe arrival. We confirm the receipt of your letter to the address "V" and the reading of [your] letter No. 1.
For organization of cover we have given instructions to transmit to you three thousand in local [currency]. Consult with us prior to investing it in any kind of business, advising the character of the business.
According to your request, we will transmit the formula for the preparation of soft film and the news separately, together with [your] mother's letter.
[It is too] early to send you the gammas.3 Encipher short letters, but do the longer ones with insertions. All the data about yourself, place of work, address, etc., must not be transmitted in one cipher message. Transmit insertions separately.
The package was delivered to [your] wife personally. Everything is all right with [your] family. We wish [you] success. Greetings from the comrades.
No. 1, 3 December.
The Russian text was as follows:
The first major step in the encipherment of this text is substitution of one- and two-digit numbers for the Russian plaintext letters. For this purpose a table or "checkerboard" of 40 cells--ten across and four down--is set up as illustrated below.
The first seven letters, , of the Russian word for "snowfall" are inscribed in the first row, leaving the last three cells blank. The remaining 23 letters of the modern Russian alphabet, omitting diacritical marks, are inscribed in sequence vertically in the other three rows, skipping the third and fifth columns, which, with the last cell remaining in the last column, are then filled by seven symbols. These are a period, a comma, the symbol whose meaning is undetermined, the abbreviation , the letter-number switch sign , the "message starts" sign H/T, and the abbreviation for "repeat." Along the top of the checkerboard are written the ten digits in a mixed sequence determined by a process to be described later. The last three digits in the sequence, which stand over the blank cells at the end of the first row, are repeated at the left of the second, third and fourth rows. These digits are known as coordinates.
Each plaintext letter in the first row of the checkerboard is enciphered by substituting the single coordinate above it. Each letter and symbol in the other rows is enciphered by substituting the coordinate at the end of its row followed by the coordinate at the top of its column. Numbers are enciphered by placing them within a pair of the letter-number switch signs and repeating them three times.
Before these substitutions are made, however, the plaintext is bisected--chopped at random into two parts--and the true start of the message is tacked onto the true end. This true start is indicated by the "message starts" sign H/T. In this encipherment, as illustrated below, the sign stands seventh in the fourth line from the bottom of page A19.
The sequence of coordinates resulting from the substitution--which by itself affords virtually no security--is then thoroughly jumbled by passing it through two transposition tableaux. The first tableau (Fig. 1) is a standard columnar transposition. The substituted coordinates are written in horizontally under a set of keynumbers (the second of the two rows heading Figure 1) whose derivation will be given presently. They are taken out vertically, the column under keynumber 1 first and the others following in key order.
This new sequence of digits is then inscribed into the second tableau (Fig. 2) which, however, has a complication. This consists of a series of step-like disruption (D) areas determined as follows. The first D area begins in the top row under keynumber 1 and runs to the right side of that row. In each of the following rows, it begins one column to the right. When the columns are exhausted, one row is skipped and another D area is started in the following row with the column under keynumber 2, and so forth for as many rows as are needed to accommodate all the cipher digits.
The cipher digits taken vertically from the first tableau are inscribed horizontally from left to right into the rows of the second tableau, but leaving the D areas blank. When the non-D portions of all rows have been filled, the remaining digits are written in from left to right in the D areas, starting with the top row. From the completed tableau the digits are then taken out vertically in the order indicated by the column keynumbers without any regard to D areas. This final sequence of digits, in the standard groups of five, comprises the cipher text. A keygroup is inserted at a predetermined point before the message is sent. The result is shown in Figure 3.
We have seen that one of the four mnemonic keys -develops the alphabetic arrangement in the checkerboard. The other three--a phrase from a popular song, the V-J date, and Hayhanen's personal number, 13--interact to generate a series of virtually random numbers that in turn yield the keynumbers across the top of the checkerboard and the two transposition tableaux.
In the derivation of these keys two devices are used repeatedly--chain addition and conversion to sequential numbers. Chain addition produces a series of numbers of any length from a few priming digits: the first two digits of the priming series are added together modulo 10 (without tens digits) and the result placed at the end of the series; then the second and third digits are added and the sum placed at the end; and so forth, using also the newly generated digits when the priming series is exhausted, until the desired length is obtained. To illustrate: with the priming series 3 9 6 4, 3 and 9 are added to get 2 (the 1 of the 12 being dropped), 9 and 6 yield 5, 6 and 4 add to 0. The series so far is 3 9 6 4 2 5 0; extended, it would run 3 9 6 4 2 5 0 6 7 5 6 3 2 1 . . . .
Conversion to sequential numbers, or the generation of a sequential key, is an adaptation from the standard practice of deriving a numerical key from a literal one by assigning consecutive numbers to the letters of the key in their alphabetical order, numbering identical letters from left to right. The literal key BABY, for example, would generate the sequential numerical key 2 1 3 4. In the Hayhanen system a series of digits is used as the breeder key, and consecutive numbers are assigned to them in their numerical order (0 is last), numbering identical digits from left to right. For example, if the breeder key is 3 9 6 4 6, the sequential key would be 1 5 3 2 4.
The derivation of the checkerboard and transposition keys for this message begins with the date--September 3, 1945--that Russia, achieved victory over Japan in World War II. It is written numerically in the Continental style: 3/9/1945. Its last digit, 5, indicates the position from the end of the message of an inserted arbitrary keygroup, presumably a different one for each message. In this message it is 2 0 8 1 8. The first five digits of the date, in Line B following, are subtracted from this keygroup (Line A) by modular arithmetic (without borrowing the tens digit). The result is Line C.
Then the first 20 letters of a line from the Russian popular song "The Lone Accordion" 6 are divided, in Line D, into two sections of ten letters, and sequential keys are derived for each part in Line E. Under the key for the first part is written, in Line F, the subtraction result of Line C, chain-added out to ten digits. Under the key for the second part is written a standard numerical sequence, 1, 2, 3, . . . 0. The first parts of Lines E and F are added modulo 10 to yield Line G.
Then each digit of line G is located in the standard sequence of Line F and replaced by the number in Line E directly over it. The result of this substitution is Line H, which becomes the priming series for a chain addition that begins in Line K and proceeds--in rows of ten digits each--through lines L, M, N, and P.
The widths of the two transposition tableaux are found by adding respectively the eighth and ninth numbers--or perhaps the last two dissimilar numbers--in Line P to the agent's personal number, in this case 13. The first tableau will therefore have 17 columns and the second 14.
The sequential key derived in Line J from Line H indicates the column sequence for a vertical transcription from the block formed by Lines K through P. The digits that result from this transcription, in Lines Q and R, become the breeder keys for the two transposition tableaux. They are repeated at the top of Figures 1 and 2 respectively, followed by the sequential keynumbers derived from them.
Finally, a sequential key is derived in Line S from Line P.
This becomes the sequence of digits used as the coordinates for the checkerboard.
In 1956 Hayhanen's personal number was changed from 13 to 20, so that the width of the transposition tableaux was increased and their reconstruction thereby made slightly more difficult. In addition, the chain-added block was deepened by one row to increase the randomness of the digits that become the breeder keys for the transposition tableaux.
What can be said of the cryptographic merits of this cipher? That it is eminently secure was demonstrated by the FBI's inability to solve the nickel message. The system derives its great strength from complications introduced into a combination of two basically simple methods, monoalphabetic substitution and columnar transposition.
The complication in the substitution is the straddling device in the checkerboard. Ordinary checkerboards, having no unkeyed rows, produce two-digit equivalents for all plaintext letters. Here the irregular alternation of single and double coordinates makes it hard for a cryptanalyst to divide the running list of numbers into the proper pairs and singletons, a division which is of course prerequisite to the reduction to plaintext. A division entirely into pairs would straddle the correct equivalents (whence the term "straddling" in the cipher's technical description). Furthermore, this irregularity undoubtedly increases the difficulty of reconstructing the transposition tableaux.
The complication in the transposition is the disruptions in the inscription of the second tableau. Their purpose is to block any attempt at reconstructing the first tableau. In the solution of ordinary double transposition, once the difficult job of reconstructing the second tableau is completed, the cryptanalyst can immediately proceed to the first with the premise that its columns will be found in the rows of the second. But the D areas forestall this direct attack here by mixing a part of one such column with a part of another. The cryptanalyst must sort out the columns before he can reconstruct the first tableau, and this sorting is a formidable task.
The keying method of this cipher adds to its cryptanalytic resistance. The long series of calculations performed in the key derivation results in a series of virtually random numbers whose lack of pattern makes it difficult for the cryptanalyst to reconstruct the original keys and thus get clues for the solution of subsequent messages. Even more important is the arbitrary five-digit group introduced at the start of the key derivation. It affects the derivation so strongly that keys with different groups would bear no apparent relation to one another. Since this group was apparently different for each message, and since each agent presumably had a different set of mnemonic keys, no two messages of all those sent out from Moscow by this system to secret agents all over the world would ever be keyed the same. Cryptanalysts, whose work becomes harder as they have less traffic in a single key, would have to attack each message separately.
Finally, the bisection of the message makes it harder for the cryptanalyst to find and exploit stereotyped beginnings and endings.
The system also has a number of operational advantages. First, the individual operations are easy and rapid, minimizing the chance of garbled messages. Second, the cipher text runs only about half again as long as the plain, not twice as long, as it often does in high-security pencil-and-paper systems. This reduction from the usual doubling is effected by the use of single coordinates in the checkerboard for high-frequency letters, for which the keyword is specially chosen.
The keyword includes the most frequent letter in Russian (O, with 11 percent), four other high-frequency letters (C, H, E, A) and two low-frequency letters. The seven account for 40 percent of normal Russian text, so that the cipher text should average 60 percenter longer than the plain. (The nickel message is 62 percent longer.) The relative reduction means briefer communications, with consequent lowered risk of detection.
Third, the most important and unusual operational advantage of the cipher is the way an entire encipherment can be developed from four easily memorized items. The agent must also know, of course, the procedure for deriving from these the final keys, but this does not appear very hard to remember. Each step seems to lead to the next in much the same way that one portion of a piano piece leads to another. No spy cipher of comparable security that achieves this feat of mnemonics is known. To a spy, who lives in fear of sudden raids and searches, a cipher system that requires no betraying memoranda is a boon. Ironically, however, Hayhanen -- or his superiors -- did not trust his memory: when he arrived in the United States he carried microfilm notes on his cipher in case he forgot what was so easy to remember!
For all the impressive security in this cipher, it is not theoretically impossible to reconstruct the second transposition tableau in correct form for deriving a first tableau whose rows would yield the required monoalphabetic frequency distribution, and when this were done the monoalphabetic substitution could be solved with relative ease. Once the system were known and with a large volume of traffic in it, an electronic computer might be able to run through the billions of trials needed for a solution. But that a single message could have been solved while the system itself remained unknown is highly unlikely. The weakening of frequency characteristics caused by the way it uses the numbers and the obliteration of repetitions by the thorough transpositions leave virtually no clues for the cryptanalyst.
4 The genitive case is apparently an error.
5 Apparently enciphered as a letter by error.
6 The full phrase is the following