The Solfa Cipher

Between May 19th and 21st, 2017, I’ve participated in the NSEC 17 Capture-the-Flag (CtF) event held annually in Montreal, QC. As usual, the team and I had a blast spending days and nights solving challenges and drinking free beer. Among the challenges was a two-part cryptographic puzzle printed on the first and last pages of the passport of Rao’s Intricate Kingdom – the country part of the storyline of the event. The challenge was divided into two parts: a Braille encoded message and the second part was encrypted using the Solfa cipher, which I had never heard of before. As such, I decided to learn more about it and complete a write-up for the challenge at the same time. I’ll first quickly cover the Braille part of the challenge, then move on to the Solfa part of it and the decryption process.

The Second Half of the Flag

Upon entrance at the NSEC competition this year, participants received a passport designed to be stamped based on events happening during the CtF. The back of the front cover contains a sequence of dotted symbols which can be recognized quickly as Braille as shown in the figure below:

As most of you probably know, the Braille writing system was developed for blind and visually impaired individuals to be able to read using touch. Examples of braille can often be found on elevators. The system is based on a matrix of 3×2 dots which can be blank or filled. Each dot is numbered from 1 to 6 as shown below:

A matrix of the 6 dots representing an individual character Braille

Each character of a natural alphabet can then be associated with a specific matrix configuration. For example, a simple Braille-English translation is shown below:

The Braille Alphabet (figure from pharmabraille.com)

Additional “shortcut” symbols are used for specific sounds, punctuation, symbols and words. The figure below shows some examples of common words in Braille:

Braille for words and abbreviations (from the Tennessee Council of the Blind)

Additional abbreviations can be found in [1]. Going back to the passport, we can obtain the transcription using Unicode:

⠠⠮⠀⠎⠑⠒⠙⠀⠓⠁⠇⠋⠀⠷⠀⠮⠀⠋⠇⠁⠛⠀⠊⠎⠀⠮⠀⠘⠺
⠠⠏⠇⠁⠞⠽⠏⠥⠎⠲⠀⠠⠁⠙⠙⠀⠭⠀⠁⠋⠀⠮⠀⠋⠌⠀⠓⠁⠇⠋⠀⠞⠕
⠕⠃⠞⠁⠔⠀⠁⠀⠉⠕⠍⠏⠇⠑⠞⠑⠀⠋⠇⠁⠛⠲⠀⠠⠛⠇⠕⠗⠽⠀⠞⠕
⠀⠠⠠⠗⠁⠕

We then translate into English and obtain the following translation from Braille to English:

Putting everything together, we obtain the first part of the flag included in the passport:

The second half of the flag is the word Platypus. Add x after the first half to obtain a complete flag. Glory to Rao

First Half of the Flag

The second part of the flag is much more obscure and less documented than the first one. The inside of the cover page contains a small partition holding a total of 4 staves: the first one appears to be a chord while the last 3 are simply a sequence of notes. We noticed that the first staff contains a treble clef, the label “KEY-997” and is shorter than the other staves. Furthermore, it contains the type of notes.

The revolutionary lullaby of Rao, NorthSec 2017
Scanned copy of the lullaby on the last page of Rao's passport.

Clearly, there is something in there, but how do we extract a flag out of this? My knowledge of music theory is extremely low and was basically non-existent prior to this challenge. As such feel free to correct me in the comments if I misrepresent a musical concept or term.

Googling for words relating to music and cryptography will return a limited set of relevant sites, the first relating to musical cryptograms, which is not quite as we are looking for at the moment. The second page is about the Solfa Cipher. Once you’ve found the latter website, you’re almost at the solution, but let’s take a better look at it.

The Solfa Cipher

The Solfa cipher is a substitution cipher, but rather than using an alphabet to encode keys and ciphertext, it uses musical notation. The encryption/decryption key is defined using a clef, a tonic, a mode and a rhythmic unit. The links will provide a much better definition of each different item than I could ever do in this article. However, be aware that the 4 elements mentioned above can have the following values:

ClefsTonicModeRhythmic Unit
TrebleC, C#Minor1/4 (Quarter)
AltoD♭, DMajor1/8 (Eighth)
BassE♭, EPhrygian1/16 (Sixteenth)
F, F#Dorian
G♭, GLydian
A♭, AMixolydian
B♭, BLocrian
Valid values for the properties of the Solfa key.

Like any symmetric cipher, a key is needed to encrypt the message, which will have to be shared with the intended recipients of the message. In this case, the key is in musical notation rather than a sequence of characters or bytes. The first staff of the page represents the key of the cipher, as the label clearly shows. The encryption key is composed of the four elements mentioned above: a treble clef, in C minor, using 1/8 as the rhythmic unit:

Solfa Cipher Key used in the passport at NSEC '17: treble key in C minor with a 1/8 rhythm is used.

Each note is linked to the seven pitches of the solfege, i.e. Do (D), Re (R), Mi (M), Fa (F), Sol (S), La (L) and Si (T). The “KEY-554” is only a randomly generated label and has no significance in the algorithm. With the key known, this puzzle becomes a chain of translations from musical notes to a list of tuples of tones and rhythms using the standard matrix below:

Do (D)Re (R)Mi (M)Fa (F)So (S)La (L)Ti (T)
1:TIASENO:1
2:KZXJÅÆ:2
3:RCHMDLU:3
4:FYGPWBV:4
English language translation matrix normally used for the Solfa cipher.

The columns represent the pitch, while the rows represent the duration of the note (1, 2, 3 or 4).  Let’s go through a complete example to better understand the process. Consider the staff below:

The word "SOLFA" encrypted using the Solfa Cipher
The word "SOLFA" encrypted using the Solfa Cipher

In this case, we assume that we are using a 4/4 meter i.e. the length of a single measure.  That means that each measure has a duration of 4 units. The key used to generate this melody was in C major, with a clef of Treble and a rhythmic unit of 1/4 (Quarter). The first note is Fa and starts at the first time unit, i.e. 1. Therefore the first note can be translated to (F, 1). The Fa is 4 units long, meaning that the second note, Si, also starts at time 1, translating to (T, 1). However this time, the note is only 2 time units long and thus the third note – Sol – starts at 3 and lasts only 1-time unit. Thus the third node is translated to (S, 3) while the fourth one will be translated to (D, 4). Finally, the last note is a Mi and starts at 1 and thus is translated to (M, 1). Putting everything together we have (F, 1), (T, 1), (S, 3), (D, 4) and (M, 1). Using the matrix above, we obtain (F, 1) = “S”, (T, 1) = “O”, (S, 3) = “L”, (D, 4) = “F” and (M, 1) = “A” and thus the plain text is the word “SOLFA”. This process is better represented in the figure below:

Decryption of the word "SOLFA" by reading the notes and their duration.

Going back to the NSEC challenge, we have a much larger melody to decrypt. Luckily, we have the key and the same process as the one we used to decrypt the ciphertext in figure 8 applies.

Solfa-encrypted Message from the Passport in NSEC '17

Let’s take the first 9 notes listed in the figure above. For each of the notes, we first determine its pitch (do, re, mi, …) and then its duration. The key is given specify a 1/8 rhythm, as such an Eighth note will be worth 1-time unit, a Quarter note will be worth 2-time units and the half note will be worth 4-time units. Unless specified otherwise, the meter is 4/4, i.e. a measure is 4-time units long.

The first 9 notes of the Solfa-encrypted message and the initial tempo of the melody.

Using the figure above, we can then extract the notes and their time from the partition:

(d, 1) (m, 3) (s, 1) (d, 4) (r, 1) (d, 3) (f, 1) (d, 1) (m, 3) …

The note is used as the column and the time as the row to read the corresponding value defined in the translation matrix. We can put a quick script what will read these notes and find the corresponding characters. Using table 3, the notes above will be translated to:

d, 1m, 3s, 1d, 4r, 1d, 3m, 4d, 1m, 3
THEFIRSTH

Applying this process to every note in figure 9, we obtain the following message:

THEFIRSTHALFOFTHEFLAGISTHEWORDSUBDERMALCONCATENATEWITHTHESECONDHALFTOOBTAINACOMPLETEFLAGGLORYTORAO

Or with spaces and punctuation added: “The first half of the flag is the word subdermal. Concatenate with the second half to obtain a complete flag. Glory to Rao“. Mixing the 2 halves of the flag, we get the string “SUBDERMALxPlatypus” and get 5 points out of it.

In case you are wondering what the melody in figure 9 sounds like, you can download the resulting MIDI file here: A Revolutionary Lullaby.

Conclusion

Braille and Solfa are quick and fun ways to encode/encrypt data in unusual ways. While they obviously should not be used for any serious application, they could potentially be used as novel ways to exfiltrate data and bypass some filters. For example, a text file could be Base32 encoded and the padding character (“=”) could be replaced with the number 1 for example. Then the resulting string could be encrypted using the Solfa cipher, transformed into a MIDI file and then uploaded to a remote location. I highly suspect that most network security appliances would not pick up on the MIDI file being uploaded, although it would probably strike a careful analyst as suspicious. Feel free to experiment with it. A partial Python implementation can be found here.

References

Additional Readings