PyPi Read The Docs Build Status Coverage Status Python Versions Requirements Status Known Vulnerabilities License


This library extends the native codecs library (namely for adding new custom encodings and character mappings) and provides a myriad of new encodings (static or parametrized, like rot or xor), hence its named combining CODecs EXTension.


$ pip install codext

Note: Some encodings are available in Python 3 only.

Usage (CLI tool)

$ codext -i test.txt encode dna-1
$ echo -en "test" | codext encode morse
- . ... -

Python 3 (includes Ascii85, Base85, Base100 and braille):

$ echo -en "test" | codext encode braille
β žβ ‘β Žβ ž
$ echo -en "test" | codext encode base100

Using codecs chaining:

$ echo -en "Test string" | codext encode reverse
gnirts tseT
$ echo -en "Test string" | codext encode reverse morse
--. -. .. .-. - ... / - ... . -
$ echo -en "Test string" | codext encode reverse morse dna-2
$ echo -en "Test string" | codext encode reverse morse dna-2 octal
test string

Usage (Python)

Getting the list of available codecs:

>>> import codext
>>> codext.list()
['ascii85', 'base85', 'base100', 'base122', ..., 'tomtom', 'dna', 'html', 'markdown', 'url', 'resistor', 'sms', 'whitespace', 'whitespace-after-before']

Usage examples:

>>> codext.encode("this is a test", "base58-bitcoin")
>>> codext.encode("this is a test", "base58-ripple")
>>> codext.encode("this is a test", "base58-url")
>>> codecs.encode("this is a test", "base100")
'πŸ‘«πŸ‘ŸπŸ‘ πŸ‘ͺπŸ—πŸ‘ πŸ‘ͺπŸ—πŸ‘˜πŸ—πŸ‘«πŸ‘œπŸ‘ͺπŸ‘«'
>>> codecs.decode("πŸ‘«πŸ‘ŸπŸ‘ πŸ‘ͺπŸ—πŸ‘ πŸ‘ͺπŸ—πŸ‘˜πŸ—πŸ‘«πŸ‘œπŸ‘ͺπŸ‘«", "base100")
'this is a test'
>>> for i in range(8):
        print(codext.encode("this is a test", "dna-%d" % (i + 1)))
'this is a test'
>>> codecs.encode("this is a test", "morse")
'- .... .. ... / .. ... / .- / - . ... -'
>>> codecs.decode("- .... .. ... / .. ... / .- / - . ... -", "morse")
'this is a test'
>>> with open("morse.txt", 'w', encoding="morse") as f:
	f.write("this is a test")
>>> with open("morse.txt",encoding="morse") as f:
'this is a test'
>>> codext.decode("""
   z      """, "whitespace-after+before")
>>> print(codext.encode("An example test string", "baudot-tape"))
   . *
*  .  
*  .* 
   . *
** .* 
** .**
*  .  
* *. *
* *.  
* *. *
*  .  
* *.  
* *. *
 * .* 

List of codecs

Codec Conversions Comment
a1z26 text <-> alphabet order numbers keeps words whitespace-separated and uses a custom character separator
affine text <-> affine ciphertext aka Affine Cipher
ascii85 text <-> ascii85 encoded text Python 3 only
atbash text <-> Atbash ciphertext aka Atbash Cipher
bacon text <-> Bacon ciphertext aka Baconian Cipher
barbie-N text <-> barbie ciphertext aka Barbie Typewriter (N belongs to [1, 4])
baseXX text <-> baseXX see base encodings
baudot text <-> Baudot code bits supports CCITT-1, CCITT-2, EU/FR, ITA1, ITA2, MTK-2 (Python3 only), UK, …
bcd text <-> binary coded decimal text encodes characters from their (zero-left-padded) ordinals
braille text <-> braille symbols Python 3 only
citrix text <-> Citrix CTX1 ciphertext aka Citrix CTX1 passord encoding
dna text <-> DNA-N sequence implements the 8 rules of DNA sequences (N belongs to [1,8])
excess3 text <-> XS3 encoded text uses Excess-3 (aka Stibitz code) binary encoding to convert characters from their ordinals
gray text <-> gray encoded text aka reflected binary code
gzip text <-> Gzip-compressed text standard Gzip compression/decompression
html text <-> HTML entities implements entities according to this reference
ipsum text <-> latin words aka lorem ipsum
leetspeak text <-> leetspeak encoded text based on minimalistic elite speaking rules
letter-indices text <-> text with letter indices encodes consonants and/or vowels with their corresponding indices
manchester text <-> manchester encoded text XORes each bit of the input with 01
markdown markdown –> HTML unidirectional
morse text <-> morse encoded text uses whitespace as a separator
navajo text <-> Navajo only handles letters (not full words from the Navajo dictionary)
octal text <-> octal digits dummy octal conversion (converts to 3-digits groups)
ordinal text <-> ordinal digits dummy character ordinals conversion (converts to 3-digits groups)
radio text <-> radio words aka NATO or radio phonetic alphabet
resistor text <-> resistor colors aka resistor color codes
rot text <-> rot(N) ciphertext aka Caesar cipher (N belongs to [1,25])
rotate text <-> N-bits-rotated text rotates characters by the specified number of bits
scytale text <-> scytale ciphertext encrypts with L, the number of letters on the rod (belongs to [1,[)
shift text <-> shift(N) ciphertext shift ordinals with N (belongs to [1,255])
sms text <-> phone keystrokes also called T9 code ; uses β€œ-” as a separator for encoding, β€œ-” or β€œ_” or whitespace for decoding
southpark text <-> Kenny’s language converts letters to Kenny’s language from Southpark (whitespace is also handled)
tomtom text <-> tom-tom encoded text similar to morse, using slashes and backslashes
url text <-> URL encoded text aka URL encoding
xor text <-> XOR(N) ciphertext XOR with a single byte (N belongs to [1,255])
whitespace text <-> whitespaces and tabs replaces bits with whitespaces and tabs

A few variants are also implemented.

Codec Conversions Comment
baudot-spaced text <-> Baudot code groups of bits groups of 5 bits are whitespace-separated
baudot-tape text <-> Baudot code tape outputs a string that looks like a perforated tape
bcd-extended0 text <-> BCD-extended text encodes characters from their (zero-left-padded) ordinals using prefix bits 0000
bcd-extended1 text <-> BCD-extended text encodes characters from their (zero-left-padded) ordinals using prefix bits 1111
manchester-inverted text <-> manchester encoded text XORes each bit of the input with 10
octal-spaced text <-> octal digits (whitespace-separated) dummy octal conversion
ordinal-spaced text <-> ordinal digits (whitespace-separated) dummy character ordinals conversion
southpark-icase text <-> Kenny’s language same as southpark but case insensitive
whitespace_after_before text <-> lines of whitespaces[letter]whitespaces encodes characters as new characters with whitespaces before and after according to an equation described in the codec name (e.g. β€œwhitespace+2*after-3*before”)