Go to the first, previous, next, last section, table of contents.


ASCII 7-bits, BS to overstrike

This charset is available in recode under the name ASCII-BS, with BS as an acceptable alias.

The file is straight ASCII, seven bits only. According to the definition of ASCII: diacritics are applied by a sequence of three characters: the letter, one BS, the diacritic mark. We deviate slightly from this by exchanging the diacritic mark and the letter so, on a screen device, the diacritic will disappear and let the letter alone. At recognition time, both methods are acceptable.

The French quotes are coded by the sequences: < BS " or " BS < for the opening quote and > BS " or " BS > for the closing quote. This artificial convention was inherited in straight ASCII-BS from habits around Bang-Bang entry, and is not well known. But we decided to stick to it so that ASCII-BS charset will not loose French quotes.

The ASCII-BS charset is independent of ASCII, and different. The following examples demonstrate this, knowing at advance that `!2' is the Bang-Bang way of representing an e with an acute accent. Compare:

% echo \!2 | recode -v bang:us | od -bc
Bang-Bang -> ISO_8859-1:1987 -> RFC 1345 -> ANSI_X3.4-1968 (many to one)
Simplified to: Bang-Bang -> ISO_8859-1:1987 -> ANSI_X3.4-1968 (many to one)
0000000 351 012
        351  \n
0000002

with:

% echo \!2 | recode -v bang:bs | od -bc
Bang-Bang -> ISO_8859-1:1987 -> ASCII-BS (many to many)
0000000 047 010 145 012
          '  \b   e  \n
0000004

In the first case, the e with an acute accent is merely transmitted by the Latin-1:ASCII mapping, not having a special recoding rule for it. In the Latin-1:ASCII-BS case, the acute accent is applied over the e with a backspace: diacriticized characters have special rules. For the ASCII-BS charset, reversibility is still possible, but there might be difficult cases.


Go to the first, previous, next, last section, table of contents.