This charset is available in recode
under the name HTML
and has w3
and WWW
for aliases.
HTML texts used by World Wide Web limit themselves to 7-bit characters internally, special sequences beginning with an ampersand & and ending with a semicolon ; are used for representing characters from Latin-1 having the 8th bit set.
When you recode from another charset to HTML
, beware that all
occurrences of ampersands are usually translated into the string
`&', similarly, left angle brackets < are translated
into `<' and right angle brackets > are translated into
`>'. However, in practice, people often use ampersands and
angle brackets in the other charset for introducing HTML commands,
compromising it: it is not pure HTML, not it is pure other charset.
These three translations can be rather inconvenient, they may be
specifically inhibited through the command option -d
.
Go to the first, previous, next, last section, table of contents.