The main driver constructs, while initializing all conversion modules, a table giving all the conversion routines available (single steps) and for each, the starting charset and the ending charset. If we consider these charsets as being the nodes of a directed graph, each single step may be considered as oriented arc from one node to the other. A cost is attributed to each arc: for example, a high penalty is given to single steps which are prone to loosing characters, a low penalty is given to those which need studying more than one input character for producing an output character, etc.
Given a starting code and a goal code, recode computes the most
economical route through the elementary recodings, that is, the best
sequence of conversions that will transform the input charset into the
final charset. To speed up execution, recode looks for
subsequences of conversions which are simple enough to be merged, it
then dynamically creates new single steps for these mergings.
A double step is a sequence of two single steps, the output of the
first being the special charset rfc1345, the input of the second
single step being also rfc1345. A special machinery dynamically
produces efficient, reversible, merge-able single steps out of these
double steps.
The main part of recode is written in C, as are most single
steps. A few single steps need to recognize sequences of multiple
characters, they are often better written in flex.
Go to the first, previous, next, last section, table of contents.