next up previous contents index
Next: Data structure utilities Up: Memory management Previous: MZIXCO - create

MZFORM et al. - handle the I/O characteristic

MZFORM may cause garbage collection.

The nature of the contents of any bank which is to be transported from one computer to another one has to be indicated to ZEBRA, such that it can do the necessary tranformations. In the simplest case that all the data words of a bank are of the same type, this is easily indicated in the parameters to MZLIFT or MZBOOK. For anything more complicated the user specifies the "format" of the bank by calling MZIOBK or MZFORM which encode the format into a variable number of words to be included into each bank in the system part as the "I/O characteristic".

Thus the content description is carried by each bank; this avoids complicated logistics of finding bank descriptors elsewhere than in the bank itself. Complex bank contents require a relatively large number of extra system words. This could represent a substantial overhead on memory or file space occupation, which the user can avoid in the design of his bank contents. Anyway, the number of these extra descriptor words is limited to 16, and any descriptor which would need more is refused. Thus ZEBRA will not handle any arbitrary bank contents via this basic procedure, but by using the concept of the "self-describing" sector (see below) the user can indeed store any kind of information mix, decided at execution time, into a bank and have it travel from one computer to another one.

Sectors

The basic element for setting up an I/O characteristic is the sector, which is a number of words consecutive in the bank which are all of the same type. A sector is described in the format parameter to MZFORM et al. as a combination of its word-count "c" and its type "t" as "ct". For example, 24F is a sector of 24 single-precision floating-point numbers, 24D is a sector of 24 words holding 12 double-precision numbers, and 1I is a sector of one integer word.

The possible values for "t" are:

      t =  B   bit string of 32 bits, right justified
           I   integer
           F   floating-point
           D   double precision
           H   4-character Hollerith
           S   self-describing sector (see below)
A ``static'' sector is a sector of a fixed number of words, such as the 24F of the example above.

An ``indefinite-length'' sector is a sector whose end is defined by the end of the bank. This is written as -t, for example -F signals that the rest of the bank is all floating-point words.

A ``dynamic'' sector is a sector which is preceded in the bank by a single positive integer word indicating the sector length; if this number is zero this means that the rest of the bank is currently unused. This is written as *t, for example *F indicates a dynamic sector of type floating.

Thus the word-count "c" in the sector specification is written as:

      c =  n   numeric, n words:       static length sector
           -   all remaining words:    indefinite length sector
           *   dynamic length sector
A ``self-describing'' sector is a dynamic sector whose type is also encoded into the one word preceding the sector as
   word 0  =  16*NW + IT

      with  NW = length of the sector
            IT = numeric representation of the type
                 =  1  B bit string    2  I integer
                    3  F floating      4  D double precision
                    5  H Hollerith
                    6  (reserve)       7  (special)
The form "nS" is meaningless; the form "*S" indicates one particular sector; the form "-S" is special in that it indicates that the rest of the bank is filled by self-describing sectors, as many as there may be. (Thus the forms, for example, '4I 5F / *S' and '4I 5F --S' are equivalent, but the second form is more economic; the user may give either, the internal result will be the second form.)

Complete Characteristics

Looking now at the bank as a whole, we divide it into a "leading part" and a "trailing part", either of which may be empty.

The leading part consists of one region of maybe several sectors, occurring once at the beginning of the bank. This leading region may end with an indefinite-length sector, in which case the trailing part is empty.

The trailing part of the bank may be empty or it may consist of an indefinite number of regions which all have the same structure, such that the same format description is valid for all of them.

The symbol "/" marks the break between the leading region and the trailing regions in the format parameter to MZFORM et al.

Examples:

trailing part empty:

      '-F'         the whole bank is floating
      '3I -F'      the first 3 words are integer, the rest is F
      '*I -F'      the first word n=IQ(L+1) is a positive integer,
                   words 2 to n+1 are integers, the rest is F
      '3B *I -F'   the first sector consists of 3 words bit-string,
                   the second sector is dynamic of type integer,
                   the rest of the bank is floating
      '3I *F'      the first 3 words are integer, followed by a
                   dynamic sector of type F, the rest (if any) of
                   the bank is currently unused

both parts present

      '3B 7I / 2I 4F 16D'  the leading region has 3 B and 7 I words,
                           each trailing region consists of 2 integer
                           words, followed by 4 F words, followed
                           by 16 D words, ie. 8 double-precision numbers
      '4I / *H'    the bank starts with 4 integer words,
                   the rest is filled with dynamic Hollerith sectors
      '*I / 2I *F' the leading region is one dynamic I sector,
                   each trailing region consists of 2 integers
                   followed by a dynamic F sector
                   (ie. 3 integers plus a number of floating words
                    this number being indicated by the 3rd integer)

leading part empty

      '/ *H'       the bank is filled with dynamic Hollerith sectors
      '/ 4I 29F'   4 integers and 29 floating numbers alternate

Economic formats

It is in the interest of the user to design his bank contents such that the I/O characteristic is as simple as possible, because the number of system words in any bank increases with the complexity of the lay-out. "Simple" means: as few sectors to be described as possible.

For example:  '2B 2I 2B 2I 2B 2I -F' is much less simple then
               '6B 6I -F'.
Moreover, if the integers described by this format are sure to be positive integers, then one can use the even simpler form '12B -F'.

In the following we give an exhaustive list of the most economic bank formats, those requiring zero or one extra system word in the banks.

Zero extra I/O words

These bank formats can be described by the 16 bits of the I/O control-byte alone:

   (0)  '-t' or '*t'
        'ct -t'           if c < 64   (c=* is represented as c=0,
          or 'ct *t'                   hence  '*t -t'  is a sub-case)
   (1)  '*t *t -t'
   (2)  '*t *t *t'
   (3)  'ct / *t'         if c < 64
        '/ ct *t'         if c < 64
        '/ ct'            this is useful only if c=*
                          else the form '-t' is used
   (4)  '*t / *t *t'
        '*t *t / *t'
   (5)  '/ *t *t *t'

One extra I/O word

These bank formats can be described by the 16 bits of the I/O control-byte plus the 32 bits of one extra I/O word:

   (1)  'ct -t'
        'ct ct -t'        if c < 65536
        'ct ct ct -t'     if c < 1024
   (2)  'ct *t'
        'ct ct *t'        if c < 65536
        'ct ct ct *t'     if c < 1024
   (4)  'ct / ct'         if c < 65536
        'ct / ct ct'      if c < 1024
        'ct / ct ct ct'   if c < 256
        'ct ct / ct'      if c < 1024
        'ct ct / ct ct'   if c < 256
   (5)  '/ ct ct'         if c < 65536
        '/ ct ct ct'      if c < 1024        (remember:
        '/ ct ct ct ct'   if c < 256          c=0 means c=*)

Three routines are provided to mediate between the user specifying the bank format in a readable form and the highly encoded I/O characteristic to be included into any bank at creation time.

#MZIOCH>MZIOCH

analyses the format chFORM to convert and pack it into the output vector IOWDS. This is the basic routine, but it is usually called by the user only to specify formats of objects other than banks, like the user header vector for FZOUT.

To specify bank formats the following two routines serve more conveniently:

#MZIOBK>MZIOBK

is provided for the context of MZLIFT; like MZIOCH it analyses the format chFORM, but it stores the result as part of the bank-description vector NAME for MZLIFT.

#MZFORM>MZFORM

again analyses the format chFORM, but it does not return the result to the user. Instead, it remembers the I/O characteristic in a system data-structure, returning to the user only the index to the characteristic in the system. The user may then either pass this index to MZBOOK (or MZLIFT) at bank creation time, or alternatively he may request MZBOOK (or MZLIFT) to search the system data-structure for the I/O characteristic associated to the Hollerith identifier IDH of the bank to be created.

The first word of the I/O characteristic delivered by MZIOCH or MZIOBK has the following redundant format:

          |       16       |    5    |     5     | 6 bits|
          |----------------------------------------------|
          |  control-byte  |  NWXIO  |  NWXIO+1  |     1 |
          |----------------|---------|-----------|-------|
The I/O index delivered by MZFORM has the following format:
          |       16       |    5    |     5     | 6 bits|
          |----------------------------------------------|
          |      index     |    0    |  NWXIO+1  |     2 |
          |----------------|---------|-----------|-------|
where NWXIO is the number of extra I/O words, ie. the total length of the characteristic is NWXIO+1.

Typing rules for chFORM

The format should be typed giving the "ct" for each sector, in the order in which they occur in the bank, as shown in the examples. Leading, interspersed, and trailing blanks (also comma or dot) for aeration are allowed and ignored.

Single-word sectors must be typed as '1t', 't' alone is illegal.

The c for double-precision sectors gives the number of words, thus 14D specifies 7 double-precision numbers; 7D is illegal.

CALL MZIOCH (IOWDS*,NWIO,chFORM)

with

           IOWDS*  the I/O words to receive the result,
                   a vector dimensioned to NWIO

             NWIO  the maximum size of IOWDS, < 17

           chFORM  the format as a CHARACTER string

CALL MZIOBK (NAME*,NWMM,chFORM)

with

            NAME*  the bank description vector for MZLIFT,
                   the resulting characteristic will be stored
                   into the I/O words starting at NAME(5),
                   the IDH contained in NAME(1) will be used
                   if diagnostics are necessary,
                   a vector dimensioned to NWMM

             NWMM  the maximum size of NAME, < 21

           chFORM  the format as a CHARACTER string
CALL MZFORM (chIDH,chFORM,IXIO*)

with

            chIDH  the Hollerith IDH of the bank, type CHARACTER

           chFORM  the format as a CHARACTER string

            IXIO*  returns the index to the characteristic stored
                   in a system data-structure,
                   this can be passed to MZBOOK/MZLIFT,
                   in which case it must not be modified

Examples:

      DIMENSION    IOHEAD(4), MMLIFT(8)

      CALL MZIOCH (IOHEAD,4, '8I -F')             for an FZIN user header
      CALL MZIOBK (MMLIFT,8, '2I / 2I 8F')        for MZLIFT
      CALL MZFORM ('RCBC', '2I/2I 8F'), IORCBC)   for reference by index

People creating data outside Zebra, but destined to be read by FZ of Zebra, will have to know the representation of the I/O characteristic stored into any bank:

The physically first word of any bank contains:

   right half:  NOFF = NIO + NL + 12
   (bits 1-16)   where  NIO: the number of extra
                              I/O descriptor words for the bank
                        NL: the number of links in the bank


   left half:   the I/O control byte, which controls the
   (bits 17-32)  interpretation of the I/O characteristic

In the simplest cases the I/O control byte alone specifies the nature of the data in the bank, without needing extra descriptor words (in which case NIO is zero). We give here the translation of some of these cases:

       -B:  0001             *B:  0009
       -I:  0002             *I:  000A
       -F:  0003             *F:  000B
       -D:  0004             *D:  000C
       -S:  0007

For example: suppose one were to prepare a bank with two links and 4000 data words which are all un-signed 32-bit integer (type B), a bank which is to travel in link-less mode such that all standard links are zero:

   word 1   0001 000E     -B | NOFF = 14
        2   zero          link 2
        3   zero               1
        4   zero          link next
        5   zero               up
        6   zero               origin
        7   IDN           numeric ID
        8   IDH           Hollerith ID
        9   2             number of links
       10   1, say        number of structural links
       11   4000          number of data words
       12   zero          status word
                          bits 19-22 give NIO, here zero
       13   data word 1
            ...
     4012   data word 4000

Note that the status word contains NIO on bits 19--22 to allow Zebra to reach the start-of-bank.

It is impraticable to tabulate the translation of more complicated formats. There is a little program DIOCHAR to interactively take a format chFORM, translate it and display the result in hexadecimal. This is not yet properly installed on the CERN machines, but on the Apollo people at CERN can run it by giving the command /user/zoll/uty/diochar

The subroutine MZIOTC is provided to convert an encoded IO characteristic back into printable form. One may hand to this routine the address of a bank and receive its IO characteristic in a CHARACTER variable. Alternatively one may pass to it an integer array as delivered by MZIOCH for back-conversion to CHARACTER, for example the IO characteristic of a user-header vector read with FZIN.

CALL MZIOTC (IXST, !L, NCHTR*, chIOTR*)

or

CALL MZIOTC (IOWDS, 0, NCHTR*, chIOTR*)

with

             IXST  the index of the store holding the bank,
                          or of any of its divisions
               !L  the address of the bank

            IOWDS  the integer array with the encoded characteristic
                   (L must be zero in this case)

           NCHTR*  number of useful characters stored into chIOTR
                   = 0 if trouble

          chIOTR*  the CHARACTER variable to receive the characteristic

The routine returns zero in NCHTR if L is non-zero and not a valid bank address, or if chIOTR is not long enough.



next up previous contents index
Next: Data structure utilities Up: Memory management Previous: MZIXCO - create


Janne Saarela
Mon May 15 08:34:47 METDST 1995