(snip, I wrote)
For those who don't want to download the PDFs:
ASCII-8 had the same defined characters as ASCII-7, but remapped the
ranges relative to ASCII-7 (what we know as ASCII):
0..31 -> 0..31
32..63 -> 64..95
64..95 -> 160..191
96..127 -> 224..255
leaving gaps in between. This makes it incompatible with standard
ASCII. There doesn't seem to be any stated rationale for this
rather odd mapping. It doesn't represent more than 128 characters,
so I frankly don't see the point.
I mostly don't see the point either. One thing, though. It is required
that 256 different code points map to 256 different card punch
combinations. It does seem like that could have been done
with ASCII-7, though.
Such an encoding would be suitable for a conforming C implementation,
assuming you work around the changes for '^' and '!'. Like EBDIC
and ASCII, it keeps the decimal digits contiguous. Like ASCII,
but unlike EBCDIC, the lowercase letters are contiguous, as are
the uppercase letters (C doesn't require this). And like EBCDIC,
it would force C compiler to make plain char unsigned.
UTF-8 is compatible with ASCII. A UTF-8-like encoding could be made
compatible with ASCII-8, but it would have to use a less elegant
encoding, and it would probably lose some of UTF-8's nice properties.
And it would be incompatible with ASCII-7.
It might have been that if IBM did get ASCII-8 standardized that
other byte-oriented machines would have followed it. At least IBM
might have believe that.
But okay, the properties of EBCDIC that S/360 was designed around:
From pretty early in the punched card days, the top rows were called
zones and the bottom rows digits. (The top two rows are commonly called
the 12 and 11 row, though they don't have markings on them like rows
zero through nine.) In BCDIC, the top row was the '+' character and
the next row '-' (but there was also another code for '-'). In the
pre-computer punched card days one could "overpunch" the sign by
punching it over one of the digit columns. Using the electromechanical
card sorter, it would be one additional pass to separate plus from
minus cards.
For EBCDIC characters in memory, the top (MSB) half of each byte is
called the zone, and the bottom (LSB) the digit. Note that in ASCII-7,
ASCII-8, and EBCDIC the low hex digit of characters '0' through '9'
corresponds to the digit value.
The S/360 (and successor) PACK instruction will take from 1 to 16
bytes of zone decimal (one digit per byte) and convert to packed
decimal (two BCD digits per byte, with the sign in the least
significant half of the rightmost (least significant) byte.
For a series of EBCDIC digits, the result is BCD digits with a sign
field of X'F'. Conveniently, X'F' counts as positive for the packed
decimal (BCD) instructions. When the ASCII mode bit is not set in
the PSW, decimal instructions generate X'C' for plus and X'D'
for minus. When unpacked with the UNPK instruction, positive values
with the rightmost digit 1 through 9 will convert to the EBCDIC
codes for 'A' through 'J' (that is, X'C1 through X'C9') and punch
as 12 punch plus digit 1 though 9. X'C0' is not a printable
EBCDIC character, but will punch as 12 and 0. Similarly, for
negative values, the low byte will be between X'D0' and X'D9',
and punch as 11 row plus digit 0 though 9, again with a non-printing
character for 0, and C'J' through C'R' for 1 through 9.
When the ASCII bit is set in the PSW, decimal instruction generate
for the sign X'A' for plus and X'B' for minus. PACK will then
convert the low digit of positive numbers to bytes from
X'A0' through X'A9' and for negative values X'B0' through X'B9'.
Positive values convert to C'@' and C'A' through C'I' in ASCII-8,
and negative C'P' through C'Y'. With the appropriate punch code
for C'@' that works for positive numbers, but not negative
numbers. But maybe they could convince people to punch negative
numbers using C'P' through C'Y'.
In any case, it is not hard to fixup the low byte using instructions
such as OI, NI, or XI (or immediate, and immediate, xor immediate)
which OR, AND, or XOR one byte with an immediate value. Also,
one can use the TR (translate) instruction to convert between 1
and 256 characters using a 256 byte (or less) translate table.
Independent of the ASCII bit, decimal instructions accept X'B'
and X'D' as negative, X'A', X'C', X'E', and X'F' as positive.
Other sign values will generate an interrupt, as will digits
other than X'0' through X'9' in digit positions.
I don't know if that helps much. Presumably IBM could have built
card readers and card punches with either code.
-- glen