Trigraphs

  • Thread starter Christopher Benson-Manica
  • Start date
V

Villy Kruse

Glen said:
Even worse, there are multiple definitions for some of the characters. '['
and ']' have existed for many years on the TN print train, but somehow that
wasn't good enough to define them in EBCDIC.

What are TN and the TN print train?
Also, EBCDIC has both a solid and split vertical bar. Some ASCII
tables use one, and some the other, for the printable representation,
and I believe that has also been a problem with conversion tables.

I'm curious: Is there a similar problem when converting to Unicode,
or do people agree on which is which in that case?
Just to continue the confusion, EBCDIC has CR, NL, and LF control
characters, X'0D', X'15', and X'25' respectively. Which one should C
use as the '\n' character?

If CR is Carriage Return, I presume it should be \r.

What do EBCDIC NL and LF do?

By them selves nothing, only when sent to a device they may do something.
In the EBCDIC world and IBM/360 in particular neither was used to control
the printer.

Newline is next line at the beginning of line. Line feed is next line same
character position. In ASCII you emulate the newline function using the
sequence CR,LF. In the C lagnuage one of them has (arbitrarily) been used
as the '\n' character.

Villy
 
V

Villy Kruse

The point of tri- and digraphs is to make C programming possible
on systems which does not support the ASCII character set, but
that does support ISO 646. If your system does support the
characters that the tri- and digraphs are replacing, then you
are very welcome to substitute them in all your code, but you
probably won't be able to send the code back to its originator
without replacing them again.


On all the unix systems where this was relevant you would need all the
characters marked as national use in ISO646 anyway for the unix shell
and the other unix utilities, so trigraphs wasn't solving anything
there anyway. MS-DOS, another widely used system with a C compiler,
never used national variations of ISO646.

It is just a matter of learning that '{|}' prints as 'äöü' and these
characters are not letters.


Villy
 
V

Villy Kruse

I've been told that italian and danish keyboards don't support some of C's
common symbols. Personally, I wouldn't be using C if I had to use trigraphs.


They are available by using ackward key strokes. When you add accented
letters to a 102 key keyboard something has to give, and the little used
symbols in regular text like {}[]|\ will be the first to go.


Villy
 
R

Richard Bos

Christian Bau said:
And before anyone tries to get rid of trigraphs, I would like to hear a
suggestion how to do this without breaking programs that use them.

Use digraphs instead. Trigraphs are an abomination that can bugger up
your output even on computers that don't need them. Digraphs are much
more programmer-friendly.

Richard
 
A

Andreas Kahari

Richard Bos wrote: said:
Use digraphs instead. Trigraphs are an abomination that can bugger up
your output even on computers that don't need them. Digraphs are much
more programmer-friendly.

All trigraphs does not have digraph equivalents, such as the
trigraphs for '~' and '^'.
 
R

Richard Bos

Tom Zych said:
Lew said:
More to the point, '[' and ']' aren't found in certain mainframe (EBCDIC)
charactersets. Other characters are missing from EBCDIC-US as well.

Ecch. EBCDIC. COBOL. Trigraphs. gets(). void main. The ugly side of
computer science...

You seem to be confusing some things. Trigraphs, EBCDIC and COBOL are
ugly constructs, and not, from a C programmer's POV, very well thought
out, but they serve their purpose and in the case of trigraphs and
EBCDIC are even compatible with reliable C programming.
gets() and void main(), OTOH, are a different kettle of fish. They are
not a good idea, never were a good idea, and never will be a good idea,
no matter what kind of big iron you're trying to program on - and they
should be avoided by every serious C programmer.

Richard
 
R

Richard Bos

Andreas Kahari said:
All trigraphs does not have digraph equivalents, such as the
trigraphs for '~' and '^'.

True, but AFAIK <iso646.h> does have alternatives for all of those.
Ugly, I'll grant you, but not as phenomenally disbeatific as trigraphs.

Richard
 
L

LibraryUser

Glen said:
.... snip ...

The advantage of chain printers is that for smaller character
sets you can put more copies of each character on, and it will
print faster. There are even chains with only numbers and
number related punctuation. All the older high level languages
used upper case characters only. It was common for the printer
to map lower case to upper case, which made some debugging hard.
The compiler would complain about something that looked just fine
on the printout.

In those days I had a filter for my printouts, which would
overprint all upper case characters. When things like MX80
printers became available I would normally prefer them for the
listings, because they could handle lower case. Then we got some
sort of matrix line printer that basically had a comb, and
printed a matrix by shuttling the comb back and forth. Sounded
like a herd of screeching banshees, but at least it was faster.
 
H

Hallvard B Furuseth

Richard said:
True, but AFAIK <iso646.h> does have alternatives for all of those.

The only problem is \ inside strings and character constants. Digraphs
and macros don't work there. Though by the time trigraphs arrived, I
had long ago learned to read Ø as \ (or vice versa), and I still prefer
'Øn' over '??/n'. Not that it matters now, when latin-1 is here.
 
J

Joona I Palaste

The only problem is \ inside strings and character constants. Digraphs
and macros don't work there. Though by the time trigraphs arrived, I
had long ago learned to read Ø as \ (or vice versa), and I still prefer
'Øn' over '??/n'. Not that it matters now, when latin-1 is here.

"Øn", isn't that Norwegian for "The Island" or something?

--
/-- Joona Palaste ([email protected]) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"To err is human. To really louse things up takes a computer."
- Anon
 
A

Andreas Kahari

Hallvard B Furuseth <h.b.furuseth(nospam)@usit.uio(nospam).no>
scribbled the following: [cut]
'Øn' over '??/n'. Not that it matters now, when latin-1 is here.

"Øn", isn't that Norwegian for "The Island" or something?

Hi Joona.

At least the Swedish word for it, using Norwegian (and Danish)
script.
 
L

lawrence.jones

Glen Herrmannsfeldt said:
I believe at least on the 1403 that the spacing of the characters on the
train is slightly larger than the print column spacing, so that can't
happen.

You're correct that the character spacing on the chain or train was
slightly wider than the print column spacing, but that wasn't done to
avoid chain breakage, it was done to reduce the peak current demand --
trying to fire all the hammers at exactly the same time would require an
enormous amount of energy. Nonetheless, printing the characters in the
order they appeared on the chain (which was called the "chain break
pattern" for obvious reasons) would still subject the chain to a great
deal of mechanical stress and could cause breakage.
I always thought that was just a Unix/C convention, and wasn't part of the
ASCII standard.

It definitely was (and is!) part of the ASCII standard. I think
allowing the vertical format effectors (LF, FF, and VT) to optionally
affect the horizontal position as well was added in the 1977 revision
and then deprecated in the 1986 revision (which is still in effect,
having just been reaffirmed last year).

-Larry Jones

In short, open revolt and exile is the only hope for change? -- Calvin
 
H

Hallvard B Furuseth

Joona said:
"Øn", isn't that Norwegian for "The Island" or something?

No, 'øya' (øy : island, -a: 'female' the).

However, 'ø' is 'island' in some dialects, and I wouldn't be surprised
if one of them uses '-n' for as "male 'the'" instead of '-en' in this
case. Someone close to Sweden, for example.
 
T

Tom Zych

You seem to be confusing some things. Trigraphs, EBCDIC and COBOL are
ugly constructs, and not, from a C programmer's POV, very well thought
out, but they serve their purpose and in the case of trigraphs and
EBCDIC are even compatible with reliable C programming.
gets() and void main(), OTOH, are a different kettle of fish. They are
not a good idea, never were a good idea, and never will be a good idea,
no matter what kind of big iron you're trying to program on - and they
should be avoided by every serious C programmer.

All true. I was just lumping them all under the more inclusive
category of "things I find revolting and intend to avoid forever" :)

Though I suppose I might use trigraphs if I ever try my hand at the
IOCCC ;)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,079
Messages
2,570,574
Members
47,207
Latest member
HelenaCani

Latest Threads

Top