Difference between Cygwin and DOS handling of string input

T

ttkingdom

I have here 1 elementary piece of code, which produce different result
in DOS and Cygwin. I'm puzzled and don't know what caused this.

This program counts the number of characters, spaces, lines, and tabs.
It is compiled with gcc

#include <stdio.h>

int main ()
{
double countline = 1, countchar = 0, counttab = 0, countspace = 0;
int c;
printf("Please type in anything, end with EOF (CTRL + D or Z): \n");
while ((c = getchar()) != EOF)
{
if (c == '\n') { ++countline; }
else if (c == ' ') { ++countspace; }
else if (c == '\t') { ++counttab; }
else { ++countchar; }
}
printf("\nYou have typed in %.0f char(s), %.0f space(s), %.0f tab(s),
%.0f line(s).",
countchar, countspace, counttab, countline);
return 0;
}

This program ends when input is EOF, which is CTRL + Z in DOS, CTRL +
D in Cygwin (somewhere on the internet says so :D)

The problems are:
1/ It ends well in DOS with a CTRL + Z, but in Cygwin it needs a CTRL
+ D and an ENTER.
2/ In DOS it count the number of space and tab incorrectly. Try enter
5 spaces and 3 tabs. In Cygwin it counts correctly.

Please anyone explain to me what causes these discrepancies.
 
K

Keith Thompson

ttkingdom said:
I have here 1 elementary piece of code, which produce different result
in DOS and Cygwin. I'm puzzled and don't know what caused this.

This program counts the number of characters, spaces, lines, and tabs.
It is compiled with gcc

#include <stdio.h>

int main ()
{
double countline = 1, countchar = 0, counttab = 0, countspace = 0;
int c;
printf("Please type in anything, end with EOF (CTRL + D or Z): \n");
while ((c = getchar()) != EOF)
{
if (c == '\n') { ++countline; }
else if (c == ' ') { ++countspace; }
else if (c == '\t') { ++counttab; }
else { ++countchar; }
}
printf("\nYou have typed in %.0f char(s), %.0f space(s), %.0f tab(s),
%.0f line(s).",
countchar, countspace, counttab, countline);
return 0;
}

This program ends when input is EOF, which is CTRL + Z in DOS, CTRL +
D in Cygwin (somewhere on the internet says so :D)

The problems are:
1/ It ends well in DOS with a CTRL + Z, but in Cygwin it needs a CTRL
+ D and an ENTER.

The way you specify end-of-file on interactive input varies from
system to system. In Unix-like environments such as Cygwin, I
believe the sequence is either a control-D at the beginning of a
line, or a double control-D if you're not at the beginning of a line.
(The eof character can be configured.)
2/ In DOS it count the number of space and tab incorrectly. Try enter
5 spaces and 3 tabs. In Cygwin it counts correctly.

What does it do on DOS? (And are you really running DOS, or a command
window under Windows?)

Systems typically perform some translations on input. It wouldn't
surprise me if either DOS or a Windows command window did something to
input tab characters.
Please anyone explain to me what causes these discrepancies.

Please explain to us just what these discrepancies are.
 
T

Tom St Denis

I have here 1 elementary piece of code, which produce different result
in DOS and Cygwin. I'm puzzled and don't know what caused this.

This program counts the number of characters, spaces, lines, and tabs.
It is compiled with gcc

#include <stdio.h>

int main ()
{
        double countline = 1, countchar = 0, counttab = 0, countspace = 0;
        int c;
        printf("Please type in anything, end with EOF (CTRL + D or Z): \n");
        while ((c = getchar()) != EOF)
        {
                if (c == '\n') { ++countline; }
                else if (c == ' ') { ++countspace; }
                else if (c == '\t') { ++counttab; }
                else { ++countchar; }
        }
        printf("\nYou have typed in %.0f char(s), %.0f space(s), %.0f tab(s),
%.0f line(s).",
                        countchar, countspace, counttab, countline);
        return 0;

}

This program ends when input is EOF, which is CTRL + Z in DOS, CTRL +
D in Cygwin (somewhere on the internet says so :D)

The problems are:
1/ It ends well in DOS with a CTRL + Z, but in Cygwin it needs a CTRL
+ D and an ENTER.
2/ In DOS it count the number of space and tab incorrectly. Try enter
5 spaces and 3 tabs. In Cygwin it counts correctly.

Please anyone explain to me what causes these discrepancies.

Bill Gates is an asshole.

Nothing in MS-DOS is even remotely SysV, POSIX.1 or BSD/Linux
compatible. They use CTRL+Z for term (which is how you stop processes
in UNIX) as opposed to CTRL+D (which is how you disconnect a stream),
they use \r\n as opposed to \n, etc and so on.

Just write your app to allow both \r and \n [in any order, Macs used
to use *just* \r btw] and you're set.

Tom
 
T

ttkingdom

I'll explain it clearer here :)

1/ I opened up cmd.exe, cd to the folder and run the .exe file. I type
in 5 spaces and 3 tabs, CTRL + Z, and it says "you have typed in 0
space and 0 tabs"

Same as above with Cygwin, CTRL + D, ENTER, and it says correctly that
"you have typed in 5 space and 3 tabs".

2/ In cmd.exe the program stops at CTRL + Z, and display the results.

In Cygwin, it is supposed to stop at CTRL + D. But it does not. After
CTRL + D, I have to ENTER for it to stop.

I'm not trying to make this program work, I just want to know what is
happening behind in the system. Please help or provide any clue/
suggestion.
 
G

Giacomo Degli Esposti

I'll explain it clearer here :)

1/ I opened up cmd.exe, cd to the folder and run the .exe file. I type
in 5 spaces and 3 tabs, CTRL + Z, and it says "you have typed in 0
space and 0 tabs"

Same as above with Cygwin, CTRL + D, ENTER, and it says correctly that
"you have typed in 5 space and 3 tabs".

2/ In cmd.exe the program stops at CTRL + Z, and display the results.

In Cygwin, it is supposed to stop at CTRL + D. But it does not. After
CTRL + D, I have to ENTER for it to stop.

I'm not trying to make this program work, I just want to know what is
happening behind in the system. Please help or provide any clue/
suggestion.

What happens if you press ENTER before ctrl+Z in DOS?
Just guessing, but input is usually line-buffered so the chars are not
delivered
to the application until you press ENTER.
It could be that when you press ^Z the chars still in the buffer are
dropped
and not delivered to the program.
BTW. what DOS compiler are you using?

ciao
Giacomo
 
R

Richard Tobin

Tom St Denis said:
Nothing in MS-DOS is even remotely SysV, POSIX.1 or BSD/Linux
compatible. They use CTRL+Z for term (which is how you stop processes
in UNIX) as opposed to CTRL+D (which is how you disconnect a stream),
they use \r\n as opposed to \n, etc and so on.

I've been using Unix for over 25 years, and my end-of-file character
has always been ctrl-Z. Ctrl-D is merely a default.

And it's hardly surprising that MS-DOS was not compatible with Unix,
since it was derived (indirectly) from DEC operating systems such
as RSX-11 which used ctrl-Z for end of file and CR-LF for line
ends.
Just write your app to allow both \r and \n [in any order, Macs used
to use *just* \r btw] and you're set.

If you're reading text files, use text mode which should do the
appropriate conversions for you.

-- Richard
 
T

ttkingdom

What happens if you press ENTER before ctrl+Z in DOS?
Just guessing, but input is usually line-buffered so the chars are not
delivered
to the application until you press ENTER.
It could be that when you press ^Z the chars still in the buffer are
dropped
and not  delivered to the program.
BTW. what DOS compiler are you using?

ciao
Giacomo

Thanks for your suggestion. I'm using gcc but now I think that for DOS
I should use Borland C Compiler.
 
J

jacob navia

Richard a écrit :
Just why do you think the originators of MS-DOS should have made
anything posix v1 compliant?

No. Don't answer that. You're clearly another one of these MS hating
lunatics.

Posix version 1: 1988 (http://fr.wikipedia.org/wiki/POSIX#Versions)
MSDOS vesion 1: 1982 (http://en.wikipedia.org/wiki/MS-DOS)

Bill Gates is an asshole because he did not write an OS compatible
with a standard that would be published 6 years later...

This is typical of this guy (Tom St Denis).

Just nonsense, but nonsense in the line of c.l.c: windows hater,
even if it is completely ridiculous.

jacob
 
T

Tom St Denis

In UNIX ^Z has been "stop process" for a very long time (hint: before
MS-DOS). At least they got ^C right...
You beat me to it. Tom is another time machine surfer where he would
have made everything old compatible with leading standards of today.

I just think if something works don't break it. Microsoft by design
does things very differently from others to isolate and corner off
their customers from the "real world." Skills I learn today under say
Redhat don't magically become irrelevant because I move to a job where
they use Fedora, or Ubuntu, or BSD. Whereas quite a bit of the
practical skills you learn under Windows are 100% useless anywhere
else. Not to forget to mention their interpretation of what C and C++
are and are not...

BillG and his co-horts are assholes because they waved bye-bye to
sound engineering way long ago in search of the almighty buck. And
here we have today, a man who has more money than God, and he STILL
actively pursues the path of least engineering correctness. He's
already won the fight and he's still kicking people while they're
down. He's an asshole. So we as the generation after him (I was born
when Microsoft was just getting into the stride of PC OSes) have to
deal with their bullshit everyday. I thank the almighty FSM and the
grace of his noodly appendage that I work at a place where Linux is
the norm.

As to the general theme of the question, usually I write parsers that
accept all forms of \r, \n, \r\n, \n\r and then my software "just
works." I don't assume the end of fgets is NUL or \n (like it would
be in Linux) because then my application doesn't work in MS-DOS,
Windows, or on older Macs, etc...

Tom
 
T

Tom St Denis

Richard a écrit :




Posix version 1: 1988 (http://fr.wikipedia.org/wiki/POSIX#Versions)
MSDOS vesion  1: 1982 (http://en.wikipedia.org/wiki/MS-DOS)

Bill Gates is an asshole because he did not write an OS compatible
with a standard that would be published 6 years later...

This is typical of this guy (Tom St Denis).

Just nonsense, but nonsense in the line of c.l.c: windows hater,
even if it is completely ridiculous.

And it would have been wholesale impossible for Microsoft to clean up
some of their behaviour over time? And/or not continue to make things
worse?

If I can run DOS applications on a Linux machine through DOSbox why
can't Microsoft and all their money do the same damn thing? That way
older apps still will run and newer apps will be more standards
compliant.

Answer: They don't care about being open to competition because
they're a bunch of greedy assholes.

Tom
 
T

Tom St Denis

Another person living in the past (witness lack of knowledge of modern
RCS systems) who blames everything else except himself and poor
tools. We've seen the type before.

So far it seems to really only be you going off on a tirade against my
ideas and posts. Maybe it's just you who has a problem with me?

Also I don't consider Windows "modern." I consider them way behind
the times. Sure their aero based GDI gui is fancy but where are the
multiple desktops, or remote shells, or hell a shell to begin with
(cmd.exe doesn't count). They still don't come with useful userland
tools, while NTFS has *real* file permissions their GUIs don't give
you access to them, their user security is somewhat of a joke, you
have to be root to install most apps, they poorly implemented things
like IPsec, they only run on x86 class processors [I don't really
consider WinCE real...], takes gobs of ram and disk space, etc...

In the same disk space as a blank Vista install I can get a very
complete Debian install with 100s of userland tools including
compilers, debuggers, shells, archivers, WYSIWYG editors, browsers,
media players, etc. I can boot a Linux box into a very small amount
of ram. I can easily write kernel drivers and use full access to free
and unfettered source to do so...

So it's not that I look to Windows as "omg it's too advanced, old is
better." It's that I look at it and think "why would anyone buy this
when OSS offers so much more." And I think of people like Microsoft
mgmt as assholes for peddling this mediocre shiny useless toy OS on
people...

Tom
 
J

jacob navia

Tom St Denis a écrit :
So far it seems to really only be you going off on a tirade against my
ideas and posts. Maybe it's just you who has a problem with me?

Its not only Richard, its me too...

Also I don't consider Windows "modern." I consider them way behind
the times. Sure their aero based GDI gui is fancy but where are the
multiple desktops,

In lcc-win you have the source code for a multiple desktops application under windows
since windows xp...

or remote shells,

Remote desktop allows you more than just a text mode shell and that since windows NT...


or hell a shell to begin with
(cmd.exe doesn't count).

And why it doesn't count? Obviously because you are a windows hater
They still don't come with useful userland
tools, while NTFS has *real* file permissions their GUIs don't give
you access to them,

Right click in any file, the "properties". You can do this since
windows NT (10 years go).
their user security is somewhat of a joke, you
have to be root to install most apps,

Under Unix too

they poorly implemented things
like IPsec, they only run on x86 class processors [I don't really
consider WinCE real...], takes gobs of ram and disk space, etc...

In a word:

Saint Denis knows nothing about windows and displays his ignorance here.
 
T

Tom St Denis

Its not only Richard, its me too...

No offense, but since you took a jump off the sanity-wagon you don't
really count either.
In lcc-win you have the source code for a multiple desktops application under windows
since windows xp...

Last I checked you can't move windows from one desktop to another
(part of the problem being GDI windows need a parent and can't change
them). To me you don't have "real" multiple desktop support until you
can move one to the other.
or remote shells,

SSH is *NOT* part of a standard Windows 7 install.
Remote desktop allows you more than just a text mode shell and that since windows NT...

Sometimes text mode is better (hint: compiling something, turning on/
off a service, changing a system param). Specially over high latency
networks.
or hell a shell to begin with


And why it doesn't count? Obviously  because you are a windows hater

No, because it's incomplete, a bitch to script with, not even remotely
compatible with tcsh, bash, ash, sh, etc... Not to forget to mention
the lack of userland tools like sed, awk, grep, [ ], etc... that make
shell scripting remotely useful...
Right click in any file, the "properties". You can do this since
windows NT (10 years go).

Last I checked [admittedly I'm not at a windows box now] you can't
change, for example, the execute bit from inside the GUI. e.g.

chmod +x notepad.exe
Under Unix too

Um no, most *nix applications are designed so they can be installed
relative to your home directory.
like IPsec, they only run on x86 class processors [I don't really
consider WinCE real...], takes gobs of ram and disk space, etc...

In a word:

Saint Denis knows nothing about windows and displays his ignorance here.

I worked on an IPsec project last year. WinXP Pro by default only
supports 3DES, doesn't support AES, and doesn't support SHA-256 [to be
fair Linux doesn't support SHA-256 either... but at least they have
AES support].

Arguing with you people is like arguing with Christians. I can say
something like "December 25th is actually a pagan holiday celebrating
the winter solstice" and you'll just slap "na-huh, it's a jesus day
and you're a poopy head!!!"

Windows is *not* an advanced OS, and for what people pay they should
be getting a hell of a lot more. Anyone who thinks otherwise has
drunk the koolaid and is 100% not objective.

Tom
 
S

Seebs

Tom St Denis a écrit :
or hell a shell to begin with
And why it doesn't count? Obviously because you are a windows hater

It doesn't count because it's not an actual stable programming language
with a real parser and basic features.
Under Unix too

Not really. You can install anything you want in your home directory,
and nearly anything would work.
Saint Denis knows nothing about windows and displays his ignorance here.

Maybe things have changed a lot, but most of that was true a while back.
I don't imagine it's likely to change much.

-s
 
S

Squeamizh

Its not only Richard, its me too...

No offense, but since you took a jump off the sanity-wagon you don't
really count either.
In lcc-win you have the source code for a multiple desktops application under windows
since windows xp...

Last I checked you can't move windows from one desktop to another
(part of the problem being GDI windows need a parent and can't change
them).  To me you don't have "real" multiple desktop support until you
can move one to the other.
or remote shells,

SSH is *NOT* part of a standard Windows 7 install.
Remote desktop allows you more than just a text mode shell and that since windows NT...

Sometimes text mode is better (hint: compiling something, turning on/
off a service, changing a system param).  Specially over high latency
networks.
or hell a shell to begin with
And why it doesn't count? Obviously  because you are a windows hater

No, because it's incomplete, a bitch to script with, not even remotely
compatible with tcsh, bash, ash, sh, etc...  Not to forget to mention
the lack of userland tools like sed, awk, grep, [ ], etc... that make
shell scripting remotely useful...

Download cygwin, and 10 minutes later all of the above problems are
solved. Why complain when an easy solution is already out there?
Right click in any file, the "properties". You can do this since
windows NT (10 years go).

Last I checked [admittedly I'm not at a windows box now] you can't
change, for example, the execute bit from inside the GUI.  e.g.

chmod +x notepad.exe
Under Unix too

Um no, most *nix applications are designed so they can be installed
relative to your home directory.
like IPsec, they only run on x86 class processors [I don't really
consider WinCE real...], takes gobs of ram and disk space, etc...
In a word:
Saint Denis knows nothing about windows and displays his ignorance here..

I worked on an IPsec project last year.  WinXP Pro by default only
supports 3DES, doesn't support AES, and doesn't support SHA-256 [to be
fair Linux doesn't support SHA-256 either... but at least they have
AES support].

Arguing with you people is like arguing with Christians.  I can say
something like "December 25th is actually a pagan holiday celebrating
the winter solstice" and you'll just slap "na-huh, it's a jesus day
and you're a poopy head!!!"

December 25th is a Christian holiday. There is a sliver of truth in
your claim, but as usual, you've gotten almost all the facts wrong,
and you were incredibly lame and annoying in the process.
Windows is *not* an advanced OS, and for what people pay they should
be getting a hell of a lot more.  Anyone who thinks otherwise has
drunk the koolaid and is 100% not objective.

Obviously Windows is different from UNIX. Windows is easy to use. If
you place a higher priority on "advanced" features, then, uh, don't
use Windows?
 
F

Flash Gordon

Tom said:
In UNIX ^Z has been "stop process" for a very long time (hint: before
MS-DOS). At least they got ^C right...

On VMS ^Z is exit, ^Y is interrupt, and ^C is cancel (from memory). Unix
was not the only game in town back then.
I just think if something works don't break it.

There are plenty of things wrong with Unix, including a number of things
which are wrong with the standard security model.
Microsoft by design
does things very differently from others to isolate and corner off
their customers from the "real world."

There have always been lots of different ways things are done.
Skills I learn today under say
Redhat don't magically become irrelevant because I move to a job where
they use Fedora, or Ubuntu, or BSD. Whereas quite a bit of the
practical skills you learn under Windows are 100% useless anywhere
else.

They would not have done you much good on VMS.
Not to forget to mention their interpretation of what C and C++
are and are not...

There C compiler has been pretty good in its support for C89 for a long
time.
BillG and his co-horts are assholes because they waved bye-bye to

grace of his noodly appendage that I work at a place where Linux is
the norm.

If you want to rant about operating systems find an advocacy group.
As to the general theme of the question, usually I write parsers that
accept all forms of \r, \n, \r\n, \n\r and then my software "just
works."

Open a file in text mode and just read it and all you have to deal with
is \n. The only time you have to do something else is if you need to
deal with foreign format text files which have not been converted during
transfer (good file transfer tools will do the transformation to native
format of the destination for you).
I don't assume the end of fgets is NUL or \n (like it would
be in Linux) because then my application doesn't work in MS-DOS,
Windows, or on older Macs, etc...

They do if you open the file in text mode. That is what text mode is *for*!

Oh, and an advantage of \r\n as your line termination is that it matches
most of the text based internet protocols.
 
R

Richard Tobin

Tom St Denis said:
In UNIX ^Z has been "stop process" for a very long time (hint: before
MS-DOS).

MS-DOS got it from CP/M, which predates job control in Unix.

-- Richard
 
N

Nobody

This program ends when input is EOF, which is CTRL + Z in DOS, CTRL +
D in Cygwin (somewhere on the internet says so :D)

The concept of "EOF" for interactive input is quite different between DOS
and Unix.

For a text-mode stream, DOS treats a ^Z character as EOF. This even works
in files. The reason is that the filesystem used by CP/M (on which DOS is
based) didn't store the length of a file in bytes, only in blocks.

With binary formats, you could typically deduce the length of the data
from the data itself, so any padding to a block boundary didn't matter.
Text files used ^Z to indicate the end of the data.

Unix doesn't have a general EOF character. The "tty" (i.e. terminal)
device driver supports an EOF character; typically ^D (although not
always; historically, # has been a common EOF character), although it can
be configured.

When this character is read from the tty device, the driver reports EOF to
the process reading from the device. This state is transient; the process
can ignore the EOF and continue to read from the tty, in which case a
subsequent ^D will return another EOF.

This behaviour only applies to terminals (anything connected via a serial
port, plus programs like terminal emulators which pretend to be connected
via a serial port), and only if the terminal is in "canonical" mode (if
it's in "raw" mode ^D, ^C, ^Z etc are returned as normal characters).
 
N

Nick

Flash Gordon said:
Oh, and an advantage of \r\n as your line termination is that it
matches most of the text based internet protocols.

Can someone remind me exactly of how this works. AIUI (and could be
wrong), '\n' will be translated to whatever is appropriate for the
system. So could generate a CR LF pair on DOS and its descendents, LF on
Unix and something else entirely on something else (like start next
record on IBMs with fixed length records).

If I want to send the necessary characters for an Internet header,
irrespective of what OS my code will be running on (well, as far as Unix
and Windows goes anyway) what should I write? Is there a risk that "\r\n"
will generate two CRs? Should I assume ASCII and send 10 and 13
(probably in octal)?
 
M

Michael Foukarakis

Also I don't consider Windows "modern."  I consider them way behind
the times.
There's a word for that. 'Denial'.
Sure their aero based GDI gui is fancy but where are the
multiple desktops, there
or remote shells, or hell a shell to begin with
(cmd.exe doesn't count).
Powershell. And cmd *is* a shell.
 They still don't come with useful userland
tools,
Please point me to the Linux/UNIX analogous of OllyDbg. Then you can
talk shit again.
while NTFS has *real* file permissions their GUIs don't give
you access to them,
Of course they do. RTFM.
their user security is somewhat of a joke,
Yeah, UNIXoids are totally secure. Puh-lease..
you
have to be root to install most apps,
Have you tried apt-get install on Ubuntu lately?
they poorly implemented things
like IPsec,
As of this line, you are redirected to various bug and vulnerability
reports for the Linux kernel..
And I think of people like Microsoft
mgmt as assholes for peddling this mediocre shiny useless toy OS on
people...

MS management are very bright entrepreneurs. Asshole are, evidently,
everywhere.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,995
Messages
2,570,236
Members
46,823
Latest member
Nadia88

Latest Threads

Top