Command line arguments

G

glen herrmannsfeldt

(snip on quoting, globbing, and otherwise command line processing)
It seems to me that placing quotes around the argument *is* a good way
of handling it. Any program run from a command line and requiring
arguments will have to be specified and given that unix users are
inclined to read man pages and to ask a program itself for --help, it is
a simple thing to say "if you wish to use -f with wildcards, and don't
want the shell expansion, surround the argument with quotes". However,
_most_ unix users will know this already.

I wonder how many users of Apple OS X have never opened a command
window, and have no idea about globbing and quoting.

Even more, Mac users more than other systems like putting spaces
in file names, which require escaping or quoting even without
globbing.

-- glen
 
B

BartC

The way command line expansion is done in current windows is
still surprising to unix users.

The most flexible approach would have been to supply a raw, unprocessed
command line to a C program. Then it would work exactly the same way as
every subsequent line of input.

With a library function to chop it up into parameters, and perhaps expand
them, if that is desired.

In fact WinMain() works exactly like that (apart from having such a function
as I suggested).
 
L

Lew Pitcher

The most flexible approach would have been to supply a raw, unprocessed
command line to a C program. Then it would work exactly the same way as
every subsequent line of input.

Perhaps, but then the C program would have to do it's own main() argument
parsing. On Windows (at least through COMMAND.COM and CMD.EXE) and Unix
(through whichever shell you choose), the end-user enters one or more lines
of text, and it is the /command parser/ (the shell or CMD) that breaks the
text up into programnames and arguments to main().

For instance, on my Unixish box, I type the following three lines
/bin/echo \
my name is \
lew
and the command interpreter assembles that into an invocation of
the /bin/echo program with
argc == 5
argv[0] -> "/bin/echo",
argv[1] -> "my",
argv[2] -> "name",
argv[3] -> "is",
argv[4] -> "lew", &
argv[5] == NULL

Obviously, for the echo program to get individual arguments, the invoker
(the commandline interpreter) must break up the "raw unprocecssed command
line".

Another example, on my Unixish box, I type the following one line
/bin/echo my name is lew ; /bin/ls

Again, the command interpreter must intervene; the ";" and all that follows
are not part of the /bin/echo argument list, and should not be passed
to /bin/echo. Again, the environment must break up the "raw unprocessed
command line" into something a little more sensible.

Finally, in a Unixish C program (C + POSIX), I code
execlp("/bin/echo","echo","my","name","is","lew",NULL);
and the /bin/echo program again gets the same arguments as my first example.
Here, there is NO command line, not even a "raw unprocessed command line"
to provide to the C program.

HTH
 
B

Ben Bacarisse

Shells do all sorts of special processing before executing a program.
Are you making a special case for file globing, or would you like a
program to be able to "reach out" and bypass all or any of the others
(IO redirection, parameter expansion, command substitution, the various
quotes and word splitting, and so on)?
While it might be difficult to do at this late date, requiring a shell
to support that wouldn't be a big deal.

You'd have to decide what constitutes the raw command. If you say it's
just what was typed, a program would need a complex parser just to work
out what it's first argument is! If you mean the command just before
file globing is done, then I'd ask again what makes globing the special
case?
 
G

glen herrmannsfeldt

(snip)
When there is no wild-card match, the Unix shell passes the
actual as-typed string or strings as arguments to the program.

If by unix shell you mean the Bourne shell (sh) or its descendant,
then yes. csh and tcsh don't do that.
Therefore " ls *.a *.b" will pass "*.a" and "*.b" as the arguments to ls
if there are no files with .a or .b extensions.
However, if there is at least one match, then the shell passes
the match(es) as argument(s).

[gah@localhost ~/tmp]$ ls *.a
ls: No match.
[gah@localhost ~/tmp]$

-- glen
 
B

Ben Bacarisse

BartC said:
The most flexible approach would have been to supply a raw,
unprocessed command line to a C program. Then it would work exactly
the same way as every subsequent line of input.

What does each command here see as it's raw unprocessed command line?

ls -lt $(find . -name data\*) | head -$1 >>out*.log

<snip>
 
G

glen herrmannsfeldt

(snip, someone wrote)
Shells do all sorts of special processing before executing a program.
Are you making a special case for file globing, or would you like a
program to be able to "reach out" and bypass all or any of the others
(IO redirection, parameter expansion, command substitution, the various
quotes and word splitting, and so on)?

Well, with many shells I can type

!!

and run the previous command again. I certainly don't expect
the program to see the !! and know what the previous arguments
were.

Even more, I can type

ls !*

So, some processing has to be done before the program is run.
You'd have to decide what constitutes the raw command. If you say it's
just what was typed, a program would need a complex parser just to work
out what it's first argument is! If you mean the command just before
file globing is done, then I'd ask again what makes globing the special
case?

Yes, one of the fun things about unix are programs that recognize the
name that they are called with and do things differently.

On my system, gzip, gunzip, and gzcat are all the same program.
With either a hard or symbolic link, argv[0] is the name actually
types, but aliases are replaced before argv[0] is generated.

Some (not unix) command line parsers upper case the whole line
before processing it. Many programs need the line in the original
case.

-- glen
 
B

Ben Bacarisse

Fred K said:
When there is no wild-card match, the Unix shell passes the actual
as-typed string or strings as arguments to the program.

This has been stated more than once but it varies from between shells
and it is even configurable in some. The term "the Unix shell" is
ambiguous, but what you describe is the default behaviour of the default
shell in many Unix systems. It's just neither universal nor fixed.

<snip>
 
B

BartC

Ben Bacarisse said:
What does each command here see as it's raw unprocessed command line?

ls -lt $(find . -name data\*) | head -$1 >>out*.log

On Windows, characters such as |, < and > have special meaning meaning to
the 'shell', so the first command would see:

-lt $(find . -name data\*)

and the next -$1, which I guess has special meaning in Unix, perhaps like %1
in Windows BAT files. But this is starting to get into scripting now. A few
escapes are acceptable (I was going to say inescapable), and sometimes there
is nothing the invoked program can do with them anyway.

However, to add "*" to the end of \windows\system32\, and turn that one
parameter, into 2745 parameters, is something that is in a class of its own.

(And imagine that all you'd intended doing with that parameter was to write
it out again as a part of an argument to system(). Or perhaps changing a *.c
spec to a *.o one, which suddenly becomes a bit more difficult.)
 
K

Keith Thompson

BartC said:
The most flexible approach would have been to supply a raw, unprocessed
command line to a C program. Then it would work exactly the same way as
every subsequent line of input.

With a library function to chop it up into parameters, and perhaps expand
them, if that is desired.

I'd say the Unix approach is equally flexible. 99% of programs
don't care about wildcard expansion, because the shell does all
the expansion for them. Programs that need to deal with wildcards,
like "find", can be invoked with their arguments quoted to inhibit
expansion.

That's why C's main() takes its arguments as a char** referring to
a list of strings, rather than as a single string. (See, it's at
least marginally topical.)

It's worked very well for 40 years. It can take some getting used
to if you're not accustomed to it, but I suggest that it's well
worth the effort.

(And even if you're right, and the rest of us have been doing it
wrong all this time, it's not going to change.)

One thing that *could* happen is that a new shell, or a new version
of an existing shell, could set an environment variable containing
the "raw" command line, before quoting and wildcard expansion.
But defining just what the "raw" command line is turns out to be
far from trivial -- and no currently existing programs make use of
the information.
 
G

glen herrmannsfeldt

This has been stated more than once but it varies from between shells
and it is even configurable in some. The term "the Unix shell" is

Well, I suppose "the unix shell" does have some meaning, which would
be the (original) Bourne shell, back from when there was only one.
ambiguous, but what you describe is the default behaviour of the default
shell in many Unix systems. It's just neither universal nor fixed.

But yes, now there are plenty of shells to go around.

-- glen
 
B

Ben Bacarisse

BartC said:
On Windows, characters such as |, < and > have special meaning meaning to
the 'shell', so the first command would see:

You said "the most flexible approach would have been to...". That
sounds like you are saying what would have been better for Unix to have
done. I.e. I was asking not what Windows does, but what you think Unix
shells should do.
-lt $(find . -name data\*)

and the next -$1, which I guess has special meaning in Unix, perhaps like %1
in Windows BAT files. But this is starting to get into scripting now. A few
escapes are acceptable (I was going to say inescapable), and sometimes there
is nothing the invoked program can do with them anyway.

However, to add "*" to the end of \windows\system32\, and turn that one
parameter, into 2745 parameters, is something that is in a class of
its own.

It's simply what it means to most Unix shells. I don't see why it's so
odd an idea. You can pass the string "/windows/system32/*" perfectly
easily if that's what the program (or the user) wants.
(And imagine that all you'd intended doing with that parameter was to
write it out again as a part of an argument to system(). Or perhaps
changing a *.c spec to a *.o one, which suddenly becomes a bit more
difficult.)

Just pass *.c to the program. What's the problem?
 
L

Les Cargill

BartC said:
I've just discovered that a single command line argument containing
wildcards, such as *.c, is expanded to a full list of matching files before
it gets to main().

This is a *shell* behavior - of bash, I think ( and perhaps others ) -
so something like invoking the program with csh might be helpful.

user@user-desktop:~$ echo *.c
o.c
user@user-desktop:~$ csh echo *.c
echo: No such file or directory.
user@user-desktop:~$
 
L

Les Cargill

BartC said:
Suppose a program expects two parameters, both of which contain wildcards.
The result will be a single list of files; how to tell where the first set
of files ends, and the next begins? Or the second parameter should be a
single file; how to tell whether that parameter was present? Etc.


I've have a look later. If that works, that's good. But I can see problems:
it'll work on my system, but someone else running my program will also have
to do that set command. And it might stop other programs working
properly that expect the expansion.


So wrap it with a script.
 
G

glen herrmannsfeldt

(snip)
I'd say the Unix approach is equally flexible. 99% of programs
don't care about wildcard expansion, because the shell does all
the expansion for them. Programs that need to deal with wildcards,
like "find", can be invoked with their arguments quoted to inhibit
expansion.

I might have said slightly less, if you include all user written
programs, but pretty high.
That's why C's main() takes its arguments as a char** referring to
a list of strings, rather than as a single string. (See, it's at
least marginally topical.)
It's worked very well for 40 years. It can take some getting used
to if you're not accustomed to it, but I suggest that it's well
worth the effort.

TeX and associated programs like to have the command line.

I did once compile metafont from Pascal source with Sun Pascal.
I had to add a routine to put the arguments back together with
space in between before passing it to metafont to process.
(And even if you're right, and the rest of us have been doing it
wrong all this time, it's not going to change.)
Yep.

One thing that *could* happen is that a new shell, or a new version
of an existing shell, could set an environment variable containing
the "raw" command line, before quoting and wildcard expansion.
But defining just what the "raw" command line is turns out to be
far from trivial -- and no currently existing programs make use of
the information.

Would be interesting to see.

-- glen
 
G

glen herrmannsfeldt

(snip)
This is a *shell* behavior - of bash, I think ( and perhaps others ) -
so something like invoking the program with csh might be helpful.
user@user-desktop:~$ echo *.c
o.c
user@user-desktop:~$ csh echo *.c
echo: No such file or directory.
user@user-desktop:~$

Maybe

csh -c "echo *.c"

-- glen
 
J

JohnF

Joe Pfeiffer said:
Trying this with ls, I was surprised to learn that JohnF's
interpretation is actually correct:
snowball:515$ ls a*.c
/bin/ls: cannot access a*.c: No such file or directory
So 'a*.c' must have actually been passed verbatim to /bin/ls.

Yeah, that was exactly my thinking about the observed behavior
of echo and ls in the several situations enumerated above:
the shell's making a few more choices than you might naively
expect. If it's going to expand a*.c and there are no matching
files, then I'd naively expect to see an empty string,
i.e., I'd expect echo a*.c to emit a blank line. Or if it's going
to "pass verbatim" like you said, then it should always do that,
whether or not matching files exist. As it stands, the behavior is
somewhat unpredictable, depending on the contents of your pwd.
But I'm sure the people who wrote it that way have their reasons
why this is usually the most desireable behavior, if not the most
rational behavior.
 
J

James Kuyper

On 01/11/2013 06:46 PM, BartC wrote:
....
The most flexible approach would have been to supply a raw, unprocessed
command line to a C program. Then it would work exactly the same way as
every subsequent line of input.

With a library function to chop it up into parameters, and perhaps expand
them, if that is desired.

In fact WinMain() works exactly like that (apart from having such a function
as I suggested).

Your first message on this thread implies that you never knew about this
feature until yesterday, which implies that you must be very new to
unix-like systems. At the moment, your impression of this feature is
based primarily upon the fact that it differs from what you're used to.
I recommend getting the equivalent of at least several months full-time
experience with unix-like environments before making a judgment about
the value of this feature. Most experienced users of Unix-like systems
find it far more convenient than inconvenient. In the long run, you
might still dislike that feature, but at least at that point you'll have
enough experience with such systems to justify reaching a conclusion.
 
P

Philip Lantz

Les said:
This is a *shell* behavior - of bash, I think ( and perhaps others ) -
so something like invoking the program with csh might be helpful.

user@user-desktop:~$ echo *.c
o.c
user@user-desktop:~$ csh echo *.c
echo: No such file or directory.

I'm pretty sure this example doesn't demonstrate what you think it does.
(I know what it demonstrates, but I can't be sure what you intended.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,077
Messages
2,570,567
Members
47,202
Latest member
misc.

Latest Threads

Top