The OS provides a function that the application can use to do that
pretty simply (CommandLineToArgvW).
Unfortunately, Microsoft's own applications, like CMD.EXE, do not use it.
The command processor processes quotes on arguments, so that
dir "file"
dir file
produce consistent results. But then it gets confused by
"dir" file.
The inability to read the unmangled, and unglobbed string (and those
are two separate requirements, although they do sometimes overlap), is
a significant PITA on *nix.
But what you have is the the ability to read the un-mangled-in-any-way
null-terminated argument string which was prepared by the parent process!!!
At the shell level, you can quote in reliable ways to pass any string
you want to the process:
$ my-find-utility '*.foo' # receives *.foo argument
This is done. There are programs that take patterns which are interpreted by
the programs:
# GNU diff: recursively compare two directories, but not .xml files.
$ diff -r --exclude='*.xml' dir-a dir-b
Getting the input tokenized and globbed
is often what applications want, but there are enough exceptions to
make the *nix approach as wrong as the MS approach of leaving it far
too much up to the whim of the application.
But the Unix approach lets us pass a vector of arguments from one process to
another in a completely robust way. The Microsoft approach doesn't.
We can replace the Unix shell with some scripting language in which a fork/exec
can be done using a list of strings, and count on the list of strings being
accurately converted to arguments. We don't have that assurance in the Windows
environment.
The MS approach at least
has the advantage of being able to support any desired input syntax
(ignoring piping and command stacking), which the *nix approach does
not.
This is only a theoretical advantage in Windows, whereas it is actually
*done* in Unix.
A primary example of this is the standard awk language, which is used for
writing one-liners whereby the entire script is presented as a command line
argument, often wrapped using single quotes:
$ awk 'BEGIN { foo = 42 } /xyzzy/ { s[$1]++; } ...'
What makes this sort of thing possible is that there is a solid foundation
underneath. A program can call awk using fork/exec, and not have to worry
about quoting and escaping issues:
execlp("awk", "BEGIN ....", (char *) 0);
argv[0] is "awk" and argv[1] is a null-terminated string which awk takes to be
script source code, not messed up in any way between the exec call and awk.
In the end, it is the Unix environment which exhibits languages that can take
an entire script with arbitrary syntax as a command line argument, thanks
in part to clear quoting rules and robust argument passing between programs.