Xah's Edu Corner: The importance of syntax & notations.

X

Xah Lee

Xah's Edu Corner: The importance of syntax & notations.

http://www.stephenwolfram.com/publications/recent/mathml/mathml_abstract.html

this article should teach the coding sophomorons and computer
“science†idiotic authors who harbor the notion that syntax is not
important, picked up by all the elite i-reddit & twittering & hacker
news am-hip dunces.

Further readings:

• The TeX Pestilence
http://xahlee.org/cmaci/notation/TeX_pestilence.html

• A Notation for Plane Geometry
http://xahlee.org/cmaci/notation/plane_geometry_notation.html

• The Concepts and Confusions of Prefix, Infix, Postfix and Fully
Nested Notations
http://xahlee.org/UnixResource_dir/writ/notations.html

• The Problems of Traditional Math Notation
http://xahlee.org/cmaci/notation/trad_math_notation.html

Xah
∑ http://xahlee.org/

☄
 
P

Peter Keller

In comp.lang.scheme Xah Lee said:
Xah's Edu Corner: The importance of syntax & notations.

http://www.stephenwolfram.com/publications/recent/mathml/mathml_abstract.html

this article should teach the coding sophomorons and computer
?science? idiotic authors who harbor the notion that syntax is not
important, picked up by all the elite i-reddit & twittering & hacker
news am-hip dunces.

I must have really tweaked you with my "Syntax is not important, ideas are."
statement.

I read Wolfram's article carefully. He applies an intuitive sense onto
why he does or doesn't like a particular notation, but yet can't really
elucidate his feelings. I'm surprised that didn't tweak you worse. He
also goes so far as to mention that:

"But actually we still don't know a clean simple way to represent things
like geometrical diagrams in a kind of language-like notation. And
my guess is that actually of all the math-like stuff out there, only
a comparatively small fraction can actually be represented well with
language-like notation."

It is simply that the method by which the right brain categorizes
and processes visual information is not observable by the left
brain. Therefore no language can EVER be constructed by the left brain to
represent why the right brain "prefers" some visual layouts for languages
over others. I've done enough classical art training in my life to
understand the conflict between the powerful spatial/visual processor
the right brain has and the (in the context of drawing) meaningless
linguistics of trying to describe the process.

Only when we as human beings build the observation channels needed (with
physical connection between certain areas of the left and right sides of
the brain) will any sort of meaningful left brain language be created
for the visual understanding/gradation of the spatial relationship and
the method by which our right brain performs its processing.

If you want to design a better computer language, hire an artist.

Most semantic objects in programs stand in some spatio-temporal
relation to each other. If you deny that fact, then simply look at the
directed acyclic form/SSA form/CPS transform of any of your favorite
languages. Compiler go through *great* pains to transform left brain
scribblings into large spatio-temporal "2d images" where lots of algorithms
are done before converting them into assembly. This is because it is simply
easier to visually understand how to do the processing of those elements
than not.

Thank you.

-pete
 
P

Peter Keller

In comp.lang.scheme w_a_x_man said:
Compiler work real hard.
Compiler have heap big trouble.

That's a funny observation in the context of this thread--which I
appreciate, since syntax really is the cornerstone of meaning transferal
between people. The unintended connotation brought in by what I mistakenly
wrote underscores the value of syntax.

However, what we don't have is a means of measuring the effectiveness
and/or efficiency of expressing meaning for an arbitrary set of syntax
rules. Computer Scientists can do this somewhat in that the expressive
power of parsing is greater than regular expressions and both can use a
syntax to represent them. But in a single complexity class, the "black
art" of how to place a metric on a syntax is, at least at this time,
relegated to the right brain and how it visually sees (and visually
parses) the syntax and how our emotions relate to the syntax.

The wolfram article, in fact, never does mention any metric other than
"this is hard to understand, this is less hard to understand". In a sense,
how is that useful at all? Instead of really trying to find a method
by which understanding can be placed upon a metric (or discovering a
method *can not* be found, he seems to anecdotally ascribe understanding
difficulty upon various syntaxs.

The real frustrations of Xah Lee might be explained by his denial of the
right brain processing of syntax information. It is to be expected since
most industrial cultures suppress right brain advancement (emotional
understanding/social interaction, drawing, music, spatial relations) in
lieu of left brain processing (language and syntax, symbolic manipulation
(part, though not all of the skill set of math), object naming). In
fact, his skills of communicating his ideas in a social setting which,
in my opinion, are poor and stunted, is a red flag and the epitome of
this type of cultural viewpoint.

Thank you.

-pete
 
T

toby

I must have really tweaked you with my "Syntax is not important, ideas are."
statement.

I read Wolfram's article carefully. He applies an intuitive sense onto
why he does or doesn't like a particular notation, but yet can't really
elucidate his feelings.

Exactly; and as far as I can determine, Knuth does the same. He
applies standards of *good taste* to his notation. (No surprise that
he's also singled out for vituperation by the OP.)

In my opinion Knuth believed in the value of literate programming for
similar reasons: To try to exploit existing cognitive training. If
your user base is familiar with English, or mathematical notation, or
some other lexicography, try to exploit the pre-wired associations.
Clearly this involves some intuition.
 
R

rjf

I must have really tweaked you with my "Syntax is not important, ideas are."
statement.

I read Wolfram's article carefully. He applies an intuitive sense onto
why he does or doesn't like a particular notation, but yet can't really
elucidate his feelings. I'm surprised that didn't tweak you worse. He
also goes so far as to mention that:

"But actually we still don't know a clean simple way to represent things
like geometrical diagrams in a kind of language-like notation. And
my guess is that actually of all the math-like stuff out there, only
a comparatively small fraction can actually be represented well with
language-like notation."


For someone talking about notation and language, it is amusing that
Wolfram refuses to write grammatically correct sentences. Each of the
pseudo-sentences quoted above is a dependent clause. Wolfram writes
like that. He typically begins sentences with "but" or "and".

In the article cited, he also claims to be more-or-less the only
person alive to have thought about these issues, and that Mathematica
is the solution. This is not surprising,and it is nevertheless clear
that Wolfram has spent some time thinking about mathematical notation,
and some money developing fonts and such. His self-praise is perhaps
not deserved.

It is unfortunate that most users of Mathematica labor under a
misunderstanding of the meaning of the fundamental algorithmic
notation of that system, namely "function definition".

f[x_]:=x+1 is not a function definition, although most users think
so.

In fact, it is a pattern/replacement rule.

If you want to write the equivalent of the lisp (defun f(x)(+ x 1))
you can do so in mathematica, but it looks like this:

f=#1+1&

or alternatively,

f=Function[Plus[Slot[1],1]]

or alternatively,

f= Function[{x},x+1]


How do those choices grab you for the Emmy in "Excellence in
Notation"?

By the way, what is wrong with f[x_]:=x+1?

While you might think it maps x->x+1 for all x, it does so only in
the absence of other rules involving f. Thus the presence of another
definition (actually another rule) ... f[z_]:=z^2/;OddQ[z] changes
the definition when the argument is an odd integer.
An equivalent definition is
f[z_?OddQ]:=z^2

now you could have 2 rules
f[x_?Predicate1]:= 123
f[x_?Predicate2]:= 456

and it might not be clear which Predicate is a subset of the other.
(It is clear that the blank pattern "_" is a superset of "_?OddQ" ).

So how does Mathematica deal with this situation?
The predicates are tested in some order until one returns True.
What order is that?

"Whenever the appropriate ordering is not clear, Mathematica stores
rules in the order you give them. "
Oh, when is the ordering not clear?

Uh, "you should realize that this [ordering] is not always possible".
(quotes from mathematica documentation , TheOrderingOfDefinitions.)

While Wolfram has provided an admirable summary of the history of
notation, his solution is less so.

RJF
 
J

John Nagle

Xah said:
Xah's Edu Corner: The importance of syntax & notations.

http://www.stephenwolfram.com/publications/recent/mathml/mathml_abstract.html

this article should teach the coding sophomorons and computer
“science†idiotic authors who harbor the notion that syntax is not
important, picked up by all the elite i-reddit & twittering & hacker
news am-hip dunces.

Definitely read Wolfram's paper. He and his people actually had
to solve the problem.

Wolfram had to face up to the big problem - how do we input traditional
mathematics into a computer unambiguously. Version 1 was Mathematica
FullForm, which is very wordy but clear, like Ada. This worked, but
was too clunky. After a few redesigns, they came up with what he
calls StandardForm, which is Mathematica's variant on what he calls
"TraditionalForm", or mathematics as currently written in textbooks.
The differences are subtle; the main one is that precedence is more
explicit, and a "double strike" convention is used to disambiguate
certain meta-symbols, like the "d" used in derivatives and "i" used
as sqrt(-1).

StandardForm is unambiguous. TraditionalForm is not, as anyone who
spends too much time reading other people's math publications realizes.
Cleverly, the Mathematica people have been able to produce a set of
heuristics which, Wolfram claims, allows converting traditional form
into the unambiguous StandardForm, getting it right most of the time.
(I've been struggling to read a paper on machine learning which has some
non-standard operator precedence problems, and I wish the paper was in
StandardForm.)

Wolfram would, I think, like to make "StandardForm" the standard for
publication, and in time, something like that will probably happen.
It's useful to be able to get math in and out of symbolic manipulation
systems.

Anyway, read the paper.

John Nagle
 
P

Peter Keller

In comp.lang.scheme toby said:
In my opinion Knuth believed in the value of literate programming for
similar reasons: To try to exploit existing cognitive training. If
your user base is familiar with English, or mathematical notation, or
some other lexicography, try to exploit the pre-wired associations.
Clearly this involves some intuition.

I've dabbled in the linguistics field a little bit and this thread made
me remember a certain topic delineated in this paper:

http://complex.upf.es/~ricard/SWPRS.pdf

This paper uses a concept that I would say needs more investigation:

The mapping of syntax onto metric spaces.

In fact, the paper literally makes a statement about how physically
close two words are when mapped onto a line of text and language networks
build graph topologies based upon the distance of words from each other
as they relate in syntactical form.

It seems the foundations for being able to construct a metric to "measure"
syntax is somewhat available. If you squint at the math in the above
paper, you can see how it can apply to multidimensional metric spaces
and syntax in different modalities (like the comparison between
reading written words and hearing spoken words).

For example, here the same semantic idea in a regex defined three ways:

english-like:

Match the letter "a" followed by one or more letter "p"s followed by the
letter "l" then optionaly the letter "e".

scheme-like:

(regex (char-class \#a) (re+ (char-class \#p)) (char-class \#l)
(re? (char-class \#e)))

perl-like:

/ap+le?/

The perl-like one would be the one chosen by most programmers, but the
various reasons why it is chosen will fluctuate. Can we do better? Can
we give a predictable measurement of some syntactic quantities that
we can optimize and get a predictable answer?

Most people would say the perl-like form has the least amount of
"syntactic garbage" and is "short". How do we meaningfully define those
two terms? One might say "syntactic garbage" is syntax which doesn't
relate *at all* to the actual semantic objects, and "short" might mean
elimination of redundant syntax and/or transformation of explicit syntax
into implicit syntax already available in the metric space.

What do I mean by explicit versus implicit?

In the scheme-like form, we explicitly denote the evaluation and class
of the various semantic objects of the regex before applying the "regex"
function across the evaluated arguments. Whitespace is used to separate
the morphemes of only the scheme syntax. The embedding metric space
of the syntax (meaning the line of text indexed by character position)
does nothing to help or hinder the expression of the semantic objects.

In the perl-like form, we implicitly denote the evaluation of the regex
by using a prototyping-based syntax, meaning the inherent qualities of
the embedding metric space are utilized. This specifically means that
evaluation happens left to right in reading order and semantic objects
evaluate directly to themselves and take as arguments semantic objects as
related directly in the embedding metric space. For example, the ? takes
as an argument the semantic object at location index[?]-1. Grouping
parenthsis act as a VERY simple tokenization system to group multiple
objects into one syntactical datum. Given the analysis, it seems the
perl-like regex generally auto-quote themselves as an advantageous use
of the embeded metric space in which they reside (aka the line of text).

The english-like form is the worst for explicit modeling because it
abstracts the semantic objects into a meta-space that is then referenced
by the syntax in the embedding space of the line itself. In human
reasoning, that is what the quotes mean.

Out of the three syntax models, only the perl one has no redundant syntax
and takes up the smallest amount of the metric space into which it is
embedded--clearly seen by observation.

Consider if we have quantities "syntactic datums" versus "semantic objects",
then one might make an equation like this:

semantic objects
expressiveness = ----------------
syntactic datums

And of course, expressiveness rises as semantic objects begin to outweigh
syntactic objects. This seems a very reasonable, although simplistic,
model to me and would be a good start in my estimation. I say simplistic
because the semantic objects are not taken in relation to themselves
on the metric space the syntactic datums are embeded, I'll get to this
in a bit.

Now, what do we do with the above equation, well, we define what we can, and
then optimize the hell out of it. "semantic objects" for a computer
language is probably a fixed quantity, there are only so many operators and
grouping constructs, and usually very few, meaning one, function description
semantic objects. So, we are left with a free variable of "syntactic datums"
that we should minimize to be as small as possible.

If we don't take into consideration the topological mapping of
the semantic objects into the syntactic datum space, then the
number of semantic objects at least equals the number of syntactic
datums. Expressiveness is simply one in the best case, and this would
be pretty poor for sure. It could be worse--just add redundant syntactic
datums, now it is worse!

If we take into consideration the mapping of the semantic objects onto the
metric space, then maybe this becomes the new equation:

distance(index[obj1], index[obj2], ..., index[objn])
expressiveness = ----------------------------------------------------
syntactic datums

The distance() function in this new model is the centroid of the syntactic
datum which represent the semantic object. (Of course, there are other
models of expressiveness which would need to be explored based upon this
incremental idea I'm presenting.)

Larger distances between semantic objects, or larger syntactic constructs,
would mean less expressiveness. This is interesting, because it can give
a rough number that sorts the three syntactic models I provided for the
regex and it would follow conventional wisdom. This is a decriptivist
model of syntax measurement.

Obviously, the fun is can we search an optimization tree to try find an
optimal set of syntactic datums to represent a known and finite set of
semantic objects taking into consideration the features of the metric
space into which the syntactic datums are embedded?

Maybe for another day....

Later,
-pete
 
X

Xah Lee

Personally, particular interesting info i've learned is that, for all
my trouble in the past decade expressing problems of traditional math
notation, i learned from his article this single-phrase summary:
“traditional math notation lacks a grammarâ€.

The article is somewhat disappointing though. I was expecting he'd go
into some details about the science of math notations, or, as he put
it aptly: “linguistics of math notationsâ€. However, he didn't touch
the subject, except saying that it haven't been studied.

Xah
 
P

Peter Keller

In comp.lang.scheme Peter Keller said:
The distance() function in this new model is the centroid of the syntactic
datum which represent the semantic object.

Oops.

I meant to say:

"The distance() function in this new model uses the centroid of each
individual syntactic datum (which represents the semantic object) as
the location for each semantic object."

Sorry.

-pete
 
K

Kaz Kylheku

["Followup-To:" header set to comp.lang.lisp.]
Oops.

I meant to say:

"The distance() function in this new model uses the centroid of each
individual syntactic datum (which represents the semantic object) as
the location for each semantic object."

Don't sweat it; either way it makes no sense. The rewrite does have a more
journal-publishable feel to it, though: the centroid of the whole aromatic
diffusion seems to hover more precisely above the site of the bovine waste from
which it apparently emanates.
 
X

Xah Lee

2009-08-17

Personally, particular interesting info i've learned is that, for all
my trouble in the past decade expressing problems of traditional math
notation, i learned from his article this single-phrase summary:
“traditional math notation lacks a grammarâ€.
The article is somewhat disappointing though. I was expecting he'd go
into some details about the science of math notations, or, as he put
it aptly: “linguistics of math notationsâ€. However, he didn't touch
the subject, except saying that it haven't been studied.

upon a more detailed reading of Stephen's article, i discovered some
errors.

On this page:
http://www.stephenwolfram.com/publications/recent/mathml/mathml2.html

he mentions the Plimpton 322 tablet. It is widely taught in math
history books, that this table is pythagorean triples.

On reading his article, i wanted to refresh my understanding of the
subject, so i looked up Wikipedia:
http://en.wikipedia.org/wiki/Plimpton_322

and behold!

apparantly, in recent academic publications, it is suggested that this
is not pythagorean triples, but rather: “a list of regular reciprocal
pairsâ€.

Xah
∑ http://xahlee.org/

☄

Xah's Edu Corner: The importance of syntax & notations.

this article should teach the coding sophomorons and computer
“science†idiotic authors who harbor the notion that syntax is not
important, picked up by all the elite i-reddit & twittering & hacker
news am-hip dunces.
Further readings:
 
X

Xah Lee

http://www.stephenwolfram.com/publications/recent/mathml/index.html

i was trying to find the publication date and context, but didn't find
it last time after a couple min. Yesterday, on rereading, i did. The
article in question is:

«
Mathematical Notation: Past and Future (2000)

Stephen Wolfram
October 20, 2000
Transcript of a keynote address presented at
MathML and Math on the Web: MathML International Conference 2000
»

so, it's a speech for MathML conf in 2000.

so, this explains the error on the plimpton 322. The latest discovery
on that is published in 2002 and later.

the date of this speech also explains parts of the writings about some
mysterious “fundamental science workâ€, which now we know is his
controversial book A New Kind Of Science (2002).

Xah
∑ http://xahlee.org/

☄

----------------------
Xah Lee wrote:

Personally, particular interesting info i've learned is that, for all
my trouble in the past decade expressing problems of traditional math
notation, i learned from his article this single-phrase summary:
“traditional math notation lacks a grammarâ€. The article is somewhat
disappointing though. I was expecting he'd go into some details about
the science of math notations, or, as he put it aptly: “linguistics of
math notationsâ€. However, he didn't touch the subject, except saying
that it haven't been studied.

upon a more detailed reading of Stephen's article, i discovered some
errors.

On this page:http://www.stephenwolfram.com/publications/recent/mathml/
mathml2.html

he mentions the Plimpton 322 tablet. It is widely taught in math
history books, that this table is pythagorean triples.

On reading his article, i wanted to refresh my understanding of the
subject, so i looked up Wikipedia: http://en.wikipedia.org/wiki/Plimpton_322

and behold!

apparantly, in recent academic publications, it is suggested that this
is not pythagorean triples, but rather: “a list of regular reciprocal
pairsâ€.

Xah
∑http://xahlee.org/

☄
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,982
Messages
2,570,185
Members
46,736
Latest member
AdolphBig6

Latest Threads

Top