[RFC] SRS Section 2
zhouhui at wam.umd.edu
Thu Feb 3 19:22:38 PST 2005
On Thu, Feb 03, 2005 at 10:33:48PM +0000, Matthew Burgess wrote:
>Going back to your previous example of a C compiler, how
>useful would it be if you gave the compiler invalid C code and it
>silently ignored those lines of code? I'm not talking about code that
>is syntactically valid but contains logic-errors or other classes of
>bugs, I'm talking about stuff like a missing ';'.
You are talking about missing endtags or missing '>', right?
That is not XML validation about. That is syntactical parsing to
verify whether it is a well formed xml document. A XML validation is
to verify if a well formed XML document is a valid document of certain
type according to a dtd.
Whether syntactical parsing is useful?
Let's see what is syntactical parsing. With
and lex and yacc, one get a c syntactical parsing. What it does is
spit OK or NOT on the first syntactic error.
A syntactic parser verifies a syntactical correct stream, which by
itself is meaningless. The effort of correcting syntactic error is
pure wrestling with the language, have no contribution of your
objective, conveying meanings.
A seasoned C programmer still make quite some syntactical errors. I
still routinely commit 20-40% syntactic error, although I can correct
most of them on detection. But that still indicate the c syntax is not
quite intuitive or efficient. In python, although with less than a
year's experience, I only make less than 10% (I estimate much less)
syntax error. I would consider you(Matt) a seasoned XML speaker. If
you still commit large percentage of XML syntax error, that's an
indication a poor syntax, at least to human. When I edit nALFS
profile, I commit 99% syntax error, so I abandoned nALFS altogether to
avoid XML. All those syntactical effort are just pure meaningless
struggle with poor language design. (Well, XML does make a good
language for machine communication.)
The xml validation in my vision is the semantic layer of XML. The dtd
defines a document type which has some meaning. However, XML being
general, this semantic layer can't be very through, so it is at most a
partial semantic layer. It verifies say it is a valid DOCBOOK
document, but it doesn't do anything beyond that, it doesn't even tell
you how many chapters in it and how deep the document structure is.
How useful is that!
To parse at semantic level, it is necessary to parse at syntactical
level first. So on validation, xmllint will spit out syntactical
errors first. The fact is, it has not reached the validation part yet.
To a nALFS profile, at least in my case, I almost never make dtd level
error, (how would I forgot to add <url> element under <download>
element!?) so it is 99% percent pure meaningless syntax error plus 1%
semantical error which dtd validation won't help. That's why I say dtd
validation for alfs is pure useless.
After validation with xmllint, does it mean alfs don't need do syntax
parsing again? NO. In fact, alfs has to parse the full sematical layer
which must first parse the syntax layer again then parse the dtd level
semantics, then the actual logic (semantics) of the profile. Dtd
validation won't save alfs anything!
By the way, do you ever hit a c parser that tell you your program is
missing ';' at somewhere? I know some good parser do that, which shows
that the ; at the end of statement is just not necessary. It is there
to bait you and let you do meaningless struggle. Yes, if one use those
Obfuscated c code, almost all those syntax elements are necessary. But
most programmer use newlines and indentations naturally which are
ignored in the c syntax, which shows the inefficiency. Python uses
those, that's why it feel so intuitive and less error prone. Of
course, the semantical level of C is also quite at cumberson.
XML also doesn't utilize indentations and newlines and uses double
matching syntax: '<' and '>'; open tag and close tag. Human is not
good at matching, and those matching requirement in syntax in my #1
error source. That is why OGDL (google it if you don't know what it
is) feels so human friendly. In edit my profile in ogdl, 100% of my
error is in logical level (like typos in package names).
Let's use the c compiler example again, (seems everyone loves it :),
there is never a notion of validation. After writing the source, go
ahead feed into the compiler and correct those errors. What I am
saying for alfs is: forget all those dtds and validations, just feed
into alfs and let alfs tell you the errors. alfs need do that whether
or not the profile has been validated before.
Since I am in the mood and am talking about useless dtd, Jeremy, I
think your effort on SRS is useless. Disregard the fact that Neocool
take me as a joke, in many aspects I am in total agreement with him.
(He seldom speak outside irc, I have little chances to say it.)
Writing SRS hoping different coder will implement it according to SRS
is very similar to making dtd and profile and waiting for actual
program to use it. It will work similar to the dtd, provides more
hassels than helpful. The dtd demands writing multiple elements for a
simple bash command, which doubles the opening closing matching pairs
and again doubles the < > pairs. Any automated building scripts or
programs writer without the constrain of this dtd will not use a
profile that does that. As for SRS, without actuall writing the
program specifing this detail and that detail is absurd. It is
different when Kevin was in charge, which he has a code base to build
on and the SRS is essentially a to do list based the existing
implementation. Now the program is from scratch up, and the SRS has
little connection with the Neocool's code (if that counts). I just
can't help thinking what a useless effort you are doing.
Well, I love discussion and not much in care of the progress of alfs,
and most of all, I quite appreciate you energy and intention, so I
haven't speak on this yet. Now I spoken, just as a friend and
expressing my honest opinion with my reasons, hope you don't get too
Again on dtd. The dtd added conditionals for a long time, it never
went into code, how useful is that. the download tag also lagged for
quite some IIRC. To the profile writers, do you really enjoy reading
dtd in compare to reading man smb.conf or httpd.conf?
Well, I thought I had more rantings, but I can remember now. Some one
must be sighing or uhhh... quite a few times already, sorry :)
Hopefully it still make a less reading than XML spec or even DTD spec.
More information about the alfs-discuss