Re: [engelang] Xorban: Termsets

From: And Rosta <and.rosta@hidden.email>
Date: Mon, 08 Oct 2012 12:56:24 +0100
Subject: Re: [engelang] Xorban: Termsets
To: engelang@yahoogroups.com

Mike S., On 05/10/2012 20:34:

On Fri, Oct 5, 2012 at 6:22 AM, And Rosta <and.rosta@hidden.email
<mailto:and.rosta@hidden.email>> wrote:

On Oct 4, 2012 5:21 PM, "Mike S." <maikxlx@gmail.com
<maikxlx@gmail.com">mailto:maikxlx@gmail.com>> wrote:




On Thu, Oct 4, 2012 at 11:38 AM, And Rosta <and.rosta@hidden.email
<mailto:and.rosta@hidden.email>> wrote:


On Thu, Oct 4, 2012 at 4:10 PM, Mike S. <maikxlx@gmail.com
<maikxlx@gmail.com">mailto:maikxlx@gmail.com>> wrote:

Some initial comments.

1. I don't see a good rationale for having a load of special
rules to accommodate parentheticals -- at least not when Xorban
is trying not to be more complicated than it needs to be.



Do you prefer that parentheticals be removed from the grammar, or
handled another way?


It depends what we require of them and how simply that can be
accommodated. For the time being, I'd suggest leaving them out. That
doesn't preclude restoring them later.

No one has really been using them, but that's to be expected given
that we are all novice users at this point. Intuitively, I suspect
that parentheticals will find great use among fluent speakers, e.g.
as a way of adding interjectives, vocatives, topic switching,
nonrestrictive clauses and similar things. All of these things would
be useful right now if we were to move past the toy-sentence stage.


I agree, but each of these things can be considered in their own right. For example, interjections should be able to unproblematically occur anywhere, because they don't combine syntactically with anything. Nonrestrictive clauses could be signalled simply by a sentence-internal illocutionary-operator marking a formula as outside the scope of main illocutionary force. And so on.

2. It would be good to adopt a sounder less improvised formalism,
separating syntax from phonology, and able to accommodate
phonologically-null and semantically-interpreted syntax. What we
have at present is a constituent structure imposed on a
phonological string -- everything wrong with Lojban pseudogrammar
is wrong with this, albeit to a far lesser degree.



I am not sure how separating phonology (which is really just the
segment-terminals in these rules) from syntax would help anything,


It would remove cruft and avoid the delusion that syntactic terminal
nodes consist of phonological material. If there happened to be
perfect homomorphism between syntax and phonology, the annoyance
would be purely conceptual, but given the goal of usability and
brevity, there will not be this homomorphism; & we already are in
agreement about this.

You may have to jar my memory -- can you give an illustration
involving a hypothetical design in which it would be desirable for
that homomorphism not to hold?


The homomorphism affords simplicity of rules, but as a design criterion, concision is far more important than simplicity of rules -- I think we agree on that. We also agree that there may be elements in logical form that are implicit, i.e. present in logical form but not in phonological form.

3. It would be helpful for the rules for binding to be made
explicit. Do the rules for binding remain the same as they are
for basic Xorban?

--And.



To date there have been no binding rules in the formal grammar.


To date there has been no codified formal grammar. There exists an ad
hoc sketch by Jorge of some bits of grammar.

What I am going to write here and for the rest of this email is not
going to be fully responsive to your points, but rather will be
expositional of my own (and possibly others' to some degree) loglang
design perspective.

The design of a loglang is conceived as the design of two separate
but parallel formal modules: grammar (production rules) and semantics
(interpretation rules).  The grammatical module subsumes syntax,
morphology and the phoneme inventory, and the semantic module
provides an interpretation for each grammatical production in
parallel. Regarded purely grammatically, a _formal language_ over
some formal alphabet (i.e. the phonemes) is simply some subset of the
set of all possible finite strings composed of those phonemes.  A
_formal grammar_ of a language is simply a set of rules that defines
which strings (i.e. the sentences) are in that subset (i.e. the
language).  There is nothing more to a "formal grammar" than that.

This is a more technical overview:
http://en.wikipedia.org/wiki/Formal_grammar

Jorge's grammars have been complete formal grammars under this
definition.

I think this is a good characterization of the difference between where you and I are coming from.

It's my impression that formal grammars are a staple of introductions to Computer Science and are considered important by Computer Scientists. But they don't feature in introductions to linguistics, or anywhere in linguistics apart from its computational corners, not because linguistics doesn't like formalisms but because it's inapplicable to language -- that is, it's not suited for application to language, because language doesn't work like that; "formal grammar" in the sense under discussion seems, like Chomsky's 1950s work that is foundational to it, a naive, crude and erroneous attempt to model language formally (-- as any initial attempt at the birth of the discipline is likely to be). I have met linguists with a kind of 1950s-ish view of the mechanics of language, but they were rather unreflective by nature, or worked in other subdisciplines.

I see now that there are two ways of making a loglang.

(I) You start with the notion of "formal grammar" and create a set of rules that implement it.

(II) You consider how language works, and create a loglang version of it.

It had never occurred to me that anybody would have (I) as a goal, and hence when I saw evidence of (I), e.g. in Lojban, I had attributed it to naivety and ignorance, on the assumption that (I) would merely be a misguided attempt at (II). I suppose the attraction of aiming for (I) in the first place is that it is well-understood and readily formalizable, but I don't know how it would avoid the garbage-in-garbage-out problem (that is, is there virtue in formalized garbage?).

Under approach (II), syntax would be a structure of Argument and Binding relations, with lexical and inflectional rules mapping these structures to the sentential phonological form and with semantic rules (i) defining semantic equivalences between syntactic structures (-- something restricted to loglangs, I think) and (ii) defining semantic interpretation (truth-conditions, etc.). And above the word-level, phonological structure would probably be a mere concatenation of phonological words, certainly without any of the elaborately (and pointlessly) patterned structure of Lojban and its less egregious Xorban counterparts.

In the rest of what I say, I am supposing Xorban to be taking Approach (II), but if anyone wanted to take Approach (I) instead, I guess we could take a step back, agree to disagree in principle, maybe work on parallel projects, and then look to see where we have common ground.

But we can't intelligently use that as the basis for serious
discussion. Jorge's sketch is innocuous because the rules of syntax
are so simple that one grocks them quite independently of the BNF
stuff. But for more complicated proposals it simply won't do. As
formulated, these termset rules merely impose pointless patterns on
phonological strings, & it's somewhat shameful that we are so
blithely recapitulating the inanities of Lojban. If the syntactic
rules don't generate logical formulas then throw them away; they're
junk.

If the grammar for formulas without termsets (i.e. trees) generate
logical formulas, then assuming no mistakes have been made I assure
you that the proposed grammar for formulas with termtrees also
generate logical formulas, because every formula that contains
termtrees can be interpreted unambiguously as a rewriting of a
certain formula that does not contain termtrees.


I'm certain that in your head there exists a real functioning grammar that generates formulas. What I was objecting to was the current explicit formulation, which doesn't. The current formulation appears to divide the phonological form of a sentence into labelled parts, which looks like junk to me, and says nothing at all about the essential stuff, which is the Argument and Binding relations.

The language with termtrees is a superset of the language without
them, but under the accompanying proposed semantics, the language
with termtrees has precisely zero greater expressive power than the
language without them. It has merely greater abbreviatory power.


Yes, I understand this.

There was a fair deal of work over two weeks coming up with a design
that is grammatically unambiguous. Given the skills that I have, I
would not have been able to contribute towards a meaningful result
without relying on the formalism of BNF or something similar.


That's fine. I don't object to BNF as an ad hoc aid for thinking or recording ideas. But if we actually had to start discussing Xorban syntax seriously, which I think the complications added by the termtrees mean we do, we should translate the BNF stuff into the real rules that generate the real structure of Argument and Binding relations.

The termset proposals themselves seem to me to be essentially a good
idea - a useful abbreviatory device. I mean not to criticize the
attempt to find workable abbreviatory devices, but only to criticize
the idea that these BNF rules have merit. The ideas that the BNF
rules are a misguided attempt to formalize surely do have merit, tho
until I understand them I can't comment on their adequacy.

The way I see it: Xorban is founded on FOPL, or is possibly a modest
extention of FOPL;


I agree with that.

the language of FOPL is aptly defined by BNF. Therefore, Xorban is
aptly defined by BNF.


My only objection to BNF per se is the weak objection that it is too unrestrictive. You can use BNF to write good rules and to write bad ones. I can think of vaguely how to do FOPL in BNF with two orthogonal trees, one for Argument structure and one for Binding structure, on the presumption that there is some version of BNF neutral with regard to the linear ordering of parts, and on the presumption that mere use of BNF doesn't imply confusion between phonology and syntax, tho I can't see how to tie the trees together. Doing something similar for Core Xorban, for the non-binding tree, we'd want something like:
complement := argument-terminal | phrase
phrase := head complement*
[where the number of complements varies according to the identity of the Head]. And for binding, we'd want:
Binding := binder bindee*
But I can't see how to capture the fact that a binder is a head and its bindees are contained within its complement. "*Aptly* described by BNF" seems like a bit of a stretch, for FOPL and for Xorban: "Described by BNF clumsily or perhaps not at all" seems closer to the mark, tho I'm basing that only on my own half-arsed attempts.

Nevertheless, I am not opposed to exploring other formalisms. I
definitely will revisit your "reformulating" thread if you continue
fleshing out your formal description. I am very interested in what
you come up with.


I think I'd done the rules for core Xorban. Possibly the binding rules are incomplete: I can't remember whether the rules I gave said, as they probably need to, that is X is subordinate to Y and Z is subordinate to X and Y binds Z but not X, then the form of X must be differently inflected from the forms of Y and Z.

One thing that you have made clear to me: In the long run, we will
need more intuitive and informal ways of describing the language to
our fellow humans than BNF. Plentiful illustrative usage examples,
which I apologize for being short on, are a must. The formal
definition is just that, a formal definition.


But hopefully the formal description describes the actual mechanics of the syntax  too.

No one is meant to think in it when they use the language.


Given the goal of being ergonomic, I think it would be good to base the formal definition on what speakers are in fact meant to think in when they use the language. Well, I don't mean go all cognitive and mentalist, but at least base the formal definition on something consistent with a very simple incremental parsing algorithm.

--And.

Follow-Ups:
- Re: [engelang] Xorban: Termsets
  - From: Jorge Llambías <jjllambias@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>

References:
- Xorban: Termsets
  - From: Jorge Llambías <jjllambias@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: Jorge Llambías <jjllambias@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: Jorge Llambías <jjllambias@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: And Rosta <and.rosta@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>
- Re: [engelang] Xorban: Termsets
  - From: And Rosta <and.rosta@hidden.email>
- Re: [engelang] Xorban: Termsets
  - From: "Mike S." <maikxlx@gmail.com>

Prev by Date: Re: [engelang] Xorban experimental tense markers
Next by Date: Re: [engelang] reformulating the core grammar
Previous by thread: Re: [engelang] Xorban: Termsets
Next by thread: Re: [engelang] Xorban: Termsets
Index(es):
- Date
- Thread