Re: Self-segmenting words & the treatment of names

From: Martín Baldán <martinob@hidden.email>
Date: Sun, 07 May 2006 20:01:54 -0000
Subject: Re: Self-segmenting words & the treatment of names
To: engelang@yahoogroups.com

--Hola :)

- In engelang@yahoogroups.com, "Jorge Llambï¿½as" <jjllambias@...> wrote:

 On 5/6/06, Martï¿½n Baldï¿½n <martinob@...> wrote:
>>
>> I'm working on a logical language myself, but it's still less than
>> half-baked, and quickly evolving. To be brief, I'd like it to have a
>> lisp-like syntax, and I think I've thought of a minimally verbose way
>> to express deeply nested parenthesis, by using three parenthethical
>> symbols (words) instead of two, with a bottom-up hierarchical markup.
>> I think that this avoids ever having to write/say two consecutive
>> parenthetical symbols, and also lets you pick a parenthesised
>> expression and build another one from it, without altering its inner
>> hierarchical markup. I can elaborate if someone is interested.

> Yes, please give some examples.

Ok, I'll use the following symbols for the sake of clarity
(in the actual language, reserved words would be used instead):_

(:left parenthesis

):right parenthesis

|:middle parenthesis, equivalent to ")("

]: marker for the end of hierarchical number after each parenthesis.
Not required, but used here to enhance visual clarity.

Separator Levels:

Level 0: blank space
Level 1: unnumbered parenthesis "(]", ")]" , "|]"
Level 2: numbered parenthesis "(2]", ")2]" , "|2]"
...etc

IN GENERAL:

 If We build a parenthetical expression from a set of (already
reduced) level "j1","j2",.. expressions,and jmax is max(j1,j2,..),
and let be "i" the level of the expression we are building, then 
i = jmax + 1. After building the level "i" expression,
we replace every ocurrence of ")n1] (n2]",inside the expression, where
"n1" and "n2" are two (lower than i) numbers, 
with "|n3]", where n3 = i-1. Then we replace every ocurrence of "(i]
(n4]" with "(i]" and every ocurrence of ")n5] )i]" with ")i]",
where "n4" and "n5" are two (lower than "i") numbers.

Examples:

Traditional notation: a
Proposed notation:  "a" OR   (0] a )0] 

Traditional notation: (a b)
Proposed notation: (] a b )] OR  (1] a b )1]

Traditional notation: (  a ( b c ) d )
Proposed notation: (2] a |] b c |]  d  )2]

Traditional notation: (  a ( b c ) )
Proposed notation: (2] a |] b c  )2]

Traditional notation: (  a ( b c ) (d e) )
Proposed notation: (2] a |] b c |]  d e )2]

Traditional notation: (  a ( b c ) d e )
Proposed notation: (2] a |] b c |]  d |] e )2]

Traditional notation: ( (  a ( b c ) d e ) (f g) )
Proposed notation: (3] a |] b c |] d |] e |2] f g  )3]

Traditional notation: ( (  a ( b c ) d e ) f g )
Proposed notation: (3] a |] b c |] d |] e |2] f |2] g  )3]

Notice that, when we build complex expressions from simpler ones, the
inner markup of the constituent expressions is preserved,
and only the outest parenthesis of each one of them is changed. This
is a form of encapsulation, which is one of my top priority design
criteria.

>That means no other word can start with lia/lie/lio/liu right? So for
>example "liebre" could not be a word of the language?

That's right, they are reserved words.

> Short names can become very long: "Ann" would be "liekatannaka",
> goes from one to five syllables.

Yes, that's a drawback. I've thought of having yet another keyword
that lets you
establish the end-of-word keyword until you either change it or return
to normal mode.

Let's say that "Siei" is that keyword, and "siou" lets you come back
to normal mode.
Then I could say:

Siei ka
lietannaka,..., lietalbionaka,..
Siei ku
lietannaku,..., lietalbionaku,..
Siou
liekatannaka,..., liekatalbionaka

>Another approach that gives you shorter names in most cases might
>be something like this: Reserve two rare syllables for the start and
>the end of names, let's say they are written "qw" and "qy".

>For any name that does not contain "qy" (which will be the great
majority)
>you just add "qw" at the beginning and "qy" at the end: qwalbionqy,
>qwannqy, etc.

>For a name that contains "qy" once, you start it with "qwqyqw"
>instead. So a hypothetical name "Aqybe" would be "qwqyqwAqybeqy".
>The name "Qy" would be "qwqyqwQyqy".

>If the name contains "qy" twice, you start it with "qwqyqyqw".

>If the name contains "qy" n times, you start it with "qw", n "qy"s
and "qw".

>So a few rare names will be longer than with your system, but
>most names will be shorter. Uglier, but shorter. :)

Granted, that's also a valid approach. My concerns with it are the
following:

1) I would have to decide which syllables are unusual, on average, in
foreign names. 
Since different languages have different phonological systems, and I
dont know the native
language of potential speakers, I dont know how to pick two syllables
that are, at the same time,
unusual in every language and not very difficult to pronounce. Ease of
pronunciation is one of my goals.

2) I want my language to be "pause-free", that is, I want speakers to
be able to speak
without making pauses or glottal stops, and still emmit a uniquely
parseable stream of sounds.

Now, let's say we want to say "Iraq". In the method you described, it
would be "qwiraqqy". Two problems arise.

First, you would have to forbid, in your language, not just "qw" but
also every syllable that begins with "qw",
such as "qwi", and in general every combination of qw+vowels. 

Second, you would have to make a final pause. Otherwise, it may be
heard as "qwiraqy", rendering "ira" as the foreign word.
You would have the same problem at the beginning, with words like
"wall", and both at the beginning and the end, with words like "woq".

On the other hand, as a suggestion, you may consider to reserve a
third word, for long names where "qy" is present several times.
For instance, lets say you pick "qa" and the foreing name is
"begqyqyqyqyend", Then you would say: "qwqa4qwbegqyqyqyqyendqy"

>Saludos,
>Jorge

Saludos, compan~ero

Martin

Follow-Ups:
- Parenthetical notation, corrected
  - From: Martín Baldán <martinob@hidden.email>
- Re: [engelang] Re: Self-segmenting words & the treatment of names
  - From: And Rosta <a.rosta@hidden.email>

References:
- Re: [engelang] Re: Self-segmenting words & the treatment of names
  - From: "Jorge Llambías" <jjllambias@hidden.email>

Prev by Date: Re: [engelang] Re: Self-segmenting words & the treatment of names
Next by Date: Re: [engelang] Re: Self-segmenting words & the treatment of names
Previous by thread: Re: [engelang] Re: Self-segmenting words & the treatment of names
Next by thread: Parenthetical notation, corrected
Index(es):
- Date
- Thread