[YG Conlang Archives] > [engelang group] > messages [Date Index] [Thread Index] >


[Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Self-segmenting words & the treatment of names



 
--- In engelang@yahoogroups.com, Rick Morneau <ram@...> wrote:
>
> If you're trying to quote foreign words in a written text, just use
> special symbols that are reserved for this purpose, such as angle
> brackets or square brackets or even a combination of symbols.
> 
> If you're talking to a human, you don't need to worry about
> self-segregation.
> 
> However, if you're talking to a computer, how is the computer going to
> deal with the speech that appears between the spoken brackets?  Speech
> recognizers are designed to understand only one language.  The computer
> can, of course, simply record the foreign expression so that it can play
> it back later, but it won't be able to transcribe it for further
> processing.
> 
> 
> Regards,
> 
> Rick Morneau
> http://www.eskimo.com/~ram
>


Hi,

Quoting written text, without having to pronounce it, is easy, even if
it includes brackets or other, arbitrary symbols. For instance, the
method could be:

[double quote][key symbol][text][keysymbol]

For instance, let's say I want to quote the following:

"  The symbol ' " ' is called "double quote" "

The quotation method could be:

"# The symbol ' " ' is called "double quote" #



Another example:

" you're the #1! "

Quotation:

"$ you're the #1! $

The problem is spoken language, especially if one tries to keep it
concise. I find it unacceptable to require pauses, since they are the
first thing to vanish when one is not making an effort to speak slowly
(not to say when one in a hurry). Besides, a logical language should
be self-parsing, even if it is for human use, for several reasons. 

Native speakers tend to underestimate the difficulty of parsing their
language, but some of us who are "hard-eared" greatly appreciate this
feature in any language which aspires to be used as an auxiliary
language in some context (as is the case with most engelangs, I'd say) 

Besides, while in most cases the context helps to decide how to parse
speech, there's always some occasion where two  interpretations are
equally valid, and I don't think a logical language should take this
easily avoidable risk. After all, few people argue for the elimination
of spaces in written text, even if it's for human use.

The need for self-segregation is especially strong in the language I'm
designing, because of its high semantic density: Every string of legal
phonemes that begins in a consonant and ends in a vowel results in a
unique stream of words, and most (if not all) monosyllabic words have
some meaning, and many pollysyllabic words are valid compound words,
even though they may not have an official meaning.

On how to treat the quoted spoken piece, a first draft of a policy
could be as follows:

1) Proper names that are intended to be actually used to refer to
people both in spoken and written text should have their phonology
adapted to my language. This process should go no further than
restricting the phonemes those legal in my language. No change
accentuation is required, and no phonemes need be destructively added
or removed. I haven't decided whether to respect tones in foreign
names, or how to mark them but this should not be a problem, since my
language is not tonal. Optionally, the foreign written form can be
included in written text, parenthesised.

2) The real form of a foreign name could be specified by a particle
that tells the computer (or the listener) to parse the name using the
X-Sampa phoneme set, or something similar.

3) Another particle could indicate that there's no written convenion
for the name or sound, and that it should be recorded if possible, and
declared as non-writable sound in written text. This is not a problem,
since only case (1) should be used as a tagging system, so there's no
need for the computer to process the recorded sounds.



Regards,

            Martin Baldan