[YG Conlang Archives] > [romconlang group] > messages [Date Index] [Thread Index] >
On 2009-01-31 Eric Christopherson wrote: > On Jan 30, 2009, at 4:57 PM, old_astrologer wrote: > > > PS What encoding are you using? My browser can't work it > > out, and I've tried half a dozen manually, with no luck. > > The accented characters came out as gibberish here too > (I'm using OS X Mail), but it looked like UTF-8. However, > David, yours also look like gibberish. I think the Yahoo server double-converts posts, i.e. tries to convert what already is in UTF-8 to UTF-8, since when I look at my own post in Latin-15 _cio'_ looks like ciò, which is what double-converted UTF-8 typically looks like. Unfortunately I know of no easy way to rectify this. In the old days some philological mailing lists used an ASCII transliteration using various punctuation marks to simulate diacritical marks. Clearly we seem to need something similar: Diacritic HTML ASCII ------------------ ------------------ ------------------- On 2009-01-31 Eric Christopherson wrote: > On Jan 30, 2009, at 4:57 PM, old_astrologer wrote: > > > PS What encoding are you using? My browser can't work it > > out, and I've tried half a dozen manually, with no luck. > > The accented characters came out as gibberish here too > (I'm using OS X Mail), but it looked like UTF-8. However, > David, yours also look like gibberish. I think the Yahoo server double-converts posts, i.e. tries to convert what already is in UTF-8 to UTF-8, since when I look at my own post in Latin-15 _cio'_looks like ciò, which is what double-converted
UTF-8 typically looks like. Unfortunately I know of no easy way to rectify this. In the old days some philological mailing lists used an ASCII transliteration using various punctuation marks to simulate diacritical marks. Clearly we seem to need something similar: Diacritic HTML ASCII ------------------ ------------------ ------------------- acute á /a grave à \a trema/diaeresis ä :a tilde ñ ~n circumflex â <a caron/hacek č >c macron ā |a breve ă )a tie a͡e (ae dot above ċ .c dot below ẹ !e cedilla ç ,c ogonek/hook ę ;e superscript c<sup>i</sup> c^i subscript e<sub>1</sub> e_1 This should cover most our needs in terms of what we meet in our sources. The benefit of putting the pseudo-diacritics in front of the letters is that the punctuation characters seldom appear in that position in normal text. However there is still potential ambiguity with the characters /()_ in particular. I suggest to enclose words containing pseudo-diacriticized letters in curly braces when there is risk of ambiguity. {n/ee}. An added benefit is that unlike with UTF-8 these can be freely combined with arbitrary letters and with each other without worrying about font and Unicode support. I've actually seen things like {/>g} and {/)|;e} in old Romanist publications! I've uploaded an HTML version of this post to the files section, and hope to put up some kind of converter form on my site /BP 8^)> -- Benct Philip Jonsson -- melroch atte melroch dotte se ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "C'est en vain que nos Josués littéraires crientà la langue de s'arrêter; les langues ni le soleil ne s'arrêtent plus. Le jour où elles se *fixent*,
c'est qu'elles meurent." (Victor Hugo) This should cover most our needs in terms of what we meet in our sources. The benefit of putting the pseudo-diacritics in front of the letters is that the punctuation characters seldom appear in that position in normal text. However there is still potential ambiguity with the characters /()_ in particular. I suggest to enclose words containing pseudo-diacriticized letters in curly braces when there is risk of ambiguity. {n/ee}. An added benefit is that unlike with UTF-8 these can be freely combined with arbitrary letters and with each other without worrying about font and Unicode support. I've actually seen things like {/>g} and {/)|;e} in old Romanist publications! I've uploaded an HTML version of this post to the files section, and hope to put up some kind of converter form on my site /BP 8^)> -- Benct Philip Jonsson -- melroch atte melroch dotte se ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ "C'est en vain que nos Josu�s litt�raires crient � la langue de s'arr�ter; les langues ni le soleil ne s'arr�tent plus. Le jour o� elles se *fixent*, c'est qu'elles meurent." (Victor Hugo)