• Member Since 30th Jun, 2014
  • offline last seen 28 minutes ago

Chicago Ted


"Friendship" is a magical-class noun.

More Blog Posts104

  • 6 weeks
    Every Page a Painting - Walls of Words

    Yup, hello, it's me, back on my typesetting binge again, with another "Every Page a Painting" to show you. And boy oh boy, do I have a real treat for you this time around: one of my favorite novels on this site, one that hasn't been typeset before. . . well, until now, of course.

    Read More

    2 comments · 71 views
  • 7 weeks
    Every Page a Painting - By Any Other Name

    First of March, it's clear to me
    There's something that's uncomforting. . .

    Here I am again, about a fortnight after the first "Every Page a Painting", locked and loaded with a second one, whether you wanted it or not. Enjoy.

    Read More

    4 comments · 55 views
  • 9 weeks
    Every Page a Painting - Click, Clack, Neigh

    I know, I know, it's quite bold of me to publish this on Valentine's Day of all days, but here it is all the same.

    If you don't like the timing, just come back tomorrow. I'll wait.

    If you're still here and you don't care about when you'd get this, all I can say is buckle up.

    (Disclaimer: everything you see here is work in progress and subject to change.)

    Read More

    3 comments · 73 views
  • 12 weeks
    The Art of Typesetting

    "Hey Ted, remember when you said you'd work on another blogpost right after your last one?"

    Read More

    2 comments · 114 views
  • 15 weeks

    Ah yes, my hundredth blogpost on Fimfiction.

    I know I should try to find one single topic to spend it on, but I've got several going through my head and only one milestone to do it in, so. . . what the hell, I'll just talk about all of them.

    Buckle up; this is a certified Anthology Blogpost.™

    Read More

    4 comments · 170 views
Nov
22nd
2015

On Transcribing the Albanian Language · 3:23am Nov 22nd, 2015

Until 22 November 1907, the Albanian language was written in a wide variety of alphabets. There was a Greek version, a Cyrillic version, an Arabic version, a few varying Latin versions-- and the several home-grown scripts that never saw widespread use.

Which is why, on 14 November 1907, fifty delegates from the Albanian-speaking world gathered in Manastir (now Bibola, Macedonia), to decide on a single, standardized alphabet for the whole language. They made their final decision on 22 November, and the resulting 36-letter alphabet has gone unmodified ever since.

In this essay, I will present the various scripts that were used, compared to the modern alphabet-- and then, present my own revision thereof.


Quota fulfilled!

But first, a history lesson.

When the Phonœcians presented their abjad to the Greeks, the Greeks misinterpreted several consonantal letters as vowels (since abjads, by definition, do not have vowels), and lo, the world's first alphabet was created. Tosk Albanians (the ones who live south of the Shkumbin River), under the control of the Greek Orthodox Church, adapted it for their use. Any additional phonemes were represented by either digraphing, or adding a dot above, the letter in question.

Meanwhile, north of the Shkumbin, the Gheg Albanians were under Roman (and, consequently, Catholic) control. They used the Latin alphabet, indirectly descended from the Greek alphabet, for their use. It should be pointed out, however, that digraphing and multiple spellings were rampant in this script, since Albanian contains more phonemes than Latin.

In fact, in a 1332 Latin manuscript, Brocardus Monacus (or Guillaume Adam-- the author's identity is still being debated) commented, "Licet Albānenses aliam omnīnō linguam ā latīna habeant et dīversam, tamen līteram latīnam habent in ūsō et in omnibus suīs lībrīs." (English: "The Albanians indeed have a language quite different from Latin; however, they use Latin letters in all their books.")

With the invasion (and subsequent annexation) by the Ottoman Empire, an Arabic abjad, modified into an alphabet, started spreading throughout Albania. Regardless of the dominant church (Roman Catholic or Greek Orthodox), this "elifbaja," as the script came to be known, was used to write Albanian on both sides of the Shkumbin.


The elifbaja. Not all letters are supported in Unicode.

A Cyrillic alphabet also existed, but I have been unable to locate a document that was written using such. I, therefore, shall make no further mention of this Cyrillic system.

The oldest surviving Albanian document is the Formula e Pagëzimit (Baptismal Formula), written in an otherwise Latin letter dated 8 November 1462. At that time, Albania was being invaded by the Ottoman Empire, and the Formula was intended to be used if the family of a dying loved one couldn't make it to a church in the chaos. It was a single sentence: "Un'te paghesont' pr'emenit t'Atit e t'Birit e t'Spirit Senit." In modern Standard Albanian (which is based heavily on Tosk), this is rendered, "Unë të pagëzoj në emër të Atit, të Birit, e të Shpirtit të Shenjtë." (Note that other differences are present due to dialectal differences.) And in English, "I baptise thee in the name of the Father, the Son, and the Holy Spirit."


Excerpt of the letter in question. The Formula e Pagëzimit is underlined.

In the 19ᵗʰ century, there was a sort of renaissance of Albanian language and culture. And for scholar and interpreter Konstandin Kristoforidhi, this meant unifying the two main dialects, Gheg and Tosk, as mandated in his Memorandum for the Albanian Language. Working from Constantinople (now İstanbul, Turkey), he created an alphabet, based on his translations of the New Testament into Gheg and Tosk Albanian. This is now considered the Istanbul Alphabet, referred to by the Albanians as the Stamboll Alphabet. It is as follows:

A a, B b, C c, Ç ç, D d, Ƃ δ, E ε, ♇ e, F f, G g, Γ γ, H h, I i, J j, K k, L l, Λ λ, M m, N n, И ŋ, O o, Π p, Q q, R r, Ρ ρ, S s, Ϲ σ, T t, Θ θ, U u, V v, X x, X̦ x̦, Y y, Z z, Z̧ z̧

As one can see, the Istanbul Alphabet has a perfect 1:1 phoneme-grapheme correspondence. The only disadvantages is that it is optimized for the Tosk dialect (Gheg vowels are more diverse, comparable to a Germanic language), which also served as a significant influence, with ten letters improvised-- seven of which come from Greek (remember, Tosk was generally written in the Greek script), not to mention the ambiguity of grapheme shapes-- C c and Ϲ σ are very close in appearance, as are Π p and Ρ ρ.

On the other hand, the Society for the Unity of the Albanian Language (Shoqnia e Bashkimit të Gjuhës Shqipe), of which Kristoforidhi had no affiliation, designed another alphabet, dubbed the Bashkimi Alphabet. Unlike the Istanbul Alphabet, the Bashkimi (Union) Alphabet was entirely Latin-based, without any new letters or Greek borrowings. In fact, it was optimized to be written using a French typewriter. It is as follows:

A a, B b, Ts ts, Ch ch, D d, Dh dh, É é, E e, F f, G g, Gh gh, H h, I i, J j, K k, L l, Ll ll, M m, N n, Gn gn, O o, P p, Q q, R r, Rr rr, S s, Sh sh, T t, Th th, U u, V v, Z z, Zh zh, Y y, X x, Xh xh

Note the amount of digraphs the Bashkimi Alphabet uses-- there are 11 in total. In particular, Gn gn appears to be the result of Italian influence. É é is the only letter with a diacritic. And Z z and X x swap places around Y y compared to most other Latin alphabets.

Brothers Lazër and Ndre Mjeda left the Society to found another group, Agimi (The Dawn), which advocated another alphabet. Also entirely Latin, it was based more off of Gaj's Alphabet, used for the various Serbo-Croatian dialects of the former Yugoslavia. This was the Agimi Alphabet. While I have been unable to locate the Agimi Alphabet, I believe it was something like this:

A a, B b, C c, Č č, D d, Đ đ, E e, Ë ë, F f, G g, Ǧ ǧ, H h, I i, J j, K k, L l, Ľ ľ, M m, N n, Ň ň, O o, P p, Q q, R r, Ř ř, S s, Š š, T t, Ŧ ŧ, U u, V v, X x, X̌ x̌, Y y, Z z, Ž ž

This combines the advantages of both the Istanbul and Bashkimi Alphabets-- 1:1 phoneme-grapheme correspondence, and no grapheme disambiguities.

In addition, there were several native scripts for Albanian. Like Greek and Armenian, Albanian constituted its own branch in the Indo-European language family. Greek and Armenian also had their own scripts. Some individuals supposed that, given these, Albanian ought to have its own script as well, not based on Arabic, Cyrillic, Greek, Latin, or otherwise.

One of the earliest and most well-known native scripts was the Elbasan Alphabet. Created in the village of Elbasan, it was used between there and nearby Berat. The Elbasan Gospel Manuscript is the only major document written in the Elbasan Alphabet. Numerals were Greek letters with a line above. This is the only native script with Unicode support (U+10500-U+1052F).


A German-language chart of the Elbasan Alphabet.

Another proposition was the Vithkuqi Alphabet, developed by Naum Veqilharxhi. Named after his native village, it was designed to be as culturally neutral as possible.


The Vithkuqi Alphabet. Modern equivalents are given on the right.

Another script existed called the Todhri Alphabet. Also created in the village of Elbasan by Todhri Haxhifilipi, it was also adopted by Austrian consul Johann Georg von Hahn, who referred to it as the "original" Albanian alphabet. However, it was not ideal for writing Albanian. It bore no relation to the Elbasian Alphabet, and appears to be derived either from the Phonœcian abjad or Roman handwriting.

Other scripts included a script found only in the Codex of Berat (which appears to be influenced by Glagolitic), the Veso Bey alphabet (not found ouside of Bey's correspondence with relatives), Jan Vellara's alphabet, and an undeciphered script in the Codex of Elbasan.

But with these many more alphabets in existence, there was just as much confusion as before-- instead of standardizing, it further destandardized the language and hindered efforts in its unification.

This is when the Manastir Congress was held.

Fifty delegates from the Scutari, Janina, and Manastir Vilâyets gathered in Manastir on 14 November 1907, to finally discuss, and agree upon, a single standard script for their language. The chairman of the Congress was Gjergj Fishta, who would later, after the Congress, write and publish the national epic of Albania, Lahuta e Malcís (The Highland Lute).

There, it was agreed on two points:

1. The alphabet should be Latin-based
2. The alphabet should have as close to a 1:1 phoneme-grapheme correspondence as possible.

Given these points, three different alphabets were proposed and debated-- the Istanbul Alphabet, the Bashkimi alphabet, and the Agimi Alphabet.

The Istanbul and Agimi Alphabets possessed a 1:1 phoneme-grapheme correspondence, as stated before. But the Bashkimi and Agimi were entirely Latin-based. And the Bashkimi was optimized for a French typewriter, so a newly-designed typewriter would not be required. It seemed impossible to achieve a single alphabet. Albanian would have to go on forever fractured by its own graphemes.

And yet, a compromise was reached. The Bashkimi Alphabet incorporated one letter from the Istanbul Alphabet-- Ç ç replaced Ch ch, since the former was nigh-universal on typewriters, given its use in Romance languages like French, Portuguese, and Catalan. It also switched around Z z and X x, putting it in line with the other two alphabets. The digraphs were kept, thus sacrificing the 1:1 correspondence, but three digraphs were revised, in accordance to the Agimi Alphabet-- C c replaced Ts ts, Nj nj replaced Gn gn, and Gj gj replaced Gh gh.

Gheg vowels presented another hinderance: none of the three alphabets made any accomodation whatsoever for Gheg's larger vowel system. For the first time, at Manastir, a standard for Gheg vowels was agreed upon. Besides the seven vowels of Tosk Albanian (A a, E e, Ë ë, I i, O o, U u, Y y), another four were adopted (À à, Ä ä, È è, Ò ò).

Most Gheg vowels also occurred long, and six occurred nasal as well. Long vowels were marked with an acute diacritic (Á á, É é, Í í, Ó ó, Ú ú, Ý ý). Nasals were marked with a tilde, inherited from Portuguese (Ã ã, Ẽ ẽ, Ĩ ĩ, Õ õ, Ũ ũ, Ỹ ỹ). In the case of À à, È è, and Ò ò, these were marked as long with a circumflex (Â â, Ê ê, Ô ô)-- again, from Portuguese. These three vowels did not occur nasally. And since neither Ä ä nor Ë ë occurred long or nasal, it was deemed acceptable for them to possess their umlauts.

The alphabet was finalized on 22 November.


Final decision of the Manastir Congress.

Even after Enver Hoxha's standardization of Albanian (by basing it on Tosk, to win support of the Tosk-speaking proletariat), the alphabet held, to this very day. This is the modern standard alphabet:

A a, B b, C c, Ç ç, D d, Dh dh, E e, Ë ë, F f, G g, Gj gj, H h, I i, J j, K k, L l, Ll ll, M m, N n, Nj nj, O o, P p, Q q, R r, Rr rr, S s, Sh sh, T t, Th th, U u, V v, X x, Xh xh, Y y, Z z, Zh zh

Now, I would like to present my own revision.

I praise the idea of a 1:1 phoneme-grapheme correspondence, detailed in the Istanbul and Agimi Alphabets. Why?

Take the digraph sh. In English, we have an instinct to pronounce these as one sound, as in the word "shœ." But, at times, if the digraph straddles a syllable break, it is pronounced as two sounds, as in the word "mishap."

Therefore I propose eliminating all digraphs from the Albanian language. But how to do this?

Bear in mind that the main reason for the modern design is because computers didn't exist in 1907. If something was to be written en masse, the only way to do this was to hire a printer. If something was to be written neatly and regularly, but only one or two copies were needed, a typewriter was a more economical means of doing this.

While printing is still a fundamental part of our society, today typewriters are considered obsolete. In its stead we have computers-- with various fonts that one can change on a whim, the change taking effect instantly. And with computers we have Unicode-- a single, standardized way of representing all the world's languages. A sort of modern-day Manastir Congress, if you will.

So, using only characters that are pre-composed in Unicode, I have taken the liberty of revising the Albanian alphabet. By no means do not consider it official-- all Albanian-language-related grammar, spelling, and so forth are regulated by the Social Sciences and Albanological Section of the Academy of Sciences of Albania (Akademia e Shkencave e Shqipërisë).

This is my alphabet:

A a, B b, C c, Ċ ċ, D d, Ḑ ḑ, E e, Ę ę, F f, G g, Ġ ġ, H h, I i, J j, K k, L l, Ł ł, M m, N n, Ṅ ṅ, O o, P p, Q q, R r, Ŕ ŕ, S s, Ṡ ṡ, T t, Ț ț, U u, V v, X x, Ẋ ẋ, Y y, Z z, Ż ż

What have I replaced?

  • Ç ç was replaced with Ċ ċ;
  • Dh dh was replaced with Ḑ ḑ;
  • Ë ë was replaced with Ę ę;
  • Gj gj was replaced with Ġ ġ;
  • Ll ll was replaced with Ł ł;
  • Nj nj was replaced with Ṅ ṅ;
  • Rr rr was replaced with Ŕ ŕ;
  • Sh sh was replaced with Ṡ ṡ;
  • Th th was replaced with Ț ț;
  • Xh xh was replaced with Ẋ ẋ; and
  • Zh zh was replaced with Ż ż.

Where do these letters come from?

Ċ ċ and Ġ ġ originate from the Maltese alphabet. They were also used, alongside Ṡ ṡ, to write Irish. (Irish nowadays digraphs with h.) Ę ę, Ł ł, and Ż ż are borrowed from Polish. Ŕ ŕ is used in Slovak. Ḑ ḑ and Ț ț are found in Romanian. (The former of the two is an old letter-- it was eliminated in a 1904 spelling reform, replacing it with Z z, i.e. Bună ḑiuaBună ziua, "Good day.") Ẋ ẋ is used in the Latin orthography of Chechen. Finally, Ṅ ṅ is used in transliterating Hindi, and in the Igbo and Emilian languages.

Why these specific changes?

In the case of Gj and Nj, I took the tittle off the j and placed it on the G and N. Then the rest of the letter (ȷ) was eliminated. This applied to most palatal, and then all post-alveolar, sounds.

Q q remained unchanged, and Ç ç was changed to Ċ ċ, both for æsthetic reasons. They both preserve the general grapheme-shapes that were used for a little over a century. In addition, K̇ k̇ is not pre-composed (and the dot goes above the caps-height line for the lowercase), and Ç ç would be a glaring exception to the other post-alveolar sounds: Ṡ ṡ, Ẋ ẋ, and Ż ż.

Why dots at all? Gaj's Alphabet, itself based off the Czech orthography, uses downward-pointing diacritics known in Czech as the háček. I deem their use as too partial to its Slavic neighbors to the north, so I used dots as a way of avoiding the appearance of such. (Remember, Albanian constitutes its own branch in the Indo-European family tree.)

The dental fricatives Dh dh and Th th had commas placed below, since dots above indicate only palatal or post-alveolar sounds, and dental fricatives are too far forward.

Ll ll was changed to Ł ł because, in Polish, this used to represent a velarized (or "dark") L, [ɫ]. In Polish, it evolved into [w]; however, [ɫ] is still present in Albanian.

In a simliar vein to long vowels in Gheg, Rr rr was changed to Ŕ ŕ.

Speaking of vowels, Ë ë was changed to Ę ę-- not as a nasal vowel, like in Polish, but as an E caudata. Developing separately from Polish, but with very similar appearances, it variably represented a, æ, or ea.

I left most of the Gheg vowels alone, but changed Ä ä to Å å. The reason for this is that the ring above originated in Old Norse as a letter o above A, and the umlaut originated as a letter e over the letter in question (); this is much less misleading, since in Albanian it represents an open back rounded vowel [ɒ]-- about the mid-way point between a and o.

This is all. Nothing else has been changed. To give you an idea, take the following passage in Tosk Albanian:

Të gjithë njerëzit lindin të lirë dhe të barabartë në dinjitet dhe në të drejta. Ata kanë arsye dhe ndërgjegje dhe duhet të sillen ndaj njëri tjetrit me frymë vëllazërimi.

The same passage in Gheg Albanian:

Zhdo njeri kan le t'lir mê njãjit dinjitêt edhê dreta. Ata jan të pajisun mê mênjê edhê vet-dijê edhê duhën të veprôjnë ka njãni-tjetrin mê nji shpirt vllâznimit.

Under my new orthography, the Tosk version would read:

Tę ġițę ṅeręzit lindin tę lirę ḑe tę barabartę nę diṅitet ḑe nę tę drejta. Ata kanę arsye ḑe ndęrġeġe ḑe duhet tę siłen ndaj ṅęri tjetrit me frymę vęłazęrimi.

And the Gheg:

Żdo ṅeri kan le t'lir mê ṅãjit diṅitêt eḑê dreta. Ata jan të pajisun mê mênjê eḑê vet-dijê eḑê duhën të veprôjnë ka ṅãni-tjetrin mê ṅi ṡpirt vłâznimit.

And of course, in English:

All human beings are born free and equal in dignity and rights. They are endowed with reason and conscience and should act towards one another in a spirit of brotherhood.

This is Article 1 of the Universal Declaraion of Human Rights.

Natën e mirë, dhe paç fat. (Good night, and good luck.) (Natęn e mirę, ḑe paċ fat.)

Report Chicago Ted · 1,428 views · #Albanian
Comments ( 1 )

Definitely going to read this later. I do love history lessons :raritystarry: For now, I'm just going to read a fic. :derpytongue2:

Login or register to comment