Language : fr, de, en, lb, eo

Last update : October 24, 2017

Language is the human capacity for acquiring and using complex systems of communication, and a language is any specific example of such a system. The scientific study of language is called linguistics.

In the context of a text-to-speech (TTS) and automatic-speech-recognition (ASR) project, I assembled the following informations about the french, german, english, luxembourgish and esperanto languages.

French

French is a romance language spoken worldwide by 340 million people. The written french uses the 26 letters of the latin script, four diacritics appearing on vowels (circumflex accent, acute accent, grave accent, diaeresis) and the cedilla appearing in ç. There are two ligatures, œ and æ. The french language is regulated by the Académie française. The language codes are fr (ISO 639-1), fre, fra (ISO 639-2) and fra (ISO 639-3).

The spoken french language distinguishes 26 vowels, plus 8 for Quebec french. There are 23 consonants. The Grand Robert lists about 100.000 french words.

German

German is a West Germanic language spoken by 120 million people. In addition to the 26 standard latin letters, German has three vowels with Umlauts and the letter ß called Eszett. German is the most widely spoken native language in the European Union. The german language is regulated by the Rat für deutsche Rechtschreibung. The language codes are de (ISO 639-1), ger, deu (ISO 639-2) and 22 variants in ISO 630-3.

The spoken german language uses 29 vowels and 27 consonants. The 2013 relase of the Duden lists about 140.000 german words.

English

English is a West Germanic language spoken by more than a billion people. It is an official language of almost 60 sovereign states and the third-most-common native language in the world. The written english uses the 26 letters of the latin script, with rare optional ligatures in words derived from Latin or Greek. There is no regulatory body for the english language. The language codes are en (ISO 639-1) and eng (ISO 630-2 and ISO 639-3).

The spoken english language distinguishes 25 vowels and 34 consonants, including the variants used in the United Kingdom and the United States. The Oxford English Dictionary lists more than 250,000 distinct words, not including many technical, scientific, and slang terms.

Luxembourgish

Luxembourgish (Lëtzebuergesch) is a Moselle Franconian variety of West Central German that is spoken mainly in Luxembourg by about 400.000 native people. The Luxembourgish alphabet consists of the 26 Latin letters plus three letters with diacritics: é, ä, and ë. In loanwords from French and German, the original diacritics are usually preserved. The luxembourgish language is regulated by the Conseil Permanent de la Langue Luxembourgeoise (CPLL). The language codes are lb (ISO 639-1) and ltz (ISO 630-2 and ISO 639-3).

The spoken luxembourgish language uses 22 vowels (14 monophthongs, 8 diphthongs) and 26 consonants. The luxembourgish-french dictionary dico.lu icludes about 50.000 words, the luxembourgish-german disctionary luxdico lists about 26.000 words. The full online Luxembourgish dictionary www.lod.lu is in construction, at present words beginning with A-S may be accessed via the search engine.

Esperanto

Esperanto is a constructed international auxiliary language. Between 100,000 and 2,000,000 people worldwide fluently or actively speak Esperanto. Esperanto was recognized by UNESCO in 1954 and Google Translate added it in 2012 as its 64th language. The 28 letter Esperanto alphabet is based on the Latin script, using a one-sound-one-letter principle. It includes six letters with diacritics: ĉ, ĝ, ĥ, ĵ, ŝ (with circumflex), and ŭ (with breve). The alphabet does not include the letters q, w, x, or y, which are only used when writing unassimilated foreign terms or proper names. The language is regulated by the Akademio de Esperanto. The language codes are eo (ISO 639-1) and epo (ISO 630-2 and ISO 639-3).

Esperanto has 5 vowels, 23 consonants and 2 semivowels that combine with the vowels to form 6 diphthongs. The core vocabulary of Esperanto contains 900 roots which can be expanded into tens of thousands of words using prefixes, suffixes, and compounding.

Links

A list with links to websites with additional informations about the five languages (mainly luxembourgish) is shown hereafter :