Last update : October 24, 2017
Language is the human capacity for acquiring and using complex systems of communication, and a language is any specific example of such a system. The scientific study of language is called linguistics.
In the context of a text-to-speech (TTS) and automatic-speech-recognition (ASR) project, I assembled the following informations about the french, german, english, luxembourgish and esperanto languages.
French is a romance language spoken worldwide by 340 million people. The written french uses the 26 letters of the latin script, four diacritics appearing on vowels (circumflex accent, acute accent, grave accent, diaeresis) and the cedilla appearing in ç. There are two ligatures, œ and æ. The french language is regulated by the Académie française. The language codes are fr (ISO 639-1), fre, fra (ISO 639-2) and fra (ISO 639-3).
The spoken french language distinguishes 26 vowels, plus 8 for Quebec french. There are 23 consonants. The Grand Robert lists about 100.000 french words.
German is a West Germanic language spoken by 120 million people. In addition to the 26 standard latin letters, German has three vowels with Umlauts and the letter ß called Eszett. German is the most widely spoken native language in the European Union. The german language is regulated by the Rat für deutsche Rechtschreibung. The language codes are de (ISO 639-1), ger, deu (ISO 639-2) and 22 variants in ISO 630-3.
The spoken german language uses 29 vowels and 27 consonants. The 2013 relase of the Duden lists about 140.000 german words.
English is a West Germanic language spoken by more than a billion people. It is an official language of almost 60 sovereign states and the third-most-common native language in the world. The written english uses the 26 letters of the latin script, with rare optional ligatures in words derived from Latin or Greek. There is no regulatory body for the english language. The language codes are en (ISO 639-1) and eng (ISO 630-2 and ISO 639-3).
The spoken english language distinguishes 25 vowels and 34 consonants, including the variants used in the United Kingdom and the United States. The Oxford English Dictionary lists more than 250,000 distinct words, not including many technical, scientific, and slang terms.
Luxembourgish (Lëtzebuergesch) is a Moselle Franconian variety of West Central German that is spoken mainly in Luxembourg by about 400.000 native people. The Luxembourgish alphabet consists of the 26 Latin letters plus three letters with diacritics: é, ä, and ë. In loanwords from French and German, the original diacritics are usually preserved. The luxembourgish language is regulated by the Conseil Permanent de la Langue Luxembourgeoise (CPLL). The language codes are lb (ISO 639-1) and ltz (ISO 630-2 and ISO 639-3).
The spoken luxembourgish language uses 22 vowels (14 monophthongs, 8 diphthongs) and 26 consonants. The luxembourgish-french dictionary dico.lu icludes about 50.000 words, the luxembourgish-german disctionary luxdico lists about 26.000 words. The full online Luxembourgish dictionary www.lod.lu is in construction, at present words beginning with A-S may be accessed via the search engine.
Esperanto is a constructed international auxiliary language. Between 100,000 and 2,000,000 people worldwide fluently or actively speak Esperanto. Esperanto was recognized by UNESCO in 1954 and Google Translate added it in 2012 as its 64th language. The 28 letter Esperanto alphabet is based on the Latin script, using a one-sound-one-letter principle. It includes six letters with diacritics: ĉ, ĝ, ĥ, ĵ, ŝ (with circumflex), and ŭ (with breve). The alphabet does not include the letters q, w, x, or y, which are only used when writing unassimilated foreign terms or proper names. The language is regulated by the Akademio de Esperanto. The language codes are eo (ISO 639-1) and epo (ISO 630-2 and ISO 639-3).
Esperanto has 5 vowels, 23 consonants and 2 semivowels that combine with the vowels to form 6 diphthongs. The core vocabulary of Esperanto contains 900 roots which can be expanded into tens of thousands of words using prefixes, suffixes, and compounding.
A list with links to websites with additional informations about the five languages (mainly luxembourgish) is shown hereafter :
- Automatic phonetic transcription of Luxembourgish, by Peter Gilles
- Phonetic Online Tool: Sequitur g2p, Gramophone
- PhonLaF : Phonetic Online Material for Luxembourgish as a Foreign Language
- Luxembourgish, by Peter Gilles and Jürgen Trouvain
- MaryLux: Erste Sprachsynthese des Luxemburgischen
- 6000 Wierder op Lëtzebuergesch
- Infolux – Fuerschungsportal iwwert d’Lëtzebuergesch
- Luxembourgish – Omniglot
- Petite grammaire luxembourgeoise, par Francis André-Cartigny
- Richteg Lëtzebuergesch schreiwen, Athénée de Luxembourg
- Lexilogos, dictionnaire luxembourgeois
- Luxogramm, uni.lu
- Introduction à l’orthographe luxembourgeoise, par François Schanen et Jérôme Lulling
- Translating-IT, Pascal Zotto
- Luxembourgish alphabet and pronunciation, by VetoVeto
- Luxembourgeois: phonologie et orthographe, par Johan Viroux
- Lëtzebuergesch : eis Schreifweis, Wikipedia
- Category:Luxembourgish language, Wiktionary
- Wéi ee Lëtzebuergesch schreiwt, Arrêté ministériel du 10 octobre 1975
- Orthographe luxembourgeoise, Règlement grand-ducal du 30 juillet 1999
- D’Lëtzebuerger Schreifweis, par François Schanen
- Phonological domains in Luxembourgish, by Peter Gilles
- Lëtzebuergesch …Ei wéi flott !, vum Jean-Marie Nau
- Das luxemburgischsprachige Oeuvre von Michel Rodange (1827-1876) – Editionsphilologische und korpuslinguistische Analyse, von Joshgun Sirajzade
- Multilingual Speech Processing, edited by Tanja Schultz and Katrin Kirchhoff