Attempto Controlled English (ACE) is a controlled natural language, i.e. a subset of standard English with a restricted syntax and restricted semantics described by a small set of construction and interpretation rules. It has been under development at the University of Zurich since 1995. In 2013, ACE version 6.7 was announced.
ACE can serve as knowledge representation, specification, and query language, and is intended for professionals who want to use formal notations and formal methods, but may not be familiar with them. Though ACE appears perfectly natural – it can be read and understood by any speaker of English – it is in fact a formal language.
ACE and its related tools have been used in the fields of software specifications, theorem proving, text summaries, ontologies, rules, querying, medical documentation and planning.
Here are some simple examples:
- Every woman is a human.
- A woman is a human.
- A man tries-on a new tie. If the tie pleases his wife then the man buys it.
ACE construction rules require that each noun be introduced by a determiner (a, every, no, some, at least 5, ...). Regarding the list of examples above, ACE interpretation rules decide that (1) is interpreted as universally quantified, while (2) is interpreted as existentially quantified. Sentences like "Women are human" do not follow ACE syntax and are consequently not valid.
Interpretation rules resolve the anaphoric references in (3): the tie and it of the second sentence refer to a new tie of the first sentence, while his and the man of the second sentence refer to a man of the first sentence. Thus an ACE text is a coherent entity of anaphorically linked sentences.
The Attempto Parsing Engine (APE) translates ACE texts unambiguously into discourse representation structures (DRS) that use a variant of the language of first-order logic. A DRS can be further translated into other formal languages, for instance AceRules with various semantics, OWL, and SWRL. Translating an ACE text into (a fragment of) first-order logic allows users to reason about the text, for instance to verify, to validate, and to query it.
- "Attempto Controlled English" | 2019-09-12 | 243 Upvotes 60 Comments
The Cherokee syllabary is a syllabary invented by Sequoyah in the late 1810s and early 1820s to write the Cherokee language. His creation of the syllabary is particularly noteworthy as he could not previously read any script. He first experimented with logograms, but his system later developed into a syllabary. In his system, each symbol represents a syllable rather than a single phoneme; the 85 (originally 86) characters provide a suitable method to write Cherokee. Although some symbols resemble Latin, Greek, and Cyrillic letters, they are not used to represent the same sounds.
- "Cherokee Syllabary" | 2019-06-30 | 56 Upvotes 12 Comments
Afrikaans is a daughter language of Dutch and—unlike Netherlands Dutch, Belgian Dutch and Surinamese Dutch—a separate standard language rather than a national variety. As an estimated 90 to 95% of Afrikaans vocabulary is ultimately of Dutch origin, there are few lexical differences between the two languages; however, Afrikaans has a considerably more regular morphology, grammar, and spelling.
- "Comparison of Afrikaans and Dutch" | 2019-02-03 | 11 Upvotes 1 Comments
The Dolgopolsky list is a word list compiled by Aharon Dolgopolsky in 1964. It lists the 15 lexical items that have the most semantic stability, i.e. they are the 15 words least likely to be replaced by other words as a language evolves. It was based on a study of 140 languages from across Eurasia.
The words, with the first being the most stable, are:
- you (singular, informal)
- nail (finger-nail)
The first item in the list, I/me, has been replaced in none of the 140 languages during their recorded history; the fifteenth, dead, has been replaced in 25% of the languages.
The twelfth item, louse/nit, is well kept in the North Caucasian languages, Dravidian and Turkic, but not in other proto-languages.
- "Dolgopolsky list" | 2015-01-28 | 42 Upvotes 13 Comments
Languages spoken in India belong to several language families, the major ones being the Indo-Aryan languages spoken by 78.05% of Indians and the Dravidian languages spoken by 19.64% of Indians. Languages spoken by the remaining 2.31% of the population belong to the Austroasiatic, Sino-Tibetan, Tai-Kadai and a few other minor language families and isolates. India (780) has the world's second highest number of languages, after Papua New Guinea (839).
Article 343 of the Indian constitution stated that the official language of the Union should become Hindi in Devanagari script instead of the extant English. Later, a constitutional amendment, The Official Languages Act, 1963, allowed for the continuation of English alongside Hindi in the Indian government indefinitely until legislation decides to change it. The form of numerals to be used for the official purposes of the Union are "the international form of Indian numerals", which are referred to as Arabic numerals in most English-speaking countries. Despite the misconceptions, Hindi is not the national language of India. The Constitution of India does not give any language the status of national language.
The Eighth Schedule of the Indian Constitution lists 22 languages, which have been referred to as scheduled languages and given recognition, status and official encouragement. In addition, the Government of India has awarded the distinction of classical language to Kannada, Malayalam, Odia, Sanskrit, Tamil and Telugu. Classical language status is given to languages which have a rich heritage and independent nature.
According to the Census of India of 2001, India has 122 major languages and 1599 other languages. However, figures from other sources vary, primarily due to differences in definition of the terms "language" and "dialect". The 2001 Census recorded 30 languages which were spoken by more than a million native speakers and 122 which were spoken by more than 10,000 people. Two contact languages have played an important role in the history of India: Persian and English. Persian was the court language during the Mughal period in India. It reigned as an administrative language for several centuries until the era of British colonisation. English continues to be an important language in India. It is used in higher education and in some areas of the Indian government. Hindi, the most commonly spoken language in India today, serves as the lingua franca across much of North and Central India. Bengali is the second most spoken and understood language in the country with a significant amount of speakers in Eastern and North- eastern regions. However, there have been concerns raised with Hindi being imposed in South India, most notably in the state of Tamil Nadu and Karnataka. Maharashtra, West Bengal, Assam, Punjab and other non-Hindi regions have also started to voice concerns about Hindi.
- "Languages of India" | 2019-06-05 | 216 Upvotes 239 Comments
The language of the court and government of the Ottoman Empire was Ottoman Turkish, but many other languages were in contemporary use in parts of the empire. Although the minorities of the Ottoman Empire were free to use their language amongst themselves, if they needed to communicate with the government they had to use Ottoman Turkish.
The Ottomans had three influential languages: Turkish, spoken by the majority of the people in Anatolia and by the majority of Muslims of the Balkans except in Albania, Bosnia, and various Aegean Sea islands; Persian, initially used by the educated in northern portions of the Ottoman Empire before being displaced by Ottoman Turkish; and Arabic, used in southern portions of the Ottoman Empire; Arabic was spoken mainly in Arabia, North Africa, Mesopotamia and the Levant. Throughout the vast Ottoman bureaucracy Ottoman Turkish language was the official language, a version of Turkish, albeit with a vast mixture of both Arabic and Persian grammar and vocabulary.
Virtually all intellectual and literate pursuits were taken in Turkish language. Some ordinary people had to hire special "request-writers" (arzuhâlcis) to be able to communicate with the government. The ethnic groups continued to speak within their families and neighborhoods (mahalles) with their own languages (e.g., Jews, Greeks, Armenians, etc.) In villages where two or more populations lived together, the inhabitants would often speak each other's language. In cosmopolitan cities, people often spoke their family languages, many non-ethnic Turks spoke Turkish as a second language. Educated Ottoman Turks spoke Arabic and Persian, as these were the main foreign languages in the pre-Tanzimat era, with the former being used for science and the latter for literary affairs.
In the last two centuries, French and English emerged as popular languages, especially among the Christian Levantine communities. The elite learned French at school, and used European products as a fashion statement. The use of Ottoman Turkish for science and literature grew steadily under the Ottomans, while Persian declined in those functions. Ottoman Turkish, during the period, gained many loanwords from Arabic and Persian. Up to 88% of the vocabulary of a particular work would be borrowed from those two languages.
Linguistic groups were varied and overlapping. In the Balkan Peninsula, Slavic, Greek and Albanian speakers were the majority, but there were substantial minorities of Turks and Romance-speaking Vlachs. In most of Anatolia, Turkish was the majority language, but Greek, Armenian and, in the east and southeast, Kurdish were also spoken. In Syria, Iraq, Arabia, Egypt and north Africa, most of the population spoke varieties of Arabic with, above them, a Turkish-speaking elite. However, in no province of the Empire was there a unique language.
- "Languages of the Ottoman Empire" | 2020-02-22 | 87 Upvotes 18 Comments
Linguistic purism in English is the preference for using words of native origin rather than foreign-derived ones. "Native" can mean "Anglo-Saxon" or it can be widened to include all Germanic words. Linguistic purism in English primarily focuses on words of Latinate and Greek origin, due to their prominence in the English language and the belief that they may be difficult to understand. In its mildest form, it merely means using existing native words instead of foreign-derived ones (such as using begin instead of commence). In a less mild form, it also involves coining new words from Germanic roots (such as wordstock for vocabulary). In a more extreme form, it also involves reviving native words which are no longer widely used (such as ettle for intend). The resulting language is sometimes called Anglish (coined by the author and humorist Paul Jennings), or Roots English (referring to the idea that it is a "return to the roots" of English). The mild form is often advocated as part of Plain English, but the more extreme form has been and is still a fringe movement; the latter can also be undertaken as a form of constrained writing.
English linguistic purism is discussed by David Crystal in the Cambridge Encyclopedia of the English Language. The idea dates at least to the inkhorn term controversy of the 16th and 17th centuries. In the 19th century, writers such as Charles Dickens, Thomas Hardy and William Barnes advocated linguistic purism and tried to introduce words like birdlore for ornithology and bendsome for flexible. A notable supporter in the 20th century was George Orwell, who had a preference for plain Saxon words over complex Latin or Greek ones, and the idea continues to have advocates today.
- "Linguistic purism in English" | 2016-05-17 | 43 Upvotes 55 Comments
The following is a list of English-speaking population by country, including information on both native speakers and second-language speakers.
Some of the entries in this list are dependent territories (e.g.: U.S. Virgin Islands), autonomous regions (e.g.: Hong Kong) or associated states (e.g.: Cook Islands) of other countries, rather than being fully sovereign countries in their own right.
- "Today a greater percentage of Dutch people speak English than Canadians" | 2016-10-28 | 13 Upvotes 5 Comments
This is a list of extinct languages sorted by their time of extinction. A language is determined to be an extinct when its last native or fluent speaker dies. When the exact time of death of the last remaining speaker is not known, either an approximate time or the date when the language was last being recorded is given.
- "List of languages by time of extinction" | 2016-08-20 | 98 Upvotes 68 Comments