Skip to content
Home » Rationale


    UNESCO reports on language as an intangible cultural heritage 

    Living languages constitute carriers of values and knowledge, very often used in the practice and transmission of intangible cultural heritage. The spoken word in mother language is important in the enactment and transmission of virtually all intangible heritage, especially in oral traditions and expressions, songs and most rituals. Using their mother tongue, bearers of specific traditions often use highly specialized sets of terms and expressions, which reveal the intrinsic depth oneness between mother tongue and the intangible cultural heritage.

    Indeed people use their mother language idiomatically

    Indeed, people use their mother tongue idiomatically unless there is a particular reason not to do so, according to John Searle, one of the greatest 20th-century linguists [1]. This oral wealth is preserved and becomes a subject of study through writing (with a brilliant example in the Homeric epics). However, in the 21st century, from the 7111 living (spoken) languages, only about half (3995) have developed a writing system (see source1, source2, source3).

    Survival of languages

    Survival of those languages is ​​currently supported by digital technologies, which presupposes a writing system and also a set of electronic resources and specialized tools. A living spoken language is not only part of the cultural heritage; it is mainly a day-to-day tool of highest usability and enormous symbolic and emotional value.


    Today, in the era of the Internet and social media, the spoken language is being studied and processed as a subject of economic interest. It is, however, “wayward” [2] both in learning, when it is not the mother tongue, and in management with technological tools and requires particular infrastructures, which currently attract the international research interest [3,4].


    In view of this situation, PHILOTIS adopts a general framework of excellence and envisions to create an infrastructure, methodology and tools for recording and analyzing living spoken languages, with emphasis on idiomatic expressions. The infrastructure is characterized by state-of-the-art approaches and technologies for content analysis, and the new tools and services will be combined into an integrated sophisticated digital resource platform. The project includes:

    • A novel methodology-best practice guide for the recoding of living spoken languages
    • State-of-the-art tools and services to collect sources, vocabularies development, multiword expressions recognition and digital rights management
    • A complete web resource/portal dedicated to the study of living spoken languages
    • A technological infrastructure (hardware and software) for the capturing and processing of various media relating to spoken languages
    • A case study regarding a language with minimal resources and no writing system, the Pomak language
    • A case study regarding a language with limited resources a writing system, the Greek language


    1. John Searle. 1975. Indirect speech acts. In P. Cole & J.L. Morgan(Eds.), Syntax and semantics. Speech acts (pp. 59–82). New York: Academic.
    2. Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002, February). Multiword expressions: A pain in the neck for NLP. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 1-15). Springer, Berlin, Heidelberg.
    3. Klyueva, N., Doucet, A., & Straka, M. (2017, April). Neural networks for multi-word expression detection. In Proceedings of the 13th Workshop on Multiword Expressions (MWE 2017) (pp. 60-65).
    4. Gharbieh, W., Bhavsar, V., & Cook, P. (2017, August). Deep learning models for multiword expression identification. In Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (* SEM 2017) (pp. 54-64).

    Please follow and like us:
    Tweet 20

    Leave a Comment

    This site uses Akismet to reduce spam. Learn how your comment data is processed.