The project of a deeply tagged parallel corpus of Middle Russian translations from Latin

Keywords: Middle Russian, Church Slavonic, Latin, translation, electronic corpora, syntactic alignment


Tagged parallel corpora are powerful tools for the analysis of natural language. Moreover, for historical linguistics, whose most peculiar shortcoming is lack of living native speakers, corpora — as paper or electronic collections of written texts — are the main source of linguistic information. Old and Middle Russian are well-documented languages, and a host of manuscripts in both idioms — including those containing numerous translations — are available for investigation. Nevertheless, up to now there is no parallel translational corpus of Middle Russian. Thus, a number of important written sources containing information valuable for linguists, literary scholars and historians cannot be studied properly. This article provides a preliminary account of the project of a deeply tagged parallel corpus of Middle Russian translations from Latin. Such corpus may prove useful in the formal description of the translation techniques of the time, which may help with dividing the anonymous texts of the time into several groups based on their language features. Such grouping may help with authorship attribution and, consequently, with incorporating each translation into a proper cultural landscape.

From the linguistic point of view, such corpus could provide researchers with crucial information on the vocabulary, morphology and syntax of Middle Russian with an emphasis on the argument structure of the verbs, usage of borrowed lexical items and set expressions and professional skills of the ancient translators. The article gives an outline of the crucial features of the prospective Middle Russian translational corpus, its possible primary contents, text standardization and annotation principles, as well as the reasons for not using a theory-neutral syntactic apparatus, characteristic of the existing historical corpora of ancient Indo-European languages, such as TOROT or PROIEL. An explanation of how the potential users of this corpus could benefit from our non-standard tagging principles is given.


Alekseeva, E. L. (2014) Sintaksicheskaya razmetka korpusa drevnerusskikh agiograficheskikh tekstov SKAT [Syntactic tagging of Saint-Petersburg corpus of hagiographic texts (SCAT)]. In: Strukturnaya i prikladnaya lingvistika. Iss. 10. Saint Petersburg: Saint Petersburg State University Publ., pp. 345–351. (In Russian)

Arkhangelskiy, T. A. (2012) Printsipy postroeniya morfologicheskogo parsera dlya raznostrukturnykh yazykov. Extended abstract of PhD dissertation (Philology). Moscow, Moscow State University, 24 p. (In Russian)

Arkhangelskiy, T. A., Mishina, E. A., Pichkhadze, A. A. (2014) Sistema elektronnoj grammaticheskoj razmetki drevnerusskikh i tserkovnoslavjanskikh tekstov i ee ispol’zovanie v veb-resursakh [A system for digital morphological tagging for Old Russian and Church Slavonic texts and its use in web resources]. In: V. A. Baranov, V. Zhelyazkova, A. M. Lavrent’ev (eds.). Pismenoto nasledstvo i informacionnite tehnologii. El’Manuscript–2014. Sofia; Izhevsk: Bolgarskaya akademii nauk Publ., pp. 102–104. (In Russian)

Bailyn, J. F. (2012) The Syntax of Russian. Cambridge; New York: Cambridge University Press, XVIII, 373 p. (In English)

Berdičevskis, A., Eckhoff, H., Gavrilova, T. (2016) The beginning of a beautiful friendship: Rule-based and statistical analysis of Middle Russian. In: V. P. Selegej (ed.). Computational linguistics and intellectual technologies: Proceedings of the International conference “Dialogue 2016”. Vol. 15 (22). Moscow: Russian State University for the Humanities Publ., pp. 99–111. (In English)

Bird, S., Klein, E., Loper, E. (2009) Natural language processing with Python. Beijing: O’Reilly, XX, 479 p. (In English)

Carnie, A. (2008) Constituent Structure. Oxford; New York: Oxford University Press, XVIII, 292 p. (Oxford surveys in syntax and morphology. Book 5). (In English)

Chomsky, N. (2015) The Minimalist Program: 20. Anniversary edition. Cambridge, MA: MIT Press, XIII, 393 p. (In English)

Danckaert, L. (2011) On the left periphery of the Latin embedded clauses. PhD dissertation (Philology). Ghent, Belgium, Ghent University, XVII, 387 p. (In English)

Dimitrova, Ts. (2011) The Old Bulgarian noun phrase: Towards an annotation specification. Saarbrücken: VDM Verlag Dr. Müller, VII, 273, 28 p. (In English)

Eckhoff, H. M., Berdičevskis, A. (2016) Automatic parsing as an efficient pre-annotation tool for historical texts. In: Proceedings of the Workshop on language technology resources and tools for digital humanities (LT4DH). Stroudsburg, PA: The COLING 2016 organizing committee; Association for Computational Linguistics, pp. 62–70. (In English)

Fedorova, E. S. (1999a) Traktat Nikolaja de Liry “Probatio adventus Christi” i ego tserkovnoslavyanskij perevod kontsa XV veka: In 2 books. Book 1. Moscow: Prosvetitel’ Publ., 287 p. (In Russian)

Fedorova, E. S. (1999b) Traktat Nikolaja de Liry “Probatio adventus Christi” i ego tserkovnoslavyanskij perevod kontsa XV veka: In 2 books. Book 2: Prilozheniya. Moscow: Prosvetitel’ Publ., 120 p. (In Russian)

Gaifman, H. (1965) Dependency systems and phrase-structure systems. Information and Control, 8 (3): 304–337. DOI: 10.1016/S0019-9958(65)90232-9 (In English)

Gavrilova, T. S., Shalganova, T. A., Liashevskaia, O. N. (2016) K zadache avtomaticheskoj leksiko-grammaticheskoj razmetki starorusskogo korpusa XV–XVII vv. [Lexico-grammatical annotation of the Middle Russian corpus 1400–1700: A computational approach]. Vestnik Pravoslavnogo Svyato-Tikhonovskogo gumanitarnogo universiteta. Seriya III: Filologiya — St. Tikhon’s University Review. Series III: Philology, 2 (47): 7–25. DOI: 10.15382/sturIII201647.7-25 (In Russian)

Grishman, R. (1999) Iterative alignment of syntactic structures for a bilingual corpus. In: S. Armstrong, K. Church, P. Isabelle et al. (eds.). Natural language processing using very large corpora. Dordrecht: Springer, pp. 225–234. (Text, Speech and Language Technology. Vol. 11.). DOI: 10.1007/978-94-017-2390-9_14 (In English)

Grot, Ja. K. (1894) Russkoe pravopisanie; Rukovodstvo, sostavlennoe po porucheniyu 2-go Otdeleniya Imperatorskoj akademii nauk akademikom Ya. K. Grotom. 11th ed. Saint Petersburg: Tipografiya Imperatorskoij Akademii Nauk Publ., XII, 120, XL p. (In Russian)

Haug, D T. T. (2010) PROIEL guidelines for annotation. [Online]. Available at: syntactic_guidelines.pdf (accessed 15.08.2019). (In English)

Haug, D. T. T., Jøndal, M. L., Eckhoff, H. M. et al. (2009) Computational and linguistic issues in designing a syntactically annotated parallel corpus of Indo-European languages. TAL (Traitement Automatique des Langues), 50 (2): 17–45. (In English)

Hays, D. G. (1964) Dependency theory: A formalism and some observations. Santa Monica, CA: RAND Corporation, VII, 39 p. (In English)

Isakadze, N. V. (1999) Otrazhenie morfologii i referentsial’noj semantiki imennoj gruppy v formal’nom sintaksise. Extended abstract of PhD dissertation (Philology). Moscow, Moscow State University, 23 p. (In Russian)

Kalugin, V. V. (2001) “Kniga svyatogo Avgustina” v russkoj pis’mennosti XVI — XIX vekov. In: A. M. Moldovan, V. S. Golyshenko (ed.). Lingvisticheskoe istochnikovedenie i istoriya russkogo yazyka. Moscow: Drevlekhranilishche Publ., pp. 108–163. (In Russian)

Kazakova, N. A. (1980) Zapadnaya Evropa v russkoj pis’mennosti XV–XVI vekov. Iz istorii mezhdunarodnykh kul’turnykh svyazej Rossii. Leningrad: Nauka Publ., 278 p. (In Russian)

Kazakova, N. A., Katushkina, L. G. (1968) Russkij perevod XVI v. pervogo izvestiya o puteshestvii Magellana (Perevod pis’ma Maksimiliana Transil’vana). In: D. S. Likhachev (ed.). Trudy otdela drevnerusskoj literatury. Vol. 23. Leningrad: Nauka Publ., pp. 227–252. (In Russian)

Kiss, K. É. (ed.). (1995) Discourse Configurational languages. New York; Oxford: Oxford University Press, 402 p. (Oxford Studies in Comparative Syntax). (In English)

Kloss, B. M. (1975) Maksim Grek — perevodchik povesti Eneya Sil’viya “Vzyatie Konstantinopolya turkami” [Maxim the Greek — translator of Aeneus Silvius’ narrative “Seizure of Constantinople by Turks”]. In: Pamyatniki kul’tury. Novye otkrytiya. Pis’mennost’, iskusstvo, arkheologiya. Moscow: Nauka Publ., pp. 55–61. (In Russian)

Matasova, T. A. (2014) Pervaya kniga “Geografii” Pomponiya Mely v drevnerusskom perevode: O retseptsii antichnogo naslediya v russkoj kul’ture XV‒XVI vv. [The Old-Russian translation of the first part of Pomponius Melas’ “Cosmography”: Perception of classical heritage in Russian culture in XV‒XVI centuries]. Aristej: vestnik klassicheskoj filologii i antichnoj istorii — Aristeas. Philologia Classica et Historia Antiqua, IX: 310–343. (In Russian)

McCloskey, J. (1998) Subjecthood and subject positions. In: L. Haegeman (ed.). Elements of grammar: Handbook in generative syntax. Dordrecht: Springer, pp. 197–235. DOI: 10.1007/978-94-011-5420-8_5 (In English)

Mitrenina, O. V. (2012) Sintaksis psevdokorrelyativnykh konstruktsij s mestoimeniem kotoryj v starorusskom [The syntax of pseudo-correlative constructions with the pronoun Kotoryj (“Which”) in Middle Russian]. Slověne. International Journal of Slavic Studies, 1 (1): 61–73. DOI: 10.31168/2305-6754.2012.1.1.4 (In Russian)

Mitrenina, O. V. (2014) The corpora of Old and Middle Russian texts as an advanced tool for exploring an extinguished language. Scrinium. Journal of Patrology, Critical Hagiography, and Ecclesiastical History, 10 (1): 455–461. DOI: 10.1163/18177565-90000109 (In English)

Melchuk, I. (2014) Dependency in language. In: K. Gerdes, E. Hajičová, L. Wanner (eds.). Dependency linguistics. Recent advances in linguistic theory using dependency structures. Amsterdam; Philadelphia: John Benjamins Publishing Company, pp. 1–32. (Linguistik Aktuell / Linguistics Today. Vol. 215). (In English)

Nida, E. A. (1949) Morphology: The descriptive analysis of words. Ann Arbor: University of Michigan Press, XVI, 342 p. (In English)

Oniga, R. (2014) Latin: A linguistic introduction. Oxford: Oxford Universty Press, XVIII, 345 p. (In English)

Osborne, T. (2014) Dependency grammar. In: A. Carnie, Y. Sato, D. Siddiqi (eds.). The Routledge handbook of syntax. Abingdon: Routledge, pp. 604–626. (In English)

Partee, B. H., ter Meulen, A., Wall, R. E. (1990) Mathematical methods in linguistics. Dordrecht; Boston; London: Kluwer Academic Publishers, XX, 663 p. (In English)

Polyakov, A. E. (2014) Korpus tserkovnoslavyanskikh tekstov: Problemy orfografii i grammatiki [Church Slavonic corpus: Spelling and grammar problems]. In: A. Kiklewicz (ed.). Przegląd Wschodnioeuropejski [East European Review]. Vol. V (1). Olsztyn: University of Warmia and Mazury in Olsztyn, pp. 245–254. (In Russian)

Durandus, W. (2012) “Rationale Divinorum officiorum” Wilgelmi Durandi v russkom perevode kontsa XV veka. Moscow; Saint Petersburg: Indrik Publ., 261 p. (In Russian)

Sokolov, E. G. (2014) “De moluccis insulis” Maksimiliana Transil’vana v russkom perevode XVI v.: Zadachi i perspektivy lingvisticheskogo issledovaniya [“De Moluccis Insulis” by Maximilianus Transylvanus in 16th century Russian translation: Tasks and prospects of the linguistic study]. Vestnik Sankt-Peterburgskogo universiteta. Yazyk i literatura — Vestnik of Saint Petersburg University. Language and Literature, 11 (3): 60–70. (In English)

Tomelleri, V. S. (ed.). (1999) Die “Pravila gramatichnye”, der erste syntaktische Traktat in Rußland. München: Verlag Otto Sagner, 159 p. (In German)

Tomelleri, V. S. (2004) Il Salterio commentato di Brunone di Würzburg in area slavo-orientale: Fra traduzione e tradizione (con un’appendice di testi). München: Verlag Otto Sagner, XVII, 343 p. (Slavistiche Beiträge. Bd. 430). (In Italian)

Tomelleri, V. S. (2011) Latinskaya traditsiya u vostochnykh slavyan (nekotorye zametki). In: Aktual’nye problemy filologii: Antichnaya kul’tura i slavyanskij mir. Minsk: National Institute For Higher Education Publ., pp. 214–221. (In Russian)

Tvorogov, O. V. (ed.). (1972) Troyanskije skazaniya. Srednevekovye rytsarskie romany o Troyanskoj vojne po russkim rukopisyam XVI–XVII vekov. Leningrad: Nauka Publ., 232 p. (In Russian)

Tsypkin, D. O. (1990) Skazanije “O Molukitskykh ostrovekh” i Povest’ o Loretskoj Bogomateri (Iz sbornika BAN, Arhangel’skoe sobr., D. 193, XVI v.). In: D. S. Likhachev (ed.). Trudy otdela drevnerusskoj literatury. Vol. 44. Moscow: Nauka Publ., pp. 378–386. (In Russian)

Wimmer, E. (1990) Die russisch-kirchenslavische Version von Maximilian Transylvans De Moluccis insulis ... epistola und ihr Autor. Zeitschrift für slavische Philologie, 50 (1): 51–66. (In German)

Wimmer, E. (2005) Novgorod — ein Tor zum Westen? Die Übersetzungstätigkeit am Hofe des Novgoroder Erzbischofs Gennadij in ihrem historischen Kontext (um 1500). Hamburg: Kovac, 229 S. (Hamburger Beiträge zur Geschichte des östlichen Europa. Bd. 13). (In German)

Zwicky, A. M. (1985) Heads. Journal of Linguistics, 2 (1): 1–29. DOI: 10.1017/S0022226700010008 (In English)

Applied Linguistics