Research Areas


My linguistics training had a very strong historical component, with a focus on Slavic and wider Indo-European. I like experimental and computational approaches and get bored quickly when doing interpretative, close-reading work (which is my limit: I look up to traditional philologists).

The main areas I work on are:

My doctoral project looks (quantitatively, through treebank data) into the competition between finite and non-finite temporal subordinates in Early Slavic, and their position within the typology of when-clauses in 1400+ languages of the world.

Digital Humanities

It’s hard to set Digital Humanities apart from my main areas of research, and that’s often true for digital humanists at large: one tends to approach DH to answer questions in their research areas, and then may find themselves wondering about DH tools and techniques as such. The following are some of the areas I work most frequently in:

Some of my recent contributions include:

Open Scholarship in the Humanities

I am Editorial Assistant for the Journal of Open Humanities Data (JOHD) and Fellow at RROx, the Oxford ‘branch’ of the UK Reproducibility Network (UKRN).

I’m interested in the specific challenges faced by the Humanities in making research reproducible (also: I get a bit angry when I am not given the steps followed by another researcher to get to an interpretation or a result).

Some of my contributions to the discussion:

Selected past projects


In 2020 I spent two months as a Research Assistant at the ReadOxford research group, based at the Department of Experimental Psychology of the University of Oxford. The aim of the group is to answer different questions related to child literacy development. I mainly dealt with data processing, developing R scripts to make corpus data reproducible and analysable for morphological complexity and lexical variation. My major contribution has been scripting an R code to automatically calculate the Average Reduced Frequency (ARF) of combined lemmata/parts of speech in the Oxford Children Corpus and Childes treebank.


Starting from mid-2020, I was first Research Assistant , then collaborated with the International Multimodal Communication Centre (IMCC) based within the Oxford School of Global and Area Studies (OSGA) at the University of Oxford. I carried out annotation and correlation analyses of speech-gesture co-occurrences in Russian and American media, largely within the project Depictions of Post-COVID-19 Futures in Russian International Media: Multimodal Viewpoint Analysis .

British Library

In 2015, I spent two months as a trainee Assistant Curator-Cataloguer for the Slavonic Collections at The British Library . During that time, I enhanced the online catalogue of all Slavonic early-printed Cyrillic books held at the British Library (and fuelled my interest for all-things data and pre-modern Slavic). Check out two posts I wrote for the British Library’s European Studies Blog:

Check out some of my projects

Parallel Bibles
Temporal subordination in 1400+ languages of the world
Machines in the media
Semantic change in the era of mechanization
A scalable dependency parser for pre-modern Slavic
Ancient Greek graph-based syntactic embeddings
Syntactic word representations for Ancient Greek