Barbara McGillivray

I am a research fellow at the Department of Theoretical and Applied Linguistics of the University of Cambridge and at the Alan Turing Institute. I am a member of the Language Technology Lab (LTL) and of the NLP interest group at the Alan Turing Institute. I am also member of the scientific committee of the LaTeCH workshop (Language Technology for Cultural Heritage, Social Sciences, and Humanities) and of the working group Texts and Topics of the EU-funded COST Action IS1310 Reassembling the Republic of Letters, 1500-1800.

I co-founded the research network HiCor (History & Corpus Linguistics) funded by The Oxford Research Centre in the Humanities from 2013 to 2014 and co-organized the workshop From Text To Tech (NLP and Python for Humanities) during the 2015, 2016, and 2017 editions of the Digital Humanities at Oxford Summer School.

My research focusses on computational linguistics with a special interest on historical languages. I received a degree in Mathematics and one in Classics from the University of Florence (Italy), and a Ph.D. in Computational Linguistics from the University of Pisa (2010). I have worked as a language technologist in the Dictionary division of Oxford University Press and as a data scientist in the Open Research Group of Springer Nature.

I am interested in computational semantics applied to current and historical data, and in methodological issues concerned with answering questions in historical linguistics and digital humanities with corpus data and quantitative-computational techniques.

Research topics:


Email: bm517 [at]


McGillivray, B. and Jenset, G. B. (2017). Empirical Historical Linguistics, Oxford: Oxford University Press.

McGillivray, B. and Vatri, A. (2015). Computational valency lexica for Latin and Greek in use: a case study of syntactic ambiguity. Journal of Latin Linguistics, vol. 14 (1)

Passarotti, M., McGillivray, G., and Bamman, D. (2015). A Treebank-based Study on Latin Word Order. In Haverling, G. (Ed.), Latin linguistics in the early 21st century. Acts of the 16th international colloquium on Latin linguistics, Uppsala, June 6th-11th, 2011, Uppsala.

McGillivray, B. and Passarotti, M. (2015). Accessing and using a corpus-driven Latin Valency Lexicon. In Haverling, G. (Ed.), Latin linguistics in the early 21st century. Acts of the 16th international colloquium on Latin linguistics, Uppsala, June 6th-11th, 2011, Uppsala.

McGillivray, B. (2015). Metodi in linguistica computazionale latina. In Molinelli, P., Putzu, I. (Eds.), Modelli epistemologici, metodologie della ricerca e qualità del dato. Dalla linguistica storica alla sociolinguistica storica. Milan: Franco Angeli.

McGillivray, B. (2013). Latin preverbs and verb argument structure: New insights from new methods. In Elly van Gelderen, Jóhanna Barðdal, and Michela Cennamo (eds.), Argument Structure in Flux. The Naples-Capri Papers, John Benjamins

McGillivray, B. (2013). Methods in Latin Computational Linguistics, Leiden: Brill.

McGillivray, B. and Kilgarriff, A. (2013). Tools for historical corpus research, and a corpus of Latin. In Paul Bennett, Martin Durrell, Silke Scheible, Richard J. Whitt (eds.), New Methods in Historical Corpus Linguistics, Tübingen: Narr.

Jenset, G. and McGillivray, B. (2012). Multivariate analyses of affix productivity in translated English. In: Michael P. Oakes and Meng Ji (eds.), Quantitative Methods in Corpus-Based Translation Studies. Amsterdam: John Benjamins

Barðdal, J., Smitherman, T., Bjarnadóttir, V., Danesi, S., Jenset, G., and McGillivray, B. (2012). Reconstructing

Constructional Semantics: The Dative Subject Construction in Old Norse-Icelandic, Latin, Ancient Greek, Old Russian and Old Lithuanian. Studies in Language, 36(3)

McGillivray, B. (2011). Contribution to the lexical entries contained in Manni, P., and Biffi, M. (Eds.), Glossario Leonardiano. Nomenclatura delle macchine nei codici di Madrid e Atlantico. Florence: Olschki

McGillivray, B., Passarotti, M., and Ruffolo, P. (2009). The Index Thomisticus Treebank Project. Annotation, parsing and valency lexicon. Traitement Automatique des Langues, vol. 50 (2)

Meini, L. and McGillivray, B. (2010). Between semantics and syntax: spatial verbs and prepositions in Latin. Space in Language. Proceedings of the Pisa International Conference, Pisa: ETS

McGillivray, B. (2010). Automatic Selectional Preference Acquisition for Latin verbs. In Proceedings of the ACL 2010 Student Research Workshop, Uppsala

McGillivray, B. and Johansson, C. (2009). Making sense through correspondence. Arena Romanistica, vol. 4

McGillivray, B. and Passarotti, M. (2009). The Development of the Index Thomisticus Treebank Valency Lexicon. In Proceedings of the Workshop on Language Technology and Resources for Cultural Heritage, Social Sciences, Humanities, and Education , Athens

McGillivray, B. (2009). Selectional Preferences from a Latin Treebank. In Przepiórkowski A., M. Passarotti, S. Raynaud, and F. van Eynde (eds.) Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT8). Milan: EDUCatt

McGillivray, B., Johansson, C., and Apollon, D. (2008). Semantic structure from Correspondence Analysis. In Proceedings of the Workshop on Graph-based Algorithms for Natural Language Processing (COLING 2008) , Manchester

Lenci, A., McGillivray, B., Montemagni, S., and Pirrelli, V. (2008), Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora. In Proceedings of the Sixth International Language Resources and Evaluation (LREC'08), Marrakech

McGillivray, B. (2006). A probabilistic algorithm for the secant defect of Grassmann varieties.Linear. Algebra and its Applications, vol. 418(2-3)


PhD in Computational Linguistics, University of Pisa, Italy (2007-2010)
Three-year degree in Classics, equivalent to Bachelor's degree, University of Florence, Italy (2004-2006)
Four-year degree in Mathematics, equivalent to Master's Degree, University of Florence, Italy (1999-2004)