SimVerb-3500 is a gold standard evaluation resource for semantic similarity of verbs.
We provide 3500 verb pairs with ratings on a scale 0-10. Here are some examples:
|to reply / to respond||9.79|
|to participate / to join||5.64|
|to stay / to leave||0.17|
SimVerb-3500 covers all normed verb types from the USF free-association database, and provides at least three examples for every VerbNet class.
Please contact Daniela Gerz for any questions.
Download SimVerb-3500 by clicking here.
The .zip file includes the full dataset, as well as a development and test split. In addition to the averaged scores (as shown above) we also provide the raw individual ratings per annotator. Please see the accompanying readme file for the file formats and details.Please cite the following paper if you use SimVerb in your work:
Here is a benchmark of current models on SimVerb-3500. The presented numbers are Spearman correlation scores.
Please consult the supplementary material for an explanation of models.
|Word2Vec SGNS-BOW-8B (dim=500) ||0.348||0.378||0.350|
|Word2Vec SGNS-DEPS-8B (dim=500) ||0.356||0.389||0.351|
|Symmetric Pattern Vectors 8B (dim=500) ||0.328||0.276||0.347|
|Paragram (dim=300) ||0.540||0.525||0.537|
|Paragram + counter-fitting (dim=300) ||0.628||0.611||0.624|
 Tomas Mikolov, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013a. Efficient estimation of word representations in vector space. In ICLR: Workshop Papers
 Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In ACL, pages 302‐308.
 Roy Schwartz, Roi Reichart, and Ari Rappoport. 2016. Symmetric patterns and coordinations: Fast and enhanced representations of verbs and adjectives. In NAACL.
 Roy Schwartz, Roi Reichart, and Ari Rappoport. 2015. Symmetric pattern based word embeddings for improved word similarity prediction. In CoNLL, pages 258‐267.
 Manaal Faruqui and Chris Dyer. 2015. Non-distributional word vector representations. In ACL, pages 464‐469
 John Wieting, Mohit Bansal, Kevin Gimpel, and Karen Livescu. 2015. From paraphrase database to compositional paraphrase model and back. Transactions of the ACL, 3:345‐358.
 Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Lina Maria Rojas‐Barahona, Pei‐Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve J. Young. 2016. Counter-fitting word vectors to linguistic constraints. In NAACL‐HLT.