Skip to content

aurora1625/mateplus

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mateplus

This repository contains code for an extended version of the mate-tools semantic role labeler. Most extensions are described in Roth and Woodsend, 2014. Unpublished extensions include feature selection routines and some currently undescribed additional functionalities.

June 2015: The current version achieves state-of-the-art performance on the CoNLL-2009 data set. With F1-scores of 87.33 in-domain and 76.38 out-of-domain, it is the best performing system for SRL in English to date. With an in-domain F1-score of 81.38, it is also the best SRL system available for German. A demo is available online here.

Dependencies

The following libraries and model files need to be downloaded in order to run mateplus on English text:

To run mateplus on German text, additional preprocessing libraries need to be downloaded:

If you want to run mateplus on German text using ParZu as an external dependency parser (recommended for non-newswire text), please use this model from Google Drive.

Running mateplus

If copies of all required libraries and models are available in the subdirectories lib/ and models/, respectively, mateplus can simply be executed as a standalone application using the script scripts/parse.sh. This script runs the mate-tools pipeline to preprocess a given input text file (assuming one sentence per line), and applies our state-of-the-art model for identifying and role labeling of semantic predicate-argument structures. For German, please use the script scripts/parse-ger.sh (recommended for newswire text) or scripts/parse-ger-ext.sh (recommended for non-newswire text).

It is also possible to apply the mateplus SRL model on already preprocessed text in the CoNLL 2009 format, using the Java class se.lth.cs.srl.Parse. Since mateplus is trained based on input preprocessed with mate-tools, however, we strongly recommend to use the complete pipeline to achieve best performance.

References

If you are using mateplus in your work--and we highly recommend you do!--please cite the following publication:

Michael Roth and Kristian Woodsend (2014). Composition of word representations improves semantic role labelling. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, October, pp. 407-413

Depending on which parts of the pipeline you are using, please also cite the following.

German joint parsing model: Bernd Bohnet, Joakim Nivre, Igor Boguslavsky, Richárd Farkas, Filip Ginter, Jan Hajic (2013). Joint morphological and syntactic analysis for richly inflected languages. Transactions of the Association for Computational Linguistics (TACL) 1:415--428

ParZu--The Zurich Dependency Parser: Rico Sennrich, Martin Volk, Gerold Schneider (2013). Exploiting synergies between open resources for german dependency parsing, POS-tagging, and morphological analysis. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Hissar, Bulgaria.

English parsing model: Bernd Bohnet (2010). Very high accuracy and fast dependency parsing is not a contradiction. The 23rd International Conference on Computational Linguistics (COLING), Beijing, China.

Original mate-tools SRL model: Anders Björkelund, Love Hafdell, and Pierre Nugues (2009). Multilingual semantic role labeling. In Proceedings of The Thirteenth Conference on Computational Natural Language Learning (CoNLL), Boulder, Colorado, pp. 43--48


1 To reproduce our evaluation results on the CoNLL-2009 data set, preprocessing components must be retrained on the training split only, using 10-fold jackknifing.

About

Extension of the mate-tools NLP pipeline

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 99.6%
  • Shell 0.4%