Educe

Abstract representation of a discourse-annotated corpus.

Download .zip Download .tar.gz View on GitHub

About

Educe is a library for working with discourse-annotated corpora. It also includes some utility scripts for building, maintaining, and for querying these corpora. Currently supported corpora are

  • (SDRT) STAC corpus
  • (RST) RST Discourse Treebank (experimental, 2014-07-14)
  • (PDTB) Penn Discourse Treebank (experimental, 2014-07-14)

If you have a discourse-annotated corpus, or are trying to build one, you may find it useful to add support for it to educe.

Documentation

Installation

First, try

pip --help

If this doesn't work, download this setup script and run

python distribute_setup.py
easy_install pip

If you have pip installed, then install educe and its dependencies:

pip install -r requirements.txt .

See also