Computer Science and Engineering

Implicit Argument Annotation Data

Here you will find our annotation data for implicit arguments of nominal predicates. Note that these data differ somewhat from the data we used in our 2010 paper; however, the data match the data used in our 2012 CL article exactly. First, download the data and unpack it. The XML elements should be interpreted as follows:

annotations: contains all of the annotations for a particular predicate node. The location of the predicate node is given by the "for_node" attribute on this element.

annotation: indicates an argument label applied to a parse tree node. The argument label is given by the "value" attribute. The argument label is applied to the node at the location given by the "node" attribute.

attribute: there are two possible values for this element, "Explicit" and "Split". The former indicates that the annotation label is supplied by NomBank. The latter indicates that the annotated node combines with other annotated nodes to form a complete argument. These other nodes exist under the same "annotations" element and will also have the "Split" attribute applied to them. One thing to note: suppose there is an implicit split Arg0. You will find at least two annotated Arg0 nodes with the "Split" attribute. You might also find some Arg0 nodes without the "Split" attribute. This simply indicates that we found other mentions of the Arg0 that were not split.

All node locations are in the format "wsj_xxxx:a:b:c". The "xxxx" indicates an article number, "a" indicates a zero-based sentence number within article "wsj_xxxx", "b" indicates a zero-based terminal number within sentence "a", and "c" indicates the height of the target node above the terminal given by "b". When you are performing the counting for terminal node locations in a parse tree, be sure to include "trace" terminals, which are present in Penn TreeBank parses.

We did our best to adhere to the following requirement: for a missing argument position (i.e., one not given by NomBank) for a nominal predicate, annotate all mentions of the argument in the current and all preceding sentences. If you find annotations that don't fit with this requirement (e.g., annotations in the sentences following the predicate), you should ignore them.

NomBank often marks the predicate as itself supplying a so-called incorporated argument. Furthermore, NomBank completely ignores predicates if they are only associated with an incorporated argument (see the NomBank guidelines for more information). Because we annotated all predicate instances in our study, we added incorporated argument labels to predicate nodes where needed. We did not, however, include incorporated argument positions in our implicit argument identification evaluation. To reproduce our results, you should not evaluate argument positions for which there exists an incorporated argument label on the predicate node.

If you have any questions about the annotations, please email us (email addresses are provided in our papers).

Return to Nominal Semantic Role Labeling