The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Extended constituent-to-dependency conversion for English

Author

Editor

  • Joakim Nivre
  • Heiki-Jaan Kalep
  • Kadri Muischnek
  • Mare Koit

Summary, in English

We describe a new method to convert English constituent trees using the Penn Treebank annotation style into dependency trees. The new format was inspired by annotation practices used in other dependency treebanks with the intention to produce a better interface to further semantic processing than existing methods. In particular, we used a richer set of edge labels and introduced links to handle long-distance phenomena such as wh-movement and topicalization.



The resulting trees generally have a more complex dependency structure. For example, 6% of the trees contain at least one nonprojective link, which is difficult for many parsing algorithms. As can be expected, the more complex structure and the enriched set of edge labels make the trees more difficult to predict, and we observed a decrease in parsing accuracy when applying two dependency parsers to the new corpus. However, the richer information contained in the new trees resulted in a 23% error reduction in a baseline FrameNet semantic role labeler that relied on dependency arc labels only.

Publishing year

2007

Language

English

Pages

105-112

Publication/Series

NODALIDA 2007 Proceedings

Document type

Conference paper

Publisher

University of Tartu

Topic

  • Computer Science

Keywords

  • dependency syntax
  • treebanks
  • Natural language processing

Conference name

16th Nordic Conference of Computational Linguistics

Conference date

2007-05-25 - 2007-05-26

Conference place

Tartu, Estonia

Status

Published

ISBN/ISSN/Other

  • ISBN: 978-9985-4-0514-7