TY - GEN
T1 - Active learning for dependency parsing with partial annotation
AU - Li, Zhenghua
AU - Zhang, Min
AU - Zhang, Yue
AU - Liu, Zhanyi
AU - Chen, Wenliang
AU - Wu, Hua
AU - Wang, Haifeng
N1 - Publisher Copyright:
© 2016 Association for Computational Linguistics.
PY - 2016
Y1 - 2016
N2 - Different from traditional active learning based on sentence-wise full annotation (FA), this paper proposes active learning with dependency-wise partial annotation (PA) as a finer-grained unit for dependency parsing. At each iteration, we select a few most uncertain words from an unlabeled data pool, manually annotate their syntactic heads, and add the partial trees into labeled data for parser retraining. Compared with sentence-wise FA, dependency-wise PA gives us more flexibility in task selection and avoids wasting time on annotating trivial tasks in a sentence. Our work makes the following contributions. First, we are the first to apply a probabilistic model to active learning for dependency parsing, which can 1) provide tree probabilities and dependency marginal probabilities as principled uncertainty metrics, and 2) directly learn parameters from PA based on a forest-based training objective. Second, we propose and compare several uncertainty metrics through simulation experiments on both Chinese and English. Finally, we conduct human annotation experiments to compare FA and PA on real annotation time and quality.
AB - Different from traditional active learning based on sentence-wise full annotation (FA), this paper proposes active learning with dependency-wise partial annotation (PA) as a finer-grained unit for dependency parsing. At each iteration, we select a few most uncertain words from an unlabeled data pool, manually annotate their syntactic heads, and add the partial trees into labeled data for parser retraining. Compared with sentence-wise FA, dependency-wise PA gives us more flexibility in task selection and avoids wasting time on annotating trivial tasks in a sentence. Our work makes the following contributions. First, we are the first to apply a probabilistic model to active learning for dependency parsing, which can 1) provide tree probabilities and dependency marginal probabilities as principled uncertainty metrics, and 2) directly learn parameters from PA based on a forest-based training objective. Second, we propose and compare several uncertainty metrics through simulation experiments on both Chinese and English. Finally, we conduct human annotation experiments to compare FA and PA on real annotation time and quality.
UR - https://www.scopus.com/pages/publications/85011965796
U2 - 10.18653/v1/p16-1033
DO - 10.18653/v1/p16-1033
M3 - 会议稿件
AN - SCOPUS:85011965796
T3 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
SP - 344
EP - 354
BT - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 7 August 2016 through 12 August 2016
ER -