Skip to main navigation Skip to search Skip to main content

HITMI&T at SemEval-2021 Task 5: Integrating Transformer and CRF for Toxic Spans Detection

  • Chenyi Wang
  • , Tianshu Liu
  • , Tiejun Zhao*
  • *Corresponding author for this work
  • Harbin Institute of Technology

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper introduces our system at SemEval-2021 Task 5: Toxic Spans Detection. The task aims to accurately locate toxic spans within a text. Using BIO tagging scheme, we model the task as a token-level sequence labeling task. Our system uses a single model built on the model of multi-layer bidirectional transformer encoder. And we introduce conditional random field (CRF) to make the model learn the constraints between tags. We use ERNIE as pre-trained model, which is more suitable for the task accroding to our experiments. In addition, we use adversarial training with the fast gradient method (FGM) to improve the robustness of the system. Our system obtains 69.85% F1 score, ranking 3rd for the official evaluation.

Original languageEnglish
Title of host publicationSemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop
EditorsAlexis Palmer, Nathan Schneider, Natalie Schluter, Guy Emerson, Aurelie Herbelot, Xiaodan Zhu
PublisherAssociation for Computational Linguistics (ACL)
Pages870-874
Number of pages5
ISBN (Electronic)9781954085701
DOIs
StatePublished - 2021
Event15th International Workshop on Semantic Evaluation, SemEval 2021, co-located with The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online, Thailand
Duration: 5 Aug 20216 Aug 2021

Publication series

NameSemEval 2021 - 15th International Workshop on Semantic Evaluation, Proceedings of the Workshop

Conference

Conference15th International Workshop on Semantic Evaluation, SemEval 2021, co-located with The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
Country/TerritoryThailand
CityVirtual, Online
Period5/08/216/08/21

Fingerprint

Dive into the research topics of 'HITMI&T at SemEval-2021 Task 5: Integrating Transformer and CRF for Toxic Spans Detection'. Together they form a unique fingerprint.

Cite this