Skip to main navigation Skip to search Skip to main content

Malicious Domain Detection on Out-of-Distribution Gray Data through Graph Contrastive Learning with Structure Aggregation

  • East China Normal University
  • Harbin Institute of Technology Shenzhen
  • Shenzhen Loop Area Institute

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Graph-based threat detection methods model Indicators of Compromise (IoC) using heterogeneous graphs and train node classifiers to identify malicious domains. Despite their promising performance, these approaches still face two major challenges. Firstly, the high cost of node annotation leads to a lack of evaluation on extensive gray data (unlabeled data). Secondly, the previous observations reveal a significant distribution shift in the Domain Maliciousness Graph (DMG), where structural differences between labeled and unlabeled domains hinder model performance. Existing graph learning methods have not yet considered both of these challenges simultaneously. To fill the gap, we frame the problem as semi-supervised graph node classification under out-of-distribution (OOD) constraints. We introduce graph aggregative contrastive learning (GRAVEL), which leverages the inherent structure of DMG to enhance detection performance on OOD unlabeled domains. GRAVEL is pre-trained end-to-end on abundant in-distribution malicious and benign samples, then fine-tuned with scarce OOD malicious data via mixup. During pre-training, label propagation seeds pseudo-labels, and a label-guided aggregation classifier is used to warm up the model, after which multi-view contrastive learning sharpens features for unlabeled domains. Extensive industrial evaluations demonstrate that GRAVEL improves F1 by 5-20% across diverse benchmarks for OOD malicious domain detection, consistently outperforming state-of-the-art baselines.

Original languageEnglish
Title of host publicationKDD 2026 - Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1
PublisherAssociation for Computing Machinery
Pages324-335
Number of pages12
ISBN (Electronic)9798400722585
DOIs
StatePublished - 20 Apr 2026
Externally publishedYes
Event32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, KDD 2026 - Jeju Island, Korea, Republic of
Duration: 9 Aug 202613 Aug 2026

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume1-A
ISSN (Print)2154-817X

Conference

Conference32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.1, KDD 2026
Country/TerritoryKorea, Republic of
CityJeju Island
Period9/08/2613/08/26

Keywords

  • heterogeneous graph
  • indicator of compromise
  • malicious domain detection
  • out-of-distribution

Fingerprint

Dive into the research topics of 'Malicious Domain Detection on Out-of-Distribution Gray Data through Graph Contrastive Learning with Structure Aggregation'. Together they form a unique fingerprint.

Cite this