TY - GEN
T1 - CDGM
T2 - 20th International Conference on Advanced Data Mining Applications, ADMA 2024
AU - Xie, Yushun
AU - Wang, Haiyan
AU - Tan, Runnan
AU - Song, Xiangyu
AU - Gu, Zhaoquan
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Cyberattacks can lead to data breaches, service disruptions, and economic losses, and may even threaten national security and social stability. Therefore researchers have proposed various methods based on public datasets to improve the intelligence and automation of cybersecurity defense techniques. However, these public datasets usually have limited coverage of the types of cyberattacks, resulting in the proposed methods being ineffective against attacks not included in the dataset. Meanwhile, cybersecurity defenders often need to study cyberattack scenarios involving specific assets that are usually not represented in public datasets. To address these challenges, we propose a new approach to cybersecurity controlled dataset generation. Our method can reproduce any cyberattack using our four-role architecture, generating customized private attack data that includes specific assets, this capability satisfies the needs of researchers. By integrating the private attack data with a cybersecurity knowledge base derived from open-source datasets, we construct a comprehensive cybersecurity dataset. Extensive experiments demonstrate that the cybersecurity dataset generated by our method is suitable for various common cybersecurity tasks, such as threat hunting, alert analysis, and knowledge reasoning.
AB - Cyberattacks can lead to data breaches, service disruptions, and economic losses, and may even threaten national security and social stability. Therefore researchers have proposed various methods based on public datasets to improve the intelligence and automation of cybersecurity defense techniques. However, these public datasets usually have limited coverage of the types of cyberattacks, resulting in the proposed methods being ineffective against attacks not included in the dataset. Meanwhile, cybersecurity defenders often need to study cyberattack scenarios involving specific assets that are usually not represented in public datasets. To address these challenges, we propose a new approach to cybersecurity controlled dataset generation. Our method can reproduce any cyberattack using our four-role architecture, generating customized private attack data that includes specific assets, this capability satisfies the needs of researchers. By integrating the private attack data with a cybersecurity knowledge base derived from open-source datasets, we construct a comprehensive cybersecurity dataset. Extensive experiments demonstrate that the cybersecurity dataset generated by our method is suitable for various common cybersecurity tasks, such as threat hunting, alert analysis, and knowledge reasoning.
KW - Cyberattack
KW - Cybersecurity Dataset
KW - Data generate
UR - https://www.scopus.com/pages/publications/85214375335
U2 - 10.1007/978-981-96-0850-8_16
DO - 10.1007/978-981-96-0850-8_16
M3 - 会议稿件
AN - SCOPUS:85214375335
SN - 9789819608492
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 238
EP - 253
BT - Advanced Data Mining and Applications - 20th International Conference, ADMA 2024, Proceedings
A2 - Sheng, Quan Z.
A2 - Zhang, Xuyun
A2 - Wu, Jia
A2 - Ma, Congbo
A2 - Dobbie, Gill
A2 - Jiang, Jing
A2 - Zhang, Wei Emma
A2 - Manolopoulos, Yannis
A2 - Mansoor, Wathiq
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 3 December 2024 through 5 December 2024
ER -