TY - GEN
T1 - Semantic-Integrated Online Audit Log Reduction for Efficient Forensic Analysis
AU - Liao, Wenhao
AU - Sun, Jia
AU - Wang, Haiyan
AU - Gu, Zhaoquan
AU - Yang, Jianye
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - Audit logs are crucial for revealing and tracking sophisticated cyber threats due to their abundant system-level information. However, the immense scale of logs burdens storage resources and limits their lifecycle to days, which is insufficient for tracking multi-step attacks over months or years. Although log reduction techniques that cater to cold storage can mitigate this issue, many of these are restricted to offline batch processing in data centers. This incurs significant storage and transmission costs at endpoints. Moreover, many log reduction techniques fail to yield a suitable pattern for forensic analysis, which aims to identify signs of malicious activities by scrutinizing past events. In this paper, we present Sopr, an online audit log reduction technique designed to preserve traceability. Sopr enables real-time execution of the entire process, allowing reduction to be performed on raw log data streams. Specifically, our approach can effectively reduce events that lack causal dependence and involve repeated dependency relationships. To achieve this objective, we design a dual-cache architecture that simultaneously models semantically similar files and utilizes a versioned graph to preserve causality between log events. The synergy of these two components enhances the effectiveness of Sopr in log reduction. Our experiments on the DARPA TC datasets show that Sopr can achieve comparable event reduction factor in an online fashion to state-of-the-art offline approaches. Moreover, the runtime overhead and forensic analysis validity meet the deployment requirements for real-world environments.
AB - Audit logs are crucial for revealing and tracking sophisticated cyber threats due to their abundant system-level information. However, the immense scale of logs burdens storage resources and limits their lifecycle to days, which is insufficient for tracking multi-step attacks over months or years. Although log reduction techniques that cater to cold storage can mitigate this issue, many of these are restricted to offline batch processing in data centers. This incurs significant storage and transmission costs at endpoints. Moreover, many log reduction techniques fail to yield a suitable pattern for forensic analysis, which aims to identify signs of malicious activities by scrutinizing past events. In this paper, we present Sopr, an online audit log reduction technique designed to preserve traceability. Sopr enables real-time execution of the entire process, allowing reduction to be performed on raw log data streams. Specifically, our approach can effectively reduce events that lack causal dependence and involve repeated dependency relationships. To achieve this objective, we design a dual-cache architecture that simultaneously models semantically similar files and utilizes a versioned graph to preserve causality between log events. The synergy of these two components enhances the effectiveness of Sopr in log reduction. Our experiments on the DARPA TC datasets show that Sopr can achieve comparable event reduction factor in an online fashion to state-of-the-art offline approaches. Moreover, the runtime overhead and forensic analysis validity meet the deployment requirements for real-world environments.
KW - Forensic Analysis
KW - Log Reduction
KW - Provenance Graph
UR - https://www.scopus.com/pages/publications/85214394627
U2 - 10.1007/978-981-96-0850-8_21
DO - 10.1007/978-981-96-0850-8_21
M3 - 会议稿件
AN - SCOPUS:85214394627
SN - 9789819608492
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 318
EP - 333
BT - Advanced Data Mining and Applications - 20th International Conference, ADMA 2024, Proceedings
A2 - Sheng, Quan Z.
A2 - Zhang, Xuyun
A2 - Wu, Jia
A2 - Ma, Congbo
A2 - Dobbie, Gill
A2 - Jiang, Jing
A2 - Zhang, Wei Emma
A2 - Manolopoulos, Yannis
A2 - Mansoor, Wathiq
PB - Springer Science and Business Media Deutschland GmbH
T2 - 20th International Conference on Advanced Data Mining Applications, ADMA 2024
Y2 - 3 December 2024 through 5 December 2024
ER -