Abstract
Security operations centre (SOC) analysts must investigate alerts, correlate threat intelligence and interpret heterogeneous telemetry under tight timing constraints. Although large language models (LLMs) offer strong understanding capabilities, directly applying them to SOC environments remains challenging due to semantic ambiguity in analyst queries, fragmented multisource event data, limited domain-specific reasoning and reliability concerns associated with unconstrained query generation. We present a task-driven knowledge-augmented framework designed to produce verifiable and contextually grounded responses for SOC workflows. The framework integrates four components: (i) contrastive context task recognition that mitigates semantic ambiguity by mapping analyst queries to predefined SOC task types; (ii) expert-guided knowledge augmentation that fuses dense and sparse retrieval to bridge the semantic gap; (iii) schema-aligned event retrieval combined with entity-centric evidence profiling to ensure reliable and secure access to heterogeneous telemetry and (iv) verifiable task-aware generation that constrains model outputs to retrieved knowledge and security events. To assess the framework, we construct a benchmark of 12,500 validated question–answer pairs derived through semiautomated synthesis over more than 34 million real SOC records. Experiments across multiple foundation models demonstrate consistent improvements in relevance and grounding quality. Our results indicate that the four proposed components substantially enhance LLMs' reliability in practical SOC analysis.
| Original language | English |
|---|---|
| Journal | CAAI Transactions on Intelligence Technology |
| DOIs | |
| State | Accepted/In press - 2026 |
| Externally published | Yes |
Keywords
- heterogeneous security telemetry
- knowledge-augmented retrieval
- large language models
- security operations centre
- task recognition
Fingerprint
Dive into the research topics of 'From Ambiguous Queries to Verifiable Insights: A Task-Driven Framework for LLM-Powered SOC Analysis⋆'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver